PHYSICS  volume  I 

F oundations  and  Applications 

ROBERT  M.  EISBERG 
LAWRENCE  S.  LERNER 


•'V 

u 

1 I 

A 

1 

1 

"""l  ; 

i 

i 

------  jbwwJfc 

1 

l^_i K<*4T  t<* 

1 1 

nr nr  i 

JyAsX 

I 

i\i 

¥ J 

ii *§* 

fj 

*— — b b — 

M J 

L 1 1 8 

1 fP 

i 8 m 

pr  i i 

FiTtI 

TJT 

H 

1 L1....1 

" 1 111 

I I I I I 1 

Selected  Physical  Quantities 


Quantity 

Typical 
symbol  for 
magnitude 

Mass 

m 

Length 

1 

Time 

t 

Velocity 

V 

Acceleration 

a 

Angle 

</>,  e 

Angular  frequency, 

(X) 

angular  velocity 
Angular  acceleration 

a 

Frequency 

V 

Momentum,  impulse 

P.  1 

Force 

F 

Work,  energy 

W,  E,  K,  U 

Power 

P 

Angular  momentum 

L 

Torque 

T 

Moment  of  inertia 

I 

Stress,  pressure 

cr,  p 

Elastic  moduli 

Y,  B,  G 

Compressibility 

K 

Viscocity 

V 

Temperature 

T 

Heat 

H 

Entropy 

S 

Electric  charge 

q 

Electric  flux 

<f>e 

Electric  field 

% 

Electric  potential,  emf 

V 

Electric  dipole  moment 

p 

Capacitance 

C 

Electric  current 

i 

Current  density 

i 

Conductance 

s 

Resistance 

R 

Conductivity 

cr 

Resistivity 

P 

Magnetic  field 

Magnetic  flux 

Magnetic  dipole  moment 

m 

Magnetic  pole  strength 

4> 

Magnetic  permeability 

P- 

Magnetization 

M 

Inductance 

L 

SI  unit 

Dimensions 

kilogram,  kg 

M 

meter,  m 

L 

second,  s 

T 

m/s 

LT~' 

m/s2 

LT~2 

radian,  rad 

dimensionless 

rad/s 

T-i 

rad/s2 

rj-'  ~2 

hertz,  Hz  (=  s_1) 

T~  i 

kg  • m/s 

MLT-1 

newton,  N (=  kg  • m/s2) 

MET'2 

joule,  J (=  N • m) 

M L2T~2 

watt,  W (=  J/s) 

ML2T"3 

kg  • m2/s 

ML2T_1 

N • m 

ML2T~2 

kg  • m2 

ML2 

pascal,  Pa  (=  N/m2) 

ML -'T-2 

Pa 

ML~lT~2 

Pa”1 

M-'LT2 

Pa  • s 

ML-'T-1 

kelvin,  K 

dimensionless 

J 

ML2T~2 

J/K 

ML2T~2 

coulomb,  C 

C 

N • m2/C 

ML3T~2C~1 

N/C  (=  V/m) 

M LT-2C_1 

volt,  V 

ML  2T-2C~' 

C • m 

LC 

farad,  F (=  C/V) 

M -‘L-TC2 

ampere,  A (=  C/s) 

T-'C 

A/m2 

L ~2T~'C 

siemens,  S (=  A/V) 

M~'L~2TC2 

ohm,  fl  (=V/A) 

ML2T~'C~2 

S/m 

M-’L^TC2 

fl  • m 

ML3T~1C~2 

tesla,  T 

MT-'C”1 

weber,  Wb  ( = 1 ■ m2) 

ML2T_1C_1 

A ■ m2 

L2T_1C 

A • m 

LT-'C 

T • m/A 

MLC-2 

A/m 

L_1T_1C 

henry.  H f=  V ■ s/A) 

ML2C~2 

Selected  Non-SI  Units  and  Conversion  Factors 

3 GO  1 

1 degree  of  arc  (°)  = — — rad  = — — rad  = 0.0175  rad 
Lit  57.3 

1 minute  of  arc  (')  = 2.91  x 10-4  rad 

1 second  of  arc  (")  = 4.85  x 10^6  rad 

1 day  (d)  = 86  400  s 
1 year  (yr)  = 3.156  x 107  s 

1 Angstrom  unit  (A)  = 0.1  nm  = 10  10  m 
1 inch  (in)  = 2.54  cm 
1 foot  (ft)  = 0.3048  m 
1 mile  (mi)  = 1.61  km 
1 light  year  (ly)  = 9.46  x 1015  m 

1 liter  (1)  = 10-3  m3 
1 cm3  = 10  6 m3 

1 atomic  mass  unit  (u)  = 1.661  x 10~27  kg 
1 slug  = 14.59  kg 

1 mole  (mol)  = 10~3  kmol 

1 dyne  (dyn)  = 10~5  N 
1 avoirdupois  pound  (lb)  = 4.448  N 

1 bar  = 105  Pa 

1 atmosphere  (atm)  = 1.013  x 105  Pa 
1 mm  of  mercury  (Torr)  = 133.3  Pa 
1 lb/in2  = 6.90  x 103  Pa 

1 electron  volt  (eV)  = ej  = 1.60  x 10-19  J 
1 erg  = 10~7  J 
1 kcal  = 4186  J 
1 cal  = 10“3  kcal  = 4.186  J 
1 kilowatt-hour  (kWh)  = 3.6  x 106  J 
1 foot-pound  (ft  • lb)  = 1.356  J 
1 British  thermal  unit  (BTU)  = 1055  J = 0.252  kcal 
1 horsepower  (hp)  = 746  W 

1 gauss  (G)  = 10-4  T 


Useful  Physical  Data 
Quantity 

Gravitational  acceleration,  ground  level  value 
in  United  States 
Mass  of  earth 
Mass  of  moon 
Mass  of  sun 
Average  radius  of  earth 
Average  earth-moon  distance 
Average  earth-sun  distance 
Triple  point  temperature  of  water 
Absolute  zero  of  temperature 


Value 

9.80  m/s2 

5.99  x 1024  kg 
7.36  x 1022  kg 

1.99  x 103°  kg 
6.367  x 106  m 
3.84  x 108  m 

149.6  x 109  m = 1 AU 
273.15  K = — 0.01°C 
-273.16°C 


Digitized  by  the  Internet  Archive 

in  2017 


https://archive.org/details/physicsfoundatio01eisb_0 


PHYSICS 

Foundations  and  Applications 

volume  I 


PHYSICS 

Foundations  and  Applications 

volume  I 


ROBERT  M.  EISBERG 

Professor  of  Physics 
University  of  California,  Santa  Barbara 


LAWRENCE  S.  LERNER 


Professor  of  Physics 

California  State  University,  Long  Beach 


McGraw-Hill  Book  Company 

New  York  St.  Louis  San  Francisco  Auckland  Bogota  Hamburg 
Johannesburg  London  Madrid  Mexico  Montreal  New  Delhi 
Panama  Paris  Sao  Paulo  Singapore  Sydney  Tokyo  Toronto 


PHYSICS:  Foundations  and  Applications,  volume  I 

Copyright  © 1981  by  McGraw-Hill,  Inc.  All  rights  reserved.  Printed  in 
the  United  States  of  America.  No  part  of  this  publication  may  be  repro- 
duced, stored  in  a retrieval  system,  or  transmitted,  in  any  form  or  by  any 
means,  electronic,  mechanical,  photocopying,  recording,  or  otherwise, 
without  the  prior  written  permission  of  the  publisher. 

1234567890  RMRM  898765432  1 

This  book  was  set  in  Baskerville  by  Progressive  Typographers. 

The  editor  was  John  J.  Corrigan; 

the  designer  was  Merrill  Haber; 

the  production  supervisor  was  Dominick  Petrellese. 

The  photo  researcher  was  Mira  Schachne. 

The  drawings  were  done  by  J & R Services,  Inc. 

Rand  McNally  & Company  was  printer  and  binder. 


Library  of  Congress  Cataloging  in  Publication  Data 

Eisberg,  Robert  Martin. 

Physics,  foundations  and  applications. 

Includes  index. 

1.  Physics.  I.  Lerner,  Lawrence  S.,  date 
joint  author.  II.  Title. 

QC21.2.E4  530  80-24417 

ISBN  0-07-019091-7  (v.  1) 


Cover:  “Vega-Nor”  by  Victor  de  Vasarely,  reproduced  by  courtesy  of 
the  Albright-Knox  Art  Gallery,  Buffalo,  New  York,  and  the  Vasarely 
Center,  New  York,  New  York. 


Contents 


PREFACE  ix 

Chapter  1 AN  INTRODUCTION  TO  PHYSICS  1 

1-1  What  Is  Physics?  1 

1-2  The  Domains  of  Physics  3 

1- 3  Force  and  Motion  in  Newtonian  Mechanics  5 

Chapter  2 KINEMATICS  IN  ONE  DIMENSION  14 

2- 1  One-Dimensional  Motion  14 

2-2  Position  and  Units  of  Length  16 

2-3  Time  and  Units  of  Time  19 

2-4  Velocity  22 

2-5  Differentiation  30 

2-6  Acceleration  35 

2-7  Velocity  and  Position  for  Constant  Acceleration  38 

2- 8  Vertical  Free  Fall  46 

Exercises  49 

Chapter  3 KINEMATICS  IN  TWO  AND  THREE  DIMENSIONS  54 

3- 1  Projectile  Motion  54 

3-2  Properties  of  Vectors  60 

3-3  Position,  Velocity,  and  Acceleration  Vectors  74 

3-4  The  Parabolic  Trajectory  80 

3-5  Uniform  Circular  Motion  and  Centripetal 

Acceleration  84 

3-6  The  Minimum-Orbit  Earth  Satellite  89 

3-7  The  Conical  Pendulum  and  the  Banking  of  Curves  92 

3- 8  The  Galilean  Transformations  96 

Exercises  102 

Chapter  4 NEWTON’S  LAWS  OF  MOTION  108 

4- 1  Newton’s  First  Law  and  Inertial  Reference  Frames  108 

4-2  Newton’s  Second  and  Third  Laws  113 

4-3  Mass  and  Momentum  Conservation  126 

4-4  Force  and  Newton’s  Second  Law  138 

4-5  Momentum  Conservation  and  Newton’s  Third  Law  142 

4- 6  Forces  in  Mechanical  Systems  143 

Exercises  154 

Chapter  5 APPLICATIONS  OF  NEWTON’S  LAWS  161 

5- 1  The  Free-Body  Diagram  161 

5-2  Atwood’s  Machine  and  Similar  Systems  164 

5-3  Motion  with  Contact  Friction  173 

5-4  Fictitious  Forces  174 

5-5  Rockets  186 

5-6  The  Skydiver  190 

Exercises  200 


v 


Chapter  6 OSCILLATORY  MOTION  207 

6-1  Stable  Equilibrium  and  Oscillatory  Motion  207 

6-2  The  Body  at  the  End  of  a Spring  212 

6-3  The  Simple  Pendulum  219 

6-4  Numerical  Solution  of  Differential  Equations  225 

6-5  Analytical  Solution  of  the  Harmonic  Oscillator 

Equation  228 

6- 6  The  Damped  Oscillator  236 

Exercises  247 

Chapter  7 ENERGY  RELATIONS  252 

7- 1  A Preview  of  Energy  Relations  252 

7-2  Work  Done  by  a Variable  Force  264 

7-3  Integration  266 

7-4  Work  and  Kinetic  Energy  270 

7-5  Conservative  Forces  279 

7-6  Potential  Energy  and  Energy  Conservation  287 

7- 7  Evaluation  of  Force  from  Potential  Energy  291 

Exercises  294 

Chapter  8 APPLICATIONS  OF  ENERGY  RELATIONS  302 

8- 1  Power  302 

8-2  Machines  305 

8-3  Impulse  and  Collisions  310 

8-4  Harmonic  Oscillations  320 

8- 5  Lightly  Damped  Oscillations  325 

Exercises  330 

Chapter  9 ROTATIONAL  MOTION,  I 337 

9- 1  Rotational  Kinematics  for  a Fixed  Axis  337 

9-2  Rotational  Kinematics  for  a Free  Axis  344 

9-3  Angular  Momentum  351 

9-4  Torque  358 

9-5  Rotation  of  Systems  and  Angular  Momentum 

Conservation  363 

9-6  Static  Equilibrium  of  Rigid  Bodies  and  Center  of 

Mass  371 

9-7  Stability  of  Equilibrium  379 

Exercises  384 

Chapter  10  ROTATIONAL  MOTION,  II  392 

10-1  Moment  of  Inertia  392 

10-2  The  Physical  Pendulum  and  the  Torsion  Pendulum  406 

10-3  The  Top  411 

10-4  Rotation  about  an  Accelerating  Center  of  Mass  417 

10- 5  Energy  in  Rotational  Motion  427 

Exercises  434 

Chapter  11  GRAVITATION  AND  CENTRAL  FORCE  MOTION  442 

11- 1  Universal  Gravitation  442 

11-2  Determination  of  the  Universal  Gravitational 

Constant  G 455 

11-3  The  Mechanics  of  Circular  Orbits:  Analytical 

Treatment  459 

11-4  Reduced  Mass  464 

11-5  The  Mechanics  of  Orbits:  Numerical  Treatment  468 

11-6  Energy  in  Gravitational  Orbits  478 

11-7  Perturbations  and  Orbit  Stability  485 

Exercises  494 


vi  Contents 


Chapter  12  MECHANICAL  TRAVELING  WAVES  500 

12-1  Traveling  Waves  500 

12-2  Wave  Trains  507 

12-3  The  Wave  Equation  512 

12-4  Traveling-Wave  Solutions  to  the  Wave  Equation  518 

12-5  Energy  in  Waves  524 

12-6  Longitudinal  Waves  and  Multidimensional  Waves  531 

12- 7  The  Doppler  Effect  538 

Exercises  543 

Chapter  13  SUPERPOSITION  OF  MECHANICAL  WAVES  549 

13- 1  Superposition  of  Waves  549 

13-2  Reflection  of  Waves  557 

13-3  Standing  Waves  561 

13-4  Standing-Wave  Solutions  to  the  Wave  Equation  570 

13-5  Standing  Waves  on  a Circular  Membrane  579 

13-6  Acoustics  591 

13-7  Fourier  Synthesis  594 

13- 8  The  Physics  of  Music  600 

Exercises  612 

Chapter  14  RELATIVISTIC  KINEMATICS  618 

14- 1  The  Relativistic  Domain  618 

14-2  The  Speed  of  Light  619 

14-3  The  Equivalence  of  Inertial  Frames  627 

14-4  Simultaneity  629 

14-5  Time  Dilation  and  Length  Contraction  633 

14-6  The  Lorentz  Position-Time  Transformation  640 

14- 7  The  Lorentz  Velocity  Transformation  648 

Exercises  652 

Chapter  15  RELATIVISTIC  MECHANICS  657 

15- 1  The  Basis  of  Relativistic  Mechanics  657 

15-2  Relativistic  Mass  and  Momentum  658 

15-3  Relativistic  Force  and  Energy  666 

15-4  Relativistic  Energy  Relations  672 

15-5  Energy  and  Rest  Mass  in  Chemical  and  Nuclear 

Reactions  680 

15-6  Nuclear  Reaction  Q Values  686 

Exercises  691 

ANSWERS  697 

INDEX  705 


Contents  Vll 


Preface 


Science  is  constructed  of  facts,  as  a house  is  of  stones. 
But  a collection  of  facts  is  no  more  a science  than  a 
heap  of  stones  is  a house. 

Henri  Poincare 
Science  and  Hypothesis 


In  this  book  we  present  the  science  of  physics  in  a carefully  structured  man- 
ner which  emphasizes  its  foundations  as  well  as  its  applications.  The  struc- 
ture is  flexible  enough,  however,  for  there  to  be  paths  through  it  compati- 
ble with  the  various  presentations  encountered  in  introductory  physics 
courses  having  calculus  as  a corequisite  or  prerequisite. 

We  have  always  kept  in  view  the  idea  that  a textbook  should  be  a com- 
plete study  aid.  Thus  we  have  started  each  topic  at  the  beginning  and  have 
included  everything  that  a student  needs  to  know.  This  feature  is  central  to 
the  senior  author’s  successful  textbooks  on  modern  physics  and  on  quan- 
tum physics. 

The  book  is  written  in  an  expansive  style.  Attention  paid  to  motivating 
the  introduction  of  new  topics  is  one  aspect  of  this  style.  Another  is  the 
space  devoted  to  showing  that  physics  is  an  experimentally  based  science. 
In  Volume  I direct  experimental  evidence  is  repeatedly  brought  into  the 
developments  by  the  use  of  photographs.  And  although  the  experiments 
underlying  the  topics  considered  in  Volume  II  generally  do  not  lend  them- 
selves to  photographic  presentation,  at  least  the  flavor  of  the  laboratory 
work  is  given  by  including  careful  descriptions  of  the  experiments.  Still 
another  aspect  of  the  expansive  style  is  found  in  the  frequent  discussions  of 
the  microscopic  basis  of  macroscopic  phenomena. 

Developments  are  often  presented  in  “spiral”  fashion.  That  is,  a quali- 
tative discussion  is  followed  by  a more  rigorous  treatment.  An  example  is 
found  in  the  development  of  Newton’s  second  law.  Chapter  1 introduces  its 
most  important  features  in  a purely  qualitative  way.  When  the  second  law  is 
treated  systematically  in  Chap.  4,  Newton’s  approach,  using  intuitive  no- 
tions of  mass  and  force,  is  followed  by  Mach’s  approach,  where  mass  and 
force  are  defined  logically  in  terms  of  momentum  in  a manner  suggested 
by  the  analysis  of  a set  of  collision  experiments. 

The  book  contains  many  features  designed  to  help  the  student.  For  in- 
stance, when  a term  is  defined  formally  or  by  implication,  or  is  redefined  in 
a broader  way,  it  is  emphasized  with  boldface  letters.  And  all  such  items  in 
boldface  are  listed  in  the  index  to  make  it  easy  to  locate  definitions  which  a 
student  may  have  forgotten. 

It  is  not  intended  that  course  lectures  cover  every  point  made  in  the 


tx 


book.  The  book  can  be  relied  upon  to  do  many  of  the  straightforward 
things  that  need  to  be  done,  thereby  freeing  the  instructor  to  concentrate 
on  the  things  that  cause  students  the  most  trouble.  Instructors  interested  in 
teaching  a self-paced  course  will  find  that  the  completeness  of  this  book 
makes  it  well  adapted  to  use  in  such  a course. 

A novel  feature  of  this  book  is  the  use  of  numerical  procedures  em- 
ploying programmable  calculating  devices.  At  the  risk  of  giving  them  more 
emphasis  than  is  warranted  by  their  importance  to  the  book,  we  describe 
in  the  following  paragraphs  what  these  procedures  make  possible,  and  how 
they  can  be  implemented.  Numerical  procedures  are  used  for: 

1.  Numerical  differentiation  and  integration.  For  students  concur- 
rently studying  calculus  this  drives  home  the  fundamental  concepts  of  a 
limit,  a derivative,  and  an  integral. 

2.  Assistance  in  curve  plotting.  This  is  put  to  good  use  in  studying  bal- 
listic trajectories,  electric  held  lines  and  equipotentials,  and  wave  groups. 

3.  Numerical  solution  of  differential  equations.  This  procedure  per- 
mits the  use  of  Newton's  second  law  in  a variety  of  cases  involving  varying 
forces.  It  also  is  applied  to  the  vibration  of  a circular  drumhead,  LRC  cir- 
cuits, and  Schrodinger’s  equation. 

4.  Simulation  of  statistical  experiments.  The  procedure  allows  the  in- 
troduction of  fundamental  topics  of  statistical  mechanics  to  be  introduced 
in  an  elementary  way. 

5.  Multiplication  of  several  2 by  2 matrices.  This  makes  practical  the 
introduction  of  a very  simple  yet  very  powerful  method  of  doing  ray  optics. 

The  principal  advantage  of  using  numerical  procedures  in  the  intro- 
ductory course  is  that  it  frees  the  physics  content  of  the  course  from  the  limi- 
tations normally  imposed  by  the  students’  inability  to  manipulate  differen- 
tial equations  analytically  or  to  handle  certain  other  analytical  techniques. 
To  give  just  one  example  of  the  many  embodied  in  this  book,  we  have 
found  that  students  are  quite  interested  in  the  numerical  work  on  celestial 
mechanics  and  are  well  able  to  understand  the  physics  involved.  It  is  the 
mathematical  difficulty  of  the  traditionally  used  analytical  techniques  that 
normally  mandate  the  deferral  of  this  material  to  advanced  courses. 

The  advantages  of  the  numerical  procedures  go  the  other  way  as  well 
— they  open  up  mathematical  horizons  not  usually  accessible  to  the  intro- 
ductory-level student.  The  analytical  solution  of  a differential  equation 
generally  requires  an  educated  guess  at  the  form  of  the  solution.  It  is  pre- 
cisely such  a guess  that  the  student  is  not  prepared  to  make,  or  to  accept 
from  others.  But  the  numerical  solution  suggests  the  correct  guess  strongly 
and  directly.  Our  experience  is  that  students  armed  with  such  insight  can 
go  through  the  analytical  solution  confidently.  The  book  exploits  this  ad- 
vantage on  several  occasions. 

The  numerical  work  can  be  presented  in  the  lecture  part  of  a course  in 
various  ways.  One  which  has  proven  to  be  successful  is  to  demonstrate  to 
the  students  the  first  numerical  procedure  that  is  emphasized  by  using  a 
closed-circuit  TV  system  to  provide  an  enlarged  view  of  the  display  of  a 
programmable  calculator  or  small  computer  running  through  the  proce- 
dure. (Programs  for  every  numerical  procedure  used  in  the  book,  and  step- 
by-step  operating  instructions,  are  given  in  the  accompanying  pamphlet, 
the  Numerical  Calculation  Supplement . ) After  the  demonstration,  a graph  of 


the  results  obtained  is  shown  to  the  students  by  projecting  a transparency 
made  from  the  appropriate  figure  in  the  book,  and  the  significance  of  the 
results  is  explained.  In  subsecpient  lectures  involving  numerical  proce- 
dures, all  that  need  be  done  is  to  graph  their  results  and  then  discuss  the 
meaning  of  the  results.  An  instructor  who  is  more  inclined  to  numerical 
procedures  may  want  to  give  more  demonstrations;  one  who  is  less  con- 
vinced of  their  worth  need  not  give  any.  The  essential  point  is  that  explana- 
tions of  the  physics  emerging  from  the  numerical  work  can  be  well  under- 
stood by  students  who  do  no  more  than  look  carefully  at  graphs  of  the 
results  obtained. 

But  it  goes  without  saying  that  students  will  get  more  out  of  an  active 
involvement  with  the  numerical  procedures  than  a passive  one.  The  most 
active  approach  is  to  ask  the  students  to  do  several  of  the  homework  exer- 
cises labeled  Numerical  in  each  of  the  fourteen  chapters  where  some  use  is 
made  of  numerical  procedures.  But  the  instructor  should  not  assign  too 
many  numerical  exercises,  particularly  at  first,  because  some  are  rather 
time-consuming.  A good  way  to  start  is  to  make  the  numerical  exercises  op- 
tional or  to  give  extra  credit  for  them.  Instruction  in  operating  a program- 
mable calculator  or  small  computer  can  be  given  in  a laboratory  period  or 
in  one  or  two  discussion  periods. 

We  now  describe  paths  which  may  be  taken  through  this  book,  other 
than  the  one  going  continuously  from  the  beginning  to  the  end. 

1.  Several  entire  topics  can  be  deleted  without  difficulty.  These  are: 
relativity,  Chaps.  14  and  15;  fluid  dynamics,  Secs.  16-6  and  16-7;  thermal 
physics,  Chaps.  17  through  19;  changing  electric  currents,  Chap.  26;  elec- 
tromagnetic waves,  Chap.  27;  optics,  Chaps.  28  and  29;  and  quantum  phys- 
ics, Chaps.  30  and  31. 

2.  We  believe  the  book  contains  as  much  modern  physics  as  should  be 
in  the  introductory  course.  This  material  is  distributed  throughout  the 
book,  but  it  has  been  written  in  such  a way  that  there  will  be  no  problem  in 
presenting  it  all  in  the  final  term.  To  do  so,  the  following  material  should 
be  skipped  in  proceeding  through  the  book,  and  presented  at  the  end: 
Chaps.  14  and  15;  Secs.  20-1,  20-3,  22-4,  22-5,  23-3,  24-2,  24-4,  and  24-5. 
Then  close  with  Chaps.  30  and  31. 

3.  In  some  schools  the  study  of  thermal  physics  is  undertaken  before 
that  of  wave  motion.  For  such  a purpose  Chaps.  16  through  19  can  be 
treated  before  Chaps.  12  and  13. 

4.  If  it  is  desired  to  present  a shorter  course  in  which  no  major  topics 
are  to  be  deleted,  the  sections  in  the  following  list  can  be  dropped  without 
significantly  interrupting  the  flow  of  the  argument  and  without  passing 
over  material  essential  to  subsequent  subject  matter.  (In  some  cases  it  will 
be  necessary  to  substitute  a very  brief  qualitative  summary  of  the  ideas  not 
treated  formally  when  the  need  for  these  ideas  arises.  Sections  marked  with 
an  asterisk  are  those  to  be  deleted  if  it  is  desired  to  avoid  entirely  the  wave 
equation  in  its  various  forms.  If  this  is  done,  electromagnetic  radiation  may 
still  be  treated  on  a semiquantitative  basis.)  The  sections  which  can  be 
dropped  are:  2-5,  2-8,  3-7,  4-2  (if  some  of  the  examples  are  used  later),  5-4, 
5-5,  6-1,  6-6,  7-1,  7-3,  8-2,  8-5,  9-7,  10-2,  10-3,  11-2,  11-4,  11-7,  12-3*, 
12-4*,  12-5,  12-6,  13-4*,  13-5*,  13-6,  13-7,  13-8,  15-5,  15-6,  16-5,  16-6, 
17-5,  18-6,  19-6,  19-7,  20-1,  20-3,  21-5,  21-8,  22-4,  22-5,  23-3,  24-2,  24-4, 
24-5,  25-4,  26-6,  26-7,  26-8,  26-9,  27-3*,  27-4*,  27-6,  28-5,  28-7,  29-2,  29-6, 


Preface  xi 


29-7,  30-3,  30-4,  31-3,  and  31-4.  In  addition,  any  material  in  small  print  can 
be  dropped. 

Many  persons  have  assisted  us  in  writing  this  book.  In  particular,  ad- 
vice on  presentation  or  on  technical  points  and/or  aid  in  producing  many 
of  the  photographs  was  given  by  R.  Dean  Ayers,  Alfred  Bork,  John 
Clauser,  Roger  H.  Hildebrand,  Daniel  Hone,  Anthony  Korda,  Jill  H.  Lar- 
ken,  Isidor  Lerner,  Narcinda  R.  Lerner,  Ralph  K.  Myers,  Roger  Osborne, 
and  Abel  Rosales.  The  manuscript  was  reviewed  at  various  stages,  in  part 
or  in  whole,  by  Raymond  L.  Askew,  R.  Dean  Ayers,  Carol  Bartnick, 
George  H.  Bowen,  Sumner  P.  Davis,  Joann  Eisberg,  Lila  Eisberg,  Austin 
Gleeson,  Russell  K.  Hobbie,  William  H.  Ingham,  Isidor  Lerner,  Ralph  K. 
Myers,  Herbert  D.  Peckham,  Earl  R.  Pinkston,  James  Smith,  Jacqueline  D. 
Spears,  Edwin  F.  Taylor,  Gordon  G.  Wiseman,  Mason  Yearian,  Arthur  M. 
Yelon,  and  Dean  Zollman.  Isidore  Lerner  contributed  many  of  the  exer- 
cises; others  were  written  by  Van  Bluemel,  Don  Chodrow,  Eugene  God- 
fredsen,  John  Hutcherson,  William  Ingham,  Daniel  Schechter,  and  Mark  F. 
Taylor.  Dean  Zollman  assisted  greatly  in  selecting  and  editing  exercises. 
Don  Chodrow  and  William  Ingham  checked  all  solutions,  compiled  the 
short  answers  that  appear  in  the  back  of  the  book,  and  prepared  the  Solu- 
tions Manual.  Herbert  D.  Peckham  wrote  the  original  versions  of  the 
computer  programs  for  the  Numerical  Calculation  Supplement.  Lila  Eisberg 
played  a major  role  in  reading  proof  and  prepared  the  index.  Important 
contributions  to  the  development  of  the  manuscript  and  its  transformation 
into  a book  were  made  by  John  J.  Corrigan,  Mel  Haber,  Annette  Hall, 
Alice  Macnow,  Peter  Nalle,  Janice  Rogers,  Jo  Satloff,  and  Robert  Zappa  at 
McGraw-Hill,  and  by  our  photo  researcher,  Mira  Schachne.  Many  students 
at  the  University  of  California,  Santa  Barbara,  and  at  California  State  Lhii- 
versity,  Long  Beach,  had  a real  impact  on  the  manuscript  by  asking  just 
the  right  questions  in  class.  To  all  these  persons  we  express  our  warmest 
thanks. 


Robert  M.  Eisberg 
Lawrence  S.  Lerner 


PHYSICS 

Foundations  and  Applications 

volume  I 


An  Introduction 
to  Physics 


1-1  WHAT  IS  PHYSICS?  Physics  is  the  systematic  study  of  the  basic  properties  of  the  universe.  Each 

of  these  properties  is  related  to  interactions  among  the  objects  found  in  the 
universe.  In  the  branch  of  physics  known  as  cosmology,  the  overall  struc- 
ture of  the  universe  itself  is  analyzed  by  taking  into  account  interactions 
between  every  part  of  the  universe  and  every  other  part.  But  in  more  typi- 
cal branches  of  physics  any  one  object  can  be  assumed  to  interact  in  a signif- 
icant way  only  with  other  objects  that  are  not  too  distant.  Thus  a property 
usually  can  be  studied  by  considering  the  interactions  among  a limited  set 
of  objects,  as  well  as  the  interactions  this  set  may  have  with  a few  other  ob- 
jects in  its  vicinity.  A set  of  objects  on  which  attention  is  focused  is  called  a 
system.  The  smallest  physical  systems  are  those  investigated  in 
elementary-particle  physics;  they  are  the  tiny  building  blocks  from  which 
everything  else  is  constructed.  The  largest  physical  system  is  the  entire  uni- 
verse, as  it  is  treated  in  cosmology.  Between  these  extremes  lies  a tre- 
mendous variety  of  systems  that  are  studied  in  physics. 

The  motivations  for  studying  physical  systems  are  almost  as  varied  as 
the  systems  themselves.  A container  of  the  superfluid  liquid  helium  is  an 
example  of  a system  that  may  be  studied  because  of  its  inherent  interest. 
Often,  however,  the  investigation  of  a system  is  directed  toward  a commer- 
cial or  social  end.  This  happens  in  semiconductor  physics  and  medical 
physics.  Sometimes  the  motivation  lies  in  the  light  the  work  will  shed  on  a 
held  of  science  other  than  physics;  an  example  is  the  application  of  physics 
to  molecular  biology.  In  such  a situation,  the  line  separating  that  held  and 
physics  is  sure  to  become  blurred  and  eventually  obliterated.  This  has 
already  occurred  in  astronomy,  chemistry,  and  several  branches  of  engi- 
neering, and  is  currently  occurring  in  molecular  biology. 


1 


The  strategy  used  in  the  study  of  physical  systems  is  one  of  the  most 
powerful  inventions  of  the  human  mind.  Its  fruits  have  completely  trans- 
formed the  way  the  human  race  lives,  the  way  its  members  think,  and  the 
world  they  inhabit.  The  strategy  of  physics  has  three  characteristic  fea- 
tures. 

The  hrst  feature  is  that  the  analysis  of  a physical  system  tends  to  be  car- 
ried out  in  terms  of  the  properties  of  simpler  systems.  In  investigating  a 
system,  a physicist  seeks  to  treat  separately  each  factor  influencing  its 
behavior.  This  usually  involves  switching  attention  to  a collection  of  simpler 
systems.  Each  of  these  is  related  in  some  important  way  to  the  original 
system,  but  has  fewer  factors  that  are  vital  to  its  behavior.  Being  simpler, 
these  systems  can  be  investigated  to  the  extent  that  their  properties  are  well 
understood.  When  this  has  been  done,  the  information  obtained  can  be 
mentally  reassembled  into  an  understanding  of  the  properties  of  the  origi- 
nal system.  But  sometimes  this  last  step  is  not  taken,  or  only  partly  taken, 
because  it  is  found  that  the  information  of  most  general  significance  to 
physics  is  obtained  in  the  analysis  of  one  of  the  simpler  systems.  You  will 
see  an  example  of  this  feature  in  Sec.  1-3. 

The  second  feature  of  the  strategy  is  that  physics  is  uncompromisingly 
based  on  experiment.  You  will  see  an  example  of  this  also  in  Sec.  1-3. 
Sometimes  theory  suggests  experiment.  Much  more  frequently,  though,  an 
experimentalist  does  the  pioneering  work  in  a particular  area  of  physics, 
and  the  theoretician  then  synthesizes  the  results  of  the  experiments  and 
deepens  the  understanding  of  their  significance.  A really  good  theory 
suggests  new  and  interesting  experiments  which  can  be  used  to  confirm  it. 
But  beyond  this,  such  a theory  goes  on  to  suggest  new  areas  of  interest. 

Whether  he  or  she  is  experimentally  or  theoretically  inclined,  the 
physicist  acknowledges  that  ideas  must  be  tested  by  experiments.  No 
matter  how  beautiful  an  idea  may  seem,  no  matter  how  attached  he  or  she 
may  have  become  to  it,  the  physicist  agrees  in  advance  to  abandon  the  idea 
if  it  conflicts  with  the  evidence  furnished  by  a crucial  experiment.  But 
experience  repeated  innumerable  times  has  shown  that  the  abandoned 
idea  is  never  as  beautiful  as  the  one  which  ultimately  emerges  to  provide  an 
accurate  description  of  nature. 

A third  feature  of  the  strategy  of  physics  is  the  frequent  use  of  mathe- 
matics. You  may  have  heard  the  saying  that  mathematics  is  the  language  of 
physics.  It  is  worth  exploring  what  the  saying  means.  Physics  is  concerned 
with  the  interactions  among  objects.  We  believe  that  objects  interact  ac- 
cording to  certain  laws,  whether  we  know  those  laws  or  not.  Since  physical 
laws  are  almost  always  quantitative,  it  becomes  essential  to  be  able  to  trace 
quantitative  logical  connections  in  studying  physical  systems.  The  rules 
governing  all  such  connections  (whether  they  have  anything  to  clo  with  the 
physical  universe  or  not)  are  the  subject  of  mathematics.  Thus  most  of  the 
rules  and  procedures  of  mathematics  are  directly  applicable  to  the  under- 
standing of  physics.  Mathematics  is  used  in  physics  because  in  all  except  the 
simplest  situations  it  provides  by  far  the  most  convenient  way  to  trace  the 
logical  relationships  that  arise  in  the  analysis  of  physical  systems. 

Granted  that  mathematics  is  the  language  of  physics,  we  must  stress 
that  this  does  not  mean  that  mathematics  is  physics,  or  vice  versa.  Conse- 
quently, when  we  obtain  a result  from  a mathematical  argument,  we  will  be 
interested  principally  in  both  the  physical  meaning  of  the  steps  used  to  ob- 
tain it  and  the  experimental  verifiability  of  the  result. 


2 An  Introduction  to  Physics 


The  part  of  the  mathematical  language  which  you  will  need  to  use  most  fre- 
quently at  first  relates  certain  quantities  to  the  rate  of  change  of  other  quantities, 
that  is,  differential  calculus.  We  assume  that  you  come  to  the  study  of  physics 
without  having  previously  studied  calculus.  All  aspects  of  calculus  that  you  will 
need  for  using  this  book  are  developed  from  scratch,  as  they  are  needed.  But  this  is 
not  a calculus  book,  and  you  certainly  should  be  working  through  such  a book  as 
you  work  through  this  one. 

In  the  many  years  since  it  evolved  in  the  sixteenth  century,  the  use  of 
the  strategy  of  physics  has  spread  to  all  fields  of  science.  Indeed,  some 
fields,  such  as  psychology  and  economics,  are  considered  “scientific”  to  the 
extent  that  they  make  use  of  the  strategy  or  parts  of  it.  The  strategy  is  most 
successfully  applied  in  physics,  however,  because  it  is  especially  suitable  for 
the  relatively  simple  systems  which  are  the  main  concern  of  physics.  It  may 
be  far  from  easy  for  a physicist  to  analyze  one  by  one  the  factors  affecting 
the  behavior  of  a simple  atom,  using  the  language  of  mathematics,  and 
then  to  verify  in  the  laboratory  the  results  obtained.  But  it  is  still  much  less 
difficult  than  it  is  for  a chemist  to  carry  out  the  analogous  task  for  a com- 
plex molecule.  More  difficult  yet  is  the  task  of  a biologist  in  carrying  out  a 
similar  procedure  on  the  enormously  complex  molecule  called  a gene.  To 
make  full  application  of  the  strategy  of  physics  in  studying  a living  animal  is 
a practical  impossibility.  The  point  is  that  the  number  of  significant  factors 
increases  rapidly  with  the  complexity  of  a system,  and  these  factors  become 
so  intimately  interrelated  that  they  cannot  be  separated.  Put  briefly,  physics 
is  the  simplest  science  because  it  studies  the  simplest  systems.  For  this 
reason  physics  forms  the  foundation  of  all  other  sciences. 

The  connection  between  physics  and  engineering  is  even  more  direct 
than  that  between  physics  and  any  of  the  other  sciences.  An  engineer  deals 
with  systems  to  which  the  principles  of  physics  are  immediately  applicable. 
Electrical  engineering,  as  an  example,  is  in  large  part  a matter  of  making 
highly  sophisticated  practical  applications  of  the  work  of  nineteenth-  and 
twentieth-century  physicists.  No  matter  what  held  of  engineering  or  sci- 
ence you  are  planning  to  enter,  you  will  find  continued  use  for  what  you 
learn  in  the  study  of  physics.  You  will  find  use  for  the  specific  facts  of  phys- 
ics, for  the  techniques  used  in  solving  physics  problems,  and  for  the  frame 
of  mind  you  inevitably  acquire  through  the  study  of  physics. 

It  is  thus  that  physics  assumes  a central  role.  Its  facts,  its  procedures, 
and  its  view  of  the  world  find  vital  applications  everywhere.  In 
addition — and  this  is  a point  of  great  significance  to  the  physicist — the  edi- 
fice of  physics  is  a beautiful  and  fascinating  work  of  the  intellect.  Albert 
Einstein  called  it  a miracle  that  the  human  mind  is  so  constructed  as  to  have 
the  ability  to  comprehend  the  universe  of  which  it  is  a part.  Whether  you 
wish  to  call  it  a miracle  or  not,  you  will  come,  as  your  understanding  of  the 
subject  develops,  to  agree  with  the  gist  of  his  statement. 


1-2  THE  DOMAINS  OF  It  is  always  useful  to  divide  a complex  held  into  domains.  For  physics  one 
PHYSICS  such  division  can  be  made  according  to  the  sizes  of  the  objects  studied.  Spe- 
cifically, objects  are  considered  small  if  their  sizes  are  comparable  to  or 
smaller  than  the  size  of  an  atom.  Objects  are  considered  large  if  they  are 
larger  than  an  atom.  This  division  into  the  domains  of  small  and  large  ob- 
jects is  shown  schematically  in  Fig.  1-1  a.  The  domain  of  small  objects  is 


1-2  The  Domains  of  Physics  3 


Nonquantum  domain 
(large  objects) 


Quantum  domain 
(small  objects) 


(a) 


Nonrelativistic 

Relativistic 

domain 

domain 

(low  speeds) 

(high  speeds) 

(b) 


Newtonian 

domain 

(large  objects) 

(low  speeds) 

I 


I 


I 


(c) 

Fig.  1-1  (a)  A schematic  illustration  of 

the  division  of  physics  into  the  non- 
quantum domain,  in  which  the  objects 
studied  are  large  compared  to  an  atom, 
and  the  quantum  domain,  in  which  the 
objects  studied  are  comparable  in  size  to 
or  smaller  than  an  atom,  (b)  The  division 
of  physics  into  the  nonrelativistic  do- 
main, in  which  the  speeds  of  the  objects 
studied  are  low  compared  to  the  speed 
of  light,  and  the  relativistic  domain,  in 
which  the  speeds  are  comparable  to  the 
speed  of  light.  ( c ) The  domain  in  which 
the  objects  studied  both  are  large  com- 
pared to  an  atom  and  have  speeds  low 
compared  to  the  speed  of  light  is  called 
the  newtonian  domain.  The  name  hon- 
ors Isaac  Newton,  who  developed  the 
form  of  mechanics  that  is  used  to  obtain 
accurate  predictions  of  the  motion  of 
objects  in  this  domain. 


called  the  quantum  domain,  and  the  domain  of  large  objects  is  called  the 
nonquantum  domain.  (This  division  according  to  size  is  not  a completely 
rigid  one  since  in  certain  circumstances  phenomena  of  the  quantum  do- 
main can  be  seen  in  large  objects.  The  superfluid  properties  of  a container 
of  liquid  helium  provide  an  example.) 

Another  division  of  physics  into  domains  can  be  made  according  to  the 
speeds  of  the  objects  being  studied,  as  shown  in  Fig.  1-1  b.  If  the  speeds  are 
small  compared  to  the  speed  of  light,  they  are  considered  low.  If  they  are 
comparable  to  the  speed  of  light,  they  are  considered  high.  The  domain  of 
high  speeds  is  known  as  the  relativistic  domain,  and  the  domain  of  low 
speeds  is  known  as  the  nonrelativistic  domain.  (There  is  flexibility  in  the 
division  according  to  speed,  just  as  there  is  in  the  division  according  to  size. 
An  example  is  found  in  the  magnetic  effects  produced  when  electrons 
move  through  a wire.  Although  the  speeds  of  the  electrons  are  small  com- 
pared to  the  speed  of  light,  the  effects  are,  basically,  caused  by  a phenome- 
non of  the  relativistic  domain.) 

A combination  of  the  divisions  according  to  size  and  speed  is  indicated 
in  Fig.  1-lc.  Emphasis  has  been  put  on  the  region  which  lies  in  both  the  do- 
main of  large  objects  and  the  domain  of  low  speeds.  This  region  has  a spe- 
cial significance — it  is  the  one  we  deal  with  in  our  everyday  lives — and  is 
given  a special  name.  It  is  called  the  newtonian  domain,  in  honor  of  Isaac 
Newton,  the  seventeenth-century  physicist  who  played  the  key  role  in  devel- 
oping the  physics  of  large  objects  moving  at  low  speeds. 

When  objects  interact,  they  exert  forces  on  one  another.  The  force 
acting  on  an  object  determines  how  the  object  moves.  Mechanics  is  the 
study  of  the  relation  between  the  force  and  resulting  motion.  That  is, 
mechanics  seeks  to  account  quantitatively  for  the  motions  of  objects  having 
given  properties  in  terms  of  the  forces  acting  on  them.  The  mechanics  of 
the  newtonian  domain  is  known  as  newtonian  mechanics.  It  is  the  me- 
chanics of  systems  containing  objects  which  are  large  compared  to  an  atom 
and  which  move  at  speeds  that  are  low  compared  to  the  speed  of  light.  An 
example  is  our  planetary  system.  Planets  are  certainly  very  large  compared 
to  an  atom,  and  they  move  with  speeds  quite  low  in  comparison  to  the 
speed  of  light.  Near  the  end  of  the  seventeenth  century  Newton  took  a 
giant  step  in  explaining  the  astronomical  observations  that  had  been  made 
on  the  planets  over  the  preceding  centuries.  Out  of  Newton’s  work  flowed  a 
powerful  science  capable  of  explaining  an  immense  variety  of  familiar  nat- 
ural phenomena. 

Until  shortly  after  the  turn  of  the  present  century  it  was  thought  that 
there  were  no  limits  to  the  applicability  of  newtonian  mechanics.  But  in 
1905  Einstein  showed  that  a quite  different  (though  not  unrelated)  ap- 
proach was  necessary  for  the  study  of  objects  moving  with  speeds  so  high  as 
to  be  comparable  to  the  speed  of  light.  These  objects  are  in  the  relativistic 
domain  and  must  be  treated  by  relativistic  mechanics. 

At  about  the  same  time  Max  Planck,  Louis  de  Broglie,  Erwin 
Schrodinger,  and  others  found  that  newtonian  mechanics  could  not  ex- 
plain the  motion  of  objects  whose  size  is  on  the  atomic  scale  or  smaller. 
These  objects  require  the  use  of  quantum  mechanics  because  they  are  in 
the  quantum  domain. 

Some  of  the  most  exciting  work  in  contemporary  physics  lies  in  the 
quantum  domain,  or  in  the  relativistic  domain,  or  in  both.  Nevertheless, 


4 An  Introduction  to  Physics 


the  logical  structure  of  physics  demands  that  this  book  begin  in  the  new- 
tonian  domain  with  newtonian  mechanics.  Furthermore,  newtonian  me- 
chanics is  very  important  in  itself.  The  fact  that  it  is  the  oldest  form  of  me- 
chanics by  no  means  renders  it  obsolete  or  inactive.  In  particular,  many 
contemporary  engineering  and  biological  applications  of  physics  are  based 
completely  on  newtonian  mechanics.  However,  two  chapters  in  the  middle 
of  the  book  are  devoted  to  relativistic  mechanics,  and  two  chapters  at  the 
end  of  the  book  to  quantum  mechanics. 

You  may  have  expected  mention,  in  this  categorization  of  physics,  of 
topics  such  as  electromagnetism,  heat,  acoustics,  and  solid-state  physics. 
How  do  these  fit  into  the  structure  we  have  set  forth? 

Solid-state  physics  is  an  example  of  a branch  of  physics  defined  by  its 
subject  matter — the  properties  of  solids — rather  than  by  a certain  proce- 
dure used  to  study  the  subject  matter.  A particular  problem  in  solid-state 
physics,  for  instance  the  properties  of  materials  used  in  transistors,  is  at- 
tacked by  employing  the  mechanics  of  whichever  domain  is  most  appropri- 
ate. In  most  cases,  this  is  quantum  mechanics  or  newtonian  mechanics. 

Acoustics  and  heat  concern  particular  phenomena  in  various  kinds  of 
matter,  for  instance  the  creation  of  sonic  booms  by  a supersonic  airplane. 
Very  often  they  can  be  understood  completely  in  terms  of  systems  of  par- 
ticles acting  in  conformity  with  the  rules  of  newtonian  mechanics,  though 
some  cases  require  quantum  mechanics. 

Electromagnetism  is  a study  of  the  properties  and  consequences  of  the 
electromagnetic  force,  which  is  one  of  the  four  fundamental  forces  in  na- 
ture. The  electromagnetic  force  plays  an  important  role  in  all  domains  of 
physics.  For  instance,  it  is  the  force  of  overwhelming  importance  in 
solid-state  physics,  the  study  of  which  involves  several  domains. 

Of  the  three  fundamental  forces  other  than  the  electromagnetic  one, 
the  gravitational  force  is  the  most  familiar.  The  remaining  two,  called  the 
strong  nuclear  force  and  the  weak  nuclear  force,  are  unfamiliar  because 
they  operate  over  only  very  small  distances.  Thus  we  do  not  experience 
them  directly  with  our  senses.  Nevertheless,  the  strong  and  weak  nuclear 
forces  are  the  dominant  forces  holding  together  the  smallest  parts  of  the 
universe.  Since  the  larger  parts  are  built  up  of  the  smaller  ones,  many 
large-scale  properties  of  the  universe  do  depend  ultimately  on  the  nuclear 
forces.  We  encounter  all  four  fundamental  forces  in  our  study  of  physics, 
but  electromagnetism  and  gravitation  will  occupy  most  of  our  attention. 

In  the  next  section  we  begin  to  look  at  the  newtonian  domain  through 
the  eyes  of  the  physicist  by  asking  the  basic  question  of  newtonian  me- 
chanics: What  is  the  relation  between  force  and  motion? 


1-3  FORCE  AND 
MOTION 
IN  NEWTONIAN 
MECHANICS 


Whether  or  not  you  have  previously  taken  a physics  course,  you  come  to 
the  study  of  physics  in  this  book  having  more  experience  with  the  subject 
than  you  may  think.  You  could  not  have  survived  in  the  physical  world 
without  a considerable  grasp  of  how  it  operates,  particularly  in  its  mechan- 
ical aspects.  In  this  section  we  make  a first  approach  to  newtonian  me- 
chanics by  investigating  the  relation  between  force  and  motion.  This  ap- 
proach takes  the  form  of  qualitative  discussions  based  partly  on  the  me- 
chanical intuition  you  have  gained  from  everyday  experience  and  partly  on 
experimental  evidence  provided  by  a set  of  photographs.  Your  intuitive 


1-3  Force  and  Motion  in  Newtonian  Mechanics  5 


understanding  of  many  of  the  topics  considered  here  makes  it  possible  for 
the  discussions  to  be  quite  brief.  But  each  topic  is  treated  in  a much  more 
thorough  and  systematic  way  in  subsequent  chapters. 

One  purpose  of  presenting  such  material  at  this  point  is  to  give  you  a 
preliminary  example  of  the  strategy  of  physics.  The  arguments  must  be 
qualitative  at  this  stage  because  we  have  not  established  the  physical  foun- 
dations for  quantitative  ones.  T hus  mathematics  is  not  applicable,  and  so 
the  third  feature  of  the  strategy  is  not  used.  Otherwise,  the  approach  is  rep- 
resentative of  that  actually  employed  in  physics  because  the  first  and  sec- 
ond features  are  used.  In  particular,  this  section  should  give  you  some  idea 
of  (1)  the  way  a physicist’s  investigation  of  a complicated  system  leads  to  the 
investigation  of  a set  of  related  but  progressively  simpler  systems  and  (2) 
the  way  experiment  is  used  in  this  process. 

As  was  said  earlier,  newtonian  mechanics  is  concerned  with  familiar 
objects  (objects  not  very  small)  moving  in  familiar  ways  (at  speeds  not  very 
high),  fhe  aim  is  to  explain  the  motion  of  an  object  in  terms  of  the  force 
acting  on  it.  Motion  occurs  when  an  object  changes  its  position.  Here  we 
consider  the  simplest  kind  of  motion,  in  which  the  object  moves  always  in 
the  same  direction  along  a straight  line.  One  of  the  terms  used  to  describe 
the  motion  is  commonplace.  It  is  speed.  (In  this  case  of  motion  along  a 
straight  line  in  only  one  direction,  it  is  not  necessary  to  make  the  technical 
distinction  between  speed  and  velocity  because  the  two  quantities  are  equiv- 
alent.) Speed  is  a measure  of  how  much  the  position  of  the  object 
changes  — in  other  words,  how  far  it  moves  — in  a certain  small  interval  of 
time.  In  fact,  the  speed  of  an  object  moving  in  a particular  direction  along  a 
straight  line  is  just  the  change  in  its  position  divided  by  the  time  interval  in 
which  the  change  occurs. 

fhe  other  term  used  to  describe  the  motion  we  are  considering  is  al- 
most as  widely  used  in  everyday  language  as  speed.  It  is  acceleration. 
The  term  acceleration  is  useful  if  the  speed  of  the  object  is  not  constant. 
Acceleration  is  a measure  of  how  much  the  speed  of  the  object  changes  in  a 
certain  small  interval  of  time.  Specifically,  the  acceleration  of  an  object 
moving  in  a particular  direction  along  a straight  line  is  just  the  change  in 
speed  divided  by  the  time  interval  in  which  the  change  occurs. 

If  an  object  moving  in  only  one  direction  along  a straight  line  has  a con- 
stant speed,  then  it  has  no  acceleration.  If  the  object  has  an  increasing  speed, 
then  in  common  language  it  is  said  to  be  accelerating,  and  in  technical  lan- 
guage it  is  said  to  have  an  acceleration  in  the  same  direction  as  the  direction 
of  its  motion.  If  the  object  has  a decreasing  speed,  then  commonly  it  is  said  to 
be  decelerating,  and  technically  it  is  said  to  have  an  acceleration  in  the  oppo- 
site direction  to  its  direction  of  motion. 

Speed  and  acceleration  are  related  but  distinctly  different  quantities. 
Consider  the  following  examples.  A car  which  is  stopped  at  a red  light  has 
no  speed  and  no  acceleration.  A short  time  after  the  driver  sees  the  light 
turn  green  and  steps  hard  on  the  gas  pedal,  the  car  has  a small  speed  and  a 
large  acceleration  in  the  direction  of  its  motion.  When  the  car  is  moving 
along  a straight  stretch  of  highway  and  its  speed  has  nearly  reached  the 
legal  limit,  it  has  a large  speed  and  a small  acceleration  in  the  direction  of  its 
motion.  Shortly  after  the  driver  sees  a traffic  jam  ahead  and  presses  hard 
on  the  brake  pedal,  the  car  has  a large  speed  and  a large  acceleration  in  the 
direction  opposite  to  its  motion.  The  direction  of  its  acceleration  is  opposite 
to  the  direction  of  its  motion  because  its  speed  is  decreasing. 


Force  is  another  term  commonly  used  in  everyday  language,  although 
in  some  of  these  uses  the  meaning  of  the  word  has  little  relation  to  its  tech- 
nical meaning.  You  certainly  know  that  you  can  exert  a force  on  an  object 
by  pushing  or  pulling  on  it.  The  force  acts  in  the  direction  in  which  you 
push  or  pull,  and  its  strength  is  a measure  of  how  hard  you  do  it.  You  prob- 
ably also  know  that  the  earth  exerts  a force  on  an  object  near  its  surface  by 
means  of  gravitational  attraction,  pulling  the  object  in  the  downward  direc- 
tion (toward  the  center  of  the  earth).  The  strength  of  the  force  is  a measure 
of  the  weight  of  the  object;  in  fact,  it  is  the  weight. 

Familiar  but  less  well  understood  than  the  two  forces  just  discussed  is 
the  frictional  force.  This  force  is  produced  by  the  frictional  effects  which  in 
many  situations  are  experienced  by  objects.  If  an  object  is  moving,  the  fric- 
tional force  exerted  on  it  by  another  object  is  in  whatever  direction  is  oppo- 
site to  its  direction  of  motion  past  that  other  object.  The  strength  of  the 
frictional  force  depends  on  circumstances  that  will  be  mentioned  shortly. 

With  these  ideas  in  mind  about  the  quantities  used  to  describe  force 
and  motion,  let  us  return  to  the  central  question;  What  is  the  relation 
between  force  and  motion?  Three  possible  answers  are  (1)  there  is  no  rela- 
tion, (2)  the  force  applied  to  an  object  is  related  to  its  speed,  and  (3)  the 
force  applied  to  an  object  is  related  to  its  acceleration.  We  will  find  the  cor- 
rect answer  by  studying  a set  of  simple  experiments. 

Figure  1-2  is  a stroboscopic  photograph  of  the  hrst  experiment.  A squat 
cylindrical  object,  called  a puck,  was  initially  motionless  on  the  horizontal 
top  of  a table.  Then  the  experimenter  pushed  the  puck  in  a particular 
direction  so  that  it  moved  uniformly  in  a straight  line  across  the  unlubri- 
cated tabletop.  He  reported  that  in  doing  this  he  felt  the  sensation  that  the 
force  applied  to  the  puck  by  his  hand  was  of  constant  strength.  Strobo- 
scopic lights  were  used  to  illuminate  the  puck  while  it  was  moving.  These 
lights  flashed  very  briefly  at  regular  intervals  of  time.  A camera  viewed  the 
puck  from  above  with  its  shutter  remaining  open.  With  each  light  flash  the 
position  of  the  puck  at  that  instant  was  recorded  on  the  film.  So  the  strobe 


Fig.  1-2  A strobe  photo  looking  down  on  a puck  that 
is  being  continually  pushed  at  constant  speed  across 
the  top  of  a table. 


1-3  Force  and  Motion  in  Newtonian  Mechanics  7 


photograph  is,  in  effect,  a sequence  of  “snapshots”  showing  the  positions  of 
the  puck  at  equally  separated  times. 

You  can  see  from  the  strobe  photo  that  the  puck  moved  very  nearly 
equal  distances  in  each  of  the  equal  time  intervals.  Thus  its  speed  was 
approximately  constant  while  the  hand  was  exerting  an  approximately  con- 
stant force  on  the  puck. 

From  this  experiment  you  can  conclude  that  possible  answer  1 to  the 
question  posed  is  wrong.  It  is  evident  that  there  is  a relation  between  force 
and  motion.  Before  the  hand  pushed  the  puck,  it  was  stationary.  After  the 
hand  began  exerting  the  force,  the  puck  moved. 

Another  conclusion  you  might  draw  from  the  experiment  is  that 
answer  2 is  right.  That  is,  you  might  conclude  that  force  is  related  to  speed, 
since  an  approximately  constant  force  applied  by  the  hand  led  to  an 
approximately  constant  speed  of  the  puck.  However,  this  is  not  a correct 
conclusion. 

If  you  came  to  that  conclusion,  take  comfort,  because  you  are  in  good 
company.  Up  to  the  time  of  Galileo  Galilei  (1564-1642)  almost  all  scholars 
who  were  concerned  about  the  question  believed  that  a body  acted  on  by  a 
constant  force  moves  with  a constant  speed.  Consider  a modern  example. 
You  are  driving  a car  along  a level,  straight  road  at  a certain  constant  speed 
by  keeping  the  gas  pedal  depressed  a certain  amount,  thereby  making  the 
engine  produce  a constant  force  of  propulsion.  If  you  want  to  travel  at  a 
higher  constant  speed,  you  must  depress  the  gas  pedal  more,  so  as  to  in- 
crease the  force  produced  by  the  engine  of  the  car.  These  considerations 
seem  to  imply,  again,  that  force  is  related  to  speed.  But  this  is  not  so. 

What  was  not  recognized  by  the  ancients,  and  is  still  overlooked  by 
some  moderns  who  have  not  studied  physics,  is  that  two  signihcant  forces 
act  on  the  car,  or  on  the  puck.  Each  is  a bona  fide  force,  although  one  is 
more  obvious  than  the  other.  The  obvious  force  is  the  one  that  the  engine 
causes  to  be  applied  to  the  car,  or  the  even  more  obvious  one  that  is  applied 
by  the  hand  to  the  puck.  The  other  force  is  a less  apparent — but  equally 
important — force  applied  by  frictional  effects.  In  the  case  of  the  car,  this 
force  arises  primarily  from  air  resistance.  For  the  puck  moving  at  rather 
low  speed  across  the  table,  the  frictional  force  acting  on  it  is  almost  entirely 
due  to  the  contact  between  its  slightly  rough  bottom  surface  and  the  slightly 
rough  surface  of  the  tabletop.  But  in  both  cases  the  frictional  force  under 
consideration  always  acts  in  the  direction  opposite  to  the  direction  of  mo- 
tion of  the  object  on  which  it  is  exerted. 

The  frictional  force  of  air  resistance  is  rather  complicated  in  that  its 
strength  increases  rapidly  with  the  speed  of  the  car.  This  is  why  the  engine 
must  produce  a stronger  propulsive  force  to  move  the  car  at  a higher  fixed 
speed.  The  engine  must  cause  a harder  push  to  be  applied  to  the  car  to 
compensate  for  the  stronger  frictional  retarding  force  that  is  experienced 
when  the  car  is  moving  faster.  A contact  friction  force  has  the  simple  prop- 
erty that  its  strength  is  almost  independent  of  the  speed  of  the  moving  ob- 
ject. 


The  strobe  photo  in  Fig.  1-3  illustrates  the  effect  of  contact  friction 
on  the  motion  of  the  puck  in  the  more  elementary  situation  where  it  is  the 
only  force  acting  along  the  line  of  motion  (in  other  words,  it  is  the  only  force 
acting  either  in  the  direction  of  motion  or  in  the  opposite  direction).  In  the 
experiment  recorded  in  this  photograph,  the  propelling  hand  stopped 
short  after  the  puck  had  been  set  into  motion.  The  puck  then  moved  across 


Fig.  1-3  A strobe  photo  looking  down  on  a puck  that 
is  given  an  initial  push  across  the  top  of  a table.  It  goes 
slower  and  slower,  until  it  comes  to  rest,  because  there 
is  appreciable  friction.  Later  in  the  argument  you  will 
be  asked  to  show  that  the  acceleration  of  the  puck  is 
nearly  constant  while  it  is  slowing  down.  When  it  is 
time  to  do  this,  read  the  following  instructions.  Begin 
by  labeling  as  1 the  point  at  the  center  of  the  first  image 
of  the  puck  that  is  well  away  from  the  hand,  labeling 
the  next  center  point  as  2,  and  so  on.  Then  lay  a ruler 
having  decimal  subdivisions  along  the  line  connecting 
these  points,  with  its  left  end  at  some  convenient 
location  near  the  image  of  the  hand.  Use  the  ruler  to 
determine  carefully  the  numbers  specifying  the  loca- 
tions of  the  points  labeled  1,  2,  and  so  on.  These  are 
the  “coordinates”  of  the  points.  Find  the  difference 
between  the  first  pair  of  coordinates  by  subtracting 
coordinate  1 from  coordinate  2.  This  gives  the  change 
in  the  puck’s  position  in  the  time  interval  between  the 
corresponding  pair  of  stroboscopic  light  flashes.  So  it 
is  a measure  of  the  puck’s  speed  in  that  time  interval. 
Do  the  same  for  coordinates  2 and  3 to  measure  the 
speed  of  the  puck  in  the  next  time  interval.  Then 
subtract  the  1,2  difference  from  the  2,3  difference. 
The  result  is  a measure  of  the  change  in  speed 
between  the  consecutive  time  intervals  1,2  and  2,3. 
Therefore  it  is  a measure  of  the  puck’s  acceleration  in 
the  first  part  of  the  motion.  The  value  will  be  negative 
because  the  puck  is  slowing  down.  Now  repeat  the 
entire  procedure  using  the  coordinates  of  the  next 
three  points  to  obtain  a measure  of  the  puck’s 
acceleration  in  the  next  part  of  its  motion.  You  will 
find  that  the  two  values  are  equal,  within  the  limited 
accuracy  that  can  be  expected  from  this  procedure. 


the  table  under  the  influence  of  the  only  significant  force  acting  on  it.  This 
force  was  the  contact  friction  force  acting  in  the  direction  opposite  to  the 
direction  of  the  puck’s  motion.  It  is  not  surprising  that  the  puck  slowed 
down  and  soon  came  to  rest.  You  can  see  that  this  is  so  by  noting  that  in  the 
equal  time  intervals  between  consecutive  stroboscopic  light  flashes,  the 
puck  moved  an  ever-decreasing  distance.  So  its  speed  decreased 
throughout  its  motion,  until  it  stopped. 

While  the  puck  was  slowing  down,  its  acceleration  was  in  the  direction 
opposite  to  the  direction  of  its  motion.  And  while  this  was  happening,  the 
only  significant  force  acting  on  the  puck  was  that  of  contact  friction,  which 
also  acted  in  the  direction  opposite  to  the  direction  of  its  motion.  Thus  the 
experiment  shows  that  a force  exerted  on  the  puck  in  a certain  direction  leads  to 
an  acceleration  in  that  direction.  The  experimental  result  is  suggestive. 

If  you  believe  that  the  strength  of  a contact  friction  force  acting  on  an 
object  is  almost  constant  as  long  as  the  object  is  moving,  then  the  experi- 
ment can  be  even  more  suggestive.  Use  the  procedure  explained  in  the  cap- 
tion to  Fig.  1-3,  and  you  will  find  that  the  amount  of  the  puck’s  acceleration 
was  nearly  constant  while  it  was  coming  to  rest.  Thus  the  experiment  shows 
that  an  approximately  constant  force  produced  an  approximately  constant 
acceleration  in  the  direction  of  the  force.  It  strongly  suggests  that  the  cor- 
rect answer  to  the  question  posed  is  answer  3:  The  force  applied  to  an  object  is 
related  to  its  acceleration. 

There  is  additional  support  for  this  statement  in  that  it  allows  you  to 
understand  the  unaccelerated  motion  of  the  puck  traveling  at  constant  speed 
across  the  table  in  the  experiment  shown  in  Fig.  1-2.  In  that  experiment  the 


1-3  Force  and  Motion  in  Newtonian  Mechanics  9 


constant  force  applied  to  the  puck  by  the  hand  acted  in  the  direction  of  the 
puck's  motion,  while  the  constant  force  applied  to  the  puck  by  friction 
acted  in  the  direction  opposite  to  its  motion.  In  order  to  make  the  puck 
move  uniformly,  the  experimenter  adjusted  the  strength  of  the  hand- 
applied  force  to  equal  that  of  the  oppositely  directed  friction-applied  force. 
The  moving  puck  was  acted  on  by  two  forces  of  equal  strength  whose  ef- 
fects canceled  each  other  because  they  were  applied  in  opposite  directions. 
In  other  words,  the  puck  experienced  no  net  force.  If  the  force  applied  to  an 
object  is  related  to  its  acceleration,  as  suggested  in  the  preceding  para- 
graph, and  if  the  word  “force”  is  interpreted  to  mean  “net  force,”  then  the 
unaccelerated  motion  of  the  puck  in  Fig.  1-2  can  be  understood. 

It  certainly  would  be  worthwhile  to  test  these  ideas  by  studying  the  mo- 
tion of  the  puck  when  there  really  is  no  force  at  all  acting  on  it.  Unfortu- 
nately, this  cannot  be  done.  But  it  is  possible  to  eliminate  almost  completely 
those  forces  which  act  along  the  puck's  line  of  motion,  and  these  are  the  im- 
portant forces  for  our  considerations.  This  is  accomplished  in  an  experi- 
ment using  a table  with  an  almost  friction-free  top,  called  an  air  table. 

A picture  of  an  air  table  is  shown  in  Fig.  1-4.  It  has  a very  flat  top  with  a 
closely  spaced  gridwork  of  small  holes  through  which  air  is  forced  by  a 
blower.  The  air  coming  up  through  the  holes  under  pressure  provides  a 
horizontal  cushion  on  which  a puck  rides  with  almost  no  friction.  This  is  be- 
cause the  puck  does  not  touch  the  tabletop  itself,  but  literally  rides  on  a 
thin  layer  of  air.  If  a puck  is  set  into  motion  by  a short  initial  push,  it  will 
then  continue  to  move  freely  across  the  table.  For  all  practical  purposes, 
after  the  puck  has  started,  no  force  acts  on  it  in  any  horizontal  direction 
until  the  puck  hits  a barrier  at  the  edge  of  the  table. 

Flowever,  two  forces  always  act  on  the  puck  in  vertical  directions.  One 
is  the  downward  force  of  gravity,  and  the  other  is  the  exactly  canceling  up- 
ward supporting  force  of  the  air  cushion.  But  these  forces  are  not  signifi- 
cant in  the  air  table  experiment,  since  they  do  not  affect  motion  in  a hori- 
zontal direction.  Downward  and  upward  forces  which  cancel  one  another 
were  also  acting  on  the  puck  in  the  earlier  experiments. 

Figure  1-5  shows  a strobe  photo  of  a puck  moving  across  the  air  table 
after  being  initially  set  into  motion.  It  is  apparent  that  the  puck  does  indeed 
move  in  a straight  line  with  constant  speed  when  no  force  at  all  acts  on  it 
along  its  line  of  motion.  Another  way  of  stating  this  result  is  to  say  that  when 
there  is  no  net  force,  there  is  no  acceleration. 


Fig.  1-4  The  apparatus  used  to  obtain  the  air  table  strobe  photos  in 
this  book.  A camera  mounted  on  a lab  bench  records  a view  from  above 
of  the  air  table  by  means  of  an  inclined  mirror.  The  air  table  has  a black 
surface  to  contrast  with  the  white  pucks  and  is  illuminated  by  a pair  of 
strobe  lights  located  to  the  left  of  the  table.  The  flexible  pipe  delivering 
compressed  air  to  the  table  can  be  seen  extending  from  its  right-rear 
corner  to  the  air  pump  behind  it. 


10  An  Introduction  to  Physics 


Fig.  1-5  A strobe  photo  looking  down  on  a puck  that 
is  given  a short  initial  push  across  the  top  of  an  air 
table.  It  continues  to  move  at  constant  speed  because 
friction  is  negligible. 


The  final  experiment  to  be  considered  in  this  section  is  the  simplest  of 
all  because  it  studies  the  motion  of  the  puck  when  only  one  single  force  acts 
on  it.  Furthermore,  you  will  find  it  easy  to  agree  that  the  strength  of  the 
force  is  essentially  constant  throughout  the  motion  of  the  puck.  The  exper- 
iment is  performed  by  removing  the  support  from  beneath  the  puck, 
thereby  allowing  it  to  fall  a short  distance  near  the  surface  of  the  earth. 
Since  it  does  not  have  a chance  to  pick  up  much  speed,  it  never  experiences 
a significant  amount  of  air  resistance.  Consequently,  only  one  force  acts  on 
the  puck  while  it  is  falling.  This  net  force  is  the  force  of  gravity  exerted  on 
the  puck  by  the  earth.  The  direction  of  the  force  is  downward.  Its  strength 
is  what  is  called  the  weight  of  the  puck.  Since  the  weight  is  constant 
throughout  the  fall,  the  strength  of  the  net  force  acting  on  the  puck  is  con- 
stant. 

The  experiment  is  illustrated  in  the  strobe  photo  of  Fig.  1-6.  You 
can  see  that  the  speed  of  the  puck  is  increasing  throughout  its  motion.  It 
therefore  has  an  acceleration  in  the  downward  direction  of  its  motion.  If 
you  carry  out  the  analysis  suggested  in  the  caption  to  Fig.  1-6,  you  will  see 
that  the  amount  of  acceleration  is  constant  while  the  puck  is  falling. 

Thus  the  very  simple  experiment  of  Fig.  1-6  shows  that  a constant  net 
force  leads  to  a constant  acceleration  in  the  direction  of  the  force.  The  not 
quite  so  simple  experiment  of  Fig.  1-5  showed  that  when  there  is  no  net 
force,  there  is  no  acceleration.  These  two  statements  are  in  agreement  with 
the  even  more  complicated  experiments  of  Figs.  1-3  and  1-2.  The  experi- 
ments on  the  set  of  progressively  simpler  systems  are  summarized  in  Fig. 


Fig.  1-6  A strobe  photo  looking  at  a falling  puck  in  side  view.  Its 
speed  increases  continually  because  gravity  continually  pulls  it  toward 
the  earth.  To  show  that  during  its  fall  the  puck’s  acceleration  is 
constant,  within  the  accuracy  of  the  analysis,  you  can  carry  out  the 
same  procedure  as  that  described  in  the  caption  to  Fig.  1-3.  However, 
in  this  case  the  value  you  obtain  will  be  positive  because  the  puck  is 
speeding  up. 


1-3  Force  and  Motion  in  Newtonian  Mechanics  11 


Gravity 


(a) 


Hand  ' ’ Friction 

— — i - i 1 

r*  i i 


i 1 

I ! 


Support 


Direction  of  motion 
No  acceleration 
No  net  force 


Gravity 


( b ) 


Direction  of  net  force 


Support 


Friction 


— i 


i — 
l 


— i 
i 


Direction  of  motion 


Direction  of  acceleration 


Gravity 


Support 


Direction  of  motion 


No  acceleration 


No  net  force 


Gravity 


i — — i 

i i 

i 1 

i i 


Direction 
of  motion 


Direction 
of  acceleration 


Id) 


i — 
i 


— i 
I 


Direction 
of  net  force 


Fig.  1-7  A summary  of  experiments  leading  to  the  conclusion  that  the  net  force  applied  to  an 
object  is  related  to  its  acceleration.  Parts  a,  b,  c,  and  d are  schematic  views  from  the  side  of  the 
puck  motions  illustrated  in  the  strobe  photos  of  Figs.  1-2,  1-3,  1-5,  and  1-6,  respectively.  All 
the  forces  acting  on  a puck  are  indicated  by  arrows  on  the  first  puck  image.  The  direction  of 
each  of  these  arrows  shows  the  direction  of  the  force,  and  its  length  indicates  qualitatively  the 
strength  of  the  force.  To  the  right  of  each  part  of  the  figure  arrows  show  the  direction  of  motion 
of  the  puck  and,  if  applicable,  the  direction  of  its  acceleration  and  of  the  net  force  acting  on  it. 
The  lengths  of  these  arrows  have  no  significance.  Note  how  the  experiments  become 
progressively  simpler  in  that  four  forces  act  on  the  puck  in  the  first  experiment,  three  in  the 
next,  two  in  the  next,  and  only  one  in  the  last.  This  is  an  example  of  the  way  that  physicists  try  to 
focus  their  attention  on  simpler  systems  that  have  fewer  factors  to  influence  the  behavior. 


1-7.  All  the  experiments  support  the  conclusion  that  the  net  force  applied  to 
an  object  is  related  to  its  acceleration. 

We  are  not  yet  in  a position  to  establish  the  precise  nature  of  the  rela- 
tion between  the  net  force  acting  on  an  object  and  its  acceleration.  The 

12  An  Introduction  to  Physics 


most  rudimentary  possibility  would  be  for  the  two  quantities  which  are  in 
the  same  direction  to  have  values  which  are  in  direct  proportion.  In  Chap. 
4 we  find  that  this  is,  in  fact,  the  case.  We  also  find  there  that  the  propor- 
tionality constant  in  the  relation  between  their  values  is  the  mass  of  the  ob- 
ject. To  be  specific,  we  will  learn  that  when  a net  force  of  strength  F acts  on 
a body  of  mass  m,  it  experiences  an  acceleration  in  the  direction  of  the  force 
whose  value  a satisfies  the  relation  F — ma.  This  is  the  most  important 
equation  of  newtonian  mechanics;  it  is  one  of  Newton’s  laws  of  motion. 
These  laws  make  it  possible  to  predict  accurately  the  behavior  of  any  system 
of  objects  in  the  newtonian  domain  if  their  masses,  and  the  forces  acting  on 
them,  are  known.  The  detailed  development  of  the  laws  of  motion  in  Chap. 
4 is  founded  on  a quantitative  analysis  of  experiments  in  which  strobe  pho- 
tos record  the  motion  of  pucks  on  an  air  table.  In  this  section  you  have  been 
given  a qualitative  introduction  to  the  experimental  technique. 

Before  we  embark  on  the  development  of  Newton’s  laws  of  motion,  we 
must  first  find  out  how  to  specify  precisely  what  is  meant  by  acceleration  in 
situations  which  are  more  general  than  those  considered  in  this  section. 
That  is,  we  must  set  up  a description  of  motion  that  will  provide  an  exact 
connection  between  acceleration  and  the  directly  measurable  quantities  po- 
sition and  time,  for  any  situation  of  interest.  We  begin  this  task  in  Chap.  2. 


1-3 


Force  and  Motion  in  Newtonian  Mechanics 


13 


Kinematics 

in  One  Dimension 


2-1  ONE-DIMENSIONAL 
MOTION 


A 


-A 


(a) 


is 


(b) 

Fig.  2-1  (a)  An  object  changing  its 

orientation  without  changing  the  posi- 
tion of  its  center.  ( b ) An  object  changing 
the  position  of  its  center  without 
changing  its  orientation. 


Kinematics  (from  the  Greek  word  kinematos,  meaning  motion)  is  the 
description  of  motion,  without  regard  to  its  cause.  Kinematics  provides  the 
language  needed  to  describe  how  objects  move.  But  it  says  nothing  about 
why  they  do  so;  that  topic  starts  in  Chap.  4. 

An  object  can  move  in  two  basically  different  ways:  it  can  change  only 
its  orientation,  or  it  can  change  only  its  location.  Figure  2-1  a illustrates  the 
hrst  type  of  motion.  The  brick  shown  in  the  figure  changes  its  orientation 
by  rotating  about  its  center,  while  keeping  the  position  of  that  point  fixed. 
Figure  2-1/;  shows  the  brick  undergoing  the  second  type  of  motion.  It 
changes  its  location  by  changing  the  position  of  its  center,  while  not  ro- 
tating about  that  point.  Of  course,  an  object  can  also  experience  both  types 
of  motion  at  the  same  time.  But  in  the  hrst  part  of  this  book  we  are  con- 
cerned only  with  changes  in  location.  The  study  of  changes  in  orientation  is 
deferred  until  Chap.  9. 

Thus  we  are  concerned  at  hrst  with  an  object  whose  behavior  of  inter- 
est can  be  described  in  terms  of  the  motion  of  its  center,  or  of  some  other 
point  conveniently  used  to  specify  its  location.  As  an  example,  if  you  are 
studying  the  annual  motion  of  the  earth  about  the  sun,  you  can  ignore  the 
fact  that  it  spins  daily  about  its  center  and  focus  your  attention  on  the 
changes  in  position  of  the  center. 

The  most  general  way  for  a point  locating  an  object  to  move  is  in  three 
dimensions.  An  example  of  such  motion  is  Fig.  2-2,  which  shows  the  chang- 
ing location  of  an  airplane  gaining  altitude  by  hying  in  a helical  path.  A less 
general,  but  simpler,  motion  is  that  confined  to  two  dimensions.  An  air- 
plane hying  in  a circle  at  a constant  altitude  provides  the  example  illus- 
trated in  Fig.  2-3.  The  most  restricted,  but  simplest,  case  is  motion  along  a 


14 


Fig.  2-2  An  airplane  flying  in  a helical 
climb. 


Fig.  2-3  An  airplane  flying  in  a circle  at 
fixed  altitude. 


O • — x 

Fig.  2-4  An  airplane  flying  straight  and 
level. 


straight  line,  that  is,  in  one  dimension.  See  Fig.  2-4,  which  represents  an 
airplane  flying  a fixed  course  at  a constant  altitude.  Since  we  always  try  to 
start  a subject  by  treating  the  simplest  case,  we  begin  the  development  of 
kinematics  by  considering  only  motion  in  one  dimension.  When  the  physics 
to  be  studied  in  the  next  chapter  leads  us  to  two-  or  three-dimensional  mo- 
tion, you  will  find  that  the  extension  can  be  made  without  too  much 
difficulty. 

The  location  of  the  airplane  in  Fig.  2-4  can  be  specified  concisely  at  any 
instant  by  using  its  line  of  motion  to  define  an  x axis.  (For  the  horizontal 
motion  considered  here  it  is  natural  to  call  the  axis  the  x axis.  But  we  will 
call  the  single  axis  that  suffices  to  describe  one-dimensional  motion  the  x 
axis,  no  matter  how  it  is  oriented  in  space.)  Some  conveniently  located  fixed 
reference  point,  labeled  O in  the  figure,  is  chosen  as  the  origin  of  the  x axis. 
Then  the  location  of  the  airplane  is  given  by  stating  the  distance  from  O to 
the  airplane  and  specifying  whether  the  airplane  is  on  one  side  of  O or  on 
the  other  side.  This  is  done  by  giving  the  value  of  a quantity  x,  the  x coordi- 
nate of  the  airplane.  The  magnitude  of  x (that  is,  its  absolute  value  |x|)  equals 
the  distance  from  O to  the  airplane.  The  sign  of  x is  positive  if  the  direction 
from  O to  the  airplane  is  in  a direction  that  has  been  chosen  to  be  the  posi- 
tive direction,  and  negative  otherwise.  (For  the  case  illustrated  in  Fig.  2-4  it 
would  be  convenient  to  choose  the  positive  x direction  to  be  to  the  right. 
But  the  choice  used  does  not  really  matter,  as  long  as  it  is  used  consistently.) 

Consider  another  example  of  one-dimensional  motion.  A car  is 
moving  in  a certain  direction  on  a straight  road.  The  most  rudimentary 
way  of  depicting  the  nature  of  its  motion  is  on  a figure  which  shows  the  lo- 
cation of  the  car  at  a succession  of  equally  separated  times.  Figure  2-5  is 
such  a strobe-photo-like  illustration  showing  a car  moving  uniformly  on  a 
straight  road  past  a green  traffic  light.  It  is  said  to  have  uniform  motion  be- 
cause it  moves  the  same  distance  in  each  of  the  ecjual  time  intervals  between 
the  instants  at  which  its  location  is  depicted.  In  contrast,  Fig.  2-6  shows  a car 
moving  nonuniformly  as  it  races  away  from  a traffic  light  when  the  light 
turns  green.  These  figures  allow  a distinction  to  be  made  between  uniform 
and  nonuniform  motion.  Such  a distinction  cannot  be  made  in  Fig.  2-4. 

Also  indicated  in  Figs.  2-5  and  2-6  is  an  x axis  laid  out  along  the  road 
with  its  positive  direction  to  the  right  and  its  origin  located  at  the  traffic 
light.  The  location  of  the  car  at  a certain  time  is  given  by  the  value  of  x. 
That  time  is  specified  by  the  quantity  t,  which  is  the  amount  of  time  that  has 
passed  since  the  instant  when  the  car  goes  by  the  light.  The  subscripts  1,  2, 
and  so  on  are  used  on  x and  t to  associate  corresponding  values.  That  is,  xx 
is  the  coordinate  corresponding  to  time  fi,  x2  corresponds  to  t2,  and  so 


Fig.  2-5  An  automobile  moving  uniformly  along  a straight 
road.  The  first  three  views  show  its  locations  at  time  tl,t2,  and  t3 
when  its  distances  from  a traffic  light  are,  respectively,  xlt  x2, 
and  x3.  All  times  are  measured  from  when  the  car  passes  the 
light,  and  all  the  intervals  between  successive  times  are  equal. 


Fig.  2-6  An  automobile  moving  nonuniformly  along  a straight 
road. 


2-1  One-Dimensional  Motion  15 


Fig.  2-7  Graphical  presentation  of  the  uniform  motion  illus- 
trated in  Fig.  2-5.  The  distance  x from  the  moving  object  to 
some  stationary  point  increases  uniformly  with  increasing  values 
of  the  time  t that  has  passed  since  some  initial  instant. 


Fig.  2-8  Graphical  presentation  of  the  nonuniform  motion 
illustrated  in  Fig.  2-6. 


forth.  Actually,  there  is  a value  of  x corresponding  to  every  possible  value 
of  t.  The  relation  between  the  quantities  x and  t completely  describes  the 
motion  of  the  car. 

In  Fig.  2-7  this  relation  is  plotted  for  the  motion  illustrated  in  Fig.  2-5, 
and  the  same  is  done  in  Fig.  2-8  for  the  motion  in  Fig.  2-6.  Each  plot  shows 
the  values  of  x for  all  values  of  t that  are  of  interest.  So  each  tells  everything 
there  is  to  know  about  the  relation  between  x and  t,  and  therefore  about  the 
motion  of  the  car.  But  the  x versus  t plots  present  this  information  in  a 
very  much  more  useful  form  than  the  strobe-photo-like  figures  do.  Fur- 
thermore, they  lead  naturally  to  the  fruitful  idea  of  using  mathematical 
functions  to  describe  the  motion.  That  is,  the  same  information  which  is  ex- 
pressed graphically  by  plotting  x versus  t can  usually  be  expressed  by  a 
mathematical  function  giving  the  form  of  the  relation  between  x and  t.  The 
symbolism  x(t),  seen  in  Figs.  2-7  and  2-8,  is  used  to  indicate  that  x is  a func- 
tion of  t. 

We  will  soon  explore  the  properties  of  the  x versus  t plots.  But  first  we 
must  find  clear  definitions  of  the  quantities  x and  t that  specify  position  and 
time.  In  the  next  two  sections  we  do  this  by  considering  procedures  that  can 
be  used  to  measure  these  two  quantities. 


2-2  POSITION  AND  If  a point  locating  an  object  moves  in  a certain  direction  along  a straight 
UNITS  OF  LENGTH  line,  its  position  at  any  instant  can  be  given  by  stating  its  distance  from  some 

previously  chosen  reference  point.  But  how  can  this  distance  be  measured? 
The  answer  is:  Construct  a measuring  stick  of  a certain  length.  Then  agree 
that  this  will  be  the  standard  unit  of  length.  Finally,  measure  the  distance 
by  counting  the  number  of  times  the  length  of  the  stick  can  be  fitted  into 
the  distance.  The  number  obtained  (not  necessarily  an  integer)  specifies  the 
distance  in  terms  of  the  length  unit  used. 

In  physics  the  agreed-upon  unit  of  length  is  the  meter,  with  its  decimal 
multiples  and  submultiples.  The  meter  is  used  universally  in  the  other  sci- 
ences too.  It  is  also  used  in  engineering,  technology,  manufacturing,  and 
everyday  affairs  throughout  the  world — with  the  principal  exception  of 
the  United  States.  It  is  the  unit  of  length  used  in  this  book.  Indeed,  the  con- 


16  Kinematics  in  One  Dimension 


North 

pole 


South 

pole 


Fig.  2-9  The  original  definition  of  the 
metric  unit  of  length  was  based  on  the 
dimensions  of  the  earth.  A meter  was 
defined  as  1/10,000,000  of  the  distance 
from  the  equator  to  the  North  pole, 
measured  along  the  circle  (called  a 
meridian  circle)  passing  through  both 
poles  and  Paris. 


temporary  version  of  the  metric  system,  called  the  Systeme  Internationale 
(SI),  is  used  almost  exclusively.  However,  we  do  make  occasional  compari- 
sons between  SI  units  and  those  familiar  but  cumbersome  ones  still  used 
for  nonscientific  purposes  in  the  United  States.  At  a later  stage  we  also 
discuss  some  units  closely  related  to  SI  units  that  are  commonly  employed 
in  certain  branches  of  physics. 

After  much  study,  the  metric  system  was  legally  adopted  in  France  in  1799, 
during  the  later  phases  of  the  French  Revolution.  Its  use  was  compulsory  there 
after  1820.  It  turned  out  to  be  a highly  successful  attempt  to  make  measurement 
simpler  and  more  rational.  As  the  metric  system  spread  rapidly  through  Europe  on 
the  winds  of  revolution,  it  largely  replaced  a confusing  muddle  of  local  units, 
which  often  employed  the  same  name  for  varying  magnitudes.  The  metric  system 
was  perhaps  the  most  popular  legacy  of  the  French  Revolution,  and  even  in  the 
subsequent  political  reaction  no  serious  attempt  was  made  to  abolish  it. 

The  original  intent  of  the  metric  system  was  to  define  the  meter  (ab- 
breviated m)  in  terms  of  the  earth  itself.  The  distance  from  the  equator 
to  the  North  pole,  along  the  meridian  (longitude)  circle  passing  through 
Paris,  was  to  be  precisely  107  m,  by  definition.  This  definition  is  illustrated 
in  Fig.  2-9.  It  soon  became  clear,  however,  that  it  was  not  possible  to  mea- 
sure the  dimensions  of  the  earth  with  anywhere  near  the  same  accuracy 
that  could  be  achieved  in  the  laboratory  for  a much  shorter  length.  Conse- 
quently, the  meter  was  actually  defined  to  be  the  distance  between  two 
marks  on  a standard  meter  bar.  Considerable  pains  were  taken  to  make 
that  distance  as  close  as  practicable  to  the  best  available  value  for  the  “ter- 
restrial" meter.  But  once  the  meter  was  defined  in  1795,  terrestrial  distance 
was  measured  in  terms  of  the  standard  meter  bar,  not  vice  versa.  The  bar 
that  was  used  as  the  standard  meter  is  located  in  a vault  of  the  International 
Bureau  of  Weights  and  Measures  in  a suburb  of  Paris.  It  was  very  carefully 
constructed  and  supported  and  was  maintained  at  a fixed  temperature.  In 
due  course,  copies  were  sent  to  other  bureaus  throughout  the  world. 

A significant  change  in  the  definition  of  the  meter  was  made  in  1960. 
By  that  time  it  had  become  possible  to  measure  a wavelength  of  certain 
kinds  of  light  much  more  accurately  than  the  standard  meter  bar  itself 
could  be  measured.  So  just  as  the  meter  had  been  redefined  in  terms  of  the 
standard  meter  bar  instead  of  the  earth,  the  meter  was  newly  redefined  in 
terms  of  tfie  wavelength  of  the  orange  light  emitted  by  atoms  of  krypton-86 
in  an  electric  discharge  tube.  Specifically,  the  meter  is  now  defined  to  be  ex- 
actly 1,650,763.73  wavelengths  of  this  light.  To  make  the  new  standard 
compatible  with  the  old,  the  value  was  obtained  from  a set  of  careful  mea- 
surements of  the  length  of  the  previous  standard  meter  bar,  using 
krypton-86  light  and  an  optical  instrument  called  an  interferometer,  which 
is  described  later  in  this  book.  Aside  from  precision,  the  advantages  of  the 
atomic  wavelength  definition  of  the  meter  include  indestructibility  and  the 
ability  to  reproduce  it  in  any  properly  equipped  laboratory  without  physi- 
cally transporting  a copy  of  the  meter  bar  from  Paris.  A particular  advan- 
tage is  stability:  We  have  every  reason  to  believe  that  the  atomic  emission 
wavelength  is  more  stable  than  the  length  of  a metal  bar. 


One  convenience  of  the  metric  system  is  that  it  is  a decimal  system.  In 
other  words,  the  multiples  or  submultiples  of  each  basic  unit  are  larger  or 
smaller  than  the  basic  unit  by  exact  factors  of  10.  For  instance,  a commonly 


2-2  Position  and  Units  of  Length  17 


used  multiple  of  the  meter  is  t he  kilometer  (km),  which  is  103  m: 

1 km  = 103  m 

Some  common  submultiples  are  the  centimeter  (cm),  which  has  the  value 

1 cm  = 10-2  m 

and  the  millimeter  (mm),  which  has  the  value 

1 mm  = 10-3  m 

One  of  the  tables  inside  the  covers  of  this  book  lists  the  names  and  values  of 
the  prefixes  used  in  the  metric  system. 

The  relation  between  the  meter  and  the  unit  of  length  still  frequently 
used  in  the  United  States  is  precisely 

1 yard  = 0.9144  m 

This  is  equivalent  to  the  precise  relation 

1 inch  = 2.54  cm 

There  is  no  longer  a standard  yardstick.  The  yard  (yd)  and  the  inch  (in) 
were  redefined  in  terms  of  the  standard  meter  in  1959.  Other  length  units 
and  conversion  factors  can  be  found  in  tables  inside  the  covers  of  the  book. 
Examples  2-1  and  2-2  will  give  you  some  experience  with  metric  length 
units  and  conversion  factors. 


EXAMPLE  2-1 

Evaluate  the  radius  of  the  earth  in  meters  by  using  the  relation  between  the  original 
terrestrial  definition  of  the  meter  and  the  dimensions  of  the  earth.  Bearing  in  mind 
the  limited  accuracy  of  that  relation  (about  1 part  in  104),  quote  the  value  you  obtain 
to  only  three  significant  figures.  That  is,  express  the  value  in  power-of-ten  nota- 
tion, with  three  digits  appearing  in  the  factor  multiplying  the  power  of  10. 

■ Since  2-7rr  is  the  circumference  of  a meridian  circle  of  radius  r passing  through 
the  poles  and  Paris,  the  quantity  27rr/4  is  very  close  to  the  distance  from  the  equator 
to  the  North  pole  measured  along  that  meridian.  (The  value  27rr/4  is  not  exact  be- 
cause the  earth  is  not  exactly  spherical.)  Invoking  the  definition,  you  set  27rr/4 
equal  to  precisely  107  m: 


Solving  for  r,  you  obtain 


Carrying  out  the  division  and  expressing  the  result  to  three  significant  figures,  you 
have 

r = 6.37  x 1 06  m 

It  is  said  that  there  are  three  significant  figures  in  this  result.  They  are  the  digits 
in  the  factor  6.37.  This  means  that  the  value  quoted  for  r is  6.37  x 106  m,  in  con- 
trast to  6.38  x 106  m or  6.36  x 106  m.  But  no  attempt  is  being  made  to  distinguish 
between  6.370  x 106  m and  6.371  x 106  m or  6.369  x 106  m. 


/7 rr 


= 107  m 


4 x 107  m 


18  Kinematics  in  One  Dimension 


EXAMPLE  2-2 


Express  the  value  of  the  earth  radius  r,  obtained  in  Example  2-1,  in  inches. 
■ You  have 


r - 6.37  X 1 06  m 


To  convert  r to  inches,  first  multiply  the  right  side  of  this  expression  by  the  fraction 
100  cm/1  m.  Since  100  cm  = 1 m,  the  value  of  the  fraction  is  1,  and  so  the  multipli- 
cation does  not  change  the  value  of  r.  Thus  you  have 


r = 6.37  x 


106  m X 


1 00  cm 
1 m 


Next  multiply  by  the  fraction  1 in/2.54  cm.  Again,  the  value  of  the  fraction  is  1,  and 
you  have 


r 


_ 1UU  cm  1 in 

6.3/  x 106  m x x — 

1 m 2.54  cm 


Now  write  the  expression  in  the  expanded  form 


r = 6.37  X 106  X 1 nr  X 


100  x 1 cm 
1 m 


1 in 


2.54  x 1 cm 


Then  simplify  it  by  canceling  the  quantity  1 cm  appearing  in  a numerator  against 
the  1 cm  appearing  in  a denominator,  and  do  the  same  for  the  quantity  1 m which 
also  appears  in  both  a numerator  and  a denominator: 


r = 6.37  x 106  x T-m-  x 10(1  x 1 (Tf  x 


1 in 


You  now  have 


E-m-  2.54  x 1 cm 

6.37  x 106  x 100  x 1 in 


or 


2.54 


r = 2.51  x 108  in 


An  abbreviated  procedure  is  to  write 


r = 6.37  x 


106  m X 


100  cm 
1 m 


x 


1 in 

2.54  cm 


Then  treating  the  symbols  m and  cm  themselves  as  algebraic  quantities  which  can 
be  canceled,  make  two  cancellations,  to  give 

r = 6.37  x 106  x 100  x = 2.51  x 108  in 

2.54 


Note  that  the  result  is  quoted  to  only  three  significant  figures  since  it  was  ob- 
tained from  the  value  of  r in  meters,  which  was  quoted  to  only  three  significant  fig- 
ures. To  give  more  significant  figures  in  the  result  would  be  to  give  a misleading 
impression  of  its  accuracy. 


2-3  TIME  AND  UNITS  With  the  aim  of  continuing  the  development  of  a precise  description  of  mo- 
OF  TIME  tion,  we  turn  now  to  the  definition  and  measurement  of  time.  Any  physical 
system  which  has  a repetitive  behavior  can  be  used  to  measure  the  passage 
of  time,  if  there  is  reason  to  believe  that  each  cycle  of  its  behavior  accurately 
reproduces  each  preceding  cycle.  The  time  for  one  repetition  of  a cycle  of 
motion  is  called  a period.  Thus,  any  system  which  has  a constant  period  is 
capable  of  being  used  as  a clock  to  measure  time.  Examples  of  systems 
which  might  be  expected  to  have  this  property,  to  a greater  or  lesser  de- 


2-3  Time  and  Units  of  Time  19 


gree,  are  the  pulse  beat,  the  oscillation  of  a pendulum,  the  rotation  of  the 
earth,  and  the  vibration  of  the  electrons  in  an  atom.  In  fact,  each  of  these 
systems  has  been  used  to  measure  time.  No  matter  which  system  is  used, 
the  procedure  is  to  count  the  number  of  repetitions,  that  is,  periods,  which 
occur  during  the  interval  of  time  to  be  measured. 

For  instance,  some  time  intervals  can  be  measured  conveniently  by  re- 
cording how  many  earth  rotations  occur  during  the  interval.  The  time 
lapse  at  a particular  location  from  one  noon  (the  instant  when  the  sun  ap- 
pears to  reach  its  highest  point  in  the  sky  and  pass  over  the  meridian  circle 
of  the  observer)  to  the  next  noon,  averaged  over  one  year,  is  used  to  define 
the  unit  of  time  called  the  mean  solar  day.  Until  recently  the  second  (s)  was 
defined  as  (l/60)(  l/60)(  1/24)  = 1/86,400  of  a mean  solar  day.  The  first  of 
these  factors  arises  because  a second  is  one-sixtieth  of  a minute,  the  next 
because  a minute  is  one-sixtieth  of  an  hour,  and  the  last  because  an  hour  is 
one  twenty-fourth  of  a day. 


When  the  meter  was  first  introduced,  an  attempt  was  also  made  to  introduce  a 
time  unit  that  was  a purely  decimal  subdivision  of  the  day.  But  few  people  were 
willing  to  accept  it  since  relatively  precise  timekeeping  was  already  a well- 
established  custom.  And  so  the  second  was  retained  as  the  basic  unit  of  time,  in 
spite  of  the  nondecimal  factors  relating  it  to  other  common  units  of  time.  In  most 
branches  of  physics,  however,  the  second  is  used  almost  exclusively  as  the  unit  of 
time.  Thus  the  awkward  factors  do  not  cause  much  trouble. 


How  is  it  determined  that  a repetitive  phenomenon  used  to  define  the 
passage  of  time  actually  continues  to  repeat  itself  uniformly?  It  can  be  by 
comparison  of  the  phenomenon  with  some  other  repetitive  phenomenon. 
For  example,  the  rotational  motion  of  the  earth  can  be  compared  to  the  os- 
cillatory motion  of  a pendulum  by  coupling  the  pendulum  to  a mechanical 
counting  device  to  form  a pendulum  clock.  If  the  oscillations  of  the  pen- 
dulum are  very  carefully  stabilized,  it  is  found  that  the  number  occurring 
during  one  rotation  of  the  earth  is  reproducible  and  constant,  within  the 
accuracy  of  the  measurement. 

An  atomic  clock  can  be  made  by  coupling  the  vibrations  of  the  elec- 
trons in  cesium  atoms  to  an  electronic  counter.  Figure  2-10  is  a photograph 
of  one  such  clock  used  by  the  U.S.  Bureau  of  Standards.  When  extremely 
careful  comparisons  are  made  among  several  different  atomic  clocks,  it  is 
found  that  they  agree  with  one  another  to  better  than  1 part  in  1011.  But 
when  the  atomic  clocks  are  compared  with  the  rotation  period  of  the  earth, 
it  is  found  that  the  earth  does  not  keep  exactly  in  step  with  the  atomic 
clocks.  Instead  there  are  seasonal  fluctuations  in  the  earth’s  rotation 
period,  as  judged  by  the  atomic  clocks,  of  about  1 part  in  108  (about  1000 
times  greater  than  the  disagreement  among  the  various  atomic  clocks). 
There  is  also  an  average  annual  increase  of  the  earth’s  rotation  period  of 
about  2 parts  in  109.  Seasonal  and  annual  changes  in  the  rotation  period  of 
the  earth  can  be  plausibly  attributed  to  known  effects  (for  example,  sea- 
sonal variations  in  wind  patterns  and  continual  drag  from  ocean  tides). 
There  are  no  known  reasons  why  there  could  be  such  variations  in  the 
period  of  vibration  of  atomic  electrons.  It  is  therefore  believed  that  atomic 
clocks  provide  a true  measure  of  the  changes  in  the  rotation  period  of  the 
earth  since  they  depend  on  what  is  thought  to  be  an  accurately  repetitive 
phenomenon. 


20  Kinematics  in  One  Dimension 


Fig.  2-10  The  most  recent  definition 
of  the  metric  unit  of  time  is  based  on 
the  period  elapsing  between  successive 
vibrations  of  electrons  in  cesium- 133 
atoms.  The  atomic  clock  shown  in  the 
photograph  is  an  electronic  device  that 
counts  these  vibrations.  ( Courtesy  of  the 
National  Bureau  of  Standards. ) 


As  a consequence,  the  second  was  redefined  in  1967  in  terms  of  the  vi- 
bration period  of  the  electrons  in  cesium- 133  atoms,  when  the  electrons  are 
making  transitions  between  their  so-called  hyperfine  ground-state  levels. 
The  second  is  now  defined  as  exactly  9,192,631,770  of  these  periods.  Other 
time  units,  used  in  certain  circumstances,  can  be  found  in  the  tables  inside 
the  book  covers. 


Comparison  of  several  atomic  clocks  indicates  that  a cesium- 133  atomic  clock  will 
maintain  a constant  rate  to  about  1 part  in  1011.  Get  a feeling  for  this  accuracy  by  es- 
timating how  many  years  it  would  take  an  improperly  maintained  clock  whose  rate 
is  consistently  slow  (or  fast)  by  1 part  in  1 011,  compared  to  a properly  maintained 
one,  to  lose  (or  gain)  1 s. 

■ Since  it  is  a convenient  conversion  factor  to  know,  you  might  evaluate  first 
the  approximate  number  of  seconds  in  1 year  (yr): 


1 yr  = 365  days  x 
= 3 X 107  s 


24  h 
1 day 


X 


60  min 
1 h 


x 


60  s 
1 min 


(The  symbol  = means  “approximately  equal  to.”  The  letter  h stands  for  hour,  and 
min  abbreviates  minute.)  Then  use  this  factor  to  convert  1011  s to  years: 

10u  s - 1011  s x 1 ; ‘ 

3 x 107  s 

= 3 x 1 03  yr 

From  this  you  can  conclude  that  in  about  3000  yr  a clock  running  slow  (or  fast)  by  1 
part  in  1011  would  lose  (or  gain)  1 s. 

But  it  is  extremely  unlikely  that  an  actual  atomic  clock  would  accumulate  any- 
thing like  a 1-s  error  in  3000  yr  (if  it  was  maintained  in  good  operating  order).  The 
reason  is  that  its  rate  will  not  be  consistently  slow,  or  fast,  but  instead  will  fluctuate 
between  slow  and  fast.  Thus  there  would  be  a very  high  degree  of  error  compensa- 
tion over  the  3000-yr  period,  and  the  accumulated  error  would  be  a very  small  frac- 
tion of  1 s. 


2-3  Time  and  Units  of  Time  21 


2-4  VELOCITY 


t (in  s) 


Fig.  2-11  The  position  x versus  the 
time  t for  an  object  moving  with  constant 
velocity  of  magnitude  10  m/s  in  the 
direction  of  increasing  x.  The  velocity 
has  the  value  v = + 10  m/s. 


Having  considered  the  procedures  used  to  measure  position  and  time,  we 
turn  our  attention  to  a fundamentally  important  quantity  that  involves  both 
position  and  time.  This  is  velocity,  the  quantity  specifying  how  rapidly  the 
position  of  an  object  changes  with  the  passage  of  time.  As  a simple  specific 
example,  the  velocity  of  a car  moving  uniformly  along  a straight  road  is  the 
change  in  its  position  in  a certain  time,  divided  by  that  time.  The  magnitude 
of  the  velocity  of  the  car  can  be  expressed  in  meters  per  second  or  in  other 
units  of  length  per  time  (such  as  miles  per  hour  in  the  United  States). 
These  are  also  the  units  for  speed.  Indeed,  the  words  “velocity”  and 
“speed”  are  used  interchangeably  in  everyday  language.  But  a significant 
distinction  between  the  two  is  made  in  technical  language.  Velocity  has  a 
direction  as  well  as  a magnitude,  whereas  speed  has  only  magnitude.  The  direc- 
tion of  a velocity  specifies  the  direction  of  the  motion  it  describes.  A car 
moving  in  one  direction  along  a straight  road  at  100  kilometers  per  hour 
(km/h)  does  not  have  the  same  velocity  as  one  moving  at  100  km/h  in  the 
opposite  direction.  The  velocities  of  the  two  cars  have  opposite  directions. 
But  the  cars  have  the  same  speeds.  In  this  section  you  will  see  how  these 
ideas  are  extended  to  handle  the  very  important  case  of  one-dimensional 
motion  which  is  not  uniform.  You  will  also  see  how  the  mathematical  con- 
cept of  a derivative  arises  naturally  when  such  a case  is  considered. 


Imagine  that  the  car,  having  uniform  motion  in  a certain  direction  along 
a straight  road,  is  traveling  a distance  of  10  m each  second.  Figure  2-11  de- 
scribes this  motion  by  plotting  the  straight  line  giving  the  position  x of  the 
uniformly  moving  car  versus  the  time  t.  The  clock  used  to  measure  t was 
zeroed  at  the  instant  the  car  passed  the  mark  on  the  road  from  which  x is 
measured.  That  is  why  the  straight  line  showing  the  dependence  of  x on  t 
passes  through  the  point  at  x = 0 m and  t = 0 s,  the  origin  of  both  the  x 
and  t axes.  The  time  t is  plotted  so  that  it  increases  to  the  right,  and  the  po- 
sition x is  plotted  so  that  positive  values  of  x are  plotted  upward.  Positive 
values  of  x are  those  in  an  agreed-upon  direction  from  the  mark  on  the 
road.  As  the  values  of  x become  more  positive  with  increasing  values  of  t, 
the  car  is  moving  in  the  direction  of  positive  x. 

That  the  car  is  moving  uniformly  is  shown  in  the  way  x increases  by  the 
same  amount  in  equal  intervals  of  t.  The  figure  indicates  that  the  car  is 
10  m past  the  mark  after  1 s has  elapsed,  20  m past  after  2 s,  and  so  on. 
The  car  is  said  to  have  a constant  velocity  of  10  meters  per  second  (m/s)  in 
the  positive  x direction,  because  x increases  by  10  m every  time  t increases 
by  1 s.  Velocity  is  assigned  the  symbol  v.  The  constant  velocity  v of  an  object  is 
defined  to  have  a value  equal  to  the  change  in  its  position  in  a certain  time  interval, 
divided  by  the  duration  of  that  time  interval. 

We  write  the  values  of  the  pair  of  quantities  specifying  position  and 
time,  at  the  hrst  instant  considered,  as  x,  and  tt . These  are  called  initial  val- 
ues, and  the  subscript  i stands  for  initial.  A subsequent  pair  of  position  and 
time  values  is  written  as  x and  t.  In  particular,  we  let  x,  and  t{  be  the  values  at 
the  beginning  of  the  time  interval  used  in  the  definition  of  constant  velocity. 
And  we  let  x and  t be  the  pair  at  the  end  of  the  time  interval.  Then  the  defi- 
nition of  constant  velocity  v can  be  written 


v = 


X — X; 
t ~ ti 


for  constant 


(2- la) 


We  were  using  this  definition  implicitly  in  the  paragraph  before  the 
last  when  we  evaluated  v for  the  motion  described  in  Fig.  2-11.  Now  let  us 


22  Kinematics  in  One  Dimension 


0 12  3 4 5 


t (in  s) 

Fig.  2-12  The  position  x versus  the 
time  / for  an  object  moving  with  constant 
velocity  of  magnitude  10  m/s  in  the 
direction  of  decreasing  x.  The  velocity 
has  the  value  v = — 10  m/s. 


0 12  3 4 


t (in  s) 

Fig.  2-13  T he  position  x versus  the 
time  t for  an  object  moving  with  constant 
velocity  v = +20  m/s. 


use  the  definition  explicitly.  To  do  this  as  it  was  done  in  that  paragraph,  we 
choose,  from  all  the  possible  pairs  of  position  and  time  values  that  the  fig- 
ure shows  to  be  in  correspondence,  the  pair  x = 0 and  t = 0 as  the  initial 
values.  That  is,  we  take  x,  = 0 m and  tf  = 0 s.  Let  us  also  choose  for  the  po- 
sition and  time  values  at  the  end  of  the  interval  the  pair  x — 10  m and  t = 
1 s.  Then  we  have 


10  m — 0 m _ +10  m 
1 s — 0 s + 1 s 


+ 10  m/s 


We  can  also  choose  any  other  pair  of  x and  t values  to  be  those  at  the  end  of 
the  interval,  but  we  will  obtain  the  same  result  for  v.  Try  it  with  x = 20  m 
and  t = 2 s.  Furthermore,  we  can  alter  our  choice  for  the  pair  of  initial  val- 
ues without  affecting  the  result  obtained.  For  example,  we  can  take  x,  = 
10  m and  tt  = Is  and,  say,  x = 20  m and  t — 2 s,  to  obtain  again 


20  m - 10  m _ + 10  m 

2s—  Is  + 1 s 


+ 10  m/s 


No  matter  how  the  definition  of  Eq.  (2- la)  is  used  on  the  x versus  t curve  of 
Fig.  2-11,  it  consistently  yields  the  same  result  for  v.  That  is  what  we  mean 
when  we  say  that  the  curve  describes  motion  with  a constant  velocity. 


The  definition  of  Eq.  (2- la)  gives  both  the  magnitude  of  the  velocity 
and  its  direction.  For  one-climensional  motion  with  constant  velocity,  the 
magnitude  of  a velocity  is  just  the  magnitude  (in  other  words,  the  absolute 
value)  of  the  fraction  (x  — xt )/(t  — fi).  The  direction  of  a velocity  is  given 
by  the  sign  of  this  fraction  for  motion  in  one  dimension.  If  an  object  is 
moving  in  the  direction  chosen  to  be  the  direction  of  positive  x,  as  in  Fig. 
2-11,  then  x is  more  positive  than  x,.  So  the  numerator  of  the  fraction, 
x — X;,  is  positive.  Since  time  always  increases,  the  denominator  of  the  frac- 
tion, t — ti,  is  positive  in  all  circumstances.  Thus  the  value  of  v is  positive. 

But  if  an  object  is  moving  in  the  direction  opposite  to  the  direction  of 
positive  x,  as  in  Fig.  2-12,  then  x is  less  positive  than  x*.  This  causes  the  value 
of  v calculated  from  Eq.  (2- la)  to  be  negative.  You  should  use  that  equa- 
tion to  show  that  the  value  of  v for  Fig.  2-12  is  v — — 10  m/s. 

In  summary,  for  one-dimensional  motion  the  direction  of  a velocity  is 
given  by  its  algebraic  sign.  Positive  v means  motion  in  the  agreed-upon 
direction  of  positive  x.  Negative  v means  motion  in  the  opposite  direction. 
Later  in  this  section,  we  discuss  a specific  mechanical  system  in  which  nega- 
tive velocities  must  be  used. 

Figure  2-13  plots  x versus  t for  a car  moving  with  the  constant  velocity 
v = +20  m/s  past  the  location  x = 0 at  time  t — 0.  Yon  should  verify  the 
value  of  v by  applying  Eq.  (2- la)  to  data  obtained  from  the  figure.  Then 
compare  Fig.  2-13  to  Fig.  2-11.  You  will  note  that  the  characteristic  which 
distinguishes  them  is  that  the  straight  line  produced  by  plotting  x versus  t 
has  a steeper  positive  slope  for  the  figure  corresponding  to  the  greater 
positive  velocity.  So,  the  slope  of  the  line  describing  the  function  x{t)  on  an  x 
versus  t plot  is  a measure  of  the  velocity  of  the  uniformly  moving  object.  In 
fact,  the  slope  is  numerically  equal  to  the  velocity  since  the  definition  of  Eq. 
(2-la)  is  just  a way  of  calculating  the  slope.  The  magnitude  of  the  slope 
gives  the  magnitude  of  the  velocity,  and  its  sign  gives  the  sign  of  the  veloc- 
ity. An  x versus  t plot  with  a negative  slope  was  shown  in  Fig.  2-12. 


2-4  Velocity  23 


Fig.  2-14  A falling  puck. 


It  might  seem  that  the  slope  of  the  line  describing  a function  x(t)  depends  on 
the  scale  used  to  construct  the  x versus  t plot.  For  instance,  if  Fig.  2-13  were  re- 
drawn with  the  numbers  on  the  x axis  twice  as  close  together,  its  slope  could  be 
made  to  appear  superficially  the  same  as  the  slope  of  the  line  in  Fig.  2-11.  But  as 
soon  as  the  difference  in  scales  of  thex  axes  was  noted,  the  actual  difference  in  the 
slopes  of  the  functions  described  by  the  two  figures  would  be  apparent.  The  point 
is  that  the  slope  of  a line  plotting  x versus  t is  measured  by  a change  inx  in  meters, 
divided  by  the  corresponding  change  in  t in  seconds.  This  quantity  is  independ- 
ent of  the  scales  used  for  the  x and  t axes. 

Before  continuing  the  development  of  one-dimensional  kinematics, 
we  need  to  introduce  a more  convenient  way  of  expressing  the  content  of 
Eq.  (2- la).  Instead  of  writing  it  as 

v = — for  constant  v 

t - ti 

we  write  the  equation  as 

\x 

v = — for  constant  v (2-16) 

That  is,  we  express  x — x,  as  Ax  and  t — tt  as  Ah  We  do  this  by  using  the  up- 
percase Greek  letter  A (delta)  to  mean  “change  in,”  a usage  that  is  universal 
in  physics  and  mathematics.  Thus  Ax  means  “change  in  x”  and  A t means 
“change  in  t."  Specifically, 


and 


Ax  = x — x,  (2-2a) 

A t = t - ti  (2-2 b) 


Remember  that  the  symbol  A is  not  itself  an  algebraic  quantity;  Ax  is  not  "A 
times  x,”  and  At  is  not  “A  times  t." 


Armed  with  this  compact  notation,  we  begin  an  investigation  of  nonun- 
iform motion  in  one  dimension.  An  example  is  depicted  in  Fig  2-14,  which 
reproduces  the  strobe  photo  of  a falling  puck  considered  in  Chap.  1.  As 
we  will  demonstrate  by  direct  experiment  at  the  end  of  this  chapter.  Fig. 
2-15  gives  quantitatively  the  position  versus  time  plot  for  the  motion  of  the 
puck,  or  any  other  object  falling  freely  (that  is,  with  negligible  air  resis- 
tance) near  the  surface  of  the  earth.  Positive  values  of  the  position  x are 
measured  vertically  downward  from  the  point  at  which  the  object  was  re- 
leased from  rest.  The  values  of  the  time  t are  measured  from  the  instant  of 
release. 

The  motion  of  the  falling  object  is  nonuniform  because  its  velocity  is 
not  constant.  So  the  plot  of  the  function  x(t)  in  Fig.  2-15  is  not  a line  of  con- 
stant slope.  Your  intuition  may  be  able  to  make  an  immediate  connection 
between  the  ever-increasing  slope  of  the  x(t)  curve  and  the  ever-increasing 
velocity  of  the  falling  object.  But  intuition  is  not  enough.  We  must  learn 
how  to  determine  precisely  from  x(t)  the  velocity  of  a nonuniformly  moving 
object  at  any  instant  t.  This  will  be  done  by  applying  the  definition 
v = Ax/A t while  letting  A t become  smaller  and  smaller. 

Suppose  that  we  want  to  determine  the  value  of  the  velocity  when  the 
value  of  the  time  is  1 s.  We  express  this  time  as  tt  = 1,  simplifying  the  sym- 
bolism to  be  used  in  the  following  discussion  by  agreeing  that  in  the  discus- 


24  Kinematics  in  One  Dimension 


50  - 


Fig.  2-15  The  distance  fallen  x versus 
the  time  t elapsed  since  the  beginning  of 
the  fall,  for  an  object  in  free  fall  near  the 
surface  of  the  earth.  Free  fall  means  that 
air  resistance  is  negligible.  This  will  be 
the  case  for  a dense  object  like  a steel  ball 
in  the  early  part  of  its  fall  where  its  speed 
is  low. 


Fig.  2-16  An  expanded  view  of  the  first  part  of  the  x(t)  curve  in  Fig. 
2-15. 


sion  we  will  measure  time  in  seconds.  In  Fig.  2-16  (which  is  Fig.  2-15  drawn 
to  an  expanded  scale)  we  illustrate  how  the  value  of  the  velocity  v can  be  de- 
termined. Take  a small  time  interval  beginning  at  tt  = 1 and  ending  at  t = 
1 + A t.  Then  the  corresponding  values  of  x{  and  x at  the  beginning  and 
end  of  the  interval  are  found  from  the  x(t)  curve.  Their  difference, 
Ax  = x — xi;  is  the  actual  change  in  position  in  the  particular  time  interval 
of  duration  At  which  begins  at  tt  — 1.  Next,  evaluate  the  fraction  Ax/ A t. 
This  fraction  is  clearly  related  to  the  velocity  of  the  body  at  t,  = 1.  But  it  is 
not  the  velocity  at  the  precise  instant  t*  = 1.  For  one  thing,  it  has  to  do  with 
the  total  motion  over  the  total  time  interval  from  t;  = 1 to  t = 1 + At,  not 
just  with  the  motion  at  the  instant  t,  = 1.  Also,  the  definition  v — Ax/ At  of 
Eq.  (2-1  b)  was  restricted  to  the  case  where  x(t)  plots  as  a straight  line.  Here 
the  plot  is  not  a straight  line. 

But  is  it  really  necessary  that  x(t)  yield  a straight  line  for  all  values  of 
time  in  order  for  the  definition  v = Ax/ At  to  be  applied  in  the  immediate 
vicinity  of  t;  = 1?  No.  It  is  necessary  only  that  the  part  of  the  x(t)  curve 
which  is  actually  used  to  evaluate  Ax/ At  be  a.  sufficiently  good  approximation  to  a 
straight  line. 

If  you  look  at  the  entire  x(t)  curve  in  Fig.  2-16,  it  is  obviously  not 
straight.  However,  if  you  restrict  your  attention  to  the  part  struck  off  by  At, 
you  must  inspect  it  quite  carefully  to  discern  that  it  is  not  straight.  The 
point  is  made  in  Fig.  2-17.  The  actual  motion  of  the  object  is  almost  uniform 
over  the  time  interval  At.  So  it  should  be  possible  to  obtain  a good  approxi- 
mation to  its  velocity  v at  the  instant  occurring  at  the  beginning  of  the  in- 
terval by  evaluating  Ax/A t.  In  other  words,  we  can  use  the  approximate 
equality 

A.y 

v — — for  nonuniform  motion  when  At  is  small  (2-3) 


Is  the  approximation  good  enough?  If  it  is  not,  we  can  always  improve 
it  by  simply  reducing  the  size  of  the  time  interval  At.  Referring  to  Fig.  2-16, 
we  can  obtain  an  even  better  approximation  to  the  value  of  v at  the  instant 
t,  = 1 by  taking  the  smaller  time  interval  At'  and  then  evaluating  the  ratio 
Ax' /At'.  Over  this  shorter  interval  the  actual  x(t)  curve  is  even  more  closely 


2-4  Velocity  25 


t (in  s) 


approximated  by  a straight  line.  If  still  greater  precision  is  required,  the 
value  of  v at  f,  = 1 can  be  approximated  by  using  an  even  smaller  time  in- 
terval A t"  and  evaluating  Ax" / At" . And  so  on. 

Numerical  examples  of  this  procedure  are  given  in  Sec.  2-5.  When  you 
inspect  them,  you  will  see  that  there  is  a substantial  difference  between  the 
values  of  Ax/ At  and  Ax' / At' . But  the  difference  between  Ax' / At'  and 
Ax" / At"  is  appreciably  smaller.  Continuing  the  process,  it  is  found  that 
further  reduction  in  the  time  interval  no  longer  produces  a discernible 
change  in  the  approximation  to  the  velocity  at  the  instant  when  the  time  in- 
terval begins.  The  reason  is  that  the  time  interval  has  become  so  short  that 
the  part  of  the  x(t)  curve  lying  within  it  is  essentially  a straight  line.  When 
this  happens,  the  sequence  of  approximations  to  the  velocity  is  said  to  have 
converged  to  its  limit.  The  velocity  v is  defined  to  equal  this  limit.  In  other 
words,  by  definition  v equals  the  limit  of  Ax/ At  as  At  approaches  zero.  In 
mathematical  notation  the  definition  is  written 

v = limit  ~~  (2-4) 

Af-*0  A t 


We  have  been  led  by  the  need  to  evaluate  the  velocity  for  a non  uni- 
formly moving  body  to  the  most  basic  concept  of  differential  calculus.  In 
fact,  the  development  of  differential  calculus — the  work  of  Isaac  Newton 
and,  independently,  of  Gottfried  Wilhelm  von  Leibniz  around  1666 — was 
motivated  by  the  same  need.  The  quantity  on  the  right  side  of  Eq.  (2-4)  is 
the  instantaneous  rate  of  change  of  x with  respect  to  t.  In  differential  calculus  it  is 
called  the  derivative  of  x with  respect  to  t and  is  commonly  writted  as  dx/dt. 
That  is, 


dx  ...  Ax 
— = limit  — — 
(it  At— >0  A t 


(2-5) 


(The  symbol  = means  “identical  to."  But  dx  does  not  mean  “d  times  x,”  and 
dt  does  not  mean  “d  times  t”)  With  this  symbolism,  the  definition  of  the 
velocity  v can  be  expressed  by 


dx 

dt 


(2-6) 


26  Kinematics  in  One  Dimension 


The  velocity  is  the  derivative  of  position  with  respect  to  time.  To  be  more  specific, 
at  any  instant  the  velocity  v of  an  object  is  obtained  by  evaluating  at  that  in- 
stant the  derivative  of  its  position  x with  respect  to  time  t. 

Note  that  there  are  no  restrictions  on  the  definition  of  velocity  in  one- 
dimensional motion  given  by  Eq.  (2-6],  as  there  were  on  earlier  definitions.  It  ap- 
plies in  all  circumstances.  Can  you  see  how  each  of  the  earlier  definitions  is  con- 
tained in  Eq.  (2-6)  as  a special  case?  Specific  examples  of  the  evaluation  of  deriva- 
tives are  given  in  Sec.  2-5,  using  both  numerical  and  analytical  methods. 

The  convergence  of  Ax/A t to  its  limit,  as  A t approaches  zero,  is  illus- 
trated and  interpreted  from  geometric  considerations  in  Fig.  2-18.  [This 
figure  is  like  an  expanded  view  of  the  important  part  of  Fig.  2-16.  But  for 
the  sake  of  clarity  it  plots  a function  x(t)  with  considerably  greater  curvature 
than  the  one  plotted  in  Fig.  2-16.]  You  can  see  that  the  value  of  Ax/ A / gives 
the  slope  of  the  straight  line  connecting  the  point  on  the  curve  at  tt  = 1 to 
the  farthest  point  on  the  curve.  The  value  of  Ax'/A t'  gives  the  slope  of  the 
straight  line  connecting  the  = 1 point  to  the  intermediate  point  on  the 
curve.  And  the  value  of  Ax"/ A t”  gives  the  slope  of  the  straight  line  con- 
necting the  ti  = 1 point  to  the  point  nearest  to  it  on  the  curve.  As  the  time 
interval  is  reduced,  the  slopes  of  the  sequence  of  straight  lines  generated 
approach  a value  which  is  the  slope  of  the  tangent  to  the  curve  of  x(t)  at  tt  = 
1.  Stated  in  geometric  terms,  the  chords  drawn  to  the  curve  for  successively 
smaller  intervals  approximate  more  and  more  closely  the  tangent  to  the 
curve,  and  their  slopes  approximate  more  and  more  closely  the  slope  of  the 
tangent.  The  slope  of  the  tangent  to  a curve  at  some  point  is  commonly 
called  just  the  “slope  of  the  curve”  at  that  point.  So  we  can  say  that  the  slope 
of  the  x versus  t curve  at  some  instant  equals  the  limit  as  At  approaches  zero 
of  Ax/ A t at  that  instant.  Since  the  limit  is  the  derivative  and  the  derivative  is 
the  velocity,  we  can  also  say  that  the  velocity  at  a particular  instant  is  given  by  the 
slope  of  the  position  versus  time  curve  at  that  instant. 

The  magnitude  of  a velocity  is  called  a speed.  Thus  speed  is  always  a 
positive  quantity.  It  is  written  symbolically  as  |u|.  (The  bars  are  the  mathe- 


Fig.  2-18  A geometrical  representation  of  the  derivative  of 
x with  respect  to  t.  The  derivative  at  the  value  t,  is  the  slope  of 
the  line  that  is  tangent  to  the  x versus  t curve  at  the  point 
corresponding  to  tt. 


t 


At 


2-4  Velocity 


27 


matical  symbols  for  taking  the  absolute  value  of  a quantity;  that  is,  for  de- 
leting the  minus  sign  in  the  numerical  value  if  the  numerical  value  of  the 
quantity  is  negative.)  The  speed  |u|  provides  a useful  description  of  the  mo- 
tion of  an  object  in  circumstances  where  the  rapidity  of  the  motion  is  im- 
portant but  the  direction  of  the  motion  is  not.  For  instance,  the  condition 
that  motion  be  governed  by  the  laws  of  newtonian  mechanics,  instead  of 
relativistic  mechanics,  is  that  the  speed  be  small  compared  to  the  speed  of 
light.  The  direction  of  the  motion  is  of  no  consequence,  as  far  as  this  dis- 
tinction is  concerned.  But  in  most  circumstances  the  direction  in  which  an 
object  moves  is  important,  and  so  the  velocity,  not  the  speed,  must  be  used 
to  describe  the  motion. 


For  nonuniform  motion  it  is  sometimes  useful  to  speak  of  the  average 
velocity,  symbolized  by  (v).  (The  brackets  are  the  mathematical  symbols 
for  taking  the  average  value  of  a quantity.)  If  in  a time  interval  of  duration 
At  the  change  in  the  x coordinate  of  an  object  is  Ax,  then  its  average  velocity 
for  the  time  interval  is  defined  as 


(v)  - 


Ax 
A t 


(2-7) 


As  the  lack  of  restrictions  implies,  the  time  interval  is  not  necessarily  small. 

File  average  velocity  (v)  can  be  of  greater  practical  interest  than  the  instan- 
taneous velocity  v.  For  instance,  in  planning  a long  automobile  trip  (on  a 
straight  road  since  we  are  working  in  one  dimension)  you  know  that  there 
will  be  variations  in  v because  of  traffic,  rest  stops,  and  so  forth.  But  to  pre- 
dict the  distance  you  will  travel  in  a day,  you  can  often  make  a reasonably 
accurate  estimate  of  (v),  even  though  v is  unpredictable  from  minute  to 
minute  during  the  trip.  Then  you  can  solve  Eq.  (2-7)  for  Ax,  to  obtain 

Ax  = (v)A  t (2-8) 

Inserting  the  estimated  value  of  (v)  and  the  time  At  that  you  plan  to  spend 
in  traveling,  you  obtain  immediately  an  estimate  for  the  distance  Ax  that 
you  will  travel. 

For  uniform  motion  the  velocity  v is  constant,  and  so  the  average  veloc- 
ity (v)  has  the  same  value  as  v.  The  more  nonuniform  the  motion  and  the 
larger  the  time  interval  A i,  the  greater  can  be  the  difference  between  (v) 
and  v.  Example  2-4  illustrates  the  relation  between  these  two  quantities,  as 
well  as  the  other  relations  developed  in  this  section. 


EXAMPLE  2-4  ■„ ■■■ 

Figure  2-19  shows  a strobe  photo  of  an  oscillating  pendulum.  It  comprises  an  ob- 
ject called  a bob  tied  to  one  end  of  a cord  of  moderate  length,  whose  other  end  is 
tied  to  a fixed  support.  Figures  2-20a  and  b indicate  the  position  at  successive  times 
through  one  oscillation  cycle  of  the  bob  in  a pendulum  with  a very  long  cord.  Since 
the  distance  between  the  extreme  positions  of  the  bob  is  small  compared  to  the 
length  of  the  cord,  the  motion  is  very  nearly  one-dimensional.  The  caption  to  Fig. 
2-20a  describes  the  motion  in  detail.  Use  Fig.  2-206  to  answer  the  following  ques- 
tions concerning  the  motion. 

a.  When  does  the  velocity  have  the  largest  positive  value? 

b.  When  does  the  velocity  have  the  largest  negative  value? 

c.  Is  the  velocity  ever  zero?  If  so,  when? 

d.  What  are  the  answers  to  the  three  preceding  questions  if  they  are  asked 
about  the  speed? 


28  Kinematics  in  One  Dimension 


x (in  m) 


Fig.  2-19  A strobe  photo  of  a pendulum  with  a cord  of 
moderate  length. 


f = 8 s 
(a) 


Fig.  2-20  (a)  A strobe-photo-like  representation  of  one 

oscillation  cycle  of  a pendulum  with  a very  long  cord.  The 
pendulum  bob  supported  by  the  cord  actually  travels 
along  a circle  centered  on  the  fixed  end  of  the  cord  (the 
end  being  too  distant  to  show  on  the  figure).  But  since  the 
maximum  change  in  the  position  of  the  bob  is  small 
compared  to  the  length  of  the  cord,  its  path  will  lie  very 
close  to  the  straight  line  defining  the  x axis.  When  t = 0 s, 
the  bob  swings  past  x = 0 m,  going  to  the  right.  When 
t = 1 s,  the  bob  passes  x = 0.7  m.  When  t = 2 s,  the  bob 
comes  instantaneously  to  rest  at  x = 1.0  m.  Then  it 
reverses  its  direction  of  motion,  passing*  = 0.7  m again 
at  t = 3 s.  The  bob  continues  past  x = 0 m at  ( = 4 s,  and 

* = —0.7  m at  t = 5 s,  coming  instantaneously  to  rest  at 
x = — 1.0  mwhent  = 6 s.  Then  it  reverses  its  direction  of 
motion  once  more,  passing  * = — 0.7  m at  t = 7 s and 

* = 0 m at  t = 8 s.  Having  completed  the  first  cycle  of  its 
oscillation,  the  bob  continues  moving  into  the  next  cycle, 
but  this  is  not  shown  in  the  figure,  (b)  An  x versus  t plot  for 
the  motion  depicted  in  part  a of  this  figure.  At  several 
values  of  t the  slope  of  the  curve  is  shown  by  drawing  its 
tangent. 


( b ) 


2-4  Velocity  29 


2-5  DIFFERENTIATION 


e.  What  is  the  average  velocity  for  the  time  interval  covered  by  the  first  half- 
cycle? 

■ a.  Since  the  value  of  the  velocity  v is  the  slope  of  the  position  versus  time 
curve,  and  since  the  slope  has  its  largest  positive  value  at  t = 0 s and  again  at  t = 8 s, 
the  velocity  has  its  largest  positive  value  at  these  times. 

b.  The  velocity  has  its  largest  negative  value  at  t = 4 s because  at  that  time  the 
slope  of  the  curve  is  most  negative. 

c.  The  velocity  is  zero  at  t = 2 s,  as  well  as  at  t = 6 s,  since  the  slope  is  zero  at 
these  instants.  If  this  bothers  you,  consider  the  fact  that  just  before  t = 2 s the  bob  is 
moving  slowly  in  the  positive  x direction,  while  just  after  t = 2 s it  is  moving  slowly 
in  the  negative  x direction.  Since  it  switches  the  direction  of  its  motion,  surely  there 
will  be  an  instant  when  it  is  not  moving. 

d.  Since  speed  |u|  is  the  magnitude  of  velocity,  the  speed  has  the  same  value  at 
t = 0 s,  t = 4 s,  and  t = 8 s.  These  are  the  three  times  when  the  speed  has  the  larg- 
est positive  value.  The  speed  never  has  a negative  value.  Its  value  is  zero  when  the 
value  of  the  velocity  is  zero,  that  is,  at  t = 2 s and  t = 6 s. 

e.  The  half-cycle  begins  at  t = 0 s and  ends  at  t = 4 s.  At  both  times  the  bob  is 
atx  = 0 m.  Using  these  values  in  the  definition  of  the  average  velocity  for  the  time 
interval,  you  find 


<v) 


0 m - 0 m 
4 s - 0 s 


0 m/s 


You  can  understand  the  meaning  of  this  result  by  noting  the  symmetry  of  the  x 
versus  t curve  about  t = 2 s.  Because  of  this  symmetry  the  set  of  values  taken  on  by 
the  velocity  from  t = 0stot=2sis  exactly  the  same  as  the  set  of  values  taken  on 
from  t = 2 s to  t = 4 s,  except  that  the  signs  are  reversed.  Since  (v)  represents  the 
average  of  the  values  of  v from  t = 0 s to  t = 4 s,  and  since  the  contributions  to  this 
average  from  the  positive  values  will  just  cancel  the  contributions  from  the  negative 
values,  it  is  not  surprising  that  the  direct  application  just  made  of  the  definition  of 
(v)  yields  the  result  (v)  = 0 m/s.  But  such  a result  is  not  always  obtained.  For  in- 
stance, (v)  is  certainly  greater  than  zero  for  the  interval  from  t = 0 s to  t = 2 s. 


Unless  you  are  already  familiar  with  differential  calculus,  it  will  be  neces- 
sary to  take  a short  break  from  physics  in  this  section  to  study  differentia- 
tion, the  mathematical  process  of  evaluating  a derivative. 

The  definition  of  a derivative  given  in  Eq.  (2-5), 


dx  , . . Ax 
— = limit  — 
dt  At— *o  A t 


is  not  very  explicit.  A more  detailed  form  of  the  definition  provides  more 
information  about  the  procedure  actually  used  in  evaluating  a derivative. 
Consider  a varying  quantity  x whose  value  depends  on  the  value  of  a 
varying  quantity  t.  The  quantity  x is  called  the  dependent  variable,  and  the 
quantity  t is  called  the  independent  variable.  The  nature  of  the  depend- 
ence of  x on  t is  usually  given  in  the  form  of  a mathematical  function  x{t).  By 
definition,  the  derivative  of  x with  respect  to  t,  evaluated  at  q,  is 


limit 

Af-»0 


limit 

Af->0 


x(q  + At)  — x(q) 
At 


(2-9) 


This  says  that  the  derivative,  at  some  value  of  t designated  as  q,  is  obtained 
by  first  finding  the  difference  between  the  value  of  the  function  x(t)  when  t 
is  q + At  and  its  value  when  t is  q,  then  dividing  that  difference  by  At,  and 
finally  taking  the  limit  of  the  sequence  of  results  obtained  by  using  succes- 
sively smaller  values  of  At. 


30  Kinematics  in  One  Dimension 


Example  2-5  demonstrates  the  process  of  taking  the  limit  and  thereby 
illustrates  the  most  important  idea  of  differential  calculus.  The  demon- 
stration involves  going  through  a sequence  of  numerical  calculations  to 
evaluate  the  derivative,  at  a particular  value  of  q,  of  a function  x{t)  that  has 
a particular  mathematical  form.  You  will  find  it  worthwhile  to  employ  this 
numerical  method  a few  times.  You  can  use  it  to  evaluate  derivatives  for 
other  values  of  q as  well  as  for  functions  x(t)  having  other  forms.  It  is  fea- 
sible to  perform  the  numerical  calculations  on  a manually  operated  pocket 
calculator.  But  it  is  much  easier  to  use  a programmable  pocket  calculator  or 
a small  computer.  A suitable  program  and  operating  instructions  are  found 
in  the  pamphlet  called  the  Numerical  Calculation  Supplement  that  is  pro- 
vided with  this  book.  The  program  is  identified  as  the  numerical  differen- 
tiation program. 


EXAMPLE  2-5  i^—  i i ■ 

The  motion  of  an  object  falling  freely  near  the  surface  of  the  earth  was  plotted  in 
Fig.  2-15.  The  information  given  by  the  plot  can  also  be  given  by  the  function 

x(t)  = (4.90  m/s 2)t2  (2-10) 

To  verify  this,  let  t have  the  precise  values  t = 0 s,  1 s,  and  2 s in  the  equation  to  obtain 
x = 0 m,  4.90  m,  and  19.60  m.  Then  look  at  the  plot.  Use  the  definition  of  Eq.  (2-9) 
to  make  a direct  numerical  evaluation  of  (dx/dt)t.  for  q = 1 s.  Let  the  first  value  of 
A t be  precisely  At  = 0.1  s and  make  each  subsequent  value  smaller  by  a factor  of  \ 
(that  is,  use  A t — 0.1  s,  0.05  s,  0.025  s,  and  so  on).  Obtain  results  to  an  accuracy  of 
two  decimal  places. 

■ The  first  thing  to  do  is  to  reexpress  Eq.  (2-10)  in  the  completely  equivalent 

form  „ 

x(t)  = 4.90t2  (x  in  m,  t in  s)  (2-11) 

Since  a calculating  device  deals  only  with  numbers  and  not  with  units,  you  must 
keep  track  of  the  units  because  it  does  not.  The  units  are  specified  in  Eq.  (2-1 1)  in  a 
way  that  makes  this  fact  more  apparent.  Similarly,  reexpress  q and  the  first  value  of 
At  as  q = 1 and  A t = 0.1  (t  in  s).  Then  perform  the  calculations,  either  manually  by 
carrying  out  the  procedure  called  for  in  Eq.  (2-9)  or  by  using  the  program  which 
will  make  a programmable  calculator  or  computer  carry  out  the  procedure  for  you. 
First  you  calculate  the  number 


4.90(1. 1)2  - 4.90(1)2 
0.1 


4.90 

4.90 


(1- 1)2  ~ 
0. 1 

1.21  - 
0.1 


10.29 


Ehe  next  number  you  calculate  is 


4.90(1. 05)2  - 4.90(1)2 
0.05 


4.90 


4.90 


"(1.05)2  - r 
0.05 

/ 1. 1025  - 1 \ 
\ 0.05  / 


10.05 


You  continue  with  these  calculations,  with  each  successive  value  of  At  half  as  large  as 
the  one  before. 

You  will  obtain  the  following  sequence. 

f—)  = 10.29,  10.05,  9.92,  9.86,  9.83,  9.82,  9.81,  9.80,  9.80,  9.80  (x  in  m,  t in  s) 

V A tit-! 

Inspection  shows  that  the  sequence  of  numbers  gradually  approaches  the  number 

2-5  Differentiation  31 


9.80.  In  other  words,  the  sequence  converges  to  the  limit  9.80.  Consequently,  the 
derivative  has  the  value 

(— ) = limit  (— ) = 9.80  (x  in  m,  t in  s) 

\dtJt.=l  m-,0  V At/,-, 

Inserting  the  units  for  x and  t to  express  the  result  in  the  manner  of  Eq.  (2-10),  you 
have 


(^)  = 9.80  m/s 

\dt)tr  ls 

The  units  in  the  result  are  meters  per  second  since  a velocity  is  a length  divided  by  a 
time,  and  lengths  are  measured  in  meters  while  times  are  measured  in  seconds. 


The  numerical  method  of  evaluating  a derivative  that  was  used  in  Ex- 
ample 2-5  has  the  pedagogical  advantage  of  providing  a real  feeling  for  the 
limiting  process.  But  it  has  the  practical  disadvantage  of  being  extremely 
specific.  To  evaluate  (dx/dt)ti  for  other  values  of  you  must  perform  the 
entire  sequence  of  calculations  at  each  value.  Once  you  understand  the 
concept  of  a limit  and  the  meaning  of  a derivative,  it  is  much  more  conve- 
nient to  evaluate  (dx/dt)ti  from  the  functional  form  of  x(t)  by  using  the  ana- 
lytical method  indicated  in  Example  2-6. 


EXAMPLE  2-6 

Make  an  analytical  evaluation  of  (dx/dt),.  for  x(t)  = ct 2,  where  c is  a constant,  by  em- 
ploying the  definition  of  Eq.  (2-9). 

■ You  must  first  evaluate  the  quantity 


/ Ax\  _ x(t{  + At)  — x(ti) 

y(i  _ a t~ 

Setting  t equal  to  t,  + At  in  the  expression  x(t)  = ct2,  you  have  x(t,  + At)  = 
c(t,  + At)2.  Doing  the  same  with  t set  equal  to  t,  gives  you  x(t,)  = ct2.  So  you  have 

/Ax\  _ c(t,  + At)2  — ct2 
\ At/,  At 

1 l 


= c 


t?  + 2 ti  At  + (At)2  - <?1 


At 


= c(2t,  + At) 


Then  take  the  limit  as  At  — » 0.  You  get 

limit  — ) = limit  c(2t,  + At)  = 2 ct, 

A<-»0  \ At/,.  A<-> 0 

Thus 


= 2 ct,-  for  x(t)  = ct2  and  constant  c (2- 12a) 

'‘i 

This  result  is  very  useful  because  it  applies  to  any  value  of  c and  any  value  of  t,-. 
To  make  a comparison  with  the  specific  result  of  Example  2-5,  set  c = 4.90  m/s2 
and  t;  = 1 s.  You  find 

(— ) = 2 x 4.90  m/s2  x 1 s = 9.80  nt/s 

\dtJ,r  is 

in  agreement  with  what  was  found  from  using  the  numerical  method  for  evaluating 
the  derivative. 

Since  Eq.  (2-12a)  is  valid  for  any  time  t;,  the  specific  time  t,  at  which  dx/dt  is  eval- 
uated can  be  dropped  from  the  left  side  of  the  equation  if  the  subscript  i is  deleted 


32  Kinematics  in  One  Dimension 


from  the  quantity  t on  the  right  side.  By  doing  this,  and  also  substituting  ct2  for  x on 
the  left  side,  the  equation  assumes  the  more  compact  and  useful  form 


d(ct2) 

dt 


= 2 ct  for  constant 


(2-126) 


Equation  (2-126)  leads  to  a particular  example  of  an  important  general 
rule  for  the  derivative  of  an  expression  containing  a constant  factor  such  as 
c.  Since  the  equation  is  valid  for  any  value  of  c,  it  must  be  valid  for  the  spe- 
cial case  c — 1 . In  this  case  we  can  write 

- P~  = 2t  (2-12c) 

dt 

Using  Eq.  (2-12c)  to  substitute  d(t2)/dt  for  2 t in  Eq.  (2-126),  we  have 

d(ct2)  d(t2)  c 

— : — - c — : — tor  constant  c 

dt  dt 


This  is  the  particular  example  of  the  general  rule. 

The  rule  itself  applies  to  any  function  of  t,  not  just  t2.  That  is  for 
any  f (t) 


d[cf(t )]  _ d[f(t)] 

dt  C dt 


for  constant  c 


(2-13) 


The  validity  of  this  rule  can  be  seen  in  an  intuitive  way  by  using  the  relation 
between  derivatives  and  slopes  to  interpret  its  meaning  in  terms  of  slopes. 
In  these  terms  the  rule  says  simply  that  for  any  value  of  the  independent 
variable  t,  the  slope  of  the  curve  plotted  for  cf(t)  is  c times  as  steep  as  the 
slope  of  the  curve  plotted  for/(t). 

You  can  make  a formal  proof  of  Eq.  (2-13)  by  following  the  procedure 
of  Example  2-6.  In  the  same  way  you  can  prove  two  other  important  gen- 
eral rules: 


and 


d[f(t)  + g(6]  = d[f(t)]  + <i[g(t)] 
dt  dt  dt 


(2-14) 


dlf(t)g{t)] 

dt 


= fit) 


4gT)] 

dt 


+ g(t) 


d[f(t )] 
dt 


(2-15) 


where /(t)  and  g(t)  are  any  two  functions  of  t. 

Following  Example  2-6,  you  can  prove  each  of  the  relations 


d(t 2)  = 
dt 


d{tl) 

dt 


= 1<° 


m 

dt 


= 0 


dh_j)  = _j  2 

dt 

dip=  2 r a 

dt 


2-5  Differentiation  33 


The  first  relation  is  just  Eq.  (2- 12c).  In  the  second  and  third  relations  t°  = 
1.  Written  in  these  consistent  forms,  the  general  pattern  is  easy  to  discern: 


dt 


ntn  1 


(2-16) 


Any  calculus  text  will  prove  that  this  is  valid  for  all  values  of  n,  integral  or 
nonintegral. 

o 

Two  other  derivatives,  which  we  will  soon  find  useful,  are 


and 


d[sin(q>t)] 

dt 

d[cos(fc>t)] 

dt 


= CO  cos  {(lit) 


= — co  sin(ojt) 


for  constant  co 


for  constant  co 


(2-17) 

(2-18) 


(The  symbol  co  is  the  Greek  letter  omega.)  The  quantity  cot,  whose  sine  or 
cosine  is  being  taken,  is  an  angle  expressed  in  radians.  The  radian  measure 
of  an  angle  is  the  arc  length  intercepted  by  the  angle  on  a circle  whose 
center  is  at  the  apex  of  the  angle,  divided  by  the  length  of  the  circle’s 
radius.  Being  a length  divided  by  a length,  the  angle  is  unitless.  Since  cot 
must  be  unitless,  and  since  the  units  of  t are  seconds,  the  units  of  co  must  be 
reciprocal  seconds  (s-1). 


EXAMPLE  2-7 


Derive  Eq.  (2-17). 

■ Applying  the  definition  of  Eq.  (2-9),  you  obtain 

sin(ajt;  + co  A t)  — sin(wt,)~ 

At 

Using  the  trigonometric  identity  for  the  sine  of  the  sum  of  two  angles,  you  get 

sin(qjt,)  cos(q)AQ  + cos(orf,)  sin(coAt)  - sin(coq) 

At 

For  the  very  small  values  of  At  that  you  will  have  when  taking  the  limit  as  At  — » 0, 
the  values  of  coA  will  be  very  small  compared  to  1.  This  suggests  that  you  make  use 
of  the  relations 


MsmWflJ  = |jmit 

{ at  l.  At-*o 


(aLsin(atf)J 


t 


dt 


limit 


and 


cos(oiAt)  = 1 for  coAt  <K  1 
sin(wAt)  — coAt  for  coAt  « 1 


[The  first  of  these  makes  sense  since  cos  0=1.  The  second  may  be  new  to  you.  If 
so,  verify  it  by  setting  a calculator  to  the  radian  mode  and  evaluating  the  sine  of  1 
radian  (rad),  0.1  rad,  0.01  rad,  and  0.001  rad.  Then  for  each  case  compare  the  sine 
of  the  angle  with  the  angle  itself.  Can  you  use  the  definitions  of  the  sine  and  of  the 
radian  to  explain  your  observations?]  These  relations  allow  you  to  express  the  quan- 
tity whose  limit  must  be  calculated  by  the  approximation 

sin(cotj)  + cos(coti)(coAt)  - sin  (cot,) 

— co  cos  (cot,-) 


As  At  (and  therefore  coAt)  becomes  smaller,  the  approximation  becomes  more  accu- 
rate. In  the  limit  At  — » 0 it  is  exact.  Thus  taking  the  limit  is  a matter  of  replacing  the 
quantity  by  co  cos  (cot,-).  Doing  this,  you  have 


) d[sin(qjt)]  | 
dt 


co  cos  (cot,) 


34  Kinematics  in  One  Dimension 


2-6  ACCELERATION 


This  equation  can  be  written  as 

r/[sin(a>0]]  r , 

— \ = [w  cos(arf)Jfj 

Since  the  equation  is  valid  for  any  value  of  time  tt,  the  tt  can  be  deleted  from  both 
sides.  Then  it  becomes  Eq.  (2-17).  A procedure  similar  to  the  one  used  here  will 
yield  Eq.  (2-18). 


The  formulas  displayed  in  this  section,  and  a very  few  more  that  will  be 
introduced  as  needed,  allow  you  to  evaluate  analytically  the  derivative  of 
any  of  the  functions  dealt  with  in  this  book. 


Let  us  return  to  physics  and  to  the  task  of  describing  motion  in  one  dimen- 
sion. When  the  velocity  of  a particle  is  changing,  it  is  said  to  be  accelerating. 
In  Sec.  1-3  we  considered  qualitatively  several  examples  of  acceleration  in 
one-dimensional  motion,  for  instance  the  motion  of  a falling  object.  There 
we  also  saw  that  acceleration  plays  an  extremely  important  role  in  new- 
tonian  mechanics.  Now  we  consider  a quantitative  way  of  evaluating  acceler- 
ation for  one-dimensional  motion. 

By  definition,  the  acceleration  is  the  derivative  of  velocity  with  respect  to  time. 

Using  the  symbol  a for  acceleration,  we  can  write  the  definition 


_ dv 
dt 


(2-19) 


Since  the  velocity  v is  related  to  the  position  % and  time  t by  the  definition 

dx 

V = ~r 

dt 


an  expression  for  a,  equivalent  to  Eq.  (2-19),  is 

d (dx\ 

a~Jt\di) 


(2-20) 


The  right  side  of  Eq.  (2-20)  is  the  derivative  with  respect  to  t of  the  deriva- 
tive of  x with  respect  to  t — in  other  words,  the  instantaneous  rate  of  change 
with  respect  to  t of  the  instantaneous  rate  of  change  of  x with  respect  to  t.  It 
is  called  a second  derivative,  and  in  the  notation  of  differential  calculus  it  is 
written  in  the  abbreviated  form 


d ( dx  \ _ d 2x 
dt  \dt)  dt 2 


(2-21) 


Thus  we  can  also  say  that  the  acceleration  is  the  second  derivative  of  position  with 
respect  to  time: 


d2x 

^1? 


(2-22) 


(Note  that  just  as  dx  does  not  mean  “d  times  xf  the  symbol  d2x  does  not 
mean  “d2  times  x.”  Nor  does  dt2  mean  “ d times  t2.”) 


The  definition  of  Eq.  (2-19)  can  be  expressed  in  geometrical  terms  by 
saying  the  acceleration  at  a particular  instant  is  given  by  the  slope  of  the  velocity 
versus  time  curve  at  that  instant.  In  similar  terms,  Eq.  (2-20)  can  be  stated:  the 


2-6  Acceleration  35 


acceleration  at  a particular  instant  is  given  by  the  rate  of  change  of  the  slope  of  the 
position  versus  time  curve  at  that  instant. 

In  the  metric  system  the  unit  for  measuring  the  magnitude  of  accelera- 
tion is  meters  per  second  per  second,  since  it  measures  the  change  per  sec- 
ond of  the  velocity,  which  is  measured  in  meters  per  second.  A shorter  way 
of  expressing  the  unit  is  meters  per  second  squared  (m/s2). 

Just  as  a velocity  has  both  a magnitude  and  a direction,  so  does  an 
acceleration.  And  just  as  is  the  case  for  velocity,  the  direction  of  an  accelera- 
tion in  one-dimensional  motion  is  specified  by  its  sign.  A positive  value  of  a 
means  that  the  direction  of  the  acceleration  is  the  same  as  that  chosen  to  be 
the  direction  from  the  coordinate  origin  to  positive  values  of  x.  In  particu- 
lar, if  an  object  is  moving  in  the  positive  x direction  so  that  the  sign  of  v is 
positive,  and  if  the  magnitude  of  v is  increasing,  then  its  acceleration  is  in 
the  positive  x direction  and  a is  positive.  If  the  sign  of  v is  positive  but  its 
magnitude  is  decreasing,  then  the  acceleration  of  the  object  is  in  the  nega- 
tive x direction  and  a is  negative.  What  would  you  expect  for  the  sign  of  a if 
v is  negative  and  its  magnitude  is  increasing?  What  is  it  if  v is  negative  and 
its  magnitude  is  decreasing?  (These  questions  are  answered  later  in  this  sec- 
tion.) 

Acceleration  is  a quantity  of  fundamental  interest  in  mechanics  be- 
cause it  is  the  effect  that  is  directly  related  to  the  cause — force.  Example 
2-8  will  give  you  some  experience  in  calculating  acceleration. 

EXAMPLE  2-8  — ■■■■HIM ■ nn  in 

In  Examples  2-5  and  2-6  the  expression 

x(t)  = (4.90  m/s2)<2  (2-23) 

was  used  to  represent  the  motion  of  an  object  falling  freely  near  the  surface  of  the 
earth  after  it  is  released  from  rest  at  x = 0 m and  t = Os,  with  positive  values  of  x 
measured  downward.  Calculate  the  acceleration  of  the  object. 

■ First,  evaluate  the  velocity  v = dx/dt.  You  have 

_ dx  _ 4(4.90  m/s2)t2] 
dt  dt 

You  can  use  Eq.  (2-13)  to  remove  the  constant  factor,  4.90  m/s2,  from  the  deriva- 
tive. This  gives  you 

v = (4.90  m/s2)  ^7-^ 
dt 

You  can  now  use  Eq.  (2-16),  with  n set  equal  to  2,  to  evaluate  the  derivative  of  t2. 
You  obtain 

v = (4.90  m/s2)2t 

Simplifying,  and  indicating  explicitly  that  v depends  on  t,  you  obtain 

v(t)  = (9.80  m/s2)t  (2-24) 

Then  you  evaluate  the  acceleration: 

dv  4(9.80  m/s2)f] 

CL  = — = 

dt  dt 

Using  Eq.  (2-13)  again,  you  have 

a = (9.80  m/s2)  3^ 


36  Kinematics  in  One  Dimension 


(UI  UI)  X (S/UI  Ul)  <7  (2s/111  u!)  D 


Setting  dt/dt  equal  to  1,  in  agreement  both  with  Eq.  (2-16)  for  n = 1 and  with 
common  sense,  you  obtain 

a = 9.80  m/s2  (2-25) 

Thus,  you  can  see  from  this  result  that  if  the  falling  object  obeys  the  relation 
between  the  distance  of  fall  x and  time  of  fall  t given  by  Eq.  (2-23),  it  must  be  falling 
with  the  constant  acceleration  a = 9.80  m/s2.  The  value  of  a is  positive  because  the 
acceleration  is  in  the  direction  that  has  been  chosen  to  be  the  positive  x direction, 
namely,  downward. 

The  strobe  photo  shown  in  Fig.  2-14  and  the  analysis  of  the  identical  pho- 
tograph outlined  in  the  caption  to  Fig.  1-6  provide  an  experimental  demonstration 
that  the  motion  of  an  object  falling  with  negligible  air  resistance  actually  is  a case  of 
motion  with  constant  acceleration.  Thus  the  proportionality  between  x and  t2  of  Eq. 
(2-23)  is  physically  correct— that  is,  it  conforms  to  what  actually  happens— since 
this  is  what  leads  to  the  constant  acceleration  of  Eq.  (2-25).  Whether  or  not  the  nu- 
merical value  of  the  factor  4.90  m/s2  connecting  x and  t2  is  correct  must  also  be  set- 
tled by  experiment.  Later  in  this  chapter  you  will  see  experimental  evidence 
demonstrating  that  the  value  of  the  acceleration  actually  is  a = 9.80  m/s2,  and  con- 
sequently that  the  numerical  factor  quoted  in  Eq.  (2-23)  is  correct. 


Additional  experience  in  calculating  acceleration  will  be  obtained  by 
considering  again  the  motion  of  a pendulum  bob  oscillating  between  posi- 
tions whose  separation  is  small  compared  to  the  length  of  the  cord  from 
which  the  bob  is  suspended.  Figure  2-21  plots  the  position,  velocity,  and 
acceleration  of  the  bob.  The  position  curve  is  just  like  the  one  in  Fig.  2-206, 
except  that  the  numerical  values  are  different.  An  equation  giving  the  posi- 
tion of  a pendulum  bob  executing  small  oscillations  is 


x(t)  = c sin(cnt) 


(2-26) 


Fig.  2-21  The  position  x,  velocity  v,  and  acceleration  a, 
plotted  versus  the  time  t,  for  one  cycle  of  the  motion  of  a bob  at 
the  end  of  a long  pendulum  cord. 


(c) 


-0.4 


2-6  Acceleration  37 


For  the  particular  case  illustrated  in  Fig.  2-21,  the  constant  c has  the  nu- 
merical value  0.1  m and  the  constant  &>  has  the  numerical  value  2 s_1.  The 
top  part  of  the  figure  is  just  a plot  of  Eq.  (2-26)  for  these  values.  {To  verify 
this,  check  a few  points.  For  t = Os  the  equation  gives  x = (0.1  m)(sin  0)  = 
0 m.  For  t = 7t/4  s,  it  gives  x = (0.1  m)[sin(2  s_1  x 7t/4  s)]  = (0. 1 m)  x 
(sin  7t/2)  = 0.1  m.  For  t = tt/2  s,  it  gives  x = (0.1  m)[sin(2  s-1  x n/2  s>]  = 
(0.1  m)(sin  tt)  — 0 m.}  The  middle  and  lower  parts  of  Fig.  2-20  are  plots  of 
equations  that  will  be  obtained  in  Example  2-9  from  Eq.  (2-26)  by  differen- 
tiating x{t)  to  obtain  v(t)  and  then  by  differentiating  v(t)  to  obtain  a(t). 


EXAMPLE  2-9  » 1 ■« 

Evaluate  the  velocity,  and  then  the  acceleration,  for  the  pendulum  bob  whose  posi- 
tion is  given  by  Eq.  (2-26). 

■ To  determine  the  velocity,  use  its  definition  and  Eq.  (2-26)  to  obtain 

_ dx  _ d[c  sin(wt)] 
dt  dt 

Remembering  that  c is  a constant,  you  next  use  Eq.  (2-13)  and  get 

d[sin(ci>0] 

v = c 

dt 

Since  co  is  also  a constant,  Eq.  (2-17)  is  applicable,  and  it  gives  you 

v(t)  = coo  cos(oot)  (2-27) 

The  notation  makes  explicit  the  fact  that  v depends  on  t.  For  c = 0.1m  and  co  = 2 s~\ 
this  result  is  in  agreement  with  the  curve  plotted  in  Fig.  2-21. 

You  determine  the  acceleration  by  using  the  expression  for  v(t)  just  obtained  in 
the  definition: 


dv  _ d\co)  cos(cat)] 
dt  dt 


Again  employing  Eq.  (2-13),  you  have 

rf[cos(cat)] 

a = co) : 

dt 

Then  Eq.  (2-18)  is  used  to  yield 

a(t)  = -co?  sin(cat)  (2-28) 

For  c — 0.1  m and  co  = 2 s_1,  this  result  is  also  in  agreement  with  the  curve  plotted 
in  Fig.  2-21. 


The  relation  between  the  x(t)  and  v(t)  curves  in  Fig.  2-21  was  explained 
qualitatively  in  Example  2-4.  The  explanation  used  the  fact,  emphasized  in 
Sec.  2-4,  that  at  any  t the  slope  of  the  x(t)  curve  equals  the  value  of  the  v{t) 
curve,  because  dx/dt  = v.  Earlier  in  this  section  we  pointed  out  a similar 
explanation  for  the  relation  between  the  v(t)  and  a(t)  curves.  The  relation 
depends  on  the  fact  that  at  any  t the  slope  of  the  v(t)  curve  equals  the  value  of 
the  a(t)  curve,  since  dv/dt  = a.  For  examples  of  both  relations,  consider  Fig. 
2-21  at  t — 0 s.  At  this  instant  x(t)  has  a maximum  positive  slope  and  v(t)  has 
a maximum  positive  value.  Also  v(t)  has  zero  slope,  and  a(t ) has  zero  value. 
At  t = 77 /4  s,  x(t)  has  zero  slope  and  v(t)  has  zero  value.  Also,  v(t)  has  its 
most  negative  slope  and  a(t)  has  its  most  negative  value.  You  can  continue 
the  analysis  yourself. 


38  Kinematics  in  One  Dimension 


Now  we  will  consider  a direct  relation  between  the  curves  for  x(t)  and 
a(t).  At  any  t the  rate  of  change  of  slope  of  the  x(t)  curve  equals  the  value  of 
the  a(t)  curve,  since  d(dx/dt)/dt  = a.  For  an  example,  look  at  the  x{t)  curve 
of  Fig.  2-21  in  the  interval  from  t = 0 s to  t — tt/2  s.  The  slope  of  x(t)  is 
always  becoming  less  positive,  or  more  negative.  The  rate  of  change  of 
slope  is  therefore  negative.  This  is  why  a(t)  is  negative  in  the  interval.  For 
the  interval  from  t = tt/2  s to  t = 7r  s,  the  rate  of  change  of  slope  of  x(t)  is 
positive  because  the  slope  is  always  becoming  less  negative,  or  more  posi- 
tive. As  a consequence,  a(t)  is  positive  in  the  interval. 

In  the  interval  from  t = 0 s to  t = tt/2  s,  the  x(t)  curve  is  said  to  be  con- 
cave downward.  From  t = tt/2  s to  t — tt  s,  it  is  said  to  be  concave  upward. 
In  these  terms,  the  relation  between  x(t)  and  a(t)  can  be  expressed  by  saying 
the  direction  of  curvature  of  the  x(t)  plot  determines  the  sign  of  a(t).  A concave  up- 
ward x(t)  means  a positive  a(t),  and  a concave  downward  x(t)  means  a nega- 
tive a(t). 

Furthermore,  the  magnitude  of  curvature  of  the  x(t)  plot  determines  the  mag- 
nitude of  a(t).  That  is,  the  more  sharply  the  x(t)  plot  curves,  the  larger  the 
magnitude  of  a(t).  You  can  see  the  truth  of  this  statement  in  Fig.  2-21.  Note 
that  in  the  vicinity  of  t = 7r/4  s the  x(t)  plot  curves  most  sharply  and  a{t)  has 
a maximum  magnitude.  Near  t = tt/2  s,  x(t)  has  minimum  curvature  and 
a(t)  has  minimum  magnitude. 

In  subsequent  chapters  you  will  see  that  these  geometric  relations 
between  the  x(t),  v(t),  and  a(t)  curves  can  help  you  obtain  an  intuitive  under- 
standing of  the  behavior  of  systems  governed  by  newtonian  mechanics. 
You  will  also  see  that  very  similar  analyses  can  fruitfully  be  employed  on 
other  systems  too. 


2-7  VELOCITY  AND 
POSITION  FOR 
CONSTANT 
ACCELERATION 


In  Examples  2-8  and  2-9  the  position  x of  an  object  was  expressed  as  a 
function  of  the  time  t by  quoting  the  mathematical  function  relating  x to  t. 
Then  its  velocity  v was  evaluated  by  calculating  the  derivative  of  x with 
respect  to  t.  Finally,  the  acceleration  a of  the  object  was  evaluated  by  calcu- 
lating the  derivative  of  v with  respect  to  t.  No  matter  how  an  object  moves 
in  one  dimension,  if  x is  known  as  a function  of  t,  it  is  always  possible  to  go 
from  x to  v to  a by  two  consecutive  differentiations  with  respect  to  t. 

But  the  conclusion  of  Sec.  1-3  implies  that  in  newtonian  mechanics  the 
process  must  generally  be  carried  out  in  the  inverse  order,  that  is,  from  a to 
v to  x.  The  point  is  that  in  newtonian  mechanics  force  is  related  to  accelera- 
tion. In  analyzing  a mechanical  system,  the  net  force  acting  on  an  object  will 
usually  be  determined  hrst.  Then  Newton’s  laws  of  motion  will  be  applied 
to  evaluate  the  acceleration  of  the  object  from  the  force.  To  use  this  infor- 
mation in  making  a prediction  of  the  object’s  motion  that  can  be  compared 
with  experiment,  v and  then  x must  be  determined  from  a. 

Is  there  a way  to  go  from  a to  v to  x?  Yes,  by  a process  called  integration. 
Integration  is  the  mathematical  inverse  of  differentiation.  It  is  generally  a 
more  difficult  task  than  differentiation.  In  fact,  there  are  cases  of  great 
physical  interest  where  integration  cannot  be  performed  by  an  analytical 
method  and  a more  cumbersome — though  conceptually  simple  — 
numerical  method  must  be  used.  (Some  idea  of  what  is  meant  by  the  dis- 
tinction between  numerical  and  analytical  methods  of  integration  can  be 
obtained  by  referring  to  Examples  2-5  and  2-6.  Although  they  involved  dif- 
ferentiation, not  integration,  the  hrst  one  used  a numerical  method  and  the 
second  used  an  analytical  method.)  A simple  numerical  method,  carrying 
out  a process  amounting  to  integration  because  it  goes  from  a to  v to  x,  is  iu- 


2-7  Velocity  and  Position  for  Constant  Acceleration  39 


troduced  in  Chap.  5.  Considerable  use  will  be  made  of  this  method,  and 
variations  of  it,  throughout  the  book.  Analytical  integration  is  deferred  as 
long  as  possible,  specifically  until  Chap.  7.  The  purpose  is  to  allow  you 
more  time  to  reach  the  topic  of  integration  if  you  are  concurrently  studying 
calculus.  And  in  Chap.  7 we  present  a self-contained,  albeit  concise,  treat- 
ment of  integration,  just  as  we  have  done  for  differentiation  in  this  chapter. 

Fortunately,  if  an  object  moving  in  one  dimension  has  constant  accelera- 
tion, as  in  free  fall  and  many  other  important  situations,  there  happens  to 
be  a very  easy  way  to  go  from  a to  v to  x.  In  fact,  it  is  not  necessary  to  use  ex- 
plicitly any  of  the  methods  of  calculus  in  the  argument. 

Here  is  the  argument.  Consider  an  object  that  is  released  from  an  ini- 
tial position  where  x = 0 at  an  initial  time  when  t = 0 with  an  initial  velocity 
v = 0.  The  object  released  from  rest  then  falls  freely  downward  with  a con- 
stant acceleration  a.  Positive  values  of  x are  in  the  downward  direction,  so 
the  same  is  true  of  v and  a.  Since  a is  constant,  with  the  passage  of  time  v in- 
creases steadily  from  its  initial  value  zero  at  a constant  rate  a.  Therefore  at  a 
subsequent  time  t the  value  of  v is 

v = at 

Furthermore,  since  v does  increase  at  a constant  rate  from  zero  as  t in- 
creases, the  average  velocity  (v)  of  the  body  over  the  interval  beginning  at 

Fig.  2-22  (a)  Velocity  v versus  time  t for  an  object  starting  from  rest  and  moving  with  constant 

acceleration.  Because  the  value  of  the  velocity  increases  uniformly  from  zero  as  time  passes,  the 
average  velocity  (v)  over  a time  interval  zero  to  t is  exactly  one-half  its  instantaneous  value  v at 
the  end  of  the  interval.  That  is,  (v)  = v/2.  A more  detailed  explanation  of  why  this  is  so  follows. 
If  the  velocity  is  sampled  at  each  of  a set  of  times  distributed  uniformly  over  the  interval  zero 
to  t,  the  average  of  these  velocities  (that  is,  the  sum  of  their  values  divided  by  the  number  of 
values  being  summed)  will  equal  ( v ).  The  uniformly  distributed  times  are  indicated  by  the 
ticks  on  the  time  axis.  For  each  of  these  the  corresponding  value  of  velocity  is  indicated  by  a 
tick  on  the  velocity  axis.  Since  the  relation  between  velocity  and  time  is  given  by  the  straight  line 
plotting  v(t)  for  the  case  of  constant  acceleration,  the  ticks  on  the  velocity  axis  are  also  uniformly 
distributed.  As  a consequence,  when  the  average  velocity  (v)  is  computed,  its  value  will  be 
exactly  at  the  center  of  the  range  zero  to  v.  Thus  (v)  equals  v/2,  the  value  at  the  center  of  this 
range.  The  argument  can  be  further  clarified  by  considering  the  counterexample  illustrated 
in  the  other  part  of  this  figure.  ( b ) Here  (v)  is  not  equal  to  v/2  because  the  acceleration  is  not 
constant.  The  example  illustrated  is  one  in  which  the  acceleration  gradually  becomes  more 
positive  through  the  time  interval  since  the  slope  of  v(t)  is  gradually  becoming  more  positive. 
(This  is  the  case  for  a jet  plane  preparing  to  take  off. ) In  these  circumstances  the  ticks  along  the 
velocity  axis  are  concentrated  in  the  lower  part  of  the  range  zero  to  v,  and  so  (v)  is  less  than 
v/2.  You  should  sketch  a figure  illustrating  an  example  in  which  the  acceleration  is  gradually 
becoming  less  positive. 


Time  Time 

(a)  (6) 


40  Kinematics  in  One  Dimension 


t (in  s) 

(a) 


10 

tr  8 
If  6 

§,  4 

a 

2 


0 12  3 4 

t (in  s) 

(a) 

Fig.  2-23  The  position  x,  velocity  v , and 
acceleration  a of  an  object  falling  from 
rest  at  x = 0 with  negligible  air  resist- 
ance, plotted  versus  the  elapsed  time  t. 
All  three  quantities  pertain  to  motion 
along  a vertically  oriented  axis,  whose 
positive  direction  is  downward. 


a(t) 


J 


time  zero  and  ending  at  time  t will  equal  one-half  the  final  velocity  v.  (If  this 
is  not  apparent,  look  at  Fig.  2-22 a,  which  is  a plot  of  v = at,  and  read  the 
explanatory  caption.)  Thus,  for  constant  acceleration 


Using  v = at,  we  obtain  (v)  — — 

But  the  total  change  in  position  is  always  the  average  velocity  multiplied  by 
the  time  interval,  as  stated  in  Eq.  (2-8): 

Ax  = ( v ) A t 

Since  the  values  of  both  position  and  time  are  zero  at  the  beginning  of  the 
time  interval,  their  values  x and  t at  the  end  of  the  interval  give  the  changes 
Ax  and  At  in  the  values.  Thus  Ax  = x and  At  = t,  and  we  have  from  Eq. 
(2-8)  the  result 

x = (v)t  = t 
or 

at 2 

* = T 

After  specifying  the  initial  values  of  a body’s  position  and  velocity  (x  = 
0 and  v = 0)  and  the  initial  value  of  the  time  ( t = 0),  we  have  used  a state- 
ment about  its  acceleration  ( a = constant)  to  find  an  expression  for  its 
velocity  (v  — at)  and  then  an  expression  for  its  position  (x  = at2/ 2).  Thus  we 
have  gone  from  a to  v to  x,  for  this  case  of  constant  a.  The  time  dependences 
of  the  quantities  a,  v,  and  x are  shown  in  Fig.  2-23.  A numerical  example  of 
the  argument  used  to  relate  these  quantities  is  given  in  Table  2-1. 

Now  we  modify  the  argument  to  treat  a more  general  case  in  which  the 
body  experiencing  constant  acceleration  has  nonzero  initial  values  of  x and 
v,  which  we  designate  as  x,-  and  vt.  But  since  we  can  usually  zero  the  clock 
used  to  measure  time  at  the  initial  instant,  the  initial  time  is  again  taken  to 
be  when  t = 0.  The  subsequent  values  of  position,  velocity,  and  time  are  x, 
v,  and  t,  as  before.  The  modified  argument  is  as  follows.  Since  a is  constant, 
v increases  steadily  from  its  initial  value  vt  at  the  constant  rate  a.  Therefore 
at  the  time  t the  instantaneous  value  of  v is  given  by  the  important  equation 


v = vt  + at  for  constant  a and  v = u,  at  t = 0 (2-29) 


Table  2-1 


Numerical  Example  with  a = 9.8  m/s2  of  Argument  Leading  to 
the  Equation  x = at2/ 2,  for  x = 0 and  v = 0 at  t = 0 


t 

v = at 

IN 

S' 

II 

X 

II 

9.8f2/2  = at2/ 2 

(in  s) 

(in  m/s) 

(in  m/s) 

(in  m) 

(in  m) 

0 

0 

0 

0 

0 

1 

9.8 

4.9 

4.9 

4.9 

2 

19.6 

9.8 

19.6 

19.6 

3 

29.4 

14.7 

44.1 

44.1 

4 

39.2 

19.6 

78.4 

78.4 

5 

49.0 

24.5 

122.5 

122.5 

2-7  Velocity  and  Position  for  Constant  Acceleration  41 


Furthermore,  since  v does  increase  at  a constant  rate  from  the  initial  value 
Vf  to  the  value  v in  the  time  interval  from  zero  to  t,  the  average  velocity  (v)  is 


<v) 


Vi  + V 
9 


(How  would  you  modify  Fig.  2-21,  and  its  caption,  to  prove  this?)  Using  Eq. 
(2-29)  to  evaluate  v,  we  have 


(v) 


vt  + Vi  + at 
2 


It  is  still  true  that  (v)  relates  the  change  in  position  Ax  to  the  time  interval 
At  by  the  equation  Ax  — (v)  At,  and  that  At  = t.  But  in  the  present  case  the 
initial  value  of  the  position  is  x,,  not  zero.  So  Ax,  the  difference  between  the 
subsequent  and  initial  positions,  has  the  value  Ax  = x — x{.  Therefore  we 
have 

/ \ ( , at\  t . , at2 

x — Xj  = (v)t  = I ■ vt  + y 1 1 - Vjt  + ~y 


or 


x = X;  + vt t + — for  constant  a and  x = xt,  v = vt  at  t = 0 (2-30) 

Equation  (2-30)  is  important  because  it  is  the  general  expression  for 
the  position  of  an  object  in  the  frequently  studied  case  of  one-dimensional 
motion  with  constant  acceleration.  Let  us  interpret  each  term  on  the  right 
side  of  the  equation.  The  third  term  is  the  same  as  the  single  term  at2/ 2,  ob- 
tained in  the  original  argument,  which  gives  the  value  of  the  position  x at 
time  t for  motion  with  constant  acceleration  a in  a case  when  both  the  initial 
position  x;  and  the  initial  velocity  vt  are  zero.  The  second  term  takes  into  ac- 
count the  fact  that  if  vt  is  not  zero,  this  velocity  acting  over  a time  interval  of 
duration  t will  make  an  additional  contribution  vd  to  the  value  of  x.  The  first 
term  is  present  to  account  for  the  fact  that  if  xt  is  not  zero,  its  value  must  be 
added  into  the  value  of  x.  Example  2-10  demonstrates  the  use  of  Eq.  (2-30), 
and  also  of  Eq.  (2-29),  in  a common  situation. 


EXAMPLE  2-10 


a.  A car  is  traveling  along  a straight  road  at  a speed  of  71  km/h.  Seeing  a 
traffic  jam  ahead,  the  driver  applies  the  brakes  for  2.3  s and  reduces  the  speed  to 
47  km/h.  Assuming  the  acceleration  is  constant  during  the  braking  period,  calcu- 
late its  value. 

b.  If  the  driver  continued  to  apply  the  brakes  so  as  to  maintain  this  accelera- 
tion, what  distance  would  be  required  to  bring  the  car  to  a halt  from  the  speed  of 
47  km/h? 

■ a.  First  you  should  convert  the  speeds  from  kilometers  per  hour  to  meters 
per  second.  You  have  for  the  initial  speed 


km 

71  km/h  = 71— — x 


103  m 
1 km 


1 h 

60  min 


1 min 
60  s 


19.7  m/s 


In  power-of-ten  notation,  71  km/h  is  7.1  X 101  km/h  and  19.7  m/s  is  1.97  X 
101  m/s.  So  the  velocity  is  expressed  in  meters  per  second  to  three  significant  fig- 
ures even  though  it  comes  from  a value  expressed  in  kilometers  per  hour  to  only 
two  significant  figures.  This  means  that  the  accuracy  of  the  third  significant  figure  is 


42  Kinematics  in  One  Dimension 


doubtful.  However,  it  is  better  to  retain  it  at  this  stage  than  to  round  off  and  write 
19.7  m/s  as  20  m/s.  The  reason  is  that  the  value  will  be  used  in  subsequent  calcula- 
tions, and  rounding  off  could  impair  their  accuracy  needlessly.  In  general,  an  extra 
significant  figure  should  be  carried  in  all  calculated  numbers  that  will  be  used  in 
subsequent  calculations.  But  when  final  results  are  obtained,  they  should  be 
rounded  off  to  no  more  significant  figures  than  are  in  the  least  accurate  value  used 
to  begin  the  calculations.  (The  number  of  significant  figures  in  a final  result  can  be 
appreciably  fewer  than  in  any  value  entering  a calculation  if  at  some  point  two 
nearly  equal  numbers  are  subtracted.  For  an  example,  consider  1.23  — 1.22  = 
0.01.) 

Converting  the  final  speed  from  kilometers  per  hour  to  meters  per  second,  as 
above,  you  find 


47  km/h  = 13.1  m/s 

Next  choose  the  direction  of  motion  of  the  car  to  be  the  positive  direction,  and 
let  time  have  the  value  zero  at  the  instant  that  the  braking  period  begins.  The  initial 
and  final  velocities  are  then  vt  = 19.7  m/s,  v — 13.1  m/s,  and  the  final  time  is  t = 
2.3  s.  These  quantities  are  related  to  the  acceleration  a through  Eq.  (2-29): 


v = vt  + at 


Solving  for  a,  you  obtain 

_ v - Vj  _ 13.1  m/s  — 19.7  m/s 
a ~ t ~ 2lTs 

or 

a = -2.87  m/s2 

The  acceleration  is  negative  since  the  magnitude  of  the  velocity  is  decreasing.  That 
is,  the  acceleration  is  in  the  direction  opposite  to  the  direction  of  motion.  As  far  as 
the  answer  to  this  part  of  the  example  is  concerned,  the  value  of  a should  be 
rounded  off  to  two  significant  figures  and  quoted  as 

a = — 2.9  m/s2 

But  when  a is  used  in  the  next  part,  all  three  significant  figures  should  be  retained. 

b.  You  can  calculate  the  distance  required  to  bring  the  car  to  a stop  from  a 
velocity  of  47  km/h  = 13.1  m/s,  at  an  acceleration  of  — 2.87  m/s2,  by  first  using  Eq. 
(2-29)  again  to  calculate  the  time  required  and  then  calculating  the  distance  traveled 
in  this  time  from  Eq.  (2-30).  In  the  first  step,  you  use  a new  choice  for  time  zero, 
taking  it  to  be  the  instant  when  the  velocity  is  13.1  m/s.  Equating  the  final  velocity  in 
Eq.  (2-29)  to  zero,  you  have 


0 = V{  + at 


or 


t = - 


Vi 

a 


Setting  Vj  = 13.1  m/s  and  a = —2.87  m/s2,  you  find  the  value  of  time  when  the 
final  velocity  is  zero  to  be 


t = 


13.1  m/s 
-2.87  m/s2 


4.56  s 


In  the  next  step  you  use  Eq.  (2-30): 


x = xt  + vtt  + 


ar 


2-7  Velocity  and  Position  for  Constant  Acceleration  43 


Measuring  x from  the  point  where  the  velocity  is  vt  = 13.1  m/s,  you  have  x;  = 0. 
Setting  a = —2.87  m/s2  and  t = 4.56  s,  you  find 


or 


x = 13.1  m/s  x 4.56  s 4- 


—2.87  m/s2  x (4.56  s)2 
2 


x = 30  m 


(2-31) 


This  is  the  distance  traveled  by  the  car  in  coming  to  rest  from  a velocity  of  47  km/h, 
quoted  to  two  significant  figures. 

Calculate  the  distance  the  car  would  travel  in  the  time  required  for  it  to  come  to 
rest,  if  it  continued  to  move  at  47  km/h  instead  of  having  a negative  acceleration. 
You  will  find  that  it  is  just  twice  as  large  as  the  distance  traveled  in  coming  to  rest. 
Can  you  explain  why? 


The  calculation  in  Example  2-10  suggests  that  a useful  general  relation 
can  be  obtained  by  solving  Eq.  (2-29)  for  t and  then  substituting  the  expres- 
sion obtained  for  t into  Eq.  (2-30).  This  is  exactly  what  was  done  in  the  ex- 
ample to  evaluate  x.  But  there  it  was  done  for  a specific  value  of  t,  whereas 
the  general  result  will  apply  to  any  value  of  t.  We  have  from  Eq.  (2-29) 


v — Vi 


Inserting  this  in  Eq.  (2-30),  we  obtain 


v - vt  , a (v  - v^2 

X — X;  + Vi b VT 5 

a l a 2 

, 2 vxv  - 2vf  , v2  - 2vjV  + vf 
= Xi  -I 1 


2 a 


2 a 


or 


v — V- 

x = Xi  4 — — for  constant  a and  x — x,-,  v = vt  at  t = 0 (2-32) 

This  relation  is  used  to  find  directly  the  change  in  position  of  an  object 
moving  with  a known  constant  acceleration,  while  its  velocity  changes  from 
one  known  value  to  another.  Use  it  to  recalculate  the  distance  required  for 
the  car  in  Example  2-10  to  stop. 

Another  convenient  relation  can  be  obtained  by  writing  Eq.  (2-29)  as 


a = 


v — Vi 


and  then  substituting  this  expression  for  the  a in  Eq.  (2-30).  The  result  is 

(v  - Vi)(2 


X = Xj  + Vit 


2 1 


which  simplifies  to 

, (v  + Vi)t 
x = X{  H o 


for  constant  a and  x = x,-,  v = v{  at  t = 0 (2-33) 


This  relation  is  employed  to  find  the  change  in  position  when  the  velocity 
changes  from  one  known  value  to  another,  if  the  value  of  the  constant 
acceleration  is  not  known  but  the  time  during  which  the  velocity  changes  is 
known. 


44  Kinematics  in  One  Dimension 


The  calculation  in  Example  2-10  provides  an  excellent  illustration  of  the  way 
each  term  in  a correct  physical  equation  has  the  proper  units.  Look  at  Eq.  (2-31): 

-2.87  m/s2  X (4.56  s)2 
x = 13.1  m/s  X 4.56  s + 

In  the  first  term  on  the  right  side  the  seconds  cancel,  leaving  the  units  for  the  term 
to  be  meters.  A similar  cancellation  occurs  in  the  second  term,  so  that  its  units  are 
also  meters.  Since  the  proper  units  for  the  term  on  the  left  side  are  meters  too,  the 
equation  is  consistent  as  far  as  units  are  concerned.  If  somehow  an  error  had  been 
made  in  deriving  Eq.  (2-30), 

at2 

X = X;  + Vjf  + — 

so  that  the  last  term  on  the  right  side  was  mistakenly  thought  to  be  at /2,  Example 
2-10  would  have  made  the  mistake  very  apparent.  In  such  a situation  the  corre- 
sponding term  in  Eq.  (2-31)  would  have  been  found  to  have  the  units  meters  per 
second,  instead  of  the  required  meters. 

But  it  is  not  necessary  to  work  through  a specific  numerical  calculation  [such 
as  in  Eq.  (2-31)]  to  search  for  errors  in  a newly  obtained  equation  [like  Eq.  (2-30)] 
by  checking  the  consistency  of  the  units.  Just  inspect  the  factors  in  each  term  of 
the  equation  from  the  point  of  view  of  their  units.  Any  system  of  units  can  be  used 
for  this  purpose.  What  is  really  important  in  the  analysis  of  Eq.  (2-30)  is  thatx  is  a 
length  (not  that  it  is  a length  measured  in  the  particular  units  called  meters),  that  v 
is  a length  divided  by  a time,  that  a is  a length  divided  by  the  square  of  a time,  and 
that  t is  a time.  It  is  said  thatx  has  the  dimensions  of  length.  The  dimensions  of  v 
are  length  divided  by  time,  that  is,  (length)(time)-1.  The  dimensions  of  a are 
length  divided  by  time  squared,  that  is,  (length) (time)-2.  And  the  dimensions  oft 
are  time.  A dimensional  analysis  demonstrating  the  consistency  of  Eq.  (2-30)  is 
carried  out  by  writing  it  and  then  writing  beneath  it  an  equation  showing  the  di- 
mensions of  each  term: 

at2 

x = x,-  + v,-t  + — 

2 

length  = length  + (length)(time)-1(time)  + (length)  (time) -2(time)2 

(No  dimensions  are  indicated  for  the  factor  j because  it  is  a pure  number.)  By 
treating  the  words  “length”  and  “time”  as  quantities  that  can  be  manipulated  ac- 
cording to  the  rules  of  algebra,  it  is  seen  that  each  of  the  three  terms  on  the  right  side 
has  the  dimensions  of  length.  Since  the  dimensions  of  the  term  on  the  left  side  are 
also  length,  the  equation  is  dimensionally  consistent. 

Dimensional  analysis  is  a very  useful  tool  for  finding  errors.  Carry  out  a di- 
mensional analysis  showing  the  inconsistency  of  an  equation  which  is  like  Eq. 
(2-30)  except  that  the  third  term  on  the  right  side  has  the  form  a2t2/2,  instead  of 
at2/2.  It  would  be  a very  good  idea  for  you  to  get  into  the  habit  of  doing  a dimen- 
sional analysis  on  any  equation  you  develop  in  the  process  of  working  through  the 
exercises  in  this  book.  If  you  find  an  equation  has  inconsistent  dimensions,  you 
know  that  an  error  has  been  made.  Of  course,  you  cannot  use  dimensional  analysis 
to  check  the  consistency  of  numerical  factors  in  equations,  such  as  the  factor  \ in 
the  third  term  of  Eq.  (2-30),  since  pure  numbers  are  dimensionless  and  thus  play 
no  role  in  the  analysis. 

You  can  give  Eq.  (2-30),  and  also  Eq.  (2-29),  a complete  check  by  differen- 
tiating. This  has  been  done  already  in  Example  2-8  for  the  special  case  x{  =0, 
vf  = 0,  and  a = 9.80  m/s2.  Repeat  the  calculation  of  Example  2-8  without  speci- 
fying the  values  of  x,,  vit  and  a.  By  differentiating  Eq.  (2-30)  with  respect  to  t,  you 
will  show  that  the  velocity  is  given  by  Eq.  (2-29).  Differentiating  again,  you  will 
show  that  the  acceleration  has  the  value  a,  where  a is  a constant.  This  successfully 
finishes  the  verification,  since  the  assumption  of  a constant  acceleration  a was  the 
basis  used  to  obtain  Eqs.  (2-29)  and  (2-30). 


2-7  Velocity  and  Position  for  Constant  Acceleration  45 


Section  2-8  closes  the  chapter  by  presenting  additional  applications  of 
Eqs.  (2-29)  and  (2-30).  It  will  also  serve  to  remind  you  that  even  though 
physics  makes  much  use  of  mathematics,  it  is  a science  based  on  experi- 
ment. 


2-8  VERTICAL 
FREE  FALL 


Fig.  2-24  An  experiment  measuring 
the  gravitational  acceleration  g. 


In  Fig.  2-14  we  presented  a strobe  photo  of  an  object  falling  with  negli- 
gible air  resistance  very  near  the  surface  of  the  earth.  It  showed  qualita- 
tively that  the  object  experiences  a constant  downward  acceleration.  Quan- 
titative results  were  quoted  in  Example  2-8  which  led  to  the  numerical 
value  of  the  gravitational  acceleration.  But  no  experimental  basis  for  the 
quantitative  results  was  given.  Here  we  rectify  this  omission  by  considering 
a measurement  of  the  value  of  the  very  important  quantity,  the  accelera- 
tion due  to  gravity. 

One  way  to  make  such  a measurement  would  be  to  obtain  a strobe  pho- 
to much  like  the  one  in  Fig.  2-14,  but  with  a meter  stick  and  a clock  in- 
cluded. It  could  then  be  analyzed  quantitatively  in  the  manner  indicated  in 
the  caption  to  Fig.  1-6,  the  one  from  which  Fig.  2-14  is  reproduced.  If  you 
look  back,  you  will  see  how  this  could  lead  to  a numerical  value  of  the  accel- 
eration of  the  falling  object.  However,  you  could  not  expect  the  value  ob- 
tained to  be  very  accurate  because  the  analysis  involves  measuring  first  the 
difference  between  two  pairs  of  positions,  to  obtain  two  velocities,  and  then 
the  difference  between  these  two  velocities.  Precision  is  lost  in  the  succes- 
sion of  subtractions. 

A reasonably  accurate  value  of  the  gravitational  acceleration  can  be  ob- 
tained by  measuring  the  total  time  required  for  an  object  to  fall  freely  from 
rest  over  a certain  total  distance.  Then  the  measurement  can  be  analyzed  by 
applying  Eq.  (2-30),  with  xt  = 0 and  = 0: 

at 2 


Figure  2-24  is  a photograph  of  such  a measurement.  A steel  ball  is  initially 
held  at  the  top  of  a meter  stick  by  an  electromagnet.  When  a switch  is 
thrown,  a relay  interrupts  the  current  to  the  magnet.  This  releases  the  ball, 
allowing  it  to  fall  from  rest.  The  relay  simultaneously  sends  current  to  start 
the  clock.  After  falling  some  distance,  the  ball  knocks  a switch  open, 
thereby  interrupting  the  current  to  the  clock  and  stopping  it. 

The  photograph  shows  that  0.45  s of  time  is  required  for  the  ball  to 
fall  from  rest  through  a distance  of  100  cm.  Solving  the  preceding  equation 
for  a and  substituting  in  these  values  for  t and  x,  we  obtain 


2x  = 2(100  cm) 
t2  (0.45  s)2 


9.9  x 102  cm/s2 


9.9  m/s2 


This  result  is  within  1 percent  of  the  value  that  is  obtained  from  a series  of 
more  accurate  measurements  of  the  magnitude  of  the  gravitational  accel- 
eration near  the  earth’s  surface.  That  magnitude  is 

g = 9.80  m/s2  (2-34) 

Following  common  convention,  the  symbol  g is  used  for  the  magnitude  of 
the  gravitational  acceleration  near  the  surface  of  the  earth.  For  reasons  that 
are  explained  in  Sec.  5-4,  the  value  of  g differs  from  place  to  place  by  as 
much  as  several  parts  in  1000.  The  value  tends  to  be  smaller  near  the  equa- 


46  Kinematics  in  One  Dimension 


tor  and  larger  near  the  poles,  but  there  are  also  minute  local  variations. 
The  value  quoted  in  Eq.  (2-34)  has  been  averaged  over  various  locations  in 
the  United  States  and  then  rounded  off  to  three  significant  figures.  This 
value  of  g is  generally  used  throughout  the  book. 

It  is  possible  to  verify  that  the  gravitational  acceleration  is  essentially 
constant  throughout  the  motion  of  an  object  falling  freely  near  the  surface 
of  the  earth,  and  to  do  so  with  more  accuracy  than  is  possible  from  the 
experiment  and  analysis  of  Fig.  1-6.  This  can  be  done  by  repeating  the 
experiment  and  analysis  of  Fig.  2-24  for  several  different  values  of  x, 
showing  that  the  same  value  of  g is  always  obtained. 

It  is  also  possible  to  show  experimentally  that  g does  not  depend  on  the 
nature  of  the  freely  falling  object.  Take  two  different  objects  which  are 
both  compact  enough  to  fall  a short  distance  without  air  resistance  playing 
a significant  role,  say  a small  coin  and  a large  coin.  Hold  them  at  the  same 
height  above  the  floor,  and  then  release  them  at  the  same  time.  You  will  see 
and  hear  them  hit  the  floor  at  very  nearly  the  same  time,  showing  that  they 
have  traveled  with  very  nearly  the  same  acceleration.  In  Sec.  4-2  we  discuss 
the  reason  for  this. 

The  experiment  just  suggested  is  sometimes  called  the  “Leaning  Tower  of 
Pisa”  experiment.  According  to  legend.  Galileo  dropped  a large  cannonball  and  a 
small  musket  ball  from  the  Leaning  Tower  and  proved  to  a large  audience  of 
dumbstruck  university  professors  that  the  two  balls  hit  the  ground  simulta- 
neously, in  contrast  to  their  expectation  that  the  heavier  ball  would  hit  much 
sooner.  In  fact,  Galileo  probably  never  dropped  anything  from  the  Leaning  Tower, 
though  he  almost  certainly  did  the  experiment  we  suggest  that  you  do.  In  any 
case,  no  public  demonstration  of  this  sort  was  recorded. 

However,  in  his  famous  Dialogues  Concerning  the  Two  Great  World  Systems, 
Galileo  describes  such  an  experiment  in  a hypothetical  way  and  states  clearly  that 
the  two  balls  will  not  quite  strike  the  ground  simultaneously.  He  then  attributes 
the  small  difference  in  time  of  fall  to  air  resistance.  He  bases  his  argument  for  an 
equal  gravitational  acceleration  of  all  objects,  in  the  absence  of  air  resistance,  on 
the  following  subtle,  somewhat  negative  grounds:  Suppose  that  a heavy  object 
does  fall  with  greater  acceleration  than  a light  one.  Tie  the  two  together  with  a 
string.  Will  the  heavy  object  then  pull  the  light  one  down  faster  than  it  would  go 
by  itself,  and  will  the  light  object  retard  the  heavy  one,  so  that  the  acceleration  of 
the  combination  is  intermediate  between  those  of  the  two  separate  objects?  Or 
does  the  combination  constitute  an  object  heavier  than  either  alone,  which  there- 
fore falls  faster  than  either?  The  only  way  to  escape  this  contradiction  is  to  agree 
that  there  is  no  difference  in  the  motions  in  the  first  place. 

Example  2-11  uses  the  equations  relating  position,  velocity,  and  accel- 
eration in  one-dimensional  motion  with  constant  acceleration,  and  the 
measured  value  of  the  gravitational  acceleration,  to  solve  a rather  compli- 
cated problem  involving  vertical  free  fall. 


EXAMPLE  2-11 

A child  leans  out  of  the  window  of  a building  at  a height  10.0  m above  the  ground 
and  throws  a ball  vertically  upward  with  velocity  12.0  m/s.  Neglecting  air  resistance, 
predict  the  maximum  height  above  ground  attained  by  the  ball  and  also  the  total 
elapsed  time  at  the  moment  it  hits  the  ground. 

■ Call  the  initial  height  above  ground  of  the  ball  h and  its  maximum  height  H, 
the  initial  vertical  velocity  Uj,  and  the  total  elapsed  time  T . Your  task  is  to  find  H and  T 


2-8  Vertical  Free  Fall  47 


in  terms  of  the  given  values  of  h,  vt  and  the  known  value  of  the  gravitational  acceler- 
ation g.  It  is  best  not  to  insert  the  actual  numerical  values  of  h,  vt,  and  g until  the  end 
of  the  analysis.  You  should  take  the  vertical  line  on  which  the  ball  moves  as  the  x 
axis.  Also,  you  can  choose  the  origin  of  that  axis  at  the  initial  location  of  the  ball,  and 
choose  its  positive  direction  to  be  upward. 

The  two  useful  relations  are  those  describing  one-dimensional  motion  with 
constant  acceleration,  Eq.  (2-29): 


and  Eq.  (2-30): 


v = Vj  + at 


at 2 

x = Xj  + vd  + 


Choosing  the  positive  x direction  to  be  upward  means  that  you  must  set  a = — g.  The 
negative  sign  expresses  the  fact  that  the  gravitational  acceleration,  being  always 
downward,  is  in  the  negative  x direction.  Also,  you  will  have  x,-  = 0,  since  the  origin 
of  the  x axis  has  been  fixed  at  the  initial  location  of  the  ball.  So  you  have 

v = Vi  - gt  (2-35) 


and 


(2-36) 


At  the  top  of  the  ball's  path  its  velocity  will  instantaneously  be  v = 0.  It  has  fin- 
ished going  up,  and  it  has  not  yet  started  going  down.  Using  this  condition  in  Eq. 
(2-35)  yields 

0 = i h ~ gt 


or 


g 


for  the  time  when  the  ball  reaches  the  top  of  the  path.  When  you  substitute  this 
value  of  t into  Eq.  (2-36),  you  obtain 


x = 


or 


for  the  maximum  x coordinate  of  the  ball.  To  find  its  maximum  height  H above  the 
ground,  you  add  the  height  of  the  window  above  the  ground  to  the  quantity  x and 
obtain 


x 


H = ^-  + h (2-37) 

2g 

You  can  find  the  time  at  which  the  ball  hits  the  ground  from  Eq.  (2-36).  Set 
— h,  and  thus  equate  the  position  of  the  ball  with  ground  level.  This  gives  you 


-h  = 


or 

Y - Vit  -h  = 0 

Now  solve  this  equation  for  the  value  of  t corresponding  to  x = — h by  applying  the 
standard  expression  for  the  solution  to  a general  quadratic  equation.  (Recall  that  if 
at 2 + bt  + c = 0,  then  t = [ —b  ± ( b 2 — 4ac)1,2]/2a.)  You  get 

vt  ± Vwf  + 2 gh 

g 


48  Kinematics  in  One  Dimension 


Since  (vf  + 2 gh)112  is  greater  than  vl , and  since  you  should  have  t greater  than 
zero,  you  will  want  the  positive  root.  Thus  the  total  elapsed  time  is 


T = 


■Ji  + Vvf  + 2 gh 


(2-38) 


Determining  numerical  values  of  //and  T,  in  terms  of  the  given  numerical  val- 
ues of  h,  vit  and  g,  is  now  simply  an  exercise  in  “plugging  in”  values  and  doing  arith- 
metic. From  Eq.  (2-37)  you  find 


H = 

And  Eq.  (2-38)  gives  you 

1 


(12.0  m/s)2 
2 x 9.80  m/s: 


+ 10.0  m = 17.4  m 


9.80  m/s2 

= 3.1  1 s 


[12.0  m/s  + V ( 1 2.0  m/s)2  + 2 x 9.80  m/s2  x 10.0  m] 


Note  the  way  that  values  expressed  to  three  significant  figures  are  used  consistently 
throughout  these  two  independent  calculations.  In  contrast  to  the  situation  in  Ex- 
ample 2-10,  here  no  calculated  numbers  are  used  in  subsequent  calculations.  So  it  is 
not  necessary  to  bother  with  carrying  an  extra  significant  figure  on  numbers  pro- 
duced at  intermediate  stages.  This  is  one  of  the  advantages  of  not  inserting  actual 
numerical  values  until  the  end  of  an  analysis. 

You  could  have  chosen  x = 0 to  be  at  ground  level  and  chosen  the  positive 
direction  of  the  x axis  to  be  downward.  It  will  be  worthwhile  to  repeat  the  analysis, 
making  these  choices,  and  show  that  the  same  final  results  are  obtained.  Still  an- 
other approach  to  the  problem  is  possible,  if  you  wish  to  avoid  the  general  quadratic 
equation  that  arises  in  determining  the  total  time  for  the  ball  to  hit  the  ground.  You 
can  break  the  analysis  into  two  parts:  (1)  the  trip  up  to  the  maximum  height,  and  (2) 
the  trip  from  that  height  to  ground.  Do  this,  and  compare  the  results  with  those  ob- 
tained here. 


EXERCISES 

Group  A 

2-1.  Speeds  of  various  objects.  Give  the  approximate 
value  of  the  speed,  in  meters  per  second  (m/s),  for  each 
object  listed.  Also  express  each  speed  as  a fraction  of  the 
speed  of  light,  which  is  3.00  x 108  m/s.  In  cases  where 
full  information  is  not  provided,  make  reasoned  estimates 
in  order  to  obtain  your  results.  (Note  the  table  of  conver- 
sion factors  inside  one  of  the  book  covers.) 

a.  An  ant  crawling 

b.  A person  walking  at  a comfortable  pace 

c.  A track  star  running  the  mile 

d.  An  automobile  on  a superhighway 

e.  A cruising  jet  airliner  (650  mi/h,  or  about  90  per- 
cent of  the  speed  of  sound) 

f.  A near-earth  artificial  satellite  (orbital  radius  of 
7000  km;  orbital  period  of  approximately  90  min) 

g.  The  moon  in  its  orbit  around  the  earth  (orbital 
radius  of  approximately  380,000  km;  orbital  period  of 

27.3  days) 

h.  The  earth  in  its  orbit  around  the  sun  (orbital  radius 
of  approximately  93,000,000  mi;  orbital  period  of 

365.3  days) 


2-2.  Travel  time  in  radio  communication.  The  speed  of 
radio  waves  is  the  same  as  the  speed  of  light  waves,  namely 

3.00  x 108  m/s.  Calculate  the  time  required  for  radio 
waves  to  make  each  of  the  trips  listed. 

a.  From  a seacoast  radio  transmitter  to  a ship  located 
100  km  offshore 

b.  From  a ground  station  to  a synchronous  com- 
munications satellite  orbiting  at  an  altitude  of  36,000  km 

c.  From  an  astronaut  on  the  lunar  surface  to  a con- 
trol center  on  Earth  (a  distance  of  approximately 

380.000  km) 

d.  Back  to  Earth  from  the  Viking  space  probes  upon 
their  arrival  at  Mars  (a  distance  of  approximately 
380,000,000  km) 

2-3.  Debunking  a rumor.  Suppose  that  a person 
claiming  to  possess  psychic  powers  announces  to  you  that 
the  sun  has  just  exploded.  How  long  will  you  have  to  wait 
in  order  to  be  sure  that  the  claim  is  untrue?  (The  distance 
from  the  sun  to  earth  is  1.50  x 108  km,  and  the  speed  of 
light  is  given  in  Exercise  2-1.) 


Exercises  49 


2-4.  The  definition  of  velocity.  Explain  how  the  general 
definition  of  velocity  for  one-dimensional  motion,  Eq. 
(2-6),  contains  in  it  the  definitions  given  by  Eqs.  (2-1)  and 
(2-3). 


2-5.  The  loneliness  of  planet  dwellers.  The  tremendous 
distances  encountered  in  astronomy  have  led  to  the  use  of 
the  travel  time  of  light  over  a fixed  distance  to  define  a suit- 
ably large  distance  unit.  A light-year  is  the  distance  trav- 
eled by  light  during  one  year. 

a.  Find  the  number  of  kilometers  in  one  light-year. 

b.  Proxima  Centauri,  the  nearest  known  star  beyond 
the  sun,  is  located  4.1  X 1013  km  away.  Express  its  dis- 
tance in  light-years. 

c.  Suppose  an  earthling  decides  to  attempt  to  estab- 
lish radio  contact  with  inhabitants  of  a (hypothetical) 
planet  orbiting  Proxima  Centauri.  How  long  after  ini- 
tiating transmission  should  the  earthling  begin  to  listen 
for  a reply? 

2-6.  No  house  calls.  An  unstaffed  space  probe  passing 
near  the  planet  Saturn  is  sending  a continuous  radio 
transmission  back  toward  Earth.  Suddenly  one  of  the 
probe’s  instruments  malfunctions  in  a way  that  cannot  be 
handled  by  the  probe’s  onboard  computer.  However,  the 
malfunction  affects  the  radio  transmission  in  a recogniz- 
able manner.  If  the  control  personnel  on  Earth  send  cor- 
rective commands  as  soon  as  they  receive  the  first  sign  of 
trouble,  how  much  time  elapses  between  the  malfunction 
and  the  arrival  at  the  probe  of  the  corrective  commands? 
Saturn  is  about  1.4  x 109  km  from  Earth. 

2-7.  Strike  up  the  band.  A marching  band  is  per- 
forming on  a football  held  during  half  time.  The  band 
members  are  standing  in  a rectangular  formation,  with 
the  back  row  located  30  m behind  the  front  row.  They  are 
playing  a musical  composition  whose  tempo  is  150  beats 
per  minute. 

a.  After  the  sound  from  the  first  rank  passes  a lis- 
tener on  the  sideline,  how  much  time  elapses  before  the 
sound  from  the  back  row  reaches  the  listener?  Assume 
that  the  band  members  are  perfectly  synchronized.  Sound 
waves  travel  at  340  m/s. 

b.  What  fraction  of  a beat  is  your  result  for  part  a? 

2-8.  Velocity  from  a graph  of  position  versus  time. 

a.  Calculate  the  velocity  for  the  motion  depicted  in 
Fig.  2-11,  using  values  for  a time  interval  starting  at 
tt  = Os  and  ending  at  t = 2 s. 

b.  Repeat  the  procedure  of  part  a for  the  motion  de- 
picted in  Fig.  2-12. 

e.  Repeat  the  procedure  of  part  a for  the  motion  de- 
picted in  Fig.  2-13. 

(S-9)  Jumping  the  gun ? A person  watching  a track  meet 
is  sitting  150  m from  the  starting  line. 

a.  After  each  race  starts,  how  much  time  elapses  be- 
fore the  spectator  hears  the  starter’s  pistol?  The  speed  of 
sound  is  340  m/s. 


G)f 


/Assuming  that  the  runners’  reaction  time  is  0.2  s, 
approximately  how  far  from  the  starting  line  will  the 
runners  be  at  this  instant? 


2-10.  Analytical  differentiation.  Use  the  analytical 
method  of  Example  2-6  to  evaluate  (dx/dt)n,  where 
x(t)  = ct3.  Here  c is  a constant.  Determine  the  value  of  the 
derivative  for  c = 2 m/s3  at  tt  = 1 s and  at  tt  = 2 s.  (The 
numerical  method  is  treated  in  Exercise  2-42.) 


2-11.  Rules  for  derivatives. 

a.  Following  Example  2-6,  verify  Eq.  (2-16)  for  the 
case  n = — 1 . 

b.  Prove  Eq.  (2-14). 

c.  Prove  Eq.  (2-15). 

d.  Following  Example  2-7,  verify  Eq.  (2-18). 

2-12.  Sandy  landing.  A steel  ball  falls  from  a height  of 
8.0  m onto  smooth  sand.  It  makes  a depression  in  the 
sand  0.40  cm  deep. 

a.  What  is  the  average  acceleration  during  the  time  it 
takes  to  stop  the  ball? 

b.  How  long  does  it  take  to  stop  the  ball? 


2-13.  Flipping  a coin  to  determine  g.  In  a crude  experi- 
ment to  determine  the  magnitude  g of  the  acceleration  of 
gravity,  a coin  is  tossed  vertically  upward  and  its  height  of 
rise  is  measured.  If  it  rises  1.0  m and  the  time  between 
leaving  and  returning  is  1.0  s as  nearly  as  can  be  deter- 
mined, what  value  of  g results  from  these  measurements? 


2-14.  Looking  for  constant  acceleration.  Which  equa- 
tion(s)  among  the  following  represent(s)  motion  in  which 
the  acceleration  is  constant?  The  symbols  t,  s,  and  v repre- 
->  sent  time,  distance  traveled,  and  velocity  for  one- 
dimensional motion.  In  each  case,  the  symbol  k represents 
I a constant  with  the  appropriate  physical  dimensions. 

a.  v = kt 

b.  S = kt 


C.  V = ks 


d.  5 = kt2 

e.  v2  = ks 


2-15.  Deer  on  the  road.  A motorist  is  traveling  along  a 
straight  highway  at  an  initial  speed  of  20  m/s.  A deer 
ambles  onto  the  road  50  m ahead  and  stops. 

a.  What  is  the  minimum  deceleration  which  will 
bring  the  car  to  a halt  before  it  strikes  the  deer? 

b.  Reevaluate  your  result  for  part  a to  take  into  ac- 
count that  the  motorist  has  a reaction  time  of  0.30  s. 


2-16.  Faster  and  faster.  Starting  at  t = 0 s,  a body 
accelerates  from  rest,  moving  in  a straight  line  with  accel- 
eration numerically  equal  tog.  Find  the  body’s  final  speed, 
distance  traveled,  and  average  speed  for  the  time  interval 
starting  at  t = 0 s and  ending  at 

a.  t = 1.0  s 

b.  t = 1.0  min 

c.  t = 1.0  h 

d.  t = 1.0  day 

e.  t = 1.0  month 


50  Kinematics  in  One  Dimension 


Express  the  speeds  in  meters  per  second  and  also  as  a frac- 
tion of  the  speed  of  light.  Express  distances  traveled  in 
meters  and,  for  part  e,  in  light-years.  (The  light-year  is  de- 
fined in  Exercise  2-5.) 

2-17.  On  the  open  road. 

a.  Estimate  the  average  speed  in  miles  per  hour  (mi/h) 
of  an  automobile  that  is  used  primarily  for  highway  driving. 
(In  estimating  the  average  speed,  count  only  the  time  that 
the  car  is  on  the  road!) 

b.  How  many  hours  are  spend  in  driving  100,000  mi? 

c.  If  in  one  year  a woman  drives  15,000  mi,  how 
many  hours  per  day  does  she  spend  driving?  What  fraction 
of  a typical  16-h  waking  period  is  this? 

2-18.  In  city  traffic. 

a.  Estimate  the  average  speed  in  miles  per  hour  (mi/h) 
of  an  automobile  which  is  used  primarily  in  city  traffic.  (In 
estimating  the  average  speed,  count  only  the  time  that  the 
car  is  on  the  road!) 

b.  How  many  hours  are  spent  in  driving  100,000  mi? 

c.  If  in  one  year  a commuter  drives  8000  mi,  how 
many  hours  per  day  does  he  spend  driving?  What  fraction 
of  a typical  16-h  waking  period  is  this? 

2-19.  When  do  we  start?  Modify  Eqs.  (2-29),  (2-30), 
(2-32),  and  (2-33)  so  that  they  apply  to  a situation  in  which 
the  value  of  time  at  the  initial  instant  is  t = tt  f 0,  rather 
than  t = 0. 

2-20.  What  if?  Calculate  the  distance  the  car  in  Ex- 
ample 2-106  would  travel  in  4.56  s if  it  continued  to  move 
at  47  km/h.  Explain  why  this  distance  is  twice  the  distance 
it  travels  in  coming  to  rest  in  4.56  s with  a constant  negative 
acceleration. 

2-21.  Average  velocity.  Evaluate  the  average  velocity  in 
the  motion  depicted  in  Fig.  2-20  for  the  time  interval 

a.  Starting  at  t = 0 s and  ending  at  t = 2 s. 

b.  Starting  at  t = 2 s and  ending  at  t = 6 s. 

c.  Starting  at  t = 0 s and  ending  at  t = 6 s. 

d.  Starting  at  t = 0 s and  ending  at  t = 8 s. 

2-22.  Basic  juggling.  A juggler  wishes  to  have  exactly 
three  balls  in  the  air  at  all  times.  He  wants  to  throw  a ball 
every  0.50  s. 

a.  With  what  initial  speed  must  he  throw  each  ball? 

b.  How  high  will  each  ball  rise  above  his  hands? 

2-23.  Well,  well.  A boy  drops  a stone  into  a deep  water 
well  and  hears  the  splash  3.0  s later.  How  far  did  the  stone 
fall  before  striking  the  surface  of  the  water?  The  speed  of 
sound  is  340  m/s. 

Group  B 

2-24.  Up  and  back.  A rocket  rises  vertically  with  con- 
stant acceleration  2.0g  (g  = 9.8  m/s2).  The  acceleration 
lasts  for  1.0  min,  after  which  the  engine  is  cut  off. 

a.  How  fast  is  the  rocket  going  at  cutoff? 

b.  How  high  is  the  rocket  at  cutoff? 

c.  How  much  higher  will  the  rocket  rise  after  cutoff, 


assuming  that  the  value  of  g due  to  the  attraction  of  the 
earth  is  substantially  constant,  and  neglecting  air  resistance? 

d.  How  long  does  it  take  to  achieve  this  additional 
height? 

e.  Neglecting  air  resistance,  how  long  will  it  take  the 
rocket  to  return  to  the  ground  from  its  highest  point? 

f.  With  what  speed  does  the  rocket  hit  the  ground? 

2-25.  Back  and  forth.  Beginning  at  time  t = 0,  a pen- 
dulum bob  executes  small  oscillations,  so  that  its  position  x 
is  given  by  x = 0.030  sin  (2 1),  where  x is  measured  in 
meters  and  t in  seconds. 

a.  During  the  first  cycle  of  oscillation,  for  what  value 
of  t does  the  position  have  its  maximum  positive  value? 

b.  What  is  the  velocity  of  the  pendulum  bob  at  this  in- 
stant? 

c.  When  during  the  first  cycle  does  the  acceleration 
attain  its  maximum  positive  value?  What  is  that  maximum 
value? 

2-26.  A confirmation.  Verify  that  Eqs.  (2-29)  and  (2-30) 
describe  motion  with  constant  acceleration.  That  is,  dif- 
ferentiate Eq.  (2-30)  with  respect  to  time  to  show  that  the 
velocity  is  given  by  Eq.  (2-29).  Then  differentiate  Eq. 
(2-29)  with  respect  to  time  to  show  that  the  acceleration  is 
indeed  equal  to  the  constant  denoted  by  the  symbol  a. 

2-27 . Average  velocity  for  motion  under  constant  accelera- 
tion. Figure  2-22  and  its  caption  describe  the  motion  of  an 
object  starting  from  rest  at  time  zero  and  undergoing 
constant  acceleration.  The  description  establishes  that  the 
average  velocity  (v)  over  a time  interval  beginning  at  time 
6 = 0 and  ending  at  time  t is  given  by  ( v ) = v/2,  where  v 
is  the  velocity  at  time  t.  Construct  an  appropriate  graph 
and  accompanying  discussion  to  prove  that,  for  motion 
under  constant  acceleration,  the  average  velocity  (v)  over 
any  time  interval  is  given  by  (v)  = (vt  + v)/2,  where  v,  is 
the  velocity  at  the  beginning  of  the  interval  and  v is  the 
velocity  at  the  end  of  the  time  interval. 

2-28.  Carry  on.  The  text  presents  on  analysis  of  Fig. 
2-21,  describing  the  relations  between  the  x(t)  and  v(t) 
curves  and  between  the  v(t)  and  a{t)  curves.  The  analysis 
covers  the  time  interval  starting  at  t = 0 s and  ending  at 
t = 77-/2  s.  Extend  the  analysis  to  cover  the  time  interval 
starting  at  t = tt/2  s and  ending  at  t = tt  s. 

2-29.  Gasoline  alley.  A car  built  for  drag  racing  is  able 
to  accelerate  from  rest  to  60  mi/h  in  3.5  s. 

a.  What  is  its  average  acceleration  during  this  time? 
Express  your  result  in  miles  per  hour  per  second,  in  meters 
per  second  squared,  and  as  a multiple  of  g,  the  accelera- 
tion due  to  gravity. 

b.  Assuming  that  the  acceleration  is  actually  constant, 
what  distance  does  the  car  travel  during  this  time? 

c.  Suppose  that  throughout  a f-mi  race  this  car  main- 
tains an  acceleration  equal  to  that  found  in  part  a.  What 
would  its  final  speed  be?  What  would  its  average  speed  be? 
How  long  would  it  take  to  complete  the  race? 


Exercises  51 


2-30.  Intermediate  juggling.  If  a juggler  knows  that  the 
shortest  time  interval  she  needs  between  tossing  successive 
balls  into  the  air  is  0.30  s,  what  is  the  maximum  number  of 
balls  she  can  keep  in  the  air  in  a room  in  which  the  ceiling 
is  3.0  m above  her  hands? 

2-31.  Roger  the  scientific  detective.  Roger,  the  staff  resi- 
dent in  a college  dormitory,  looks  out  his  window  and  sees 
that  water  balloons  are  falling  past  his  window.  He  is 
unable  to  lean  out  far  enough  to  see  the  culprit,  but  he  no- 
tices that  each  water  balloon  strikes  the  sidewalk  0.80  s 
after  passing  his  window.  Roger’s  room  is  on  the  fifth 
floor,  15.0  m above  the  sidewalk.  Assuming  that  the  bal- 
loons are  being  released  from  rest,  how  far  above  Roger  is 
the  release  point? 

2-32.  Matters  of  choice. 

a.  Repeat  the  calculation  of  Example  2-11,  choosing 
x = 0 to  be  at  ground  level. 

b.  Repeat  the  calculation  of  Example  2-11,  choosing 
the  positive  direction  of  the  x axis  to  be  downward. 

c.  Repeat  the  calculation  of  Example  2-11,  breaking 
the  motion  into  two  parts:  the  trip  up  to  maximum  height 
and  the  descent. 

2-33.  Major  league  pop-up.  Professional  baseball 
players  sometimes  hit  “pop-ups”  that  go  straight  up,  so 
that  they  can  be  caught  right  at  home  plate.  Such  pop-ups 
can  remain  airborne  for  several  seconds. 

a.  Neglecting  air  resistance,  what  is  the  initial  speed 
of  a ball  popped  straight  up  if  6.0  s elapses  before  it  is 
caught? 

b.  What  is  the  maximum  height  of  the  pop-up 
described  in  part  a? 

Group  C 

2-34.  Same  time,  same  station.  A homebound  com- 
muter usually  arrives  at  her  hometown  train  station  at  5 
p.m.,  just  as  her  husband  is  arriving  to  meet  her  in  the 
family  car.  One  day  she  leaves  work  early  and  catches  a 
train  that  arrives  at  the  station  at  4 p.m.  She  decides  to 
walk  home  and  immediately  starts  out  along  the  same 
route  that  her  husband  uses.  Her  husband  leaves  home  at 
the  usual  time,  driving  at  his  usual  speed  of  50  km/h. 
They  meet  on  the  way  and  drive  home  at  the  same  speed, 
arriving  15  min  earlier  than  usual. 

a.  How  far  did  the  commuter  walk? 

b.  When  did  husband  and  wife  meet? 

c.  What  was  the  commuter’s  walking  speed? 

d.  Can  you  determine  the  total  distance  between  the 
commuter’s  home  and  the  train  station?  Explain  your 
answer. 

2-35.  One  after  another.  At  t = 0 s,  a child  throws  a 
ball  upward  with  an  initial  speed  of  15.0  m/s.  At  t = 0.50  s, 
he  tosses  a second  ball  upward  at  the  same  speed. 

a.  Construct  graphs  showing  how  the  acceleration, 


velocity,  and  position  of  the  first  ball  depend  on  time, 
from  t = 0 s until  the  ball  returns  to  the  ground. 

b.  Describe  how  to  obtain  the  corresponding  graphs 
for  the  second  ball. 

c.  At  what  instant  do  the  two  balls  have  the  same  ver- 
tical position?  What  is  their  common  position  at  that  in- 
stant? Obtain  your  results  using  both  an  analytical  method 
and  a graphical  method. 

2-36.  Safe  passage.  Mary  Smith  is  driving  along  a 
two-lane  highway,  following  another  motorist,  who  is  trav- 
eling at  80  km/h.  Mary  is  keeping  a safe  distance,  with  (the 
center  of)  her  car  30  m behind  (the  center  of)  the  car 
ahead  of  her.  However,  the  speed  limit  is  90  km/h,  and 
Mary  wishes  to  pass. 

a.  When  she  finds  a clear  opportunity,  Mary  begins 
to  accelerate  and  changes  to  the  other  lane.  She  acceler- 
ates uniformly  from  80  km/h  to  90  km/h  in  5.0  s and  then 
maintains  a constant  speed.  At  the  end  of  the  acceleration, 
how  far  behind  the  other  car  is  Mary? 

b.  From  the  time  she  begins  to  accelerate,  how  long  is 
it  before  she  pulls  even  with  the  other  car?  How  far  does 
Mary  travel  during  this  time? 

c.  When  she  is  30  m ahead  of  the  other  car,  Mary 
changes  back  to  the  travel  lane.  How  long  does  the  entire 
passing  procedure  require?  How  far  does  Mary  travel 
during  this  time? 

d.  If  another  car  is  approaching  at  a speed  of  90 
km/h,  how  far  away  must  it  be  from  Mary’s  car  at  the 
beginning  of  the  procedure,  in  order  for  Mary  to  be  able 
to  vacate  the  passing  lane  by  the  time  the  approaching  car 
is  150  m away? 

2-37.  Catching  up.  A college  student  drops  a ball  out  a 
window  on  the  top  floor  of  the  science  building.  She 
throws  another  ball  straight  down  after  the  first  one  1.0  s 
later.  The  second  ball  leaves  her  hand  with  a speed  of 
20  m/s. 

a.  Neglecting  air  resistance,  and  assuming  that  the 
balls  continue  to  fall  freely,  how  long  after  the  first  ball  is 
dropped  will  the  second  ball  overtake  it? 

b.  How  far  above  the  ground  must  the  release  point 
be  located  in  order  for  the  two  balls  to  hit  the  ground  at 
the  same  time? 

2-38.  A daredevil  parachutist.  An  aerial  stunt  man 
plans  a spectacular  jump  from  the  World  Trade  Center  in 
New  York  City.  His  assistant  will  drop  a (packed)  para- 
chute from  the  top  of  one  of  the  towers,  41 1 nr  above  the 
street.  The  stunt  man  will  then  drop  from  a launch  point 
50  m below  his  assistant,  timing  his  jump  so  that  he  can 
grab  the  parachute  as  it  falls  by  him,  strap  it  on,  open  the 
chute,  and  float  safely  to  the  street. 

a.  The  ripcord  must  be  pulled  when  the  parachutist 
is  at  least  250  m above  the  street  in  order  for  him  to  land 
safely.  The  stunt  man  has  found  from  experience  in 
high-altitude  skydiving  that  he  can  strap  on  a parachute 
and  pull  the  ripcord  in  3.0  s flat.  After  his  assistant  drops 


52  Kinematics  in  One  Dimension 


the  chute,  what  is  the  earliest  time  at  which  the  stunt  man 
can  jump?  The  latest  time?  (Neglect  any  change  in  the 
stunt  man's  motion  as  he  catches  the  chute.) 

b.  If  things  were  to  go  badly  and  the  stunt  man  fell 
freely  all  the  way  to  the  street,  how  long  would  it  take? 
What  speed  would  he  have  on  impact?  (Note:  Air  resis- 
tance, which  would  need  to  be  taken  into  account  in  order 
to  obtain  accurate  results,  is  analyzed  in  Chap.  5.) 

c.  At  the  insistence  of  city  officials,  a layer  of  foam 
10  m thick  is  placed  over  the  street  during  the  jump.  If 
the  stunt  man  hit  the  foam  traveling  at  the  speed  found  in 
part  b,  what  constant  upward  acceleration  would  be  just 
sufficient  to  bring  him  to  rest  within  a distance  of  10  m? 
Give  your  result  in  meters  per  second  squared  and  also  as 
a multiple  of  g. 

2-39.  Upstairs,  downstairs.  Two  dormitory  roommates, 
Hugh  and  Lou,  decide  to  play  an  unusual  game  of  catch. 
Hugh  stands  on  a balcony,  and  Lou  stands  on  the  ground 
directly  below.  The  balcony  is  10  m above  the  ground. 
Hugh  and  Lou  each  throw  the  ball  directly  toward  the 
other  with  an  initial  speed  of  15  m/s. 

a.  How  long  does  it  take  for  the  ball  to  travel  from 
Lou  up  to  Hugh?  How  fast  is  the  ball  traveling  when 
Hugh  catches  it? 

b.  How  long  does  it  take  for  the  ball  to  descend  from 
Hugh  to  Lou?  How  fast  is  the  ball  traveling  when  Lou 
catches  it? 

c.  Compare  the  round-trip  travel  time  with  the  time 
that  would  be  required  if  the  ball  traveled  with  a constant 
speed  of  15  m/s  in  both  directions. 

d.  Alice,  another  resident  of  the  dormitory,  lives 
15  m above  the  two  roommates.  She  decides  to  douse 
Hugh  and  Lou  with  two  water-filled  balloons.  She  watches 
them  toss  the  ball  back  and  forth  several  limes  and  learns 
to  anticipate  when  Hugh  will  release  the  ball.  Alice  wants 
to  drop  the  balloons  so  that  Hugh  and  Lou  will  be  hit 
simultaneously,  just  as  the  ball  is  reaching  Lou.  When 
should  Alice  drop  each  balloon?  Must  she  drop  both  bal- 
loons before  Hugh  throws  the  ball  downward?  How  long 
after  Hugh  throws  the  ball  will  the  first  balloon  fall  past 
him?  How  long  after  that  will  the  second  balloon  strike 
him? 

e.  Alice’s  diabolical  plot  works  perfectly.  However, 
Hugh  and  Lou  are  accustomed  to  Alice’s  pranks,  and  each 
of  them  has  a rotten  tomato  handy.  They  grab  the  to- 
matoes, count  to  three,  and  simultaneously  hurl  the  to- 
matoes at  Alice.  Each  tomato  has  an  initial  speed  of 
30  m/s.  How  soon  after  the  tomatoes  are  thrown  does 
Alice  need  to  be  out  of  the  way? 

f.  Both  tomatoes  miss  on  the  way  up,  but  then  Alice 
makes  a mistake.  She  leans  out  to  gloat  at  her  wet  victims. 
Both  tomatoes  strike  her  on  the  back  of  the  head.  Whose 
tomato  hits  Alice  first?  When  does  it  hit  her,  and  how  fast 
is  it  traveling?  When  does  the  other  tomato  hit,  and  how 
fast  is  it  traveling? 


Numerical 

2-40.  Numerical  evaluation  of  a derivative:  I.  Use 
the  numerical  method  of  Example  2-5  to  evaluate  ( dx/dt)t ., 
where  x(t)  = (4.90  m/s2)t2,  for  t,  = 2 s.  Stop  the  calcula- 
tion after  several  consecutive  numbers  in  the  sequence 
have  the  same  value  to  two  decimal  places.  This  value  is 
the  two-decimal-place  limit. 

2-41.  Numerical  evaluation  of  a derivative:  II.  Con- 
tinue the  calculation  of  Exercise  2-40  to  obtain  numbers  in 
the  sequence  of  results  beyond  the  two-decimal-place 
limit.  Show  that  the  numbers  in  the  sequence  eventually 
begin  to  fluctuate  about  the  limit.  These  fluctuations  are  a 
result  of  calculator  round-off  error.  Use  the  definition  of 
a derivative  to  explain  this  phenomenon  in  detail.  Will  it 
be  a practical  limitation  to  the  utility  of  the  numerical 
method  for  evaluating  derivatives? 

2-42.  Numerical  method  in  action:  I.  Use  the  numer- 
ical method  of  Example  2-5  to  evaluate  (dx/dt),{  where 
x = (2  m/s3)t3  for  = Is  and  for  t ,•  = 2 s.  Compare  your 
answers  with  the  result  of  Exercise  2-10. 

2-43.  Numerical  method  in  action:  II.  Use  the  numerical 
method  of  Example  2-5  to  evaluate  (dx/dt)t.,  where 
x(t)  = sin[(2  s_1)t],  for  tt  = 0 s,  77-/8  s,  77-/4  s,  377/8  s,  and 
77/2  s.  Plot  x versus  t and  also  plot  dx/dt  versus  t for  these 
values  of  fi.  Comment  on  the  apparent  relation  between 
the  two  plots. 


2-44.  Midpoint  versus  endpoint.  The  definition  of  a 
derivative  given  in  Eq.  (2-9)  is  called  the  endpoint  defini- 
tion. An  alternative  definition,  called  the  midpoint  defini- 
tion, is 


limit 

A7-»0 


x{tj  + At/2)  — x(ti  - At/2)" 
At 


These  two  definitions  are  equivalent  since  they  lead  to  the 
same  limiting  value  as  At  — » 0.  However,  the  midpoint 
definition  forms  a basis  of  a numerical  method  for  eval- 
uating derivatives  superior  to  the  method  based  on  the 
endpoint  definition.  That  is,  the  sequence  of  numbers  ob- 
tained using  the  midpoint  method  converges  more  rap- 
idly to  the  limit  than  does  the  sequence  of  numbers  ob- 
tained using  the  endpoint  method. 

a.  Convert  the  endpoint  program  for  numerical  dif- 
ferentiation, given  in  the  Numerical  Calculation  Supple- 
ment, into  a midpoint  program. 

b.  Use  this  midpoint  program  to  repeat  the  calcula- 
tions of  Exercise  2-42.  Compare  the  rates  of  convergence 
of  the  midpoint  and  endpoint  methods.  How  many  steps 
are  required  to  reach  the  two-decimal-place  limit  in  each 
case? 

c.  Use  the  midpoint  program  to  repeat  the  calcula- 
tion of  Example  2-5.  Explain  why  the  midpoint  results 
converge  immediately  in  this  particular  case. 


Exercises  53 


3 

Kinematics  in 
Two  and  Three 
Dimensions 


3-1  PROJECTILE  If  the  universe  were  one-dimensional,  physics  would  be  much  simpler.  But 
MOTION  that  would  hardly  compensate  for  the  loss  of  richness  of  phenomena  which 
makes  the  physical  world  so  fascinating.  Very  many  of  the  most  important 
phenomena  of  physics  simply  could  not  take  place  in  a one-dimensional 
world. 

Now  that  you  have  begun  to  feel  at  home  with  the  kinematics  used  to 
describe  motion  in  one  dimension,  it  is  time  to  see  what  generalizations  are 
required  to  extend  kinematics  so  that  it  applies  to  motion  in  two  dimen- 
sions, and  ultimately  in  three  dimensions  as  well.  This  extension  not  only 
introduces  additional  physics,  but  also  calls  for — and  leads  naturally  to — a 
convenient  and  powerful  mathematical  tool  known  as  the  vector. 

Let  us  begin  our  study  of  physics  in  two  dimensions  by  considering  a 
straightforward  question:  After  an  object  is  projected  in  a direction  parallel 
to  the  earth's  surface,  it  has  a horizontal  motion.  But  it  also  has  a downward 
vertical  motion  resulting  from  the  influence  of  gravity.  Is  the  vertical  mo- 
tion affected  by  the  horizontal  motion,  and  vice  versa? 

Every  child  is  aware  that  there  are  two  conceivable  answers  to  this 
question.  The  first  one,  which  has  strong  appeal  in  spite  of  its  implausibil- 
ity,  appears  regularly  in  the  cartoons  of  Saturday  morning  television.  See 
Fig.  3-1 . The  villain,  in  pursuit  of  the  hero,  runs  over  the  edge  of  a cliff.  He 
continues  to  move  horizontally  for  some  distance  until  he  comes  to  a stop 
with  feet  windmilling,  and  only  then  does  he  plunge  vertically  downward. 

Children  never  fail  to  laugh  at  this  cliche.  It  is  based  on  a notion  that 
things  should  not  be  able  to  move  simultaneously  in  both  the  horizontal 
and  vertical  directions,  that  somehow  they  should  have  to  lose  their  hori- 
zontal motion  before  they  can  begin  vertical  motion.  This  view  is  not 


54 


(a)  ( b ) (c)  (d) 

Fig.  3-1  Motion  in  the  “cartoon  universe,”  in  which  horizontal  and  vertical  motion  cannot  take 
place  simultaneously.  In  this  sequence  of  cartoon  stills,  the  villain  runs  off  the  edge  of  the 
cliff,  but  does  not  begin  to  fall  until  his  horizontal  motion  has  completely  ceased.  Only  then  does 
he  plunge  straight  downward  to  disaster. 


merely  childish;  it  was  shared  in  at  least  some  degree  by  most  physicists 
until  near  the  end  of  the  Renaissance.  And  yet  children  know  that  real  ob- 
jects do  not  behave  in  “cartoon”  fashion.  A child  who  really  thought  they 
did  could  never  learn  to  catch  a ball. 

You  might  guess  that  the  analysis  of  motion  would  be  very  simple  in  a 
universe  where  things  could  not  move  horizontally  and  vertically  at  the 
same  time,  since  the  motions  could  be  handled  one  at  a time.  But  in  fact 
such  a universe  would  be  more  complicated  than  the  one  in  which  we  actu- 
ally live.  The  “cartoon  law  of  motion”  implies  a very  strong  interaction 
between  horizontal  and  vertical  motion,  which  would  have  to  be  taken  into 
account  in  a complicated  way. 

Fortunately,  the  way  things  actually  behave  is  much  simpler.  The  rule 
underlying  that  behavior  is  exactly  the  opposite  of  what  happens  in  the  car- 
toon: Under  the  influence  of  gravity,  objects  change  their  vertical  positions 
by  amounts  which  have  absolutely  nothing  to  do  with  changes  in  their  horizon- 
tal positions,  even  though  the  vertical  and  horizontal  motions  take  place 
simultaneously.  The  point  is  made  clear  by  the  experiment  shown  in  the 
strobe  photo  of  Fig.  3-2.  In  this  experiment,  two  steel  balls  are  initially  at 
the  same  height.  Ball  1 is  loaded  into  a spring  gun  which  can  shoot  it  as  a 
projectile,  and  ball  2 is  held  by  an  electromagnet.  A single  switch  simulta- 
neously triggers  the  gun  and  turns  off  the  electromagnet,  and  both  balls 
start  to  move  simultaneously.  Ball  2 is  simply  dropped  and  has  no  horizon- 


Fig.  3-2  Strobe  photograph  of  two  balls  which  are  set 
into  motion  simultaneously,  but  in  different  ways,  and 
which  then  continue  to  move  under  the  influence  of 
gravity.  Ball  1 , on  the  right,  is  shot  horizontally  from  a 
spring  gun.  Ball  2,  on  the  left,  is  simply  released  at  the 
same  moment  by  turning  off  the  electromagnet  which 
holds  it  at  the  same  initial  height  as  ball  1. 


3-1  Projectile  Motion  55 


tal  velocity.  The  spring  gun  has  been  aimed  in  the  horizontal  plane,  so  that 
ball  1 leaves  the  gun  with  a horizontal  velocity.  So  far  as  horizontal  motion  is 
concerned,  ball  1 acts  like  the  air  table  puck  in  Fig.  1-5,  which  moved  equal 
distances  in  the  equal  time  intervals  from  one  light  flash  to  the  next.  That 
is,  the  horizontal  velocity  of  the  ball  remains  constant.  This  can  be  verified 
by  measuring  the  distances  through  which  the  ball  travels  horizontally 
between  its  successive  positions.  The  vertical  motion  of  ball  1 is  the  same  as 
that  of  ball  2,  which  falls  straight  down  with  constant  acceleration.  This 
statement  is  based  directly  on  the  evidence  of  Fig.  3-2,  since  both  balls 
travel  the  same  distances  vertically  between  their  successive  positions. 

Thus  the  ball  shot  from  the  gun  does  not  “know”  it  is  moving  verti- 
cally, as  far  as  its  horizontal  motion  is  concerned.  Tikewise,  it  does  not 
“know”  it  is  moving  horizontally,  as  far  as  its  vertical  motion  is  concerned. 
This  is  fortunate,  since  we  already  understand  how  to  analyze  both  motion 
with  constant  velocity  and  motion  with  constant  acceleration.  The  problem, 
then,  is  to  combine  these  motions  in  two  dimensions  and  thus  to  determine 
the  path  of  the  ball  (or  projectile),  which  is  called  its  trajectory. 

In  order  to  describe  the  trajectory  in  a specific  fashion,  it  is  necessary 
hrst  to  choose  an  origin,  positive  directions,  and  distance  scales  for  the 
coordinate  axes.  In  making  these  choices,  we  specify  what  mathematicians 
call  a coordinate  system.  There  is  a further  requirement,  however,  if  a 
coordinate  system  is  to  be  useful  in  making  physical  measurements.  There 
must  be  an  observer,  real  or  imagined,  who  makes  the  necessary  measure- 
ments and  specifies  them  in  terms  of  the  coordinates.  The  observer  has  a 
fixed  location  with  respect  to  the  coordinate  axes  being  used.  Taken 
together,  the  observer  and  the  coordinate  axes  constitute  a frame  of  refer- 
ence. 

In  Fig.  3-2  it  is  convenient  to  measure  both  the  horizontal  coordinate  x 
and  the  vertical  coordinate  y of  ball  1 from  its  starting  point  in  the  spring 
gun.  So  we  fix  the  origin  of  these  coordinates  at  the  gun.  Also,  we  choose 
the  positive  direction  of  the  x axis  toward  the  right  and  the  positive  direc- 
tion of  the  y axis  upward.  Since  the  ball  always  moves  downward,  this 
choice  means  that  the  values  of  y will  all  be  negative.  (While  it  might  seem 
“natural”  to  avoid  negative  quantities  by  choosing  the  downward  direction 
as  that  of  positive  y,  this  is  not  done  because  it  is  conventional  to  draw  the 
positive  y axis  of  a coordinate  system  in  the  direction  90°  counterclockwise 
from  that  of  the  positive  x axis.)  We  denote  the  initial  velocity  of  ball  1 as  it 
is  shot  from  the  gun  by  vit  the  velocity  of  its  motion  in  the  horizontal  direc- 
tion at  any  time  by  vx,  and  the  velocity  of  its  motion  in  the  vertical  direction 
at  any  time  by  vy.  Since  the  horizontal  and  vertical  motions  are  mutually 
independent,  the  equations  describing  the  motion  of  the  ball  after  it  leaves 
the  gun  can  be  written  separately.  We  use  g to  represent  the  magnitude  of 
the  gravitational  acceleration.  These  equations  are  then  written 


Horizontal 

vx  ~ vf  = constant  (3- la)  vy  = - gt 
which  leads  to  the  relation  which  lea 


which  leads  to  the  relation 
y = (3-2  b) 


Vertical 


(3-16) 


x = Vit 


(3-2a) 


56  Kinematics  in  Two  and  Three  Dimensions 


Equations  (3-2 a)  and  (3-2 b)  are  independent  descriptions  of  motion, 
one  involving  the  coordinate  x and  the  other  the  coordinate  y.  But  x and  y 
both  depend  on  a common  variable,  the  time  t.  Such  equations  are  called  para- 
metric equations,  and  the  common  variable  (here  t)  is  called  the  parameter. 

The  algebraic  operation  of  eliminating  t from  the  pair  of  equations 
yields  a single  equation  expressing  a relation  between  the  value  of  x and 
that  of  y for  the  position  of  the  ball  at  any  value  of  t.  To  hnd  this  equation, 
square  Eq.  (3-2a)  to  obtain  x2  = vft2,  and  solve  for  t 2: 


t2 


vf 


Insert  this  value  of  t2  into  Eq.  (3-2 b)  to  obtain 


y = 


(3-3) 


Equation  (3-3)  describes  all  the  points  through  which  the  ball  passes.  It  is 
the  equation  of  a parabola.  Thus  the  trajectory  is  parabolic.  This  result  was 
first  derived  by  Galileo,  probably  quite  early  in  the  seventeenth  century. 
Specific  cases  of  Eqs.  (3-2 a)  and  (3-2 b),  and  of  the  connection  between  them 
given  by  Eq.  (3-3),  are  worked  out  in  Example  3-1. 


EXAMPLE  3-1 


Measurements  made  on  the  strobe  photo  of  Fig.  3-2  show  that  ball  1 leaves  the 
gun  with  a horizontal  velocity  vx  = =1.2  m/s. 

a.  Calling  the  time  when  the  ball  leaves  the  gun  t = 0,  hnd  the  quantities  x and 
y,  which  describe  the  horizontal  and  vertical  positions  of  the  ball  with  respect  to  the 
gun,  at  the  times  t = 0.25  s and  t = 0.50  s. 

b.  Give  the  equation  for  the  trajectory  of  the  ball,  and  then  use  it  to  check  the 
relation  between  the  values  of  x and  y found  for  t = 0.50  s. 

■ a.  When  you  insert  the  numerical  values  into  Eq.  (3-2a),  you  obtain  for 


the  value 


t = 0.25  s 

x = Vi t = 1.2  m/s  x 0.25  s = 0.30  m 


From  Eq.  (3-2 b)  you  obtain 


For 

you  have 


9.8  m/s2  x (0.25  s)2 
2 


— 0.31  m 


t = 0.50  s 

x = 1.2  m/s  x 0.50  s = 0.60  m 


and 


9.8  m/s2  x (0.50  s)2 
y = 5 = -1.2  m 


3-1  Projectile  Motion  57 


b.  Substituting  the  numerical  values  of  g and  vt  into  Eq.  (3-3),  you  can  express 
the  trajectory  in  the  form 


y 


9.8  m/s2 
2 X (1.2  m/s)2 


(—3.4  m 1 )x2 


For  the  particular  value  x = 0.60  m,  which  you  found  for  t = 0.50  s,  this  equation 
predicts 


y = ( — 3.4  m x)  x (0.60  m)2  = —1.2  m 


in  agreement  with  the  corresponding  value  of  y found  in  part  a. 


Now  we  will  try  out  all  the  ideas  we  have  developed  on  a more  general 
case  of  projectile  motion.  The  experiment  illustrated  by  the  strobe  photo  of 
Fig.  3-3  demonstrates  what  happens  if  the  spring  gun  used  to  project  the 
ball  is  not  aimed  horizontally.  (For  practical  reasons,  it  is  necessary  to 
compress  the  spring  more  tightly  in  this  experiment  than  in  the  experi- 
ment of  Fig.  3-2.) 

Inspection  of  Fig.  3-3  shows  immediately  that  the  horizontal  distance 
between  successive  positions  of  the  ball  is  constant,  just  as  was  the  case  in 
Fig.  3-2.  If  we  call  the  horizontal  velocity  vx,  the  horizontal  distance  cov- 
ered is  given  by  the  relation 

x = vxt  (3-4) 

which  is  valid  for  constant  vx  if  we  set  t = 0 at  the  moment  of  launching. 

Is  it  still  true  that  the  vertical  motion  is  one  of  constant  downward 
acceleration  g,  in  spite  of  the  fact  that  the  ball  has  an  initial  upward  vertical 
velocity  vyi7  It  is  not  easy  to  answer  this  question  by  looking  at  the  strobe 
photo.  But  let  us  assume  that  again  the  horizontal  motion  is  irrelevant, 
as  far  as  the  vertical  motion  is  concerned.  If  this  is  so,  the  vertical  motion 
of  the  ball  will  be  just  the  same  as  if  it  were  moving  in  only  the  vertical  direc- 
tion. We  know  from  the  study  of  one-dimensional  vertical  motion  in  Chap. 
2 that  the  acceleration  will  then  indeed  be  downward  with  constant  magni- 
tude g. 

If  the  assumption  is  correct,  it  also  follows  that  Eq.  (2-30),  x — xt  + 
vtt  + at2/ 2,  can  be  applied  to  the  vertical  motion  with  appropriate  changes 


Fig.  3-3  Strobe  photograph  of  a ball  shot  from  a 
spring  gun  in  a nonhorizontal  direction.  The  clock 
shows  that  0.040  s passes  between  each  flash  of  the 
strobe  light  and  the  next  one.  Only  part  of  the  spring 
gun  can  be  seen. 


58  Kinematics  in  Two  and  Three  Dimensions 


in  notation.  Since  we  are  calling  the  vertical  coordinate  y,  with  the  positive 
direction  upward  and  the  gun  located  at  the  origin  where  y{  — 0,  Eq.  (2-30) 
becomes 

gt2 

y - vyit  - (3-5) 


Equations  (3-4)  and  (3-5)  have  the  common  variable  time,  just  like  the 
simpler  Eqs.  (3-2a)  and  (3-2 b)  which  apply  to  the  case  of  the  horizontally 
aimed  gun.  Solving  Eq.  (3-4)  for  t and  substituting  into  Eq.  (3-5)  gives 


y = 


Vi  & 2 

— X — 7—9  * 
'x  2t4 


(3-6) 


Like  Eq.  (3-3),  this  is  the  equation  of  a parabola,  although  this  time  the 
turning  point  (the  point  where  the  slope  of  the  parabola  is  zero)  is  not  at 
the  origin.  Example  3-2  will  give  you  a direct  experimental  verification  of 
Eq.  (3-6),  and  therefore  also  of  Eqs.  (3-4)  and  (3-5). 


EXAMPLE  3-2 

Using  measurements  made  on  the  photograph  of  Fig.  3-3,  find  the  position  coordi- 
nates x and  y for  one  of  the  images  of  the  ball.  Measure  also  the  (constant)  horizon- 
tal velocity  vx  and  the  initial  vertical  velocity  vyi.  Use  Eq.  (3-6)  and  the  measured 
values  of  x,  vx,  and  vyi  to  calculate  y,  and  compare  the  calculated  value  of  y with  the 
measured  value. 

■ Choose  any  image  of  the  ball,  say  the  eleventh,  counting  the  image  of  the  ball  in 
the  gun  as  zero.  (This  is  the  fourth  image  of  the  ball  past  the  top  of  the  trajectory.) 
Choose  the  positive  x direction  to  the  right  and  the  positive  y direction  upward. 
Take  the  starting  position  of  the  ball  (in  the  gun)  to  be  the  origin,  so  that  x,-  = 0 and 
y,  = 0.  When  you  scale  off  horizontal  and  vertical  distances,  you  find  that  the  coor- 
dinates of  the  eleventh  image  of  the  ball  are 

xn  = 41.5  cm  and  yu  = 25.7  cm 


A digression  concerning  significant  figures  is  appropriate  here.  You  found  in 
making  measurements  on  Fig.  3-3  that  it  is  not  possible  to  measure  the  location  of 
an  image  of  the  ball  within  0.1  cm.  That  is,  the  last  digit  in  the  value  of  x or  y imme- 
diately above  is  not  a significant  figure  in  the  sense  of  the  discussion  in  Example  2- 1 . 
For  example,  x = 41.5  cm  cannot  mean  less  than  x = 41.6  cm  or  more  than  x = 
41.4  cm.  Nevertheless,  it  is  not  desirable  to  drop  the  last  digit  completely,  because 
you  can  make  the  measurements — or  at  least  estimate  them  reliably — within  a few 
tenths  of  a centimeter.  Put  another  way,  the  possible  precision  of  the  measurement 
is  such  that  you  would  not  be  exploiting  it  fully  if  you  discarded  the  digits  to  the 
right  of  the  decimal  point.  In  cases  like  this,  you  should  retain  the  last  digit  even 
though  it  is  not  fully  significant.  However,  you  must  bear  in  mind  that  the  signifi- 
cance of  the  last  digit  in  the  final  result  will  be  the  same  (at  best)  as  that  of  the  corre- 
sponding digit  in  the  least  precise  measurement. 


To  find  an  accurate  value  of  vx,  you  choose  two  well-separated  images  of  the 
ball,  say  image  3 and  image  1 1 . Call  the  time  interval  between  successive  flashes  of 
the  strobe  light  a flash  interval.  Since  there  are  eight  flashes  between  the  two  images 
of  interest,  the  time  elapsed  as  the  ball  passes  from  image  3 to  image  1 1 is  At  = 8 
flash  intervals.  Scaling  off  the  horizontal  distance,  you  obtain  xn  — x3  = 29.7  cm. 
Thus  you  have  for  the  horizontal  velocity 


Xn  - x3  _ 29.7  cm 

At  8 flash  intervals 


3.71  cm/flash  interval 


3-1  Projectile  Motion  59 


Using  the  timer  in  the  figure  to  measure  the  number  of  flashes  occurring  in  1 s,  you 
find  that  1 flash  interval  = 0.0400  s.  You  thus  have 


3-2  PROPERTIES  OF 
VECTORS 


3.71  cm  1 flash  interval 

1 flash  interval  0.0400  s 


92.8  cm/s 


It  is  not  possible  to  obtain  a precise  value  of  vyi  from  the  strobe  photo  by  direct 
measurement,  since  the  ball  has  that  initial  vertical  velocity  only  at  the  instant  of  de- 
parture from  the  gun.  Nevertheless,  you  can  make  an  approximation  by  working 
backward.  Call  the  measured  heights  of  images  1 and  2 of  the  moving  ball  jq  and  y2. 
The  average  vertical  velocity  of  the  ball  between  flashes  1 and  2 is 


(Vy  1— *2) 


3>2  ~ >’l 
At 


19.2  cm  — 10.5  cm 
0.0400  s 


218  cm/s 


According  to  the  argument  made  in  Sec.  2-7,  this  is  also  the  instantaneous  velocity  at 
the  midpoint  of  the  time  interval  between  flashes  1 and  2.  That  is,  it  is  the  instanta- 
neous velocity  in  the  y direction  at  approximately  1 2 flash  intervals  = 0.0600  s after 
the  ball  is  launched.  This  is  an  approximation,  since  the  ball  was  probably  not 
launched  simultaneously  with  a flash.  Nevertheless,  you  can  see  from  the  figure  that 
the  vertical  distance  between  images  0 and  1 is  greater  than  that  between  images  1 
and  2,  so  that  the  chances  are  that  the  firing  of  the  gun  nearly  coincided  with  a 
flash.  Thus  you  can  take  the  value  of  the  average  velocity  ( vy  !_»2 ) to  be  the  value  of 
the  instantaneous  velocity  vyXW. 

Now  that  you  know  vyim  = 218  cm/s  (to  a good  approximation),  you  can 
apply  Eq.  (2-29)  to  evaluate  the  ball’s  initial  vertical  velocity  vyi.  In  the  present  nota- 
tion, it  is 


vy  1 1/2  — Vyj  gt 

where  t = 0.0600  s is  the  time  H flash  intervals  after  the  initial  time  t = 0.  Solving 
for  vyi,  you  obtain 

Vyi  = Vy  1 1/2  + gt 

Since  you  are  measuring  distances  in  centimeters,  you  must  express  the  value  of  g in 
centimeters  per  second  squared.  You  have 

g = 9.80  m/s2  x 100  cm  _ ggQ  cm/s2 

1 m 


You  insert  this  value,  and  the  other  numerical  values,  into  the  equation  immediately 
above  to  obtain,  within  the  accuracy  of  the  approximation  and  the  measurements 
themselves, 


vyi  = 218  cm/s  + 980  cm/s2  x 0.0600  s = 277  cm/s 
Using  this  value  in  Eq.  (3-6),  you  obtain  the  result 


277  cm/s 
92.8  cm/s 


X 41.5  cm 


980  cm/s2 
2 x (92.8  cm/s)5 


x (41.5  cm)2 


or 


y = 25.9  cm 

This  compares  well  with  the  directly  measured  value,  which  isy  = 25.7  cm.  Indeed, 
you  could  not  expect  a better  correspondence,  given  the  limits  on  the  accuracy  of 
the  measurements  used. 


In  the  preceding  section  we  learned  how  to  find  the  quantities  x and  y that 
specify,  at  any  time  t,  the  horizontally-  and  vertically-measured  positions  of 
a ball  with  respect  to  a gun  from  which  it  it  shot.  But  there  are  other  quan- 


Kinematics  in  Two  and  Three  Dimensions 


Fig.  3-4  The  slant  distance  r from  the 
gun  to  the  ball  at  any  moment  is  the 
hypotenuse  of  the  right  triangle  of 
which  the  horizontal  distance  x and  the 
vertical  distance  y are  the  sides.  The 
direction  from  the  gun  to  the  ball  is 
specified  by  the  angle  (f> . The  dashed 
gray  curve  represents  the  trajectory. 


tities  which  are  useful  in  providing  the  same  kind  of  information.  One  of 
these  is  the  “slant  distance”  from  the  gun  to  the  ball.  This  distance  r is  the 
length  of  a straight  line  extending  from  the  gun  to  the  ball.  Refer  to  Fig. 
3-4.  In  this  figure,  the  pythagorean  theorem  tells  us  that 

r2  = x2  + y2 

Thus  the  distance  from  the  gun  to  the  ball  is 

r = V x2  + y2  (3-7) 

The  positive  root  is  taken  because  a distance  is  always  positive.  The  value  of 
r provides  a partial  specification  of  the  position  of  the  ball  relative  to  the 
gun.  To  complete  the  specification  it  is  necessary  to  give  also  the  direction 
from  the  gun  to  the  ball.  The  angle  <£  in  Fig.  3-4  does  this.  It  is  measured 
counterclockwise  from  the  positive  x axis  to  the  line  of  length  r and,  by  the 
definition  of  the  tangent  function,  satisfies  the  relation 

y 

tan  <±>  — - 

x 


Thus  the  angle  f is  given  by 

f = tan-1  - (3-8) 

(The  symbol  “tan-1”  means  that  <p  is  the  angle  whose  tangent  is  y/x.)  For 
any  value  of  t,  the  values  of ' x and  y can  be  found  from  Eqs.  (3-4)  and  (3-5), 
and  then  used  in  Eqs.  (3-7)  and  (3-8)  to  evaluate  the  quantities  r and  f. 

There  is  a striking  contrast  between  the  way  the  horizontally-  and 
vertically-measured  relative  positions  are  added  in  Eq.  (3-7)  and  the  famil- 
iar way,  called  algebraic  addition,  that  quantities  such  as  time  and  volume 
are  added.  For  example,  a 2-s  time  interval  followed  by  a 3-s  time  interval 
constitutes  a total  time  interval  of  2 s + 3 s = 5 s.  For  another  example,  if 
you  fill  a 5-liter  (L)  bottle  with  water  and  then  fill  a 1-L  bottle  from  the  5-L 
bottle,  the  water  remaining  in  the  latter  amounts  to  5 L + (—  1 L)  = 4 L.  A 
quantity  which  adds  to  a like  quantity  in  this  manner  can  be  completely  spe- 
cified, in  terms  of  an  agreed-upon  unit,  by  a single  number  preceded  by  a 
positive  sign  (usually  unwritten)  or  by  a negative  sign.  Such  a quantity  is 
called  a scalar. 

A quantity  which  adds  to  a like  quantity  in  the  different  way  that  relative 
positions  add  requires  more  than  a single  number  to  specify  it  completely. 
Such  a quantity  is  called  a vector.  Relative  position  is  not  the  only  vector 
quantity.  For  example,  velocity  and  acceleration  are  vectors,  as  we  will  see 
in  the  next  section.  In  this  section  we  develop  the  properties  of  vectors.  Our 
development  will  be  guided  by  considering  the  mathematical  properties  of 
position  vectors.  But  any  other  quantity  which  partakes  of  the  same  mathe- 
matical properties  is  a vector,  no  matter  what  physical  attribute  it  describes. 

We  have  learned  already  that  there  are  two  different  ways  to  specify  a 
vector: 


1.  In  a situation  confined  to  two  dimensions,  a vector  can  be  specified 
in  terms  of  two  scalars.  These  scalars  are  the  two  components  of  the  vector. 
For  example,  a vector  describing  the  position  of  an  object  with  respect  to 
the  origin  can  be  specified,  using  a set  of  mutually  perpendicular  axes,  in 
terms  of  its  components.  These  are  the  pair  of  scalars  designated  as  x and  y 
in  Fig.  3-4.  Because  the  properties  of  vectors  are  general,  we  now  introduce 


3-2  Properties  of  Vectors  61 


(0,0)  Ax 


Fig.  3-5  Illustrating  two  ways  of  de- 
scribing the  position  of  the  point  (Ax , Ay) 
with  respect  to  the  origin  (0,  0),  as 
discussed  in  the  text.  The  figure  assumes 
that  both  Ax  and  Ay  have  positive  values. 


a notation  that  more  clearly  applies  to  any  vector  and  not  just  a position 
vector.  We  do  this  by  replacing  the  symbols  x and  y by  the  symbols  Ax 
and  Ay. 

In  Fig.  3-5,  a pair  of  mutually  perpendicular  coordinate  axes  has  been 
drawn  on  a plane.  The  origin  of  these  coordinates  is  the  point  in  the  plane 
specified  by  the  pair  of  values  (Ax  = 0,  Ay  = 0),  or  simply  (0,  0)  for  short. 
Now,  any  other  point  in  the  plane  can  be  uniquely  specified  by  an  ordered  pair  of 
scalars  ( numbers ) {Ax,  Ay).  [It  is  purely  conventional  that  Ax  is  written  first, 
but  it  is  important  to  stick  to  the  convention  once  it  is  established;  (5,  3)  is 
not  the  same  point  as  (3,  5).] 

In  essence,  this  ordered  pair  prescribes  the  position  of  the  point  (Ax, 
Ay)  with  respect  to  the  origin.  It  does  so  by  specifying  a pathway  for 
reaching  it  from  the  origin.  The  prescription  reads:  Beginning  at  the  origin, 
measure  off  a length  equal  to  the  magnitude  of  Ax,  extending  in  the  positive  direction 
along  the  x axis  if  A x is  positive  and  in  the  negative  direction  if  A x is  negative.  Next 
turn  your  ruler  by  90°,  so  that  it  is  parallel  to  the  y axis.  Then  measure  off  a length 
equal  to  the  magnitude  of  Ay,  extending  in  the  positive  direction  along  the  y axis  if  A y 
is  positive  and  in  the  negative  direction  if  A y is  negative.  You  have  now  located  the 
point  (Ax,  Ay)  in  a completely  unambiguous  way. 

2.  In  two  dimensions,  a different  pair  of  scalars  can  also  be  used  to 
specify  a vector.  One  of  these  scalars  gives  the  magnitude  of  the  vector 
(that  is,  its  absolute  numerical  value),  and  the  other  gives  its  direction.  Ex- 
amples are  the  pair  of  scalars  (r,  <p)  in  Fig.  3-4.  This  second  way  of  speci- 
fying a vector  is  the  algebraic  equivalent  of  the  “natural”  way  to  depict  a 
vector  geometrically,  as  an  arrow  whose  length  and  orientation  represent 
the  magnitude  and  direction  of  the  vector.  Such  an  arrow,  of  length  A 
directed  at  angle  (f>A,  can  be  seen  in  Fig.  3-5. 

The  ordered  pair  of  scalars  ( A , fiA)  can  be  used  in  a prescription  for  lo- 
cating a point  on  the  plane  by  following  a different  pathway:  Begin  at  the  ori- 
gin, with  your  ruler  along  the  positive  x axis.  Then  turn  the  ruler  counterclockwise 
thro  ugh  an  angle  <f>A.  Next  measure  off  a length  A.  You  are  now  at  the  same  point  on 
the  plane  as  prescribed  in  1,  provided  that 

A — VAj  + A2y  and  </>4  = tan_1-r-^  (3-9) 

Ax 

The  rules  quoted  in  Eqs.  (3-9)  for  obtaining  the  values  of  the  ordered 
pair  (A,  <f>A ) from  those  of  the  ordered  pair  (Ax,Ay)  are  identical  to  the  rules 
set  forth  in  Eqs.  (3-7)  and  (3-8),  except  for  the  change  to  the  more  general 
notation.  There  are  also  rules  for  obtaining  [Ax,  Ay)  from  (A,  fiA).  These 
rules  are  found  by  applying  the  definition  of  the  sine  and  cosine  functions 
to  the  triangle  with  sides  Ax,  Ay,  and  A in  Fig.  3-5.  These  definitions  are 

AX  . Ay 

cos  c f)A  — — and  sin  <pA  = — 

A A 

Multiplication  of  both  sides  of  each  of  these  equalities  by  A yields 

Ax  = A cos  <pA  and  Ay  = A sin  fiA  (3-10) 

We  now  make  an  important  observation.  Given  the  ordered  pair  (A*, 
Ay),  we  can  find  the  ordered  pair  (A,  fA),  and  vice  versa.  Both  ordered  pairs 
uniquely  specify  the  same  point  in  space,  which  can  therefore  be  labeled 
equally  well  with  either  pair.  To  put  it  another  way,  the  vector  A ( which 


62  Kinematics  in  Two  and  Three  Dimensions 


Fig.  3-6  All  the  vectors  shown  are  the 
same  vector,  since  they  have  identical 
magnitudes  and  directions.  The  location 
of  a vector  is  not  one  of  its  intrinsic 
properties. 


determines  a certain  position  with  respect  to  the  origin)  is  uniquely  specified  by  an 
ordered  pair  of  numbers,  each  of  which  is  a scalar. 

It  is  conventional  in  printed  material  to  distinguish  vectors  from 
scalars  by  using  boldface  type  for  all  letters  designating  vector  quantities,  as 
has  been  done  immediately  above  for  the  vector  A.  (In  handwritten  mate- 
rial, vectors  are  usually  denoted  by  drawing  small  arrows  above  the 
symbols:  X.  This  is  a more  evocative  notation,  but  it  is  awkward  to  set  the 
arrows  in  type.)  In  dealing  with  situations  which  are  definitely  two-  or 
three-dimensional,  we  will  always  use  a letter  in  boldface  type  to  represent 
a vector  and  the  same  letter  in  italic  type  to  represent  the  magnitude  of  that 
vector.  Thus  the  magnitude  of  the  vector  A is  the  scalar  A. 

Now  that  we  have  said  what  a vector  is,  it  is  equally  important  to  make 
clear  what  it  is  not.  In  the  (A,  <pA)  representation  (which  corresponds  most 
closely  to  intuition)  a vector  is  completely  specified  by  a magnitude  and  a direc- 
tion. Thus  a vector  has  no  other  properties.  In  particular — and  this  some- 
times strikes  people  as  surprising  at  first — a vector  has  no  particular  loca- 
tion. For  example,  all  the  vectors  A shown  in  Fig.  3-6  are  the  same  vector. 
Thus  a vector  may  be  moved  without  changing  it  in  any  way,  provided  no 
change  is  made  in  its  magnitude  or  its  direction.  In  other  words,  a vector  can 
be  moved  parallel  to  itself  without  changing  it. 

If  the  vectors  in  Fig.  3-6  are  position  vectors,  then  each  shows  the  posi- 
tion of  the  point  at  its  head  with  respect  to  that  of  the  point  at  its  tail.  For 
one  of  these  vectors  the  tail  happens  to  be  at  the  origin  and  so  it  gives  the 
position  of  a point  with  respect  to  the  origin.  For  the  other  vectors  this  is 
not  the  case.  But  in  all  cases  the  vector  depicts  exactly  the  same  position  of 
the  point  at  its  head  relative  to  that  of  the  point  at  its  tail.  This  is  all  the  in- 
formation that  can  be  provided  by  a position  vector  since,  like  any  other 
vector,  it  is  completely  specified  by  its  magnitude  and  its  direction. 


Sometimes  the  location  of  a vector  must  be  specified  for  other  reasons.  For  ex- 
ample, the  effect  produced  by  a force  exerted  on  a lever  depends  on  the  point  of 
application  of  the  force.  So  even  though  a force  can  be  represented  completely  by 
a magnitude  and  a direction,  which  are  not  altered  by  changing  its  location — that 
is,  force  is  a vector — its  physical  effect  depends  on  the  point  at  which  it  is  ap- 
plied. But  note  that  the  significance  of  the  point  of  application  of  the  force  is  a con- 
sequence of  the  overall  physical  situation  and  is  not  a property  of  the  vector  itself. 
You  will  study  cases  of  this  sort  in  Chap.  9. 


How  are  vectors  “added”?  We  have  already  seen  how  to  construct  a 
vector  out  of  its  components.  This  is  what  is  done  in  Fig.  3-5.  The  operation 
already  contains  implicitly  the  essentials  of  vector  addition.  We  now  make 
the  vector  addition  process  explicit. 

Consider  a pair  of  components  (Ax,  Ay),  assuming  for  simplicity  that 
both  have  positive  values.  Figure  3-7 a shows  a vector  Ax  of  length  Ax  ex- 
tending in  the  positive  direction  along  the  x axis.  Also  shown  is  a vector  Ay 
of  length  Ay  extending  in  the  positive  direction  along  the  y axis. 

The  vector  Ay  can  be  moved  parallel  to  itself  so  that  its  tail  coincides 
with  the  head  of  the  vector  A^,  as  shown  in  Fig.  3-7 b.  For  the  reasons  just 
discussed,  this  has  no  effect  on  the  vector.  However,  the  vectors  shown  in 
Fig.  3-7 b display  pictorially  the  prescription  of  Fig.  3-5  for  constructing  the 
vector  A,  which  has  the  components  (Ax,  Au).  That  is,  Ax  represents  the 
process  of  measuring  off  a length  Ax  in  the  positive  direction  along  the  x 


3-2  Properties  of  Vectors  63 


V 


Ax 

( b ) 


Fig.  3-7  (a)  The  special  vector  Ax, 

whose  direction  is  that  of  the  x axis,  has 
magnitude  Ax.  Similarly,  the  special 
vector  Ay , whose  direction  is  that  of  the 
y axis,  has  magnitude  Ay.  ( b ) The  vector 
Ay  is  unchanged  by  moving  it  parallel  to 
itself  until  its  tail  coincides  with  the  head 
of  the  vector  Ax.  A vector  A is  con- 
structed with  its  tail  at  the  origin  and 
its  head  coincident  with  the  head  of 
Ay.  This  vector,  whose  magnitude  is 
A = (A2  + Al)112  and  whose  direction  is 
<t>  = tan_1(Ai)/Ax),  is  the  vector  sum 
of  the  vectors  Ax  and  Ay ; that  is, 
A = Ax  + Ay.  The  components  of  A 
are  (Ax,  Av). 


EXAMPLE  3-3 


axis,  while  Ay  represents  the  subsequent  process  of  measuring  off  a length 
Ay  parallel  to  the  y axis  in  the  positive  direction  of  that  axis.  Thus  the  sequence 
of  the  two  processes  is  equivalent  to  specifying  the  vector  A.  This  fact  is  repre- 
sented in  mathematical  notation  in  the  form 

A = Aj.  + Aj,  (3-11) 

We  therefore  say  that  A is  the  vector  sum  of  the  vectors  Ax  and  Ay.  The  use 
of  the  symbol  “+”  and  of  the  term  “sum”  does  not  imply  that  a vector  sum  is 
\he  same  as  an  ordinary  algebraic  sum.  Rather  it  implies  an  analogy  between 
the  two  distinct  mathematical  operations. 

While  it  is  not  a universal  convention,  we  will  call  Ax  and  Ay  the  con- 
stituent vectors  of  A.  This  is  to  distinguish  them  from  the  components  Ax 
and  Ay,  which  are  scalars. 

Although  the  components  Ax  and  Ay  are  not  the  same  as  the  constitu- 
ent vectors  Ax  and  A,,,  there  is  a close  connection  between  them.  The  con- 
nection can  be  made  explicit  by  defining  a quantity  called  the  unit  vector. 
We  define  the  unit  vector  x (spoken  as  “x  hat”)  to  be  the  vector  in  the  posi- 
tive x direction  having  magnitude  1,  that  is,  unit  magnitude.  The  vector  Ax, 
which  has  magnitude  Ax,  is  just  Ax  times  as  “lo  ng”  as  the  unit  vector  x and  has  the 
same  direction  as  x.  This  is  represented  mathematically  by  the  identity 

A x = Ax-k  (3-12) 

(This  “product”  is  a special  case  of  multiplication  of  a vector  by  a scalar,  an 
operation  which  is  defined  and  discussed  more  generally  later  in  this  sec- 
tion.) Equation  (3-12)  is  a very  convenient  way  of  expressing  individually 
the  two  essential  properties  of  the  vector  A^.:  its  magnitude  Ax  and  its  direc- 
tion x.  The  idea  is  easily  extended  to  any  other  vector.  For  the  vector  A„  we 
have 

Ay=Ayj  (3-13) 

And  for  any  vector  A,  having  any  direction  A,  whether  it  is  a position  vector 
or  some  other  type  of  vector,  we  can  write 

A = AA  (3-14) 

If  A is  a position  vector  it  is  not  correct  to  say  that  the  unit  vector  A has 
magnitude  1 m.  The  unit  of  measurement  is  associated  with  the  magnitude 
of  a vector,  not  with  its  direction.  This  point  is  demonstrated  in  Example 
3-3. 


If  the  direction  of  the  unit  vector  Ax  denotes  eastward,  and  the  direction  of  the  unit 
vector  Aj,  denotes  northward,  find  the  components  of  the  position  vector  A when 
A = 50.0  m and  its  direction  is  30°  north  of  east.  Then  write  expressions  for  the 
constituent  vectors  Ax  and  Ay  of  A. 

■ You  begin  by  drawing  a sketch  of  the  situation,  as  in  Fig.  3-8a.  The  orientation 
of  the  vector  A is  shown  relative  to  the  directions  of  the  unit  vectors  Ax  and  A„. 

You  now  use  Eqs.  (3-10)  to  find  the  components  Ax  and  Ay  of  the  vector  A. 
The  first  of  these  equations  gives  you 

Ax  = A cos  chA  = 50.0  m X cos  30° 


64  Kinematics  in  Two  and  Three  Dimensions 


North 


Fig.  3-8a  I'he  vector  A oriented  with 
respect  to  the  unit  vector  A,. , represent- 
ing “eastward,”  and  the  unit  vector  A„, 
representing  “northward.” 

y 


Fig.  3-86  Illustrating  the  constituent 
vectors  Ax  and  A„  of  the  vector  A. 


y 


Fig.  3-9  The  point  ( Cx , Cy)  can  be 
reached  from  the  origin  either  via  the 
straight  pathway  along  the  vector  C or 
by  first  following  the  pathway  from  the 
origin  to  the  head  of  the  vector  A at 
( As , Aa),  and  then  following  the  vector 
B to  (Cx,  Cy).  The  equivalence  of  the 
two  processes  is  denoted  by  the  vector 
sum  C = A + B 


or 


Ax  = 43.3  m 


The  second  gives  you 

Ay  = A sin  (/)A  = 50.0  m x sin  30° 


or 


Ay  = 25.0  nt 


The  corresponding  constituent  vector  Ax  is  the  product  of  the  component  Ax 
and  the  unit  vector  Ax,  and  similarly  for  the  constituent  vector  Ay.  Thus 

Ax  = AxAj  = (43.3  m)(l  eastward) 

and 


Ay  = AyAy  = (25.0  m)(l  northward) 


The  constituent  vectors  are  shown  in  Fig.  3-8 b.  The  vector  sum  of  its  two  constitu- 
ent vectors  is  equivalent  to  the  vector  A;  that  is, 


A — AX  + Ay 


We  now  generalize  the  idea  of  vector  addition  so  as  to  give  meaning  to 
the  addition  of  two  vectors  having  any  orientation.  Earlier  in  this  section  we 
specified  the  position  of  a point  with  respect  to  the  origin  by  prescribing 
two  pathways  for  reaching  it.  One  was  the  “direct  route”  along  the  vector 

A.  The  other  followed  the  constituent  vectors  Ax  and  Ay  in  succession.  In 
Fig.  3-9,  two  pathways  are  shown  from  (0,  0)  to  the  arbitrary  point  (Cx,  Cy). 
One  is  again  the  “direct  route.”  The  other  follows  the  vectors  A and  B in 
succession.  While  A and  B are  not  constituent  vectors,  they  certainly  do 
comprise  a pathway  from  the  origin  to  the  point  (Cx,  Cy).  It  is  entirely  rea- 
sonable to  state  the  ecptivalence  of  the  two  pathways  in  the  figure  by  means 
of  the  equation. 

C = A + B (3-15) 

The  sum  of  two  vectors  is  found  geometrically  by  making  the  tail  of  the  second  coin- 
cide with  the  head  of  the  first  and  then  draiuing  a vector  from  the  tail  of  the  first  to  the 
head  of  the  second.  To  put  it  another  way,  the  vector  A locates  the  point  at  its 
head  having  coordinates  (Ax,  Ay)  with  respect  to  the  origin  (0,  0).  The 
vector  B locates,  with  respect  to  the  point  ( Ax , Ay),  the  point  at  its  head 
having  coordinates  (Cj.,  Cy).  The  vector  C locates  the  point  having  coordi- 
nates (Cx,  Cy)  directly  with  respect  to  the  origin.  Thus  C,  whose  tail  is  at 
(0,  0)  and  whose  head  is  at  (Cx,  Cy),  accomplishes  in  a single  step  the  same 
process  of  location  which  is  accomplished  sequentially  by  A,  with  its  tail  at 
(0,  0)  and  its  head  at  ( Ax , Ay),  and  by  B,  with  its  tail  at  ( Ax , Ay ) and  its  head 
at  (Cx,  Cy).  This  process  is  called  vector  addition. 

What  is  the  justification  for  giving  the  name  “addition”  to  the  vector 
process  sketched  in  Fig.  3-9,  as  is  done  in  Eq.  (3-15)?  The  justification  is 
easiest  to  see  if  the  process  is  described  in  equivalent  algebraic  terms.  In 
Fig.  3- 10a  and  b,  the  vector  A has  x and  y components  (Ax,  Ay),  and  the 
vector  B has  components  ( Bx , By).  (Note  that  the  same  vector  B is  shown  in 
both  figures.)  The  vector  C,  which  is  the  vector  sum  of  A and  B,  has  compo- 
nents ( Cx , Cy).  From  inspection  of  the  figures  you  can  see  that  if  C = A + 

B,  then  Cx  = Ax  + Bx  and  Cy  — Ay  + By.  That  is,  the  sum  of  two  vectors  is 


3-2  Properties  of  Vectors  65 


y 


Fig.  3-11  In  this  case,  the  vectors 
A and  B (shown  by  solid  lines)  are  not 
conveniently  located  with  the  tail  of  A 
at  the  origin  and  the  tail  of  B coinciding 
with  the  head  of  A.  But  since  a vector 
is  not  changed  by  moving  it  parallel  to 
itself,  A and  B can  always  be  moved  to 
the  locations  shown  by  dashed  lines.  The 
vector  summation  can  then  be  carried 
out  as  described  in  Fig.  3-9  or  Fig.  3-10. 


EXAMPLE  3-4 


Fig.  3-10  (a)  The  x and  y components  of  A and  C are  shown.  ( b ) The  vector  B is  shown  sep- 
arately, moved  so  that  its  tail  coincides  with  the  origin.  Its  x and  y components  are  shown. 
Compare  both  parts  of  this  figure  with  Fig.  3-9  to  see  that  Bx  is  the  difference  between 
Cx  and  Ax,  and  that  By  is  the  difference  between  Cy  and  Ay.  Thus  Cx  = Ax  + Bx  and 
Cy  = Ay  + By.  Consequently,  the  vector  from  the  origin  to  (Ax  + Bx,Ay  + By)  is  the  vector  C. 


a vector  whose  components  are  the  sums  of  the  corresponding  compo- 
nents. We  are  thus  justified  in  writing 

C = A + B 


as  the  complete  equivalent  of  the  pair  of  equations 

CX  = AX  + BX  (3- 16a) 

and 

Cy  = Ay  + By  (3-16  b) 

Equations  (3-16)  amount  to  an  algebraic  method  of  finding  the  sum  of  any  two 
vectors  whose  components  are  known.  The  name  “vector  addition”  is  thus 
justified  both  in  the  sense  that  the  algebraic  process  for  carrying  it  out 
involves  algebraic  additions  of  the  scalar  components,  and  in  the  sense  that 
there  is  a strong  analogy  between  the  process  of  vector  addition  (whether  it 
is  carried  out  algebraically  or  geometrically)  and  the  process  of  algebraic 
addition. 

Figure  3-1  1 depicts  two  vectors  A and  B,  neither  of  which  lies  with  its 
tail  at  the  origin.  Nor  do  the  two  vectors  describe,  as  they  are  shown  in  the 
figure,  a single  pathway  from  one  point  to  another.  Nevertheless,  the  two 
vectors  can  be  added  geometrically.  This  is  done  by  reducing  the  situation 
to  the  simpler  one  shown  in  Fig.  3-9,  exploiting  the  fact  that  the  signifi- 
cance ol  a vector  is  independent  of  its  location.  It  is  necessary  only  to  move 
A,  without  changing  its  magnitude  or  its  direction,  until  its  tail  lies  at  the 
origin  and  then  to  move  B until  its  tail  coincides  with  the  head  of  A.  Thus 
Eq.  (3-15),  C = A + B,  holds  as  well  for  Fig.  3-11  as  it  does  for  Fig.  3-9. 


Vector  A has  magnitude  A = 4.00  m and  is  directed  at  an  angle  c/>.4  = —45.0°  from 
the  positive  x axis  of  a particular  reference  frame,  with  positive  angles  measured 
counterclockwise.  The  magnitude  of  vector  B is  B = 2.00  m,  and  its  direction  from 
the  x axis  is  given  by  the  angle  cf)B  = + 120.0°. 


66  Kinematics  in  Two  and  Three  Dimensions 


a.  Evaluate  the  magnitude  and  direction  of  their  vector  sum,  C = A + B,  by 
using  a geometrical  method  based  directly  on  the  definition  of  vector  addition  spe- 
cified by  Eq.  (3-15)  and  Figs.  3-9  and  3-11. 

b.  Then  evaluate  the  magnitude  and  direction  of  C using  an  algebraic  method 
based  on  the  component  addition  process  specified  by  Eqs.  (3-16)  and  Fig.  3- 10a. 
Compare  your  results  with  those  obtained  in  part  a. 

■ a.  Working  as  carefully  as  you  can,  you  use  a compass,  protractor,  and  graph 
paper  to  lay  off  the  magnitudes  and  directions  of  vectors  A and  B to  a convenient 
scale  on  a set  of  xy  coordinates.  In  Fig.  3-  12a  this  is  done  with  the  tails  of  both 
vectors  at  the  origin  of  coordinates.  Then  use  the  drawing  instruments  to  move 
vector  B,  without  changing  its  length  or  direction,  so  that  its  tail  is  at  the  head  of 
vector  A,  as  in  Fig.  3-12 b.  Now  you  can  connect  the  tail  of  vector  A to  the  head  of 
vector  B with  a vector  labeled  C.  By  the  definition  of  vector  addition,  C = A + B. 
Measuring  its  length  and  direction,  you  obtain  C = 2.13  m and  ^>c  = —31.1°.  How- 
ever, the  accuracy  of  the  last  digit  in  both  numbers  is  doubtful. 


Fig.  3-12  Illustration  for  Ex- 
ample 3-4. 


v (in  m ) 


3-2  Properties  of  Vectors  67 


If  you  use  this  method  again,  you  will  likely  want  to  skip  making  a construction 
similar  to  Fig.  3- 12a  and  go  instead  directly  to  a construction  similar  to  Fig.  3-126. 
An  alternative  procedure  begins  with  a construction  like  that  of  Fig.  3- 12a.  You 
then  draw  a line  parallel  to  B that  goes  through  the  head  of  A,  and  also  a line  paral- 
lel to  A that  goes  through  the  head  of  B.  You  will  now  have  a parallelogram.  Next 
draw  a vector  whose  tail  is  at  the  origin  and  whose  head  is  at  the  intersection  of  the 
two  parallel  lines  you  have  constructed.  This  vector  is  C = A + B.  Carry  out  this 
procedure  by  drawing  the  parallel  lines  on  Fig.  3- 12a,  and  then  explain  why  it 
works. 

b.  The  algebraic  method  requires  that  you  first  use  Eqs.  (3-10)  to  determine 
the  x and  y components  of  vectors  A and  B.  For  A you  have 

Ax  = A cos  (j)A  and  Ay  = A sin  <f>A 

= 4.00  m x cos(-45.0°)  = 4.00  m x sin(— 45.0°) 

= + 2.83  m = -2.83  m 


For  B you  have 

Bx  = B cos  </>b 

= 2.00  m x cos(  120.0°) 
= - 1.00  m 


and  By  = B sin  <f>B 

= 2.00  m x sin(  120.0°) 
= +1.73  m 


You  sum  these  components  algebraically  to  find  the  components  Cx  and  Cy,  using 
Eqs.  (3-16a)  and  (3-166).  You  obtain 

Cx  = Ax  + Bx  and  Cv  = Av  + By 

= +2.83  nr  - 1.00  m = -2.83  m + 1.73  nr 

= + 1 .83  nr  = — 1 . 10  nr 


To  find  the  magnitude  C and  direction  4>c  of  the  vector  C,  you  employ  Eqs. 
(3-9).  The  magnitude  is 

C = VC2X  + Cl 
= V(1 .83  nr)2  + (-1.10  nr)2 
= 2.14  m 


Fhe  direction  is 


4>c  = tan 


= tan  1 


/ — 1.10  nr  \ 
l + 1 .83  nr ' 


= -31.0° 

These  results  compare  well  with  those  found  by  the  geometrical  method.  Of 
the  two  methods,  the  algebraic  one  is  probably  faster,  and  certainly  more 
accurate  — particularly  if  you  use  a calculator  intended  for  scientific  work.  If  you 
use  a programmable  calculator,  you  can  program  it  to  use  the  algebraic  method  and 
add  any  number  of  vectors  automatically. 


What  has  been  said  above  for  the  summation  of  two  vectors  applies  to 
any  number  of  vectors.  The  vectors  must  be  placed  head  to  tail.  Their  sum 
is  defined  as  the  vector  joining  the  tail  of  the  first  with  the  head  of  the  last. 
See  Fig.  3-13.  Can  you  write  a set  of  equations  to  describe  the  equivalent 
algebraic  addition  process? 


68  Kinematics  in  Two  and  Three  Dimensions 


Fig.  3-13  Any  number  of  vectors 
can  be  added  by  placing  diem  head  to 
tail,  and  then  joining  the  tail  of  the 
first  to  the  head  of  the  last.  In  the  fig- 
ure, Z = A + B+  C + D + E+F. 


(b) 


Fig.  3-14  Geometric  justification  for 
the  commutative  rule  A + B = B + A 
for  vector  addition.  In  (a),  the  vector 
C is  constructed  by  adding  A and  B in 
both  possible  orders.  Note  that  the  addi- 
tion process  does  not  depend  upon 
the  choice  of  any  particular  frame  of 
reference,  as  long  as  the  scale  of  length 
is  understood.  ( b ) The  vector  C is 
uniquely  specified  as  the  directed  di- 
agonal of  the  parallelogram  of  directed 
sides  A and  B One  pair  of  sides  repre- 
sents the  sum  A + B,  and  the  other  pair 
the  sum  B + A. 


You  are  familiar  with  the  fact  that  the  algebraic  sum  of  two  numbers 
does  not  depend  on  the  order  of  addition.  That  is,  a + b = b + a is  always 
true.  This  property  of  addition  is  called  commutativity.  Vector  sums  are 
also  commutative.  To  see  this,  note  that  the  components  of  a vector,  being 
themselves  scalars,  add  commutatively: 

C x — Ax  + Bx  — Bx  + Ax  and  Cy  = Ay  + By  = By  + Ay 

Therefore  the  coordinates  ( Bx  + Ax,  By  + Ay),  which  locate  the  head  of 
the  vector  B + A when  its  tail  is  at  the  origin,  are  identical  with  the  coordi- 
nates (Ax  + Bx , Ay  + By),  which  locate  the  head  of  the  vector  A + B when 
its  tail  is  at  the  origin.  That  is,  the  two  vectors  are  the  same  vector: 

A + B = B + A (3-17) 

This  rule  applies  as  well  to  the  sum  of  any  number  of  vectors,  which  can  be 
added  in  any  order.  Figure  3-14  justifies  the  commutative  rule  from  a geo- 
metrical point  of  view. 

The  idea  of  vector  addition  leads  to  a definition  for  the  negative  of  a 
vector.  For  any  vector  A there  is  always  a vector  A'  which  has  the  property 
that 


A + A'  = 0 (3-18) 

Such  a pair  of  vectors  is  illustrated  in  Fig.  3-15.  You  can  see  from  a geomet- 
rical point  of  view  that  A'  satisfies  Eq.  (3-18)  by  imagining  A'  to  be  moved 
so  that  its  tail  coincides  with  the  head  of  A.  Then  observe  that  the  two 
vectors  add  to  “nothing.”  The  symbol  0 denotes  the  zero,  or  null,  vector. 
Since  the  null  vector  has  no  particular  direction,  we  will  often  ignore  its 
vectorial  nature  and  write  it  as  the  scalar  0. 

The  vector  A'  is  called  the  negative  of  A.  That  is,  by  definition 

A'  = -A 

Stated  in  words,  the  negative  of  any  vector  is  a vector  of  equal  magnitude  and  op- 
posite direction.  In  view  of  this  definition,  the  negative  of  a vector  can  always 
be  constructed  by  “turning  the  vector  around,”  that  is,  by  moving  its  head 
to  the  opposite  end. 

The  operation  of  vector  subtraction  is  defined  in  a way  which  follows 
directly  from  the  definition  of  the  negative  of  a vector.  In  complete  analogy 
with  ordinary  algebraic  subtraction,  we  define 

A — B = A + (—  B)  (3-19) 


y 


Fig.  3-15  The  vector  A'  has  the  same  magni- 
tude as  the  vector  A,  but  its  direction  is  oppo- 
site that  of  A.  As  explained  in  the  text,  the 
vector  sum  A + A'  = 0,  and  A'  is  defined  to 
be  the  negative  of  A;  A'  = —A. 


3-2  Properties  of  Vectors  69 


(a) 


(A) 


Fig.  3-16  Vector  subtraction  by  the  geometric 
method,  (a)  The  vector  — B is  constructed 
by  “turning  B around.”  (b)  The  vector  sum 
D = A + ( — B)  is  the  desired  vector  difference 

A - B 


Figure  3-16  depicts  this  operation.  Given  the  two  vectors  A and  B,  the  neg- 
ative — B of  B is  first  constructed.  Then  — B is  added  to  A to  obtain  the 
vector  sum  D = A + (—  B).  According  to  Eq.  (3-19),  this  is  identical  to  the 

vector  difference 

D = A - B 

Can  you  show  geometrically  that  B — A = — (A  — B)  = — D? 


EXAMPLE  3-5 

Use  the  algebraic  method  to  find  the  difference  D = A — B of  the  two  vectors 
whose  sum  C = A + B you  found  in  Example  3-4. 

■ First  you  must  extend  the  algebraic  method  of  summing  the  components  of 
two  vectors  to  obtain  the  components  of  their  vector  sum,  so  that  it  can  be  used  to 
obtain  the  components  of  their  vector  difference.  This  is  easy  to  do.  Employing  the 
unit  vectors  x and  y,  you  write 

A = Axx  + Ay  y 

and 

B = Bxk  + Byj 

Then  subtract  corresponding  sides  of  the  second  equation  from  those  of  the  first. 
You  obtain 


A - B = (Ax  - Bx)x  + (Ay  - By) y 

Now  write  the  vector  difference  D = A — B in  terms  of  its  components  and  the  unit 
vectors.  You  have 

A — B = D = Z)j.x  + Dyy 

Comparison  with  the  equation  displayed  immediately  above  shows  that 


Dx  — — Bs 


(3-20a) 


and 


Dy=  Ay  - By 


(3-20  b) 


You  can  apply  this  method  immediately  to  the  problem  at  hand  since  Ax,  Ay, 
Bx,  and  By  were  evaluated  in  Example  3-4.  Using  these  values,  you  have 


70  Kinematics  in  Two  and  Three  Dimensions 


Dx  = Ax-  Bx 

= +2.83  m — (—  1.00  m) 
= + 3.83  m 


and  Dy  = Ay  — By 


= —2.83  m — (+  1.73  m) 
= —4.56  m 


The  magnitude  D and  direction  cf)D  of  the  vector  D are  given  by 


D = \ZD%  + D% 


= V(+  3.83  m)2  + (-4.56  m)2 
= 5.96  m 


and 


= tan  1 


— 4.56  m 
3.83  m 


= -50.0° 


You  should  check  these  results  by  using  the  geometrical  method  for  obtaining  a 
vector  difference. 


An  important  conclusion  obtained  in  Example  3-5  is  that  the  components 


of  the  negative  of  a vector  are  equal  to  the  negatives  of  the  corresponding  compo  nents 


of  the  vector. 

The  multiplication  of  a vector  by  a scalar  is  an  extension  of  the  idea 
of  ordinary  scalar  multiplication.  Consider  the  vector  P given  by 


(3-2  la) 


P = cA 


By  definition,  the  magnitude  of  P is  the  magnitude  of  the  scalar  c multiplied  by  the 
magnitude  of  A.  That  is, 


P = \c\A 


(3-21  b) 


where  |c|  is  the  absolute  value  of  the  scalar  c.  Also  by  definition,  the  direction 
of  P is  the  same  as  that  of  A if  c is  a positive  number;  it  is  the  same  as  that  of  — A if  c 
is  a negative  number.  We  can  express  both  the  magnitude  and  the  direction 
of  P by  means  of  the  vector  equation 


(3-22) 


P = cAA 


In  the  case  c > 0,  the  quantity  cA  is  positive  since  A is  always  positive;  thus  P 
has  the  same  direction  as  A.  In  the  case  c < 0,  the  quantity  cA  is  negative; 
thus  the  direction  of  Pis  opposite  that  of  A.  If  |c|  > 1,  then  Pis  longer  than  A. 
If  |c|  < 1,  P is  shorter  than  A.  This  definition  is  illustrated  in  Fig.  3-17.  Can 
you  see  a connection  between  the  definition  of  the  negative  of  a vector 
and  the  definition  of  multiplication  of  a vector  by  a negative  scalar?  Are 
these  independent  definitions? 

The  definition  of  multiplication  of  a vector  by  a scalar  can  be  extended 
x so  as  to  define  the  operation  of  division  of  a vector  by  a scalar.  We  define 


c c 


3-2  Properties  of  Vectors  71 


That  is,  the  vector  A divided  by  the  scalar  c is  defined  to  be  the  product  of  the  vector 
A with  the  scalar  1 /c.  Division  of  a vector  or  a scalar  by  a vector  has  no 
meaning. 

The  ideas  just  developed  for  vectors  in  two  dimensions  can  be  readily 
extended  into  three-dimensional  space.  Figure  3-18  is  a perspective  repre- 
sentation of  a frame  of  reference  in  three-dimensional  space.  It  is  specified 
by  the  mutually  perpendicular  x,  y,  and  z axes  with  their  (identical)  scales. 
By  using  this  frame  of  reference,  any  point  in  space  can  be  uniquely  speci- 
fied by  an  ordered  triplet  of  scalars  (Ax,  Ay,  Az).  (Just  as  in  two-dimensional 
space,  the  number  of  separate  scalars  required  to  do  this  is  equal  to  the 
number  of  dimensions.)  Such  a point  — the  point  (7.0  m,  4.0  m,  5.0  m) — is 
depicted  in  Fig.  3-18. 

In  complete  parallelism  to  the  geometrical  representation  of  a two- 
dimensional  vector,  a three-dimensional  vector  can  be  visualized  as  a 
directed  line  (or  arrow)  extending  from  the  origin  (0,  0,  0)  to  the  point 
( Ax , Ay,  Az),  as  shown  in  the  figure.  This  vector  A has  the  three  constituent 
vectors  Ax,  Ay,  and  Az,  whose  magnitudes  Ax,  Ay,  and  Az  are  the  three 
components  of  A.  In  mathematical  language,  we  can  express  this  state- 
ment in  the  form 

A = Ax  + Ay  + Az  (3-24) 

and 

A = Axk  + Ay  y + Azz  (3-25) 

where  x,  y,  and  z are  the  unit  vectors  in  the  directions  of  the  positive  x,  y, 
and  z axes,  respectively. 


z (in  m) 


Fig.  3-18  A perspective  view  of  the  three-dimensional  vector  A, 
whose  components  are  Ax  = 7 , Ay  = 4,  and  Az  = 5.  The 
constituent  vectors  Ax,  Ay,  and  Az  are  shown,  as  is  the  vector  B 
which  is  the  projection  of  A on  the  xy  plane.  The  magnitude  A of  A 
is  evaluated  in  Example  3-6.  The  direction  of  A can  be  specified  by 
means  of  the  two  angles  0A  and  4>A  shown  in  the  figure.  The  angle 
6 A is  that  between  A and  the  z axis,  while  <f>A  is  the  angle  between  the 
x axis  and  B,  the  base  of  the  vertical  triangle  of  sides  B,  Az , and  A. 


72  Kinematics  in  Two  and  Three  Dimensions 


Find  the  magnitude  of  the  vector  A in  Fig.  3-18,  whose  tail  lies  at  the  origin  and 
whose  head  lies  at  the  point  (7.0  m,  4.0  m,  5.0  m). 

■ Just  as  in  two  dimensions,  this  example  requires  the  application  of  the  pythag- 
orean  theorem.  Flowever,  it  must  be  applied  twice. 

First,  note  that  the  vector  A and  its  constituent  vector  Az  are  the  hypotenuse  and 
one  side  of  a right  triangle  whose  plane  is  perpendicular  to  the  xy  plane  (the  plane 
defined  by  the  x and  y axes).  As  shown  in  Fig.  3-18,  the  other  side  of  this  triangle  is 
the  vector  B,  which  lies  along  the  line  of  intersection  of  the  two  planes.  This  vector 
is  the  projection  of  A on  the  xy  plane;  you  may  imagine  it  as  the  “shadow”  cast  by  A 
on  the  xy  plane  when  the  “sun”  is  located  very  far  away  on  the  z axis. 

Using  the  pythagorean  theorem,  you  have 

A2  = B2  + A2 

But  the  vector  B is  itself  the  hypotenuse  of  the  right  triangle  in  the  xy  plane  whose 
sides  are  the  constituent  vectors  Ax  and  A„.  Thus  you  can  use  the  pythagorean 
theorem  again  to  obtain 

B2  = A2r  + A% 

Combining  the  two  equations  immediately  above  gives  you 

A2  = A2  + A2  + A2 
or 

A = \/A2  + A2  + A2  (3-26) 

This  is  the  form  taken  by  the  pythagorean  theorem  in  three  dimensions. 

You  can  now  use  Eq.  (3-26)  to  find  the  magnitude  of  the  vector  A whose  head 
lies  at  (7.0  m,  4.0  m,  5.0  m).  You  have 

A = V(7.0  m)2  + (4.0  m)2  + (5.0  m)2  = 9.5  m 


Like  a two-dimensional  vector,  a three-dimensional  vector  is  com- 
pletely specified  by  its  magnitude  and  direction.  It  can  therefore  be  moved 
parallel  to  itself  without  being  changed. 

The  geometrical  and  algebraic  rules  for  addition  and  subtraction  of 
three-dimensional  vectors,  and  for  multiplication  and  division  of  a three- 
dimensional  vector  by  a scalar,  are  completely  analogous  to  the  corre- 
sponding operations  in  two  dimensions.  Because  of  the  difficulty  of 
drawing  accurate  three-dimensional  representations,  however,  the  geomet- 
rical method  is  rarely  used.  The  algebraic  method  of  vector  addition  is  a 
direct  extension  of  that  given  by  Eqs.  (3-16)  for  two-dimensional  vectors. 
There  are  now  three  components  instead  of  two,  and  the  vector  sum  C = 
A + B is  equivalent  to  the  component  equations 


Cx  = Ax  + Bx  (3-27  a) 

Cy  = Ah  + Bv  (3-27  b) 

Cz  = Az  + Bz  (3-27c) 


The  other  vector  operations  already  discussed  for  two-dimensional  vectors 
are  extended  similarly. 


3-2  Properties  of  Vectors  73 


3-3  POSITION, 
VELOCITY, 
AND 

ACCELERATION 

VECTORS 


At  any  instant  of  time  the  position  of  a moving  body  relative  to  the  origin  of 
a coordinate  system  can  be  described  by  the  position  vector  extending  from 
die  origin  to  the  body.  (Body  is  a technical  term  for  an  object  whose  motion 
is  being  considered.)  Since  the  velocity  of  the  body  is  defined  as  the  rate  of 
change  of  its  position,  the  fact  that  its  position  is  described  by  a vector 
suggests  that  its  velocity  also  may  be  described  by  a vector.  And  since  the 
acceleration  of  the  body  is  defined  as  the  rate  of  change  of  its  velocity,  the 
same  line  of  thought  suggests  that  the  acceleration  may  be  a vector  quan- 
tity, too.  In  this  section  we  show  that  botli  suggestions  are  correct  by 
making  use  of  the  concept  called  a differential.  Furthermore,  we  develop 
some  very  useful  relations  among  the  position,  velocity,  and  acceleration 
vectors. 

In  Fig.  3-19,  the  vector  r specifies  the  position  of  a moving  body  with 
respect  to  the  origin  of  a particular  frame  of  reference  at  a certain  instant 
of  time  t.  The  velocity  of  the  body  is  defined  to  be  the  instantaneous  rate  of 
change  of  r with  respect  to  t.  Stated  mathematically,  we  define  the  velocity 
to  be 


dr 

dt 


Fig.  3-19  The  vector  r specifies  the  Except  for  the  fact  that  in  two  or  three  dimensions  position  must  be  treated 

position  of  a point  with  respect  to  the  as  a vector,  this  quantity  is  exactly  the  same  in  concept  as  the  quantity 

origin  of  a particular  frame  of  reference. 

dx 

It 


which  is  used  to  define  velocity  when  the  motion  is  one-dimensional. 

In  both  cases  the  derivative  with  respect  to  time  is  defined  to  be  the  lim- 
iting value  of  a sequence  of  fractions 

(change  of  position) 

(time  interval  over  which  change  of  position  occurs) 


The  sequence  of  fractions  is  obtained  by  making  the  time  interval  smaller 
and  smaller.  In  the  multidimensional  case,  each  fraction  in  this  sequence 
consists  of  a numerator  Ar  and  a corresponding  denominator  At,  both  of 
which  have  certain  particular  values. 

The  symbol  dr  denotes  an  infinitesimal  change  in  r.  It  is  called  the  dif- 
ferential of  r.  Physicists  conventionally  use  dr  to  represent  a value  of  Ar 
which  is  the  numerator  of  a fraction  belonging  to  the  sequence  when  the 
value  of  the  fraction  is  very  close  to  the  limiting  value  of  the  sequence.  In 
other  words,  physicists  use  the  symbol  dr  to  represent  an  extremely 
small — or  infinitesimally  small— A r.  In  like  manner,  they  use  the  symbol 
dt  to  represent  the  corresponding  infinitesimally  small  value  of  the  denomi- 
nator At.  The  word  “corresponding”  means  that  dr  and  dt  represent  the  Ar 
and  the  At  of  the  same  fraction.  While  both  dr  and  dt  are  infinitesimally 
small,  the  value  of  the  fraction  dr/dt  formed  by  division  of  dr  by  dt  is  just  a 
value  of  Ar/A t (not  necessarily  small)  which  is  very  near  the  limiting  value. 
Hence  this  fraction  is  essentially  the  same  as  the  derivative  of  r with  respect  to 
t.  In  other  words,  the  derivative  of  r with  respect  to  t can  be  treated  as  a 
fraction  whose  numerator  is  dr  and  whose  denominator  is  dt.  In  fact, 
treating  the  derivative  of  any  function  of  any  independent  variable  as  a frac- 
tion is  legitimate  for  all  the  cases  normally  encountered  in  describing  physi- 
cal measurements. 


74  Kinematics  in  Two  and  Three  Dimensions 


It  is  possible  to  concoct  mathematical  functions  which  behave  in  peculiar 
fashion  as  the  limit  is  approached,  and  there  can  then  be  difficulties  with  treating 
the  derivative  of  the  function  as  a fraction.  But  these  difficulties  arise  with  discon- 
tinuous functions  not  normally  used  in  describing  the  continuous  behavior  with 
which  this  book  is  concerned. 

Mathematicians  can  legitimately  take  exception  to  this  approach,  since  the 
nature  of  their  work  leads  them  to  take  particular  interest  in  the  abnormal  cases. 
Physicists,  on  the  other  hand,  are  interested  in  useful  calculational  tools  and  are 
not  so  concerned  with  what  they  call  “pathological”  functions.  In  any  case,  the  dis- 
harmony between  the  intuitive  notion  of  the  physicist  and  the  rigorous  approach 
of  the  mathematician  has  come  to  be  reconciled  in  recent  years  through  work  in 
the  field  of  mathematics  called  nonstandard  analysis,  in  which  the  intuitive  idea 
of  differentials  is  reformulated  on  a rigorous  basis. 

The  fact  that  the  velocity  dr/dt  is  a vector  quantity  follows  immediately 
from  two  things.  One  is  that  we  can  treat  dr/dt  as  the  vector  quantity  dr  di- 
vided by  the  scalar  quantity  dt.  The  other  is  that  a vector  divided  by  a scalar 
is  a vector;  see  Eq.  (3-23).  Hence  we  are  justified  in  using  vector  notation  to 
designate  the  velocity  v,  and  in  writing  the  equation  defining  it  as 


The  velocity  vector  v is  the  derivative  of  the  position  vector  r with  respect  to  the  time 
t. 


The  definition  of  the  differential,  taken  together  with  treatment  of  the 
derivative  as  a fraction,  makes  possible  some  useful  algebraic  manipula- 
tions which  have  direct  physical  meaning.  Multiplying  Eq.  (3-28)  on  both 
sides  by  the  differential  quantity  dt  gives 

dr  = v dt  (3-29 a) 


y 


Fig.  3-20  If  the  point  located  in  Fig. 
3-19  moves  with  respect  to  the  origin, 
the  vector  r changes  with  time.  Over  an 
infinitesimal  time  interval  dt,  the  change 
in  position  is  given  by  the  vector  dr  = 
r — rit  where  r and  r,  are  respectively 
the  final  and  initial  position  vectors.  The 
length  of  the  infinitesimal  vector  dr 
must  of  course  be  exaggerated  in  the 
figure.  The  easiest  way  to  verify  that  the 
three  vectors  in  the  figure  are  arranged 
in  such  a way  as  to  satisfy  the  equation 
dr  = r — rf  is  to  rewrite  it  in  the  form 
r = r,  + dr.  Then  note  that  r extends 
from  the  tail  of  r,  to  the  head  of  dr,  in 
agreement  with  the  rule  for  vector 
addition. 


This  equation  tells  us  that  a body  moving  with  velocity  v for  the  infinites- 
imal time  interval  dt  will  change  its  position  by  the  infinitesimal  amount  dr 
given  by  the  product  of  v and  dt.  The  vector  dr  can  also  be  expressed  as 

dr  = r — r, 

where  r,  is  the  initial  position  vector  of  the  body — at  the  beginning  of  the 
infinitesimal  time  interval — and  r is  its  position  vector  at  the  end  of  the  in- 
terval. The  three  vectors  are  shown  in  Fig.  3-20.  The  vectors  r and  r,  are 
supposed  to  be  only  slightly  different  in  magnitude  and  direction. 

The  utility  of  Eq.  (3-29a)  comes  from  the  fact  that  even  if  v varies  with 
time,  each  time  interval  dt  is  so  short  that  v may  be  considered  constant 
over  any  particular  time  interval.  Equation  (3-29 a)  is  very  similar  to  the  re- 
lation Ar  = vA t which  involves  finite  quantities.  But  the  latter  is  valid  only 
for  constant  v.  The  possibility  of  treating  a variable  as  an  “instantaneous 
constant”  is  essential  to  the  development  of  integral  calculus,  as  you  will 
see  in  Chap.  7. 

The  vectors  r,  and  r depend  on  the  frame  of  reference  from  which 
they  are  measured.  In  Fig.  3-21,  for  instance,  a second  (“primed”)  frame  of 
reference  is  chosen  with  axes  x'  and  /.  These  axes  have  fixed  locations  with 
respect  to  the  axes  x and  y in  the  first  (“unprimed”)  frame  of  reference.  The 
initial  and  final  position  vectors  r'  and  r'  in  the  primed  frame  of  reference 


3-3  Position.  Velocity,  and  Acceleration  Vectors  75 


y 


Fig.  3-21  The  moving  point  of  Fig.  3-20  is  located  with  respect  to  a 
second,  “primed”  frame  of  reference  which  is  itself  fixed  with  respect  to 
the  “unprimed”  frame.  The  position  vectors  at  the  beginning  and  end  of 
the  infinitesimal  time  interval  dt  are  r/  and  r',  which  are  not  the  same  as 
the  corresponding  vectors  r;  and  r.  But  the  figure  shows  that 
r — r,  = r'  — r/,  or  dr  = dr' . Thus  the  change  in  position  over  the 
interval  dt  is  the  same  in  both  frames  of  reference. 


y are  not  the  same  as  the  corresponding  position  vectors  r,  and  r in  the  un- 

primed frame  of  reference.  But  the  vector  dr  denoting  the  change  in  position 
with  respect  to  the  unprimed  frame  is  identical  to  the  vector  dr'  denoting 
the  change  in  position  with  respect  to  the  primed  frame. 

In  order  to  make  this  important  fact  clear,  we  introduce  a new  name 
and  a new  symbol  for  the  infinitesimal  “change  in  position  dr."  We  call  it 
the  infinitesimal  displacement  ds.  Hence  displacement  means  “change  in 
position,”  and 

o x ds  = dr 


\^ds 

Fig.  3-22  The  displacement  vector  ds  is 
shown  in  an  arbitrary  frame  of  reference 
to  emphasize  its  independence  of  the 
choice  of  the  frame  in  which  the  position 
vectors  r and  r;  are  specified.  The  vector 
ds  is  identical  to  the  vector  dr  in  Fig.  3-2 1 . 
(In  actuality,  both  dr  and  ds  are  in- 
finitesimal. For  purposes  of  illustration, 
they  must  necessarily  be  shown  as  finite.) 


In  terms  of  the  displacement,  we  can  describe  what  happens  to  the  moving 
body  in  the  infinitesimal  time  interval  dt  without  having  to  specify  the  posi- 
tion vectors  r;  and  r (or  r/  and  r').  In  Fig.  3-22,  the  same  displacement 
vector  ds  is  shown  in  a frame  of  reference  whose  origin  is  chosen  quite  arbi- 
trarily. Indeed,  it  is  often  possible  to  draw  a displacement  vector  without 
bothering  to  draw  the  frame  of  reference  at  all. 

Since  the  displacement  ds  is  another  name  for  the  change  in  position 
dr,  Eq.  (3-29a),  dr  = v dt,  can  be  written  equally  well 

ds  = v dt  (3-29 b) 

Similarly,  the  definition  of  velocity  given  by  Eq.  (3-28),  v = dr/dt,  can  be 
written  equally  well 


v 


ds 

dt 


(3-30) 


Thus  velocity  is  the  infinitesimal  displacement  vector  ds  divided  by  the  corre- 
sponding infinitesimal  time  interval  dt.  Whether  v is  related  to  an  infinitesimal 
displacement  vector  ds,  as  in  Eqs.  (3-29 b)  and  (3-30),  or  to  an  infinitesimal 
change  dr  in  a position  vector  r,  as  in  Eqs.  (3-28)  and  (3-29a),  depends  only- 
on  the  emphasis  desired  in  a particular  situation. 


76  Kinematics  in  Two  and  Three  Dimensions 


v = v,  + a dt 


V,-  a dt 

Fig.  3-23  At  the  beginning  of  the  in- 
finitesimal time  interval  dt,  an  object 
moves  with  instantaneous  velocity  vj. 
During  the  time  interval,  it  experiences 
an  acceleration  a whose  direction  is  the 
same  as  that  of  v, . At  the  end  of  the  time 
interval,  its  instantaneous  velocity  is 
v = Vi  + a dt.  The  magnitude  of  the 
velocity  is  changed,  but  the  direction  of 
motion  is  unchanged. 


The  acceleration  vector  a is  the  derivative  of  the  velocity  vector  x with  respect 
to  the  time  t.  Expressed  in  mathematical  notation,  this  definition  is 


a 


d\ 

dt 


(3-31) 


Acceleration  is  a vector  because  an  infinitesimal  change  dx  in  a velocity  is  a 
vector,  because  the  corresponding  infinitesimal  interval  dt  of  time  is  a 
scalar,  and  because  a vector  divided  by  a scalar  is  a vector.  Except  for  the 
fact  that  it  takes  into  account  the  vectorial  nature  of  acceleration  in  two  or 
three  dimensions,  Eq.  (3-31)  does  not  differ  from  Eq.  (2-19), 


dv 
= ~dt 


which  defined  acceleration  for  one-dimensional  motion. 

We  can  substitute  the  definition  v = dr/dt  into  Eq.  (3-31)  to  obtain 


- £ (£i\  = 

3 dt\dt)  dt2 


(3-32) 


That  is,  the  acceleration  vector  a is  the  second  derivative  of  the  position  vector  r 
with  respect  to  time. 

The  identity  ds  = dr  can  be  used  to  write  Eq.  (3-32)  in  the  equivalent 
form 


a 


_ d2 s 
dt 2 


(3-33) 


Again,  it  is  a matter  of  the  emphasis  desired  as  to  whether  Eq.  (3-32)  or 
Eq.  (3-33)  is  used. 


Now  we  will  make  use  of  the  vector  definition  of  acceleration,  Eq. 
(3-31),  to  gain  new  insight  into  the  relation  between  acceleration  and 
velocity.  Multiplying  both  sides  of  the  equation  by  dt  yields  the  relation 

d\  — a dt  (3-34) 

That  is,  an  object  having  acceleration  a during  the  infinitesimal  time  in- 
terval dt  will  change  its  velocity  by  the  infinitesimal  amount  dx. 

The  simplest  case  to  which  the  differential  relation  of  Eq.  (3-34)  can  be 
applied  is  the  one  in  which  an  object  accelerates  in  the  direction  in  which  it 
is  already  moving.  This  case  is  illustrated  in  Fig.  3-23.  The  object,  whose 
initial  velocity  is  v,  , experiences  an  acceleration  a in  the  same  direction  as 
Vj.  During  the  infinitesimal  time  interval  dt,  the  acceleration  is  essentially 
constant,  and  according  to  Eq.  (3-34)  there  is  an  infinitesimal  change  dx  = 
a dt  in  the  velocity.  The  velocity  at  the  end  of  the  interval  is  thus 

v = V,-  + dx 


or 

v = v,-  + a dt  (3-35) 

As  you  can  see  from  Fig.  3-23,  the  acceleration  has  changed  the  magni- 
tude of  the  velocity,  but  not  its  direction,  in  this  simple  case. 

But  Eq.  (3-35)  is  not  restricted  to  cases  where  the  acceleration  is  paral- 
lel to  the  initial  velocity.  Consider  the  situation  represented  in  Fig.  3-24. 
Here  again  an  acceleration  a is  applied  for  an  infinitesimal  time  dt.  But  this 


3-3  Position,  Velocity,  and  Acceleration  Vectors  77 


V/ 


Fig.  3-24  At  the  beginning  of  the 
infinitesimal  time  interval  dt,  an  object 
moves  with  instantaneous  velocity  v; . 
During  the  time  interval,  it  experiences 
an  acceleration  a whose  direction  is 
perpendicular  to  that  of  v, . At  the  end  of 
the  time  interval,  its  instantaneous  veloc- 
ity is  v = Vj  + a dt,  just  as  in  the  parallel 
case  illustrated  in  Fig.  3-23.  In  the 
present  case,  however,  the  magnitude  of 
the  velocity  is  unchanged,  as  explained 
in  the  text.  But  the  direction  of  the 
velocity  is  changed. 


time  a is  perpendicular  to  the  initial  velocity  v{.  While  Eq.  (3-35)  still  ap- 
plies, we  must  now  consider  explicitly  the  fact  that  the  addition  on  the  right 
side  is  vectorial.  At  first  glance,  the  result  of  the  acceleration  may  appear  to 
be  a change  in  both  the  magnitude  and  the  direction  of  the  velocity.  But  the 
appearance  is  misleading,  since  the  figure  necessarily  exaggerates  the 
length  of  the  infinitesimal  vector  dx.  It  is  shown  in  the  small-print  section 
immediately  below  that  when  a is  perpendicular  to  v,,  the  result  is  actually  a 
change  in  the  direction  but  not  in  the  magnitude  of  the  velocity,  if  the  time 
interval  dt  is  infinitesimal. 


In  Fig.  3-24,  the  infinitesimal  angle  dO  between  v,  and  v is  given  by  the  ex- 
pression 

dv 

tan  d6  = — 

vt 


where  dd  is  expressed  in  radians  and  dv  = a dt.  Since  dO  is  small  and  is  expressed 
in  radians,  we  can  use  the  approximation  tan  dd  = dO,  where  dO  « 1 rad.  (You 
can  check  the  validity  of  this  approximation  on  your  pocket  calculator,  using  suc- 
cessively smaller  values  for  the  angle  such  as  0.1  rad,  0.01  rad,  0.001  rad,  and  so 
forth.)  The  equation  displayed  above  then  simplifies  to  the  form 


Vj 


Thus  d6,  the  infinitesimal  change  in  the  direction  of  v,-  over  the  infinitesimal  time 
interval  dt,  is  proportional  to  dv/vt. 

Now  let  us  consider  the  change  in  the  magnitude  of  v,-  produced  by  the  same 
dv.  By  the  pythagorean  theorem  we  have 

v2  = vf  + (dv)2 

Dividing  both  sides  of  this  equation  by  vf  and  taking  the  square  root,  we  obtain 


v 

Vj 

We  now  use  the  approximation 


1 + 


fr: 


1/2 


(1  + z) 1/2  — 1 + iz  where  z <SC  1 


(3-36) 


which  is  valid  for  any  number  z.  (Again,  you  can  evaluate  the  accuracy  of  this 
approximation  on  your  pocket  calculator  by  trying  successively  smaller  values  of 
z,  for  example,  0.1,  0.01,  0.001,  and  so  forth.)  The  approximation  can  be  applied  to 
the  last  equation  by  writing  the  dimensionless  quantity  (dv/v;)2  for  the  number  z. 
So  doing  yields 


2 


or 


v - Vj  1 /dv\2 

V;  2 \Vj  / 

This  tells  us  that  the  fractional  change  in  the  magnitude  of  v is  proportional  to 
the  square  of  the  infinitesimally  small  quantity  dv/v;.  But  if  a quantity  is  small, 
its  square  is  very  much  smaller.  Thus  the  fractional  change  in  magnitude  of  v re- 
sulting from  the  perpendicular  acceleration  a,  being  equal  to  i times  the  square  of 
an  infinitesimal,  is  negligible  compared  to  the  change  in  direction,  which  is  equal 
to  the  first  power  of  the  same  infinitesimal.  This  proves  the  assertion  that  an  accel- 
eration a acting  perpendicular  to  the  velocity  v of  a moving  object  changes  the 
direction,  but  not  the  magnitude,  of  the  velocity. 


78  Kinematics  in  Two  and  Three  Dimensions 


y 


X 

Fig.  3-25  At  the  beginning  of  an 
infinitesimal  time  interval  dt,  an  object 
moves  with  instantaneous  velocity  Vj. 
During  the  time  interval,  it  experiences 
an  acceleration  a whose  direction  is 
arbitrary. 


Fig.  3-26  Directions  are  chosen  paral- 
lel and  perpendicular  to  the  direction  of 
motion  in  Fig.  3-25.  The  acceleration 
vector  a is  drawn,  and  the  constituent 
vectors  aB  and  ax  constructed.  Each  of 
these  constituent  vectors  can  be  treated 
as  one  of  the  special  cases  of  Figs.  3-23 
and  3-24. 


Fig.  3-27  The  vectors  a,  an,  ax,  and  v,  of 
Figs.  3-25  and  3-26  indicated  in  relation 
to  the  path  of  motion.  The  vectors  v,  and 
an  are  tangent  to  the  path,  while  a±  is 
perpendicular  to  the  path.  In  Sec.  3-5  it 
will  be  shown  that  ax  “points  to  the  inside 
of  the  curve.” 


We  have  now  discussed  two  special  cases:  (1)  where  a is  parallel  to  v 
and  (2)  where  a is  perpendicular  to  v.  An  example  of  the  first  case  is  vertical 
fall.  In  vertical  fall  there  is  an  acceleration  a = dv/dt  because  the  magni- 
tude v of  v changes.  But  its  direction  v remains  constant.  The  second  case  is 
exemplified  by  a satellite  in  a circular  orbit,  discussed  in  Sec.  3-5.  For  such  a 
case,  an  acceleration  a = dx/dt  exists  because  the  direction  v of  the  velocity 
v changes,  while  its  magnitude  v remains  constant. 

What  about  the  general  case  where  the  angle  between  a and  v is  arbi- 
trary, as  in  Fig.  3-25?  In  this  case,  of  which  projectile  motion  is  an  example, 
we  can  always  pick  a set  of  axes  so  that  one  axis  is  parallel  to,  and  the  other 
perpendicular  to,  the  instantaneous  direction  of  motion,  that  is,  the  direc- 
tion of  Vj.  This  has  been  done  in  Fig.  3-26.  We  can  then  “resolve”  the  vector 
a into  two  constituent  vectors  a,,  and  a±.  In  other  words,  we  consider  a to  be 
the  sum  of  an  and  a±: 

a = an  + ax  (3-37) 

Each  of  the  constituent  vectors  ay  and  ax  can  be  dealt  with  as  one  of  the  two 
special  cases  discussed  above. 

The  vector  aH  is  called  the  tangential  acceleration,  since  it  is  directed 
along  the  tangent  of  the  curve  describing  the  path  of  the  object.  For  the 
vector  a±  Newton  coined  the  name  centripetal  (that  is,  center-seeking) 
acceleration  since,  as  you  will  see  in  Sec.  3-5,  it  is  directed  inward  along  the 
local  radius  of  curvature  of  the  path.  The  relation  of  the  tangential  and 
centripetal  accelerations  to  the  path  ol  motion  is  shown  in  Fig.  3-27. 

To  summarize,  velocity  is  a vector  quantity  and  is  therefore  specified  by 
two  attributes,  magnitude  and  direction.  Whenever  either  (or  both ) of  these  at- 
tributes is  changing,  the  velocity  is  changing  and  there  will  be  an  acceleration 
equal  to  the  rate  of  change  of  that  velocity. 


Before  ending  this  discussion  of  directed  quantities,  it  is  important  to 
explain  the  relation  between  the  notation  introduced  in  the  present 
chapter  for  such  quantities  and  the  notation  introduced  in  the  preceding 
chapter.  In  two-  or  three-dimensional  situations  the  vector  symbolism  of 
this  chapter  must  be  used  for  a directed  quantity,  such  as  a velocity.  But  in 
one-dimensional  situations  either  the  vector  symbolism  can  be  used,  or  the 
symbolism  of  Chap.  2 can  be  used.  These  two  symbolisms  arise  from  the 
fact  that  in  one  dimension  there  are  two  ways  to  treat  a quantity  which  has 
both  magnitude  and  direction.  Both  ways  are  indicated  in  Fig.  3-28.  The 
first  way,  shown  in  Fig.  3-28a  and  b,  is  to  regard  the  quantity  as  a vector 
whose  possible  directions  are  restricted  to  just  two  opposite  directions.  The 
second  way,  shown  in  Fig.  3-28c  and  d,  is  to  regard  the  quantity  as  a scalar, 
and  hence  a quantity  whose  numerical  value  can  be  either  positive  or  nega- 
tive. The  direction  of  the  quantity  is  then  designated  by  the  sign  of  the  nu- 
merical value.  This  second  procedure  is  the  one  used  in  Chap.  2.  (It  cannot 
be  applied  to  situations  involving  more  than  one  dimension,  simply  because 
two  signs,  + and  — , are  not  adequate  to  describe  all  possible  directions.) 
When  the  second  procedure  is  used,  the  italic  symbol  v represents  a velocity 
(whose  direction  may  be  positive  or  negative),  and  the  italic  symbols  in  ab- 
solute value  signs,  |n|,  represents  its  magnitude,  a speed.  When  using  this 
symbolism,  the  value  of  the  quantity  v can  be  either  positive  or  negative,  ac- 
cording to  its  direction.  In  contrast,  the  value  of  the  quantity  v can  be  only 
positive  when  it  represents  the  magnitude  of  the  vector  v.  Despite  the  pos- 


3-3  Position,  Velocity,  and  Acceleration  Vectors  79 


i i i i i i i i i i i i i r 

0 I 23456789  10  _______ 

v = 9 
(a) 


v = 9 


Ol68Z.9Sk£3lO 

I I I I I I I ! I I 


V 

( b ) 


-3-2-1  0+1+2+3+4+5+6+7+8+9+10 


v = +9 


(c) 


|- lul «-| 

— I — H-* — t — i 1 — ! — I — t — ! — i — t — I — h 

-10-9-8-7-6-5-4-3-2-1  0 +1  +2 +3 

v = —9 


(d) 


Fig.  3-28  Two  different  ways  of  repre- 
senting a vector  in  one  dimension,  (a) 
Like  all  vectors,  the  vector  v has  a 
magnitude  given  by  a positive  number 
(whose  units  in  this  particular  case  are 
m/s).  Its  direction  is  positive,  (b)  Here 
the  vector  v has  the  same  magnitude  as 
in  a , but  its  direction  is  negative.  The 
ruler  shown  being  used  to  measure  the 
“length”  (magnitude)  v of  v reads  the 
same  as  in  a.  But  the  ruler  has  to  be 
inverted  in  order  to  orient  it  so  as  to 
make  the  measurement  possible.  ( c ) 
Here  the  same  one-dimensional  vector  is 
represented  by  the  signed  scalar  v.  The 
magnitude  or  absolute  value  |u|  of  v is  the 
"length”  (magnitude)  of  the  vector.  The 
positive  sign  of  v signifies  the  direction. 
( d ) Here  the  signed  scalar  —v  has  the 
same  magnitude  |v|  as  does  +v  in  c.  But 
its  direction,  signified  by  the  minus  sign, 
is  negative.  Thus  the  direction  of  — v is 
opposite  to  that  of  v. 


sibility  of  confusion  arising  from  the  two  possible  meanings  of  an  italic 
symbol,  we  will  use  both  symbolisms  because  one  is  more  convenient  in 
some  one-dimensional  situations  and  the  other  is  more  convenient  in 
others.  But  we  will  always  specify  clearly  which  symbolism  is  intended  if 
there  is  any  ambiguity.  This  will  be  done  by  stating  whether  we  treat  a 
directed  quantity  as  a vector  or  as  a signed  scalar.  We  use  the  term  signed 
scalar  to  refer  to  a quantity  whose  value  can  be  either  positive  or  negative. 
(The  term  is  redundant  since  by  definition  a scalar  can  be  of  either  sign. 
But  it  helps  emphasize  the  distinction  being  made  between  such  a quantity 
and  a quantity  which  is  a magnitude — that  is,  one  which  is  necessarily  posi- 
tive.) 

In  two-dimensional  situations  a vector  can  be  represented  either  by  two 
components  or  by  a length  and  an  angle.  Both  components  are  always 
signed  scalars.  The  same  is  true  of  the  angle.  But  the  length  is  always  a 
magnitude.  How  would  you  extend  these  statements  to  three-dimensional 
situations? 


3-4  THE  PARABOLIC  We  will  now  rephrase  the  conclusion  of  Sec.  3-1  concerning  the  motion  of  a 
TRAJECTORY  projectile  in  precise  vectorial  terms  and  then  continue  with  the  discussion 
of  projectile  motion  in  those  terms. 

How  does  a projectile  move,  in  the  absence  of  air  resistance,  when  it  is 
given  an  arbitrary  initial  velocity  v;?  We  already  have  a sort  of  answer  in 
Eq.  (3-6).  At  any  moment,  the  relation  between  the  x and  y coordinates  of 
the  projectile  is 


y = — x 
Vx 


g 

2v2x 


The  quantity  vx  is  the  horizontal  component  of  the  projectile  velocity.  Its 
value  is  a constant  equal  to  its  initial  value,  since  the  horizontal  motion  is 


80 


Kinematics  in  Two  and  Three  Dimensions 


one  of  constant  velocity.  The  quantity  vyi  is  the  initial  value  of  the  vertical 
component  of  the  projectile  velocity.  And  g is  the  magnitude  of  the  gravita- 
tional acceleration.  Thus  vx  and  vyi,  the  two  components  of  v,,  specify  the 
value  of  the  constant  factor  in  each  of  the  terms  on  the  right  side  of  the 
equation.  But  this  equation  is  not  in  the  most  convenient  form,  since  we 
usually  know  the  magnitude  and  direction  v{  and  8 of  the  initial  projectile 
velocity,  and  not  its  x and  y components  vx  and  vyi.  However,  we  can  use 
Eqs.  (3-10),  in  the  form 

vx  = v cos  6 and  vy  = v sin  8 


to  rewrite  Eq.  (3-6)  in  terms  of  vt  and  8 (we  simplify  the  notation  by  using  8 
instead  of  </>„).  Setting  v = and  vy  = vyi,  and  substituting  the  values  of  vx 
and  vyi  thus  obtained  into  Eq.  (3-6),  we  have  the  following  relation  be- 
tween ry  and  rx,  the  components  of  the  instantaneous  position  vector  r of 
the  projectile: 


Vi  sin  8 

n hr 

Vi  COS  8 


g 

2vf  cos2  8 


r 


2 

JO 


or 


r»  = <tan  e)r‘  - 2(v,  cos  W ri  (3'S8) 

This  equation  is  less  complicated  than  it  looks.  Suppose,  for  the  mo- 
ment, that  we  could  “switch  off”  gravity — that  we  could  set  g = 0.  Equa- 
tion (3-38)  would  then  become  simply  ry  = (tan  8)rx.  But  tan  8 = vyi/vx, 
and  so  we  would  have  ry/rx  = vyi/vx.  Thus  the  ratio  of  the  distance  trav- 
eled in  the  y direction  to  that  traveled  in  the  x direction  would  simply  be  the 
ratio  of  the  components  vyi  and  vx  of  the  velocity  in  those  directions.  The 
reason  is  that  both  components  of  the  velocity  (and  not  just  vx)  would  re- 
main constant  at  their  initial  values  if  g were  equal  to  zero. 

Now  let  us  “turn  on”  gravity  again  and  consider  the  second  term  on 
the  right  side  of  the  equation.  It  is  of  the  same  general  form  as  the  right 
side  of  Eq.  (3-3), 


which  describes  the  trajectory  of  a horizontally  launched  projectile.  In- 
deed, the  second  term  of  Eq.  (3-38)  reduces  to  — (g/2v?)r|  when  0 = 0.  In 
Eq.  (3-38),  however,  vt  cos  8,  the  horizontal  component  of  the  initial  veloc- 
ity, takes  the  place  of  vt  in  the  simple  case  of  horizontal  launching. 

With  this  general  qualitative  picture  in  mind,  let  us  proceed  to  a quan- 
titative picture  of  the  parabolic  trajectory.  In  hitting  a baseball  or  bring  a 
gun,  we  usually  have  control  over  the  initial  elevation  angle  8 and  the  mag- 
nitude Vi  of  the  initial  velocity.  Examples  3-7  and  3-8  are  studies  of  what 
happens  when  8 is  varied  while  vt  is  kept  constant  and  when  vt  is  varied 
while  8 is  kept  constant. 

The  calculations  necessary  for  Examples  3-7  and  3-8  can  be  performed 
(given  some  patience)  on  any  pocket  calculator.  However,  they  are  more 
conveniently  done  on  a programmable  pocket  calculator  or  a small  com- 
puter. A trajectory  plotting  program  is  listed  in  the  Numerical  Calculation 
Supplement. 


3-4  The  Parabolic  Trajectory  81 


(UJ  HI)  \ 


600 


Fig.  3-29  Plots  of  the  trajectories  calculated  for  projectiles  whose  initial  speed  is  vt  = 100  m/s, 
shot  from  a gun  whose  elevation  angle  is  successively  6 = 10°,  30°,  45°,  60°,  and  80°.  Note  that 
maximum  range  is  attained  when  6 = 45°.  Also  shown  is  the  trajectory  for  a projectile  whose 
initial  speed  is  v[  = 141.4  m/s,  or  \/2  greater  than  vit  with  elevation  angle  6'  = 45°.  The  range 
is  twice  the  maximum  attained  by  the  slower  projectile. 


EXAMPLE  3-7  — ■■■ 

Use  Eq.  (3-38)  to  plot  the  trajectories  of  a projectile  having  an  initial  velocity  of  mag- 
nitude vt  = 100  m/s,  for  elevation  angles  9 = 10°,  30°,  45°,  60°,  and  80°. 

* As  in  Example  2-5,  you  must  keep  track  of  the  proper  units,  since  a calculator 
deals  with  numbers  only.  You  therefore  rewrite  Eq.  (3-38)  in  the  completely  equiva- 
lent form 


9.8 

ry  = (tan  " 2 x (100)2  cos'2!  Ty  (r"  3nd  Ty  m meters) 

The  family  of  plots  shown  in  Fig.  3-29  was  obtained  by  drawing  smooth  curves 
through  points  gotten  by  calculating  values  of  ry  for  successive  values  of  rx  which 
were  taken  50  m apart.  You  will  find  it  worthwhile  to  calculate  and  plot  one  such 
trajectory  yourself,  choosing  any  values  you  like  for  vt  and  9 and  thus  specifying  vf. 

As  you  would  expect,  increasing  9 increases  the  maximum  height  ymax  to  which 
the  projectile  rises.  At  first  the  range  R (the  horizontal  distance  from  the  origin  to 
the  point  where  the  projectile  returns  toy  = 0)  increases  with  increasing  9.  but  then 
it  decreases.  The  maximum  range  appears  to  be  attained  with  9 = 45°.  You  can 
check  this  point  by  running  some  more  calculations  yourself,  choosing  angles  close 
to  45°  and  keeping  vt  constant. 


All  the  trajectories  of  Fig.  3-29  other  than  the  45°  trajectory  fall  into 
pairs  having  equal  range.  The  members  of  each  pair  have  elevation  angles 
lying  at  equal  angles  above  and  below  45°,  as  can  be  seen  from  Fig.  3-29. 
This  remarkable  fact  leads  us  to  look  for  a qualitative  reason  why  it  should 
be  so.  With  fixed  vt,  a large  angle  6 means  that  the  projectile  will  rise  high 
and  hence  will  take  a relatively  long  time  to  reach  the  ground  again.  How- 
ever, the  component  vx  = vt  cos  6 will  be  small,  since  cos  6 is  small  for  large 
angles.  If  T is  the  total  time  the  projectile  spends  aloft,  R = vxT  will  not  be 
large.  On  the  other  hand,  if  6 is  small,  vx  will  be  large  but  the  projectile  will 
not  remain  above  the  ground  very  long.  Somewhere  in  the  middle  (the  cal- 
culation suggests  6 = 45°)  the  range  is  maximized. 


EXAMPLE  3-8  ■■■ I 

Repeat  the  calculation  of  Example  3-7  with  elevation  angle  9'  = 45°,  but  increase  u; 
by  a factor  \/2  so  that  v[  = 141.4  m/s. 


82  Kinematics  in  Two  and  Three  Dimensions 


■ When  you  repeat  the  calculation  with  these  new  values,  you  obtain  a plot  like 
the  longest  trajectory  in  Fig.  3-29.  Comparing  the  range  with  that  obtained  for  a 45° 
elevation  angle  in  Example  3-7,  you  see  that  the  range  has  doubled.  Since  the  initial 
speed  is  increased  by  v2,  this  suggests  that  the  range  is  proportional  to  vj. 


We  will  now  generalize  the  ideas  arising  from  the  numerical  calcula- 
tions of  Examples  3-7  and  3-8,  and  derive  an  analytical  expression  for  the 
range  R in  terms  of  vt  and  8 , the  magnitude  and  direction  of  the  initial 
velocity  vector  v,.  We  have  set  up  the  problem  so  that  the  projectile  starts  at 
the  coordinates  x = 0,  y = 0.  When  it  returns  to  the  ground  (which  is  as- 
sumed to  be  level),  it  does  so  at  the  specific  coordinates  x = R,  y = 0.  We 
can  therefore  rewrite  Eq.  (3-38)  in  this  special  case  by  substituting  the  value 
R for  x and  the  value  0 for  y.  We  find 

0 = (tan  8)R  - 2~S  -2--  R2 

2 vf  cos  • 8 

Since  R ^ 0,  we  can  divide  this  quadratic  equation  through  by  R.  Doing  so 
reduces  it  immediately  to  a linear  equation  which  can  he  solved  for  R to 
yield 

R = — 2 tan  8 cos2  8 (3-39) 

S' 

This  expression  can  be  simplified  by  noting  that 

2 tan  8 cos2  8=2  — — ^ cos2  8 = 2 sin  8 cos  8 
cos  8 

Using  the  standard  trigonometric  identity  2 sin  8 cos  8 — sin  28,  and  substi- 
tuting into  Eq.  (3-39),  we  have 

7/? 

R = — sin  28  (3-40) 

g 

As  suggested  by  the  numerical  solutions,  the  range  depends  on  the  square 
of  the  initial  speed  vt. 

The  range  R depends  on  the  sine  of  twice  the  elevation  angle  8.  Fur- 
thermore, R is  a maximum  when  sin  28  has  its  maximum  value  of  1.  This 
occurs  when  6 = 45°,  a result  which  is  probably  in  rough  but  reasonable 
agreement  with  your  experience  in  throwing  balls,  in  spite  of  the  fact  that 
we  have  ignored  air  resistance.  In  any  case,  the  analytical  result  certainly 
agrees  with  the  numerical  solutions  plotted  in  Fig.  3-29. 

The  symmetry  observed  in  Fig.  3-29  for  elevation  angles  symmetric 
about  45°  finds  its  analytical  expression  in  the  symmetry  of  the  func- 
tion sin(20)  about  the  angle  8 = 45°.  That  is,  since  sin[2(45°  — a)]  = 
sin[2(45°  + a)],  the  range  will  be  the  same  for  any  two  elevation  angles 
8 = 45°  ± a which  are  equal  amounts  greater  than  and  less  than  45°. 

The  small  angle  of  each  pair  gives  a flat  trajectory,  and  the  large  angle 
a high  trajectory.  Air  resistance  tends  to  affect  the  high  trajectory  more, 
since  it  is  longer.  In  baseball,  the  high  trajectory  is  called  a pop  fly.  Such  a 
fly  is  easy  to  catch,  since  the  time  of  flight  is  so  long  that  the  fielder  has 
plenty  of  time  to  get  into  position.  The  low  trajectory  is  called  a line  drive. 
It  is  much  harder  to  catch  because  it  takes  much  less  time  than  the  pop  fly 
to  reach  the  same  point  in  the  field. 


3-4  The  Parabolic  Trajectory  83 


Fig.  3-30  Strobe  photo  of  an  air-table  puck  in  uni- 
form circular  motion.  A string  attached  to  the  puck  is 
threaded  over  a low-friction  pulley  through  a hole  in 
the  table.  A weight  hanging  at  the  other  end  of  the 
string  (not  seen)  maintains  a constant  tension  in  the 
string.  By  means  of  repeated  trials,  the  puck  is  set  into 
motion  in  such  a way  that  it  moves  in  a circle.  The 
speed  of  the  puck  is  constant,  as  can  be  seen  from  the 
uniform  spacing  between  successive  puck  images.  The 
clock  hand  makes  one  revolution  per  second.  It  is 
illuminated  by  the  strobe  light  to  show  the  strobe  flash 
interval. 


3-5  UNIFORM 
CIRCULAR  MOTION 
AND  CENTRIPETAL 
ACCELERATION 


The  case  of  uniform  circular  motion  is  the  most  important  application  of 
the  general  idea  of  centripetal  acceleration,  which  we  developed  in  Sec.  3-3. 
It  is  also  the  simplest.  Out  of  this  application  will  come  a quantitative  result 
which  has  direct  bearing  on  the  motion  of  such  things  as  artificial  and  natu- 
ral satellites,  of  nuclear  particles  in  accelerators,  of  bodies  whirling  at  the 
ends  of  strings,  and  of  flywheels  spinning  on  shafts. 

The  air  table  experiment  illustrated  in  the  strobe  photo  of  Fig.  3-30 
provides  an  example  of  uniform  circular  motion.  A puck  is  connected  to  a 
string.  The  string  is  threaded  through  a hole  in  the  center  of  the  air  table, 
over  a pulley  which  can  swivel  freely  with  very  little  friction.  A small  weight 
hanging  from  the  end  of  the  string  beneath  the  air  table  maintains  a con- 
stant tension  in  the  string.  If  you  look  carefully  at  the  photograph,  you  can 
see  the  string  stretching  between  the  pulley  and  the  puck. 

To  begin  the  experiment,  the  puck  is  carefully  set  into  circular  motion 
by  projecting  it  at  a certain  speed  and  distance  from  the  center,  with  a 
direction  of  motion  perpendicular  to  the  string.  The  proper  speed  for  pro- 
ducing circular  motion  at  that  distance  is  found  by  trial  and  error.  The  fact 
that  the  orbit  is  circular  ensures  that  the  velocity  of  the  puck  always  remains 
perpendicular  to  the  string,  since  the  tangent  to  a circle  is  perpendicular  to 
the  radius  at  the  point  of  tangency.  Inspection  of  the  figure  will  satisfy  you 
that  the  orbit  is  indeed  circular  and  that  the  speed  is  constant. 

The  magnitude  of  the  velocity  vector  — the  speed  of  the  puck — is  not 
changing.  The  puck  is  nevertheless  accelerating,  since  the  direction  of  the 
velocity  vector  is  continually  changing.  We  will  determine  the  acceleration 
of  the  puck  by  measurements  on  the  photograph.  Then  we  will  use  the 
ideas  suggested  by  this  experiment  to  derive  a mathematical  expression  for 
the  centripetal  acceleration. 

Figure  3-3 1 is  a copy  of  Fig.  3-30,  in  which  vectors  are  added  to  repre- 
sent the  change  in  position  of  the  puck  between  flash  1 and  flash  2 of  the 
strobe  light,  and  its  change  in  position  between  flash  2 and  flash  3.  Each 
vector  is  drawn  with  its  tail  at  the  center  of  a puck  image  and  its  head  at  the 
center  of  the  next  image.  Thus  each  vector  shows  the  change  in  the  posi- 


84  Kinematics  in  Two  and  Three  Dimensions 


Fig.  3-31  The  vectors,  drawn  in  over  a copy  of  Fig. 
3-30,  represent  the  displacements  of  the  puck  during 
the  time  intervals  between  successive  flashes  of  the 
strobe  light.  If  we  choose  the  unit  of  time  to  be  one 
flash  interval,  the  vectors  represent  the  displacements 
per  unit  time  interval,  which  are  the  average  velocities 
during  those  intervals. 


tion  of  the  puck  during  a time  interval  between  consecutive  flashes  of  the 
strobe  light.  It  shows  this  directly,  without  referring  to  the  change  in  a vector 
describing  the  position  of  the  puck  with  respect  to  some  particular  refer- 
ence frame.  Hence  the  vector  is  a displacement  vector.  The  word  is  used  in 
complete  analogy  to  the  way  it  is  used  in  the  preceding  section.  The  only 
difference  is  that  here  the  displacement  vectors  are  not  of  infinitesimal 
magnitude,  because  here  the  time  intervals  during  which  the  displacements 
occur  are  not  of  infinitesimal  duration.  The  average  velocity  of  the  puck  in 
the  time  interval  between  any  consecutive  pair  of  flashes  is  found  by  di- 
viding the  vector  depicting  its  displacement  from  one  flash  to  the  next  by 
the  duration  of  the  time  interval  between  the  flashes.  This  is  analogous  to 
the  statement,  made  in  Sec.  3-4,  that  instantaneous  velocity  is  found  by  di- 
viding infinitesimal  displacement  by  the  corresponding  infinitesimal  time 
interval.  Since  the  time  interval  between  any  two  successive  flashes  of  the 
strobe  light  is  always  the  same,  the  displacement  vectors  are  proportional  to 
the  average  velocities  during  the  corresponding  time  intervals.  In  fact,  we 
will  use  this  fixed  time  interval  as  the  unit  of  time.  If  we  do  so,  the  nu- 
merical value  of  the  time  interval  is  1 unit,  and  so  the  vectors  shown  in  Fig. 
3-31  are  the  average  velocities.  In  other  words,  we  will  measure  velocity  in 
terms  of  displacement  per  strobe  flash  interval. 

The  puck  experiences  an  average  acceleration  as  its  average  velocity 
changes  from  the  value  it  has  over  the  time  interval  between  flash  1 and 
flash  2 to  the  value  it  has  over  the  time  interval  between  flash  2 and  flash  3. 
This  average  acceleration  can  be  constructed  by  using  the  fact  that  the  final 
average  velocity  vector  is  the  sum  of  the  initial  average  velocity  vector  and 
the  vector  specifying  the  change  in  velocity.  With  this  in  mind,  we  move  the 
final  velocity  vector  so  that  its  tail  coincides  with  the  tail  of  the  initial  veloc- 
ity vector,  as  in  Fig.  3-32.  In  the  process,  the  direction  and  length  of  the 
final  vector  are  not  changed.  As  you  can  see  from  Fig.  3-32,  the  solid  vector 
(in  white)  is  the  change  in  average  velocity  over  one  flash  interval,  because 
the  initial  average  velocity  vector  (dashed  white)  plus  the  change  in  velocity 
(solid  white)  equals  the  final  average  velocity  (dashed  gray).  Since  we  are 


3-5  Uniform  Circular  Motion  and  Centripetal  Acceleration  85 


Fig.  3-32  Measurement  of  the  average  acceleration 
between  flash  intervals.  The  second  velocity  vector 
(dashed  gray)  is  moved  parallel  to  itself  so  that  its  tail 
coincides  with  the  tail  of  the  first  velocity  vector 
(dashed  white).  The  geometric  procedure  for  sub- 
tracting vectors,  described  in  Sec.  3-2,  is  then  carried 
out.  The  solid  white  vector  is  the  difference  between 
the  second  average  velocity  vector  and  the  first,  and 
is  thus  the  change  in  average  velocity  between  the 
first  flash  interval  and  the  second.  This  change  in 
average  velocity  per  flash  interval  is  the  average 
acceleration. 


taking  the  strobe  flash  interval  to  be  the  unit  of  time,  the  solid  white  vector 
is  the  change  in  average  velocity  per  unit  time;  that  is,  it  is  the  average 
acceleration.  You  should  note  that  the  acceleration  vector  points  toward 
the  center  of  the  circular  path  of  the  puck. 

As  was  implied  at  the  beginning  of  the  discussion,  the  puck  can  be 
started  in  different  ways,  and  will  then  move  in  different  paths.  In  the  ex- 
periment depicted  by  the  strobe  photo  of  Fig.  3-33,  the  puck  is  simply  re- 
leased from  rest  at  some  distance  from  the  center.  Not  surprisingly,  the 
puck  starts  to  move  on  a straight  line  toward  the  center.  In  fact,  it  acceler- 
ates toward  the  center  under  the  influence  of  the  tension  in  the  string,  as 
you  can  see  from  the  increasing  magnitudes  of  the  displacements  of  the 
puck  between  successive  flashes.  For  this  second  experiment,  the  tension  in 


Fig.  3-33  In  this  strobe  photograph,  the  experimental  setup  is 
identical  with  that  of  Fig.  3-30.  Here,  however,  the  air  puck  has  been 
released  from  rest. 


86  Kinematics  in  Two  and  Three  Dimensions 


- 50  cm 

-40  cm 

/£* 

u 

- 30  cm 

- 20  cm 

\m  40 \y 

- 10  cm 

-0 

Fig.  3-34  The  vectors,  drawn  in  over  a copy  of  Fig.  3-33, 
represent  the  displacements  of  the  puck  during  the  time  intervals 
between  successive  flashes  of  the  strobe  light.  As  in  Fig.  3-31,  the 
vectors  denote  the  average  velocities  during  these  time  intervals,  if 
the  unit  of  time  is  chosen  to  be  one  flash  interval. 


Fig.  3-35  Measurement  of  the  average  acceleration  between  two 
successive  flash  intervals.  The  procedure  followed  is  the  same  as 
that  described  in  the  caption  to  Fig.  3-32. 


the  string  has  been  adjusted  to  the  same  value  as  in  the  hrst  experiment 
with  the  puck  in  a circular  orbit.  The  same  Hash  interval  (or  time  unit)  and 
the  same  distance  unit  are  used  in  the  second  experiment,  as  shown  by  the 
clock  and  the  distance  scale. 

We  want  to  measure  the  average  acceleration  in  this  second  experi- 
ment. In  Fig.  3-34  the  average  velocity  vectors  are  constructed  for  two  suc- 
cessive time  intervals.  In  Fig.  3-35,  the  difference  of  these  two  velocities  is 
used  to  determine  the  average  acceleration.  As  before,  the  dashed  white 
vector  is  moved  without  altering  its  length  or  direction,  so  that  the  tails  of 
the  two  velocity  vectors  coincide.  The  difference,  which  is  the  average 
acceleration,  is  the  solid  white  vector.  Note  its  magnitude  and  direction  and 
then  compare  those  values  with  the  magnitude  and  direction  of  the  average 
acceleration  found  for  the  circular-motion  experiment  in  Fig.  3-32. 

The  comparison  shows  that  the  accelerations  are  of  the  same  magni- 
tude, within  experimental  accuracy,  and  that  both  point  toward  the  center. 
This  is  true  even  though  the  different  initial  conditions  have  led  to  quite 
different  motions. 


In  view  of  the  connection  between  force  and  acceleration  established 
in  Sec.  1-3,  this  result  should  not  be  a mystery.  If  the  tension  in  the  string  is 
the  same  in  both  experiments,  then  in  both  the  force  acting  on  the  puck  is 
the  same — the  force  has  a certain  fixed  magnitude  and  is  always  directed 
inward  toward  the  center  of  the  air  table.  Thus  in  both  experiments  the 
puck  should  have  an  acceleration  which  is  the  same  in  magnitude  and  is 
always  directed  toward  the  center  of  the  air  table.  In  the  second  experi- 
ment (Figs.  3-33  through  3-35)  the  acceleration  is  of  the  more  obvious  kind 
illustrated  in  Fig.  3-28a,  where  a is  parallel  to  v*  so  that  the  magnitude  of 
the  velocity  changes  but  its  direction  remains  constant.  In  the  hrst  experi- 
ment (Figs.  3-30  through  3-32)  a is  perpendicular  to  v,,  as  in  Fig.  3-28 b. 
Consequently,  the  direction  of  the  velocity  changes,  but  its  magnitude  re- 
mains constant.  But  in  both  cases  there  is  an  acceleration  resulting  from  a 
change  in  a property  of  the  velocity  vector. 


3-5  Uniform  Circular  Motion  and  Centripetal  Acceleration  87 


3 


Fig.  3-36  Idealized  representation  of 
the  experimental  analysis  of  Fig.  3-32.  A 
body  moves  in  a circular  path  of  radius  r, 
shown  by  the  dashed  curve.  The  center 
of  the  circle  is  at  O.  In  an  unspecified 
unit  time  interval,  the  body  passes  from 
position  1 to  position  2,  and  in  an  equal 
time  interval  it  passes  from  position  2 to 
position  3.  The  vectors  of  magnitude  v 
from  1 to  2 and  from  2 to  3 are  the 
average  velocities  (displacements  per 
unit  time  interval)  over  those  intervals. 
The  vector  of  magnitude  a (drawn  from 
2 to  P)  represents  the  average  accelera- 
tion of  the  body  from  the  first  time 
interval  to  the  second.  The  angle  8 is 
a base  angle  both  of  the  isosceles  tri- 
angle 12P  and  of  the  isosceles  triangle 
012.  The  triangles  are  thus  similar,  and 
a/v  = v/r,  or  a = v2/r. 


So  far  we  have  been  concerned  with  experimental  measurements  on  a 
body  moving  uniformly  in  a circle.  Now  we  will  devise  a theoretical  ac- 
count of  this  motion.  The  diagram  in  Fig.  3-36  follows  very  closely  the  con- 
struction in  Fig.  3-32.  It  shows  successive  positions,  labeled  1,  2,  3,  of  a body 
moving  uniformly  in  a circular  path  centered  on  the  point  O.  The  time  in- 
terval between  positions  1 and  2 is  the  same  as  that  between  positions  2 and 
3.  These  positions  are  connected  to  each  other,  and  to  the  center  of  the 
circle,  by  straight  lines.  As  before,  the  length  of  the  line  from  1 to  2 denotes 
the  magnitude  v of  the  average  velocity  vector  over  the  first  time  interval. 
(We  continue  to  use  the  duration  of  the  time  interval  as  a convenient  unit  to 
define  velocities  and  accelerations.)  The  magnitude  of  the  average  velocity 
over  the  second  time  interval  has  the  same  value  v,  although  the  direction 
of  this  average  velocity  is  different.  The  second  average  velocity  vector  is 
also  shown  after  having  been  moved  parallel  to  itself  so  as  to  bring  its  tail  to 
point  1.  The  figure  shows  that  its  head  is  then  at  point  P,  a point  located  on 
the  radial  line  02.  (It  can  be  shown  on  the  basis  of  symmetry  considerations 
that  P actually  lies  on  that  line,  and  you  should  do  this  yourself  later.) 

The  average  acceleration  between  the  two  successive  time  intervals  is 
the  vector  from  point  2 to  point  P,  and  its  magnitude  a is  the  length  of  that 
vector.  We  want  to  determine  how  a is  related  to  v and  to  the  radius  r of  the 
circle.  This  relation  depends  on  the  fact  that  the  triangle  12P  is  similar  to 
the  triangle  012.  [That  is,  both  triangles  are  of  the  same  shape  and  differ 
only  in  size.  This  is  true  because  both  triangles  are  isosceles  (each  triangle 
has  two  sides  of  equal  length)  and  they  have  a common  base  angle — the 
angle  12P  or  120.  Two  isosceles  triangles  of  the  same  base  angle  are  simi- 
lar.] 

The  ratios  of  corresponding  sides  of  similar  triangles  are  equal.  Thus 
a/v  for  triangle  12P  is  equal  to  v/r  for  triangle  012.  That  is, 

a _ v_ 
v r 


or 


The  symbols  a and  v in  this  relation  represent  the  average  values  of  the 
magnitudes  of  the  acceleration  and  the  velocity.  However,  the  argument 
leading  to  Eq.  (3-4 la)  imposes  no  special  conditions  on  the  (equal)  time  in- 
tervals required  for  the  body  of  Fig.  3-36  to  pass  from  position  1 to  position 
2 and  from  position  2 to  position  3.  While  practical  difficulties  would  pre- 
vent the  reduction  of  the  strobe  flash  interval  in  the  experiment  of  Figs.  3-30 
through  3-32  to  an  arbitrarily  small  value,  no  such  difficulty  stands  in  the 
way  of  the  theoretical  analysis  of  Fig.  3-36.  Reducing  the  time  interval  to  an 
infinitesimally  small  value  leads  to  an  infinitesimal  separation  of  points  1,  2, 
and  3 in  the  figure,  so  that  there  is  no  longer  a distinction  to  be  made 
between  the  average  and  instantaneous  magnitudes  of  the  velocity  and  of 
the  acceleration.  But  reducing  the  time  interval  makes  no  change  at  all  in 
the  argument  leading  to  Eq.  (3-4 la).  Thus  Eq.  (3-4 la)  is  an  exact  relation 
between  the  instantaneous  magnitudes  a and  v,  and  the  value  of  r.  Further- 
more, for  any  time  interval  the  direction  of  the  average  acceleration  is 
always  toward  the  center  of  the  circle.  Hence  the  same  is  true  of  the  direc- 
tion of  the  instantaneous  acceleration  obtained  from  the  argument  when 


88  Kinematics  in  Two  and  Three  Dimensions 


EXAMPLE  3-9 


3-6  THE  MINIMUM- 
ORBIT  EARTH 
SATELLITE 


the  time  interval  is  infinitesimal.  Thus  we  can  write  a vector  expression  for 

the  centripetal  acceleration  ac  in  the  form 


a 


C 


r 

r 


(3-41  b) 


where  r is  a unit  vector  pointing  from  the  center  of  the  circle  to  the  instan- 
taneous location  of  the  moving  body.  We  rederive  this  important  equation 
in  Chap.  9,  using  a very  different  and  more  formal  argument. 


The  derivation  given  above  is  only  slightly  modified  from  the  original  form  in 
which  it  was  given  by  Christian  Huygens  (1629—1695).  You  will  see  in  Chap.  11 
the  important  role  which  it  played  in  Newton’s  development  of  mechanics. 
Huygens’  life  overlapped  those  of  both  Galileo  (1565—1642)  and  Newton 
(1642-1727),  and  his  contributions  to  physics  and  technology  were  prolific  and 
wide-ranging.  Although  he  was  Dutch,  he  spent  much  of  his  working  life  in  Paris, 
and  he  was  a principal  contributor  to  the  establishment  of  modern  physics  in 
France. 


Example  3-9  is  an  application  of  Eq.  (3-416). 


A child  whirls  a stone,  tied  to  a string,  around  her  head  at  a speed  of  6.0  m/s.  The 
stone  describes  a horizontal  circle  of  radius  0.75  nr.  What  is  the  centripetal  accelera- 
tion of  the  stone? 

■ Use  Eq.  (3-416)  to  obtain 


ar 


(6.0  m/s)2 
0.75  nr  r 


-48  m/s2  x r 


That  is,  the  magnitude  of  the  acceleration  is  48  m/s2,  and  its  direction  is  inward. 
The  magnitude  ac  is  relatively  large,  being  4.9  times  the  magnitude  of  the  gravita- 
tional acceleration  g = 9.8  m/s2. 


The  idea  of  launching  a satellite  into  orbit  around  the  earth  originated 
with  Newton,  although  nearly  three  centuries  had  to  elapse  before  the 
actuality  came  within  range  of  available  technology.  Newton’s  illustration 
of  satellite  motion  is  reproduced  in  Fig.  3-37.  The  trajectories  of  cannon- 
balls, launched  parallel  to  the  surface  of  the  earth  from  a mountaintop,  are 
shown  for  a set  of  initial  velocities  of  increasing  magnitude.  The  illustration 
speaks  for  itself! 


Fig.  3-37  A hypothetical  method  for  launching 
an  earth  satellite,  as  illustrated  by  a figure  in 
Newton’s  Principia  Mathematica.  A cannon  on  a 
mountaintop  fires  a series  of  cannonballs  hori- 
zontally, with  successively  increasing  muzzle 
speeds.  The  shortest  trajectory  is  close  to  a parab- 
ola (it  would  be  exactly  a parabola  if  the  earth  were 
flat  or  if  the  range  were  short  enough  that  the 
curvature  of  the  earth  did  not  need  to  be  taken  into 
consideration).  If  the  muzzle  speed  is  great 
enough,  the  cannonball  clears  the  earth  and 
returns  to  its  starting  point.  Two  orbits  are  shown 
for  satellites  launched  from  points  higher  above 
the  earth  than  the  mountaintop.  ( Courtesy  of  the  New 
York  Public  Library.) 


3-6  The  Minimum-Orbit  Earth  Satellite  89 


An  earthbound  object,  like  the  puck  on  an  air  table  in  Fig.  1-5,  main- 
tains a constant  velocity  if  no  net  force  acts  on  it.  The  same  is  true  of  a satel- 
lite. It  would  move  in  a straight  line  with  constant  speed — that  is,  it  would 
move  with  constant  velocity — if  there  were  no  net  force  acting  on  it.  But 
there  is  a net  force — the  gravitational  force — exerted  on  the  satellite  by  the 
earth  in  the  direction  toward  the  center  of  the  earth.  So  the  satellite  must 
accelerate  toward  the  center  of  the  earth.  If  it  did  not  have  a tangential 
velocity,  the  satellite  would  do  this  by  falling  directly  toward  the  center  of 
the  earth,  in  analogy  to  the  motion  of  the  air  table  puck  in  Fig.  3-33.  The 
magnitude  of  its  velocity  would  thus  change,  but  not  the  direction.  Instead, 
the  satellite  accelerates  toward  the  center  of  the  earth  (despite  the  fact  that 
its  distance  from  the  center  of  the  earth  is  constant)  because  it  has  a tangen- 
tial velocity  which  is  continually  changing  in  direction,  but  not  in  magni- 
tude. The  satellite’s  motion  is  analogous  to  that  of  the  air  table  puck  in  Fig. 


3-30. 


The  easiest  satellite  orbit  to  achieve  is  an  approximately  circular  orbit 
just  barely  high  enough  above  the  earth’s  surface  to  avoid  excessive  friction 
from  air  resistance.  In  this  minimum  earth  orbit,  at  an  altitude  of  about 
160  km,  air  resistance  is  sufficiently  small  to  allow  the  satellite  a lifetime  of 
a few  weeks. 

What  is  the  speed  of  such  a satellite?  What  is  its  period  of  revolution 
(the  time  required  for  it  to  make  one  trip  around  its  orbit)?  In  the 
minimum-orbit  case,  the  satellite  is  so  close  to  the  surface  of  the  earth,  in 
comparison  to  the  earth’s  radius,  that  it  is  a good  approximation  to  equate 
the  magnit  ude  ac  of  its  acceleration  to  g,  the  magnitude  of  the  gravitational 
acceleration  it  would  have  immediately  above  the  earth’s  surface.  Thus  we 
can  set  ac  = g in  Eq.  (3-41a),  ac  = v2/r,  and  write  the  magnitude  of  the 
acceleration  as 


where  r is  the  radius  of  the  satellite’s  circular  orbit.  Solving  for  its  speed  v, 
we  obtain 


(3-42) 


To  the  same  degree  of  approximation,  we  can  set  r equal  to  the  earth's 
radius.  The  original  definition  of  the  meter  tells  us  immediately  that 
27rr/4  = 107  m,  so  that  r - 6.4  X 106  m.  We  thus  have 


= (6.4  x 106  m x 9.8  m/s2)1/2 


v 


or 


v = 7.9  x 103  m/s  = 7.9  km/s 


In  order  to  find  the  period  of  revolution  T,  note  that  it  is  just  the  dis- 
tance once  around  the  orbit  divided  by  the  speed  of  the  satellite.  Thus  we 
have 


(3-43) 


v 


90  Kinematics  in  Two  and  Three  Dimensions 


Substituting  numerical  values  into  this  equation  gives 

_ 2 7t  x 6.4  x IQ6  m 
7.9  x 103  m/s 

= 5.1  x 103  s = 85  min 

or  just  a little  under  an  hour  and  a half.  The  actual  initial  period  of  Sputnik 
1 (launched  October  4,  1957)  was  96  min.  The  discrepancy  is  due  to  the 
fact  that  the  orbit  was  not  exactly  circular. 

Sputnik  10,  launched  about  3i  years  later,  had  a much  more  closely 
circular  orbit.  Its  maximum  and  minimum  altitudes  were  247  km  and 
175  km,  so  that  the  departure  from  circularity  was  only  about  1 percent.  Its 
orbit  period  was  5142  s (85  min  42  s),  in  close  agreement  with  the  predic- 
tion of  Eq.  (3-43). 


The  period  T can  be  expressed  directly  in  terms  of  r and  g by  substi- 
tuting Eq.  (3-42)  into  Eq.  (3-43).  This  gives 


T = 


2 nr 

Vrg 


or 


T ~ 2n 


(3-44) 


The  magnitude  ac  of  the  centripetal  acceleration  of  an  earth  satellite  in 
a circular  orbit  of  any  radius  can  be  expressed  directly  in  terms  of  r and  T. 
From  Eq.  (3-43)  we  have 


v 


2 


4n2r2 

T 2 


(3-45) 


Substituting  this  value  into  Eq.  (3-4 la),  ac  = v2/r,  we  obtain 


4 jrr 


(3-46) 


Example  3-10  applies  Eq.  (3-46)  to  the  very  important  practical  case  of  a 
communications  satellite. 


EXAMPLE  3-10 

The  period  of  the  synchronous  satellites  used  for  long-distance  communication  is 
23  h 56  min  (h  stands  for  hours).  This  is  the  time  required  for  the  earth  to  make 
one  rotation  with  respect  to  the  “fixed  stars.”  The  orbit  of  the  satellite  is  circular.  If 
the  orbit  lies  in  the  same  plane  as  the  earth’s  equator,  and  the  direction  of  revolu- 
tion of  the  satellite  around  the  earth  is  the  same  as  that  of  the  rotation  of  the  earth 
about  its  own  axis,  the  satellite  appears  to  be  permanently  suspended  above  some 
point  on  the  equator.  (This  greatly  facilitates  the  aiming  of  antennas  on  the  earth  at 
the  satellite  and  of  antennas  on  the  satellite  at  fixed  earth  stations.)  The  altitude  of 
such  a satellite  is  about  35,800  km.  Find  the  acceleration  of  gravity  at  this  altitude. 

■ Equation  (3-46)  will  yield  the  desired  magnitude  ac  of  the  acceleration  due  to 
gravity.  The  period  T is  given,  but  you  must  determine  the  value  of  the  orbit  radius 
r.  You  add  the  distance  from  the  center  of  the  earth  to  its  surface,  6.4  X 106  m,  to 
the  altitude,  or  distance  from  the  earth’s  surface  to  the  satellite  orbit,  35.8  X 106  m. 
This  gives  you  r = 42.2  X 10®  m.  You  then  convert  23  h 56  min  to  seconds,  and  ob- 
tain the  value  T = 8.62  x 104  s for  the  period  of  the  satellite.  Using  these  figures  in 
Eq.  (3-46)  gives  you  the  acceleration 


3-6  The  Minimum-Orbit  Earth  Satellite  91 


3-7  THE  CONICAL 
PENDULUM  AND  THE 
BANKING  OF  CURVES 


4t r2  X 42.2  X IQ6  m 
(8.62  X 104  s)2 


0.224  m/s2 


This  is  smaller  than  the  acceleration  of  gravity  near  the  earth's  surface,  g = 9.80 
m/s2,  by  a factor  of  2.29  x 10-2. 

You  will  see  in  Chap.  1 1 how  Newton  used  a natural  earth  satellite  (the  moon) 
in  an  approximately  circular  orbit  of  radius  4 x 108  m to  evaluate  ac  at  that  distance 
from  the  earth,  and  the  crucial  role  this  evaluation  played  in  Newton’s  development 
of  the  “law  of  gravity.” 


In  Example  3-9  we  considered  the  motion  of  a stone  on  the  end  of  a string, 
being  whirled  in  a circular  orbit.  Let  us  now  consider  the  entire 
system  — the  string  as  well  as  the  stone. 

Imagine  what  happens  as  the  child  puts  her  stone  into  motion.  At  first 
the  stone  hangs  vertically.  As  she  begins  to  swing  it,  it  describes  a circular 
path,  so  that  the  string  itself  describes  a cone.  As  the  stone  moves  faster  and 
faster,  it  describes  a wider  and  wider  circle,  so  that  the  string  describes  a 
broader  and  broader  cone. 

Let  us  consider  the  stone  when  it  has  been  spun  up  to  a particular 
speed.  If  the  child  then  holds  her  end  of  the  string  still,  the  stone  will  con- 
tinue indefinitely  along  the  same  circular  pathway,  neglecting  friction. 

This  system  is  called  a conical  pendulum.  It  is  shown  in  perspective  in 
Lig.  3-38a  and  in  side  view  at  a particular  instant  of  time  in  Lig.  3-38 b.  The 
length  of  the  string  is  /,  and  it  makes  an  angle  6 with  the  vertical.  The  vector 
r locates  the  stone,  which  we  call  the  pendulum  bob,  with  respect  to  the 
center  of  its  circular  path. 

If  the  string  were  suddenly  cut,  the  bob  would  instantaneously  acquire 
a downward  acceleration  g.  If,  on  the  other  hand,  gravity  could  suddenly 
be  “turned  off,”  the  tension  in  the  string  at  that  instant  would  result  in  an 
instantaneous  acceleration  as  of  the  bob  in  the  direction  along  the  string; 
as  is  shown  in  Lig.  3-38 b. 

Since  the  bob  neither  ascends  nor  descends,  its  actual  vertical  accelera- 
tion must  be  zero.  Thus  aSJ/,  the  vertical  constituent  vector  of  as,  must  be 


p 


Fig.  3-38  (a)  A conical  pendulum.  The  bob  whirls  in  a horizontal  circle.  The  string,  which  is 

supported  at  point  P,  describes  a cone,  (b)  A “side  view”  of  the  system  at  a particular  instant.  The 
horizontal  circular  orbit  of  the  bob  has  radius  r;  its  instantaneous  location  with  respect  to  the 
center  of  the  orbit  is  specified  by  the  vector  r.  The  string,  of  length  l,  makes  an  angle  Q with  the 
vertical.  In  the  absence  of  the  string  tension,  the  bob  would  fall  with  vertical  acceleration  g.  If 
gravity  were  suddenly  to  disappear,  the  tension  in  the  string  would  result  in  an  instantaneous 
acceleration  of  the  bob  given  by  the  vector  as , whose  direction  is  along  the  string,  (c)  The  vectors 
as  and  g are  shown,  together  with  the  horizontal  and  vertical  constituent  vectors  of  as , which  are, 
respectively,  asx  and  asy. 


92  Kinematics  in  Two  and  Three  Dimensions 


equal  in  magnitude  to  g and  oppositely  directed,  so  that  aS3/  + g = 0.  See 
Fig.  3-38c,  where  we  have  constructed  the  horizontal  and  vertical  constitu- 
ent vectors  asx  and  asl/  of  as.  The  condition  that  the  two  vertical  accelera- 
tions add  to  zero  gives  the  following  relation  between  asu,  the  magnitude  of 
a sy,  and  g,  the  magnitude  of  g: 

asv  = g (3-47) 

But  the  figure  shows  that  asy  = as  cos  8 , where  as  is  the  magnitude  of  as. 
Thus  we  have 

as  cos  8 = g (3-48) 

The  horizontal  constituent  vector  as>r  of  the  vector  as  is  the  centripetal 
acceleration  associated  with  the  circular  motion  of  the  pendulum  bob  in  the 
horizontal  plane.  The  vector  asx  has  the  direction  — r,  and  its  magnitude  is 
asx  = as  sin  8.  If  the  speed  of  the  bob  is  v and  r is  the  radius  of  its  circular 
orbit,  then  Eq.  (3-4 la)  requires  that 


or 

v2 

as  sin  8 = — 
r 

If  we  divide  Eq.  (3-50)  by  Eq.  (3-48),  we  obtain 

as  sin  8 _ 
as  cos  8 rg 

or 

v 2 

tan  8 = — 
rg 

This  relation  will  appear  again  shortly  in  what  will  seem  at  first  to  be  a 
totally  different  situation. 

Equation  (3-51)  can  be  inconvenient  because  it  contains  three  inde- 
pendent variables,  v,  r,  and  8.  Note,  however,  that  r = l sin  8.  Conse- 
quently, we  can  write  Eq.  (3-51)  in  the  form 

tan  8 sin  8 = —7  (3-52) 

gl 

which  contains  only  two  variables,  8 and  v. 

A still  more  convenient  expression  of  this  relation  can  be  written  in 
terms  of  the  period  T (the  time  required  by  the  bob  to  complete  one  orbit) 
rather  than  the  speed  of  the  bob  v.  Substituting  v2  = 4ti2t2/T2  [see  Eq. 
(3-45)]  into  Eq.  (3-51),  we  obtain 


(3-49) 

(3-50) 


(3-51) 


tan  8 = 


T7 Ft 
gT2 


or 


T = 2 


77 


g tan  8 


3-7  The  Conical  Pendulum  and  the  Banking  of  Curves  93 


Again  making  the  substitution  r = / sin  6,  we  have 


T = 2tt 


— cos  6 
g 


(3-53) 


This  equation  tells  us  that  the  period  of  the  pendulum  depends  on  the 
angle  6.  The  dependence  is  weak,  however,  since  T depends  on  the  square 
root  of  the  cosine  of  6 , which  does  not  change  very  rapidly  as  6 changes. 
You  will  find  it  worthwhile  to  check  on  this  behavior  qualitatively  with  a 
makeshift  pendulum  and  a watch. 


The  design  of  highway  curves  may  seem  a far  cry  from  a child 
swinging  a stone  on  a string.  However,  there  is  a close  connection  between 
the  two,  as  far  as  the  analysis  of  the  motion  is  concerned. 

If  a car  is  driven  around  an  unbanked  curve  too  fast,  it  will  skid  out- 
ward. The  reason  for  this  is  easily  understood  in  terms  of  the  centripetal 
acceleration  present  when  the  car  (which  we  assume  to  have  a constant 
speed)  follows  a curved  path.  If  the  radius  of  the  curve  is  r,  there  is  a cen- 
tripetal acceleration  of  magnitude  v2/r. 

But  an  acceleration  implies  the  existence  of  a net  force  acting  in  the  direc- 
tion of  the  acceleration,  as  we  learned  in  Sec.  1-3.  What  can  supply  this  force 
to  the  car,  in  a direction  perpendicular  to  that  of  its  forward  motion?  It 
must  be  the  roadway,  which  exerts  an  inward  radial  force  on  the  tires  (in 
addition  to  the  vertical  force  it  always  exerts  on  the  tires  in  order  to  hold 
the  car  up).  If  the  car  is  going  too  fast  for  the  curve,  the  necessary  centripe- 
tal acceleration  will  require  an  inward  radial  force  in  excess  of  the  max- 
imum frictional  force  which  can  be  exerted  by  the  road  on  the  tires.  The 
car  then  skids.  That  is,  the  curve  it  follows  is  gentler  (has  a larger  radius) 
than  the  curve  of  the  road.  Clearly,  the  result  can  be  disastrous. 

Even  if  the  car  does  not  skid,  going  around  unbanked  curves  can  be  an 
uncomfortable  process  for  passengers  (and  a potentially  damaging  one  for 
cargo).  Just  as  the  roadway  must  exert  a force  on  the  car  to  accelerate  it  in- 
ward, the  car  must  apply  a force  to  the  passengers  in  order  to  accelerate 
them  inward,  mostly  through  the  seats  of  their  pants.  If  various  parts  of  a 
passenger’s  body  are  to  follow,  a similar  force  must  be  applied  to  them  by 
means  of  appropriate  muscle  tensions.  But  the  muscles  of  the  back  are  not 
very  well  adapted  to  pulling  sideways  on  the  shoulders  and  head.  There  is  a 
distinct  improvement  in  comfort  (to  say  nothing  of  safety  and  economy)  if 
the  direction  of  the  force  can  be  altered  to  lie  parallel  to  the  spine. 

The  general  method  for  producing  this  alteration  is  to  tilt  the  road  in- 
ward, so  that  the  road  presses  against  the  car  in  a nonvertical  direction. 
This  inclination  of  the  roadway,  called  banking,  is  shown  in  Fig.  3-39 a. 


We  could  analyze  the  situation  from  scratch,  but  it  is  possible  to  save 
repetitive  work  by  making  a simple  observation.  The  road  surface  serves 
the  same  purpose  here  as  does  the  string  of  the  conical  pendulum — it  pre- 
vents the  object  in  question  (in  this  case  the  car  and  its  contents)  from 
moving  in  the  vertical  direction.  Indeed,  let  us  perform  the  following 
operation  in  our  mind’s  eye:  From  a point  P directly  above  the  center  of 
curvature  of  the  road,  extend  a cable  downward  to  the  body  of  the  car,  ad- 
justing the  height  of  the  cable  at  point  P so  that  it  is  exactly  perpendicular 
to  the  properly  banked  road,  as  in  Fig.  3-396.  The  cable  then  makes  an 
angle  6 with  the  vertical,  just  as  the  road  does  with  the  horizontal.  Increase 


94  Kinematics  in  Two  and  Three  Dimensions 


(a) 


Fig.  3-39  (a)  An  automobile  on  a 

banked  roadway.  In  the  absence  of  the 
support  of  the  roadway,  the  automobile 
would  fall  with  vertical  acceleration  g.  It 
gravity  were  suddenly  to  disappear,  the 
pressure  exerted  by  the  roadway  on  the 
automobile  would  result  in  an  accelera- 
tion of  the  automobile  given  by  the 
vector  as , whose  direction  is  in  the  plane 
of  the  page  and  perpendicular  to  that  of 
the  roadway.  (. b ) In  imagination,  a cable 
is  attached  to  the  automobile  and  sup- 
ported from  a point  P located  above  the 
center  of  curvature  of  the  roadway,  at 
such  a height  that  the  cable  is  perpen- 
dicular to  the  roadway.  The  tension  in 
the  cable  is  increased  until  it  entirely  sup- 
ports the  automobile,  and  the  roadway  is 
removed.  The  automobile  thus  becomes 
the  bob  of  a conical  pendulum. 


( b ) 


the  tension  in  the  cable  until  the  road  exerts  zero  force  on  the  car;  then  re- 
move (he  road.  This  thought  process  converts  the  car  into  a conical  pen- 
dulum, and  the  applicable  equation  is  Eq.  (3-51), 


tan 


(3-54) 


At  this  inclination  angle  6,  the  passengers  are  tilted  so  that  the  forces  ex- 
erted on  them  by  the  seats  of  the  car  are  parallel  to  their  spines. 


EXAMPLE  3-11 

Find  the  magnitude  of  the  acceleration  a,  in  Fig.  3-39a.  What  will  be  the  magnitude 
and  direction  of  as  if  r = 100  m and  v = 15.0  m/s  (about  34  mi/h)? 

■ Since  a car  moving  on  a properly  banked  road  is  equivalent  to  the  bob  of  a con- 
ical pendulum,  the  conditions  governing  the  vector  as  in  Fig.  3-39o  are  the  same  as 
those  governing  the  equivalent  vector  as  for  the  conical  pendulum,  shown  in  Fig. 
3-38/>.  That  is,  the  horizontal  and  vertical  components  of  as  must  be  those  given  by 
Eqs.  (3-49)  and  (3-47), 

v2  , 

= — and  asu  = g 


To  find  the  magnitude  of  the  vector  as,  you  use  the  pythagorean  theorem: 


(3-55) 


3-7  The  Conical  Pendulum  and  the  Banking  of  Curves  95 


3-8  THE  GALILEAN 
TRANSFORMATIONS 


For  the  numbers  given,  you  have  the  magnitude 


as 


'(15.0  m/s)4 
(100  m)2 


1/2 


+ (9.80  m/s2)2 


10.1  m/s2 


or  about  3 percent  more  than  the  ordinary  acceleration  of  gravity. 

You  can  solve  Eq.  (3-54)  for  6 and  use  the  result  to  calculate  the  ideal  banking 
angle.  You  have 


or 


0 = tan  1 


0 = tan  1 — 
rg 


(15.0  m/s)2 
100  m x 9.80  m/s5 


= 12.9° 


Thus  the  direction  of  as  is  about  13°  from  the  vertical.  This  is  a relatively  steep  angle 
of  bank  under  ordinary  roadway  conditions. 


There  are  many  occasions  when  two  persons  who  are  moving  with  respect 
to  each  other  observe  the  same  phenomenon  and  then  use  the  laws  of  phys- 
ics to  analyze  their  observations.  In  order  to  compare  their  results,  the  two 
observers  must  be  able  to  describe  their  observations  in  mutually  intel- 
ligible terms.  The  rules  which  they  must  employ  in  doing  so  depend  on 
how  they  are  moving  with  respect  to  each  other.  In  this  section  we  consider 
the  simplest  and  most  important  case,  in  which  one  observer  moves  at  con- 
stant velocity  with  respect  to  the  other.  In  such  a case  these  rules  are  called 
the  Galilean  transformations. 

For  a specific  example,  let  observer  O be  stationed  at  the  side  of  a 
straight  road.  Observer  O'  is  in  a car  moving  at  constant  speed  along  the 
road.  The  phenomenon  they  both  observe  is  the  motion  of  a car  C,  which  is 
traveling  along  the  road  in  the  same  direction  as  O'  but  at  a higher  speed. 
Both  O'  and  C pass  O at  the  same  instant.  Figure  3-40a  shows  the  situation, 
as  seen  by  O.  She  measures  the  position  of  O'  with  respect  to  the  origin  of  her 
reference  frame  at  instants  separated  by  equal  time  intervals.  Since  the  situ- 
ation is  one-dimensional,  she  is  free  to  describe  the  results  by  using  either 
signed  scalar  quantities  or  vector  quantities.  She  chooses  signed  scalars. 
Thus  she  describes  the  position  of  O'  by  means  of  the  coordinate  x,  taking 
the  values  of  x to  he  positive  when  the  direction  from  her  coordinate  origin 
to  O'  is  the  same  as  the  direction  in  which  O'  is  moving.  She  finds  that  the 
displacement  of  O'  during  each  of  the  equal  time  intervals  A t has  the  same 
value  AX.  So  she  concludes  that  the  velocity  of  O'  with  respect  to  herself  has 
the  constant  value  V = AX/A t. 

While  this  is  happening,  O also  observes  car  C.  She  finds  that  the  dis- 
placement of  C has  the  same  value  Ax  for  each  time  interval  At.  Thus  she 
concludes  that  the  velocity  of  C with  respect  to  herself  has  the  constant 
value  v = Ax/  At. 

What  result  does  O'  obtain  when  he  observes  the  velocity  of  C with 
respect  to  himself?  Figure  3-41  depicts  the  motion  of  C,  as  seen  by  O' . He 
describes  the  position  of  C in  terms  of  its  coordinate  x'  measured  from  his 
coordinate  origin,  taking  the  positive  direction  to  be  the  same  as  that 
chosen  by  O.  Using  the  same  time  interval  At  which  O uses,  he  finds  the  dis- 
placement of  car  C to  be  Ax'  during  each  time  interval.  He  therefore  finds 
the  velocity  of  C with  respect  to  himself  to  have  the  constant  value  v'  = 
Ax' /A  t. 


Kinematics  in  Two  and  Three  Dimensions 


Ax 


C 


Ax 


C 


Ax 


C 


ax 


-AX 


O' 


O' 


O 


x axis 


C 


(a) 


O 


AX 


Ax 


x axis 


<*> 


Fig.  3-40  (a)  Two  cars,  one  containing  observer  O'  and  the  other  known  as  car  C,  move  parallel 

to  the  positive  x axis  of  the  reference  frame  of  observer  O.  They  pass  O at  the  same  instant.  She 
measures  their  positions  at  the  ends  of  equal  time  intervals  At  after  they  pass  her.  She  finds  that 
O'  has  a constant  velocity  V = AX/A t and  that  C has  a constant  velocity  v = Ax/ At.  ( b ) The 
positions  of  O'  and  C at  the  end  of  the  first  time  interval  At,  showing  that  Ax  = AX  + Ax' so  that 
Ax'  = Ax  — AX. 


• x'  axis 

O' 


Fig.  3-41  Observer  O'  measures  the  positions  of  car  C along  his  positive  x'  axis  at  the  ends  of 
equal  time  intervals  after  it  passes  him.  The  time  intervals  At  are  the  same  as  those  used  by  O.  He 
finds  that  C has  a constant  velocity  v'  = Ax' / At. 


Figure  3-40 b shows  how  O can  deduce  from  her  own  measurements 
what  value  of  v'  will  be  measured  by  O' . She  notes  that 

Ax'  = Ax  - AX 

Dividing  both  sides  of  this  equation  by  A t,  she  has 

Ax'  _ Ax  AX 
A t A t M. 

Since  all  the  velocities  involved  are  constant,  she  can  write  the  last  equation 
in  the  form 

v'  = v - V 

Thus  she  concludes  that  v'  (the  velocity  of  C as  measured  by  O')  is  the  dif- 
ference between  v (the  velocity  of  C as  she  measures  it)  and  V (the  velocity 
of  O'  as  she  measures  it).  This  relation  is  a special  case  of  the  Galilean  veloc- 
ity transformation.  She  uses  it  to  predict  the  value  of  v'  and  then  com- 


3-8  The  Galilean  Transformations  97 


municates  the  result  to  O' . He  compares  the  prediction  with  the  result  of 
his  own  direct  measurement  of  v'  and  finds  the  prediction  agrees  with  the 
measurement. 

We  will  now  obtain  general  expressions  for  the  Galilean  transforma- 
tions from  a consideration  of  Fig.  3-42.  This  figure  shows  the  unprimed 
reference  frame  whose  origin  is  0.  At  an  arbitrary  fixed  location  in  this 
frame  is  an  observer  0 (not  shown).  The  figure  also  shows  the  primed  ref- 
erence frame  whose  origin  is  O'.  Fixed  at  an  arbitrary  location  in  this  frame 
is  another  observer  O'  (also  not  shown).  (It  is  customary  to  denote  the  ori- 
gin of  a frame  of  reference  by  the  same  symbol  used  for  the  observer  fixed 
to  that  frame,  even  though  the  observer  is  not  necessarily  stationed  at  the 
origin.) 

As  long  as  two  conditions  are  satisfied,  observer  O'  will  move  at  con- 
stant velocity  V with  respect  to  observer  O.  Idle  first  of  these  conditions  is 
that  the  origin  O'  of  the  primed  reference  frame  must  move  at  constant 
velocity  V with  respect  to  the  origin  O of  the  unprimed  frame.  The  second 
condition  is  that  neither  of  the  reference  frames  may  rotate  about  its  own 
origin,  as  seen  from  the  other.  (If  there  were  such  rotation,  the  observers 
could  not  be  moving  at  constant  velocity  with  respect  to  each  other  unless 
they  happened  to  be  at  the  origins  of  their  respective  frames.)  The  figure  is 
drawn  for  a case  in  which  the  axes  of  the  primed  frame  are  parallel  to  the 
corresponding  axes  of  the  unprimed  frame.  However,  this  is  not  essential 
to  the  argument  which  follows.  Indeed,  it  is  not  even  necessary  to  define 
specific  axes.  We  assume,  for  convenience,  that  the  origins  of  the  two  refer- 
ence frames  coincide  precisely  at  a certain  instant,  which  we  call  t = 0.  That 
is,  at  the  time  t = 0 (but  only  then)  any  vector  r which  locates  a body  B with 
respect  to  the  origin  O is  identical  with  the  vector  r'  which  locates  B with 
respect  to  the  origin  O' . 

At  any  other  time  t,  the  position  of  origin  O'  relative  to  origin  O is 
given  by  the  vector  V/  shown  in  the  figure.  This  is  so  since  O'  is  moving  with 
respect  to  O at  the  velocity  V.  At  time  t the  observers  stationed  in  their  ref- 


Fig.  3-42  A pair  of  three-dimensional  frames  of  reference  moving 
relative  to  each  other  at  constant  velocity.  An  observer  in  the 
unprimed  frame  measures  the  velocity  of  the  primed  frame  and 
finds  it  to  have  the  arbitrary  but  constant  magnitude  and  direction 
which  are  expressed  by  the  velocity  vector  V (not  shown).  If  the  two 
reference  frames  coincide  at  the  instant  t = 0,  the  position  of  the 
origin  O'  is  given  relative  to  that  of  origin  0 at  any  time  t by  means  of 
the  position  vector  Vt.  At  that  time  the  observers  in  the  two  frames 
simultaneously  measure  the  position  of  body  B relative  to  the  origins 
of  their  frames,  expressing  the  results  by  the  position  vectors  r and 
r'.  The  figure  shows  that  r = r'  + Vt,  so  that  r'  = r — Vt. 


98  Kinematics  in  Two  and  Three  Dimensions 


erence  frames  simultaneously  measure  the  position  of  body  B.  The  ob- 
server in  the  unprimed  frame  will  represent  that  position  by  the  vector  r 
extending  from  0 to  B,  while  the  one  in  the  primed  frame  will  represent  it 
by  the  vector  r'  extending  from  O'  to  B.  Inspection  of  the  figure  (which  is 
just  a three-dimensional  generalization  of  Fig.  3-40 b)  shows  that  the  rela- 
tion between  the  two  position  vectors  is 

r'  = r — Vt  (3-56) 

The  comparison  of  position  made  possible  by  this  equation  is  called  the  Gal- 
ilean position  transformation,  a name  often  given  to  the  equation  itself. 

The  rules  for  differentiating  vector  quantities  are  the  same  as  those  for 
differentiating  scalar  quantities.  We  can  differentiate  each  term  in  Eq. 
(3-56)  with  respect  to  time  to  obtain 

dr'  _ dr  _ d(\t) 
dt  dt  dt 

But  we  have  stipulated  that  the  velocity  V of  O'  with  respect  to  O be  a con- 
stant. Hence  we  can  use  the  rules  for  differentiation  given  by  Eq.  (2-13), 
and  by  Eq.  (2-16),  to  obtain 

dr'  __  dr  ^ dt 
dt  dt  dt 

and  then 

dr'  _ dr 
~dt  ~ dt  V 

Employing  the  definitions  v = dr/dt  and  v'  = dr'  I dt,  we  have 

v'  = v - V (3-57) 

The  comparison  of  velocity  made  possible  by  this  equation  is  called  the  Gal- 
ilean velocity  transformation.  Stated  in  words,  it  says  the  velocity  of  a body 
with  respect  to  a primed  frame  is  the  vector  difference  between  its  velocity  with  respect 
to  an  unprimed  frame  and  the  constant  velocity  of  the  primed  frame  with  respect  to 
the  unprimed  frame. 

An  important  special  consequence  of  Ecp  (3-57)  is  that  when  v is  con- 
stant, then  v'  is  also  constant.  That  is,  if  a primed  reference  frame  moves  at  con- 
stant velocity  with  respect  to  an  unprimed  frame,  then  a body  observed  to  move  rela- 
tive to  the  unprimed  frame  at  constant  velocity  will  be  observed  to  move  relative  to  the 
primed  frame  with  a different,  but  still  constant,  velocity. 

In  general,  the  observer  in  the  unprimed  frame  of  reference  will  not 
find  the  velocity  v of  a body  B to  be  constant.  Rather,  the  value  of  v will  de- 
pend on  when  it  is  measured.  Thus  the  observer  will  find  body  B in  general 
to  have  a nonzero  acceleration  a.  Similarly,  the  observer  in  the  primed 
frame  will  find  from  measurements,  made  simultaneously  with  those  of  the 
observer  in  the  unprimed  frame,  that  body  B has  acceleration  a'.  The  con- 
nection between  a and  a'  can  be  determined  by  differentiating  Eq.  (3-57) 
with  respect  to  time.  This  gives 

d\'  _ dx_  _ TV 
dt  dt  dt 

Since  V is  constant,  dV/dt  = 0.  Thus  dx' /dt  — dx/dt.  Using  the  definitions 
a = dx/dt  and  a'  = dx' /dt,  we  obtain 

a'  = a (3-58) 


3-8  The  Galilean  Transformations  99 


In  words,  if  a primed  reference  frame  moves  at  constant  velocity  with  respect  to  an 
unprimed  frame,  then  the  acceleration  of  a body  will  be  observed  to  have  the  same 
value  with  respect  to  either  frame.  This  equality  of  accelerations  is  called  the 
Galilean  acceleration  transformation. 

It  was  Einstein  who  first  named  the  transformations  given  by  Eqs. 
(3-56),  (3-57),  and  (3-58)  in  honor  of  Galileo.  They  have  an  intimate  con- 
nection with  Newton’s  laws  of  motion.  This  connection  is  discussed  in 
Chap.  4. 

Examples  3-12  and  3-13  illustrate  the  application  of  the  Galilean  trans- 
formation equations. 


EXAMPLE  3-12 


Observer  O drops  a stone  from  the  top  of  a skyscraper.  Observer  O' , riding  in  an 
elevator,  starts  down  from  the  top  of  the  skyscraper  at  the  instant  when  the  stone  is 
dropped.  The  elevator  accelerates  very  quickly  to  a downward  velocity  of  magni- 
tude V = 5.0  m/s  and  then  maintains  that  velocity  steadily.  At  the  time  t = 3.0  s 
after  the  stone  is  dropped,  find  the  position,  the  velocity,  and  the  acceleration  of  the 
stone  relative  to  O.  Then  find  the  position,  the  velocity,  and  the  acceleration  of  the 
stone  relative  to  O'. 

■ Since  the  problem  is  essentially  one-dimensional,  you  can  use  the  signed  scalar 
representation  of  one-dimensional  vectors,  as  discussed  in  Sec.  3-2.  You  then  repre- 
sent the  vectors  r,  v,  and  a by  the  scalars  x,  v,  and  a , and  use  Eq.  (2-30), 


at2 

X = X;  + Vjt  + — 


to  find  the  position  x.  Taking  x = 0 at  the  top  of  the  skyscraper  and  taking  the 
downward  direction  as  the  positive  x direction,  you  have  x,  = 0,  vt  = 0,  a = + g = 
+ 9.8  m/s2,  and  thus 


x — 0 + 0 + 


9.8  m/s2  x (3.0  s)2 

2 


= + 44  m 


The  positive  value  of  x denotes  the  downward  direction. 

Using  Eq.  (2-29), 

v = Vi  + at 

you  find  the  velocity  ol  the  stone  with  respect  to  O to  be 

v = 0 + 9.8  m/s2  x 3.0  s=  +29  m/s 

Again,  the  positive  value  indicates  a downward  direction. 

The  acceleration  of  a freely  falling  body,  as  seen  by  the  observer  O who  is  sta- 
tionary with  respect  to  the  earth,  is  known  to  be  the  constant  gravitational  accelera- 
tion. (Indeed,  this  underlies  the  validity  of  the  two  calculations  immediately  above.) 
You  thus  have 


a = +g  = +9.8  m/s2 

Here  the  positive  sign  means  the  acceleration  is  downward. 

You  can  now  find  x'  by  using  Eq.  (3-56)  and  the  value  of  xjust  calculated.  Sub- 
stituting the  signed  scalar  V for  the  vector  V,  you  have 

x'  = x — Vt 


or 


x'  = 44  m - 5.0  m/s  x 3.0  s = +29  m 


100  Kinematics  in  Two  and  Three  Dimensions 


That  is,  the  stone  is  located  29  m below  observer  O'  at  the  end  of  3.0  s. 
Substituting  signed  scalars  for  vectors  in  Eq.  (3-57),  you  have 

v'  = v - V 

Inserting  the  numerical  values  gives  you 

v'  = 29  m/s  - 5.0  m/s  = +24  m/s 

Thus  O'  sees  the  velocity  of  the  stone  to  be  24  m/s  downward. 

From  Eq.  (3-58),  a'  = a,  you  have 

a'  = +g  = + 9.8  m/s2 

Observer  O'  sees  the  stone  to  have  the  same  downward  acceleration  as  that  seen 
by  O. 


EXAMPLE  3-13 


y (Northward) 


x (Eastward) 


Fig.  3-43  At  what  heading  and  air- 
speed v',  given  by  the  pilot’s  instru- 
ments, should  an  airplane  be  flown 
through  a wind  having  velocity  V if  it  is 
to  make  good  a course  and  groundspeed 
given  by  v? 


In  order  to  reach  his  destination  on  schedule,  an  airline  pilot  wishes  to  fly  over  the 
ground  at  a “ground  speed”  of  250  m/s  along  a “course  made  good”  which  is  in  a 
northeasterly  direction  (in  other  words,  45.0°  north  of  east).  The  weather  bureau 
tells  him  that  the  wind  is  blowing  due  eastward  with  a speed  of  30  m/s.  Find  the  re- 
quired “airspeed”  of  the  airplane  (its  speed  relative  to  the  air  through  which  it  flies, 
read  by  the  pilot  on  his  airspeed  indicator)  and  the  required  “heading”  (the  geo- 
graphical direction  in  which  the  nose  of  the  plane  should  be  pointed,  read  by  the 
pilot  on  his  compass),  if  the  plane  is  to  reach  the  right  place  at  the  right  time. 

■ Although  it  is  possible  to  think  of  the  pilot  as  both  the  moving  observer  and  the 
observed  moving  body,  you  may  find  it  easier  to  analyze  the  problem  if  you  make  a 
separation  as  follows.  Imagine  an  observer  O'  in  a balloon  which  floats  along  with 
the  wind.  His  velocity  with  respect  to  an  observer  O on  the  ground  is  V = 30  m/s 
(eastward).  Since  the  airplane  is  also  carried  along  by  the  wind  as  it  moves  through 
the  air.  O'  will  observe  the  proper  airspeed  and  heading  which  comprise  the  velocity 
v'.  Can  you  give  a more  detailed  explanation  of  why  this  is  true?  At  the  same  time  O 
observes  the  desired  ground  speed  and  course  made  good,  v = 250  m/s  (northeast- 
ward. You  should  make  a sketch  like  that  of  Fig.  3-43  to  depict  the  relationship 
among  the  velocities  v,  v',  and  V.  The  resulting  vector  diagram  is  the  pictorial 
equivalent  of  Eq.  (3-57),  v'  = v — V.  If  you  draw  the  diagram  carefully  to  scale,  you 
can  measure  the  desired  value  of  v'  with  a ruler  and  a protractor. 

However,  it  is  usually  both  easier  and  more  accurate  to  obtain  the  desired  result 
by  algebraic  calculation.  As  you  saw  in  Sec.  3-2,  the  sum  of  the  vectors  v and  —V  re- 
quired to  solve  Eq.  (3-57)  for  v'  can  be  found  by  adding  the  vectors  component  by 
component.  Since  the  problem  is  essentially  two-dimensional,  you  need  only  the  x 
and  y components,  and  Eq.  (3-57)  can  be  written  in  the  equivalent  component  form 
as 


v'x  = vx  - Vx  (3-59 a) 

and 

v'y  - vy  - Vy  (3-59 b) 

To  save  some  work,  take  the  positive  x direction  to  be  eastward,  that  is,  the  direction 
in  which  O sees  O'  to  be  moving.  The  components  of  V then  become 

V*  = V = 30  m/s  and  Vv  = 0 

Now  calculate  the  x and  y components  of  the  desired  ground  velocity  v.  You 
have 

vx  = v cos  45.0°  = 250  m/s  x cos  45.0°  =177  m/s 

and 

vy  = v sin  45.0°  = 250  m/s  x sin  45.0°  = 177  m/s 


3-8  The  Galilean  Transformations  101 


Inserting  the  numerical  values  of  vx  and  V x into  Eq.  (3-59a),  you  obtain 
v'x  = 177  m/s  - 30  m/s  = 147  m/s 

And  the  numerical  values  of  vy  and  Vu  inserted  into  Eq.  (3-596)  give  you 

v'y  = 177  m/s  — 0 = 177  m/s 


You  can  now  calculate  the  magnitude  and  direction  of  the  velocity  v'  of  the 
plane  as  seen  by  O'  in  the  balloon  (and  also  of  the  plane  with  respect  to  the  air  as 
measured  by  the  pilot’s  instruments).  You  have  for  the  magnitude 

v'  = [(i/r)2  + (fy)2]1,2=  [(147  m/s)2  + (177  m/s)2]1'2  = 230  m/s 

For  the  direction  you  have 


9' 


= tan  1 


tan  1 


177  m/s 
147  m/s 


50.3° 


Since  the  positive  x direction  has  been  taken  to  be  eastward,  this  angle  signifies  a 
direction  50.3°  north  of  east,  and  the  pilot  should  fly  the  plane  with  a velocity 

v'  = v'\'  = (230  m/s)(50.3°  north  of  east) 


through  the  air. 

o 


EXERCISES 

Group  A 

3-1.  Determining  g.  A simple  laboratory  apparatus  for 
determining  g is  shown  in  Fig.  3E-1.  A ball  is  traveling 
horizontally  with  a known  speed  as  it  leaves  the  end  of  a 
curved  incline  at  the  edge  of  a table.  The  table  height  AB 
is  measured,  and  so  is  the  distance  BC  from  the  base  of  the 
table  to  point  C,  where  the  ball  strikes  the  floor. 


a.  If  the  ball’s  speed  as  it  passes  point  A is  3.0  m/s 
and  the  distance  BC  is  found  to  be  1.5  m,  how  long  was 
the  ball  in  flight? 

b.  Let  the  table  height  AB  be  1.0  m.  Use  the  informa- 
tion in  part  a to  And  the  value  of  g. 

3-2.  Marking  the  spot.  An  aerial  search  party  locates 
the  spot  where  a ship  has  sunk.  The  fliers  plan  to  mark  the 
spot  by  dropping  a buoy.  They  are  flying  horizontally  at 
speed  v and  altitude  h. 

a.  Neglecting  air  resistance,  how  far  will  the  buoy 
travel  horizontally  before  splash-down? 

b.  As  the  plane  makes  a direct  approach  over  the  site, 
the  crew  member  in  charge  of  releasing  the  buoy  moni- 


tors the  line  of  sight  to  the  “target."  What  angle  should  the 
line  of  sight  make  with  the  vertical  at  the  proper  time  to 
drop  the  buoy? 

c.  Evaluate  your  results  for  v = 150  m/s  and  h = 
3000  m. 

d.  Where  will  the  plane  be  when  the  buoy  strikes 
the  water? 

3-3.  A game  of  catch.  Two  youngsters  are  playing 
catch.  They  stand  2.0  m apart.  On  each  throw,  the  ball 
rises  and  falls  1.5  m. 

a.  What  is  the  time  of  rise?  The  time  of  fall?  The  time 
of  flight? 

b.  What  is  the  horizontal  component  of  the  initial 
velocity?  The  vertical  component? 

c.  What  angle  does  the  initial  velocity  make  with  the 
horizontal?  What  is  the  initial  speed  of  the  ball? 

d.  What  is  the  speed  of  the  ball  when  it  is  caught? 

3-4.  Water  fun.  Three  tilted  hose  nozzles  A,  B,  and  C 
are  fixed  on  the  ground.  The  nozzles  are  inclined  to  the 
horizontal  at  30°,  45°,  and  60°,  respectively.  Streams  of 
water  issue  from  the  three  hoses  at  identical  speeds. 

a.  What  is  the  ratio  of  the  maximum  heights  of  rise? 

b.  What  is  the  ratio  of  the  ranges? 

3-5.  Out  in  left  field.  A baseball  outfielder  has  a 
throwing  range  of  80  m when  he  throws  the  ball  at  30° 
above  the  horizontal.  Assuming  the  same  initial  speed, 
how  high  would  the  ball  rise  if  the  player  threw  it  verti- 
cally upward? 


102  Kinematics  in  Two  and  Three  Dimensions 


3-6.  Grasshopping.  A typical  adult  grasshopper  has  a 
jumping  range  of  0.75  m.  Assume  that  the  launch  angle  is 
45°. 

a.  What  is  the  horizontal  component  of  the  grasshop- 
per’s velocity? 

b.  How  long  is  the  grasshopper  in  flight?  Neglect  ait 
resistance  and  aerodynamic  lift. 

3-7.  Pigskin’s  progress.  A football  player  is  carrying  the 
hall  at  a speed  of  7.5  m/s.  His  path  makes  an  angle  of  35° 
with  the  sidelines.  At  what  rate  is  he  approaching  the  goal 
line? 

3-8.  Add  ’em  up.  Figure  3E-8  shows  six  vectors  of  the 
indicated  magnitudes,  each  at  an  angle  of  60°  with  the  ad- 
jacent vectors.  What  is  the  magnitude  of  the  resultant 
vector?  What  is  its  direction  with  respect  to  vector  A? 


Fig.  3E-8 


3-9.  Graphical  representation  oj  vectors. 

a.  Construct  a pair  of  mutually  perpendicular  coordi- 
nate axes.  Draw  a vector,  A,  from  the  origin  to  the  point 
(3,  4).  Draw  a vector,  B,  from  the  origin  to  the  point  (4,  3). 

b.  Are  A and  B equal? 

c.  Are  A and  B equal? 

3-10.  Vector  addition.  Figure  3E-10  shows  a square 
whose  sides  are  each  1 unit  in  length.  Consider  each  side 
to  be  a vector,  as  indicated.  Keeping  in  mind  that  a vector 


Fig.  3E-10 


can  be  moved  parallel  to  itself  without  changing  it,  find 
the  magnitudes  and  directions  of  the  following  vectors. 

a.  A + B 

b.  A + C 

c.  A + D 

d.  A - B 

e.  A - C 

f . A - D 


g.  A + B+  C + D 

h.  A + B-  C-  D 

i.  A + 2B 

3-11.  Vector  practice.  Vectors  A and  B lie  in  the  xy 
plane.  As  is  customary,  the  positive  y direction  makes  an 
angle  of  +90°  with  the  positive  x direction.  Vector  A is 
5.0  cm  long  and  makes  an  angle  of  +60°  with  the  x axis. 
Vector  B is  5.0  cm  long  and  makes  an  angle  of  —60°  with 
the  x axis. 

a.  Construct  a diagram  in  which  A and  B are  repre- 
sented. 

b.  Find  the  magnitude  and  direction  of  A + B; 

A - B;  B - A. 

3-12.  Net  displacement.  The  following  five  vectors  rep- 
resent displacements  on  the  earth’s  surface: 

(1)  1 m north 

(2)  2 m 30°  east  of  north 

(3)  3 m 60°  east  of  north 

(4)  4 m east 

(5)  5 m southeast 

a.  Construct  an  xy  diagram  in  which  the  positive  x 
direction  represents  east  and  the  positive  y direction  rep- 
resents north.  Choose  an  appropriate  scale  and  carefully 
represent  each  vector.  Position  the  five  vectors  in  such  a 
way  that  you  can  immediately  find  their  vector  sum  from 
the  diagram.  Use  a ruler  and  protractor  to  determine  the 
magnitude  and  direction  of  their  vector  sum. 

b.  Using  the  x and  y axes  defined  in  part  a,  find  the  x 
and  y components  of  each  of  the  five  given  vectors.  Use 
these  to  find  the  components  of  their  vector  sum. 

c.  Verify  that  the  results  of  parts  a and  b are  consist- 
ent. 


3-13.  Unscheduled  stop.  On  a calm  day,  an  airplane 
flying  at  400  km/h  flies  east  for  1 h,  then  south  fori  h.  It 
then  flies  toward  its  starting  point  for  i h before  making  a 
forced  landing.  How  far  and  in  what  direction  from  its 
starting  point  does  the  plane  land? 


3-14.  Trading  far  for  high.  Sally’s  maximum  range  in 
throwing  a baseball  is  Rmax. 

a.  Show  that  if  she  throws  the  baseball  vertically  up- 
ward with  the  same  initial  speed,  it  will  attain  a height 
equal  to  iRmax. 

b.  How  does  the  height  found  in  part  a compare  with 
the  height  attained  when  Sally  is  throwing  for  maximum 
range? 


3-15.  How  much  string?  A hose  nozzle  N is  strapped  at 
point  A to  a fixed  rod,  ABCD.  See  Fig.  3E-15.  Strings  are 
tied  to  the  rod  at  equal  distances  so  that  AB  = BC  = CD. 
The  lengths  of  the  strings  are  adjusted  so  that  the  lower 


Exercises  103 


Fig.  3E-15 


ends  just  touch  t he  curved  stream  of  water  issuing  from 
the  nozzle.  If  the  length  of  the  string  at  B is  5.0  cm,  what 
must  be  the  string  lengths  at  C and  D ? 

3-16.  Equalizing  height  and  range.  What  is  the  angle  of 
elevation  of  a launcher  which  throws  a projectile  to  a 
height  equal  to  the  range? 

3-17.  Interpreting  a trajectory  diagram.  Figure  3E-17 
shows  a portion  of  the  path  of  a body  moving  in  the  xy 
plane.  If  the  body  moves  with  constant  speed,  at  which 
point  along  its  trajectory  is  the  magnitude  of  its  accelera- 
tion the  greatest?  Explain  your  choice. 

y Fig.  3E-17 


C 


3-18.  High-speed  centrifuges.  High-speed  centrifuges 
have  been  successfully  operated  at  60,000  revolutions  per 
minute. 

a.  11  the  radius  of  the  centrifuge  is  20  cm,  what  is  the 
magnitude  of  the  acceleration  at  the  circumference? 
Express  your  answer  in  meters  per  second  per  second. 

b.  What  is  the  ratio  of  this  value  to  g,  the  acceleration 
due  to  gravity? 

3-19.  Uniform  circular  motion.  A body  moves  at  con- 
stant speed  in  a circular  path  whose  circumference  is 
60  m.  It  completes  one  revolution  every  12  s. 

a.  What  is  the  body’s  speed? 

b.  What  is  its  average  velocity  over  one  complete  rev- 
olution? 

c.  At  any  given  instant,  what  is  the  magnitude  of  the 
body’s  acceleration?  What  is  the  direction  of  its  accelera- 
tion? 


3-20.  Windy  day.  A steamer  is  sailing  west  at  25  knots 
(12.9  m/s).  A steady  wind  is  blowing  over  the  ocean 
from  the  south  at  10  knots  (5.1  m/s).  What  wind  speed  and 
wind  direction  are  indicated  by  an  anemometer  and  a 
wind  vane  mounted  on  the  ship? 

3-21.  Airspeed  versus  ground  speed.  An  airplane  is 
headed  20°  east  of  north  at  an  airspeed  of  200  km/h.  A 
wind  is  blowing  from  the  east  at  50  km/h. 

a.  What  is  the  ground  speed  of  the  plane? 

b.  What  is  the  plane’s  course  made  good?  (That  is,  in 
what  direction  is  the  plane  moving  relative  to  the 
ground?) 

Group  B 

3-22.  Muzzle  velocity.  The  term  muzzle  “velocity”  fre- 
quently is  used  for  the  speed  at  which  a projectile  leaves  a 
gun.  In  order  to  determine  the  muzzle  velocity  of  a partic- 
ular type  of  bullet  fired  from  a rifle,  the  rifle  is  mounted 
and  carelully  leveled.  Then  it  is  fired  at  a target  at  a 
known  distance  L.  The  vertical  distance  d from  the  aim 
point  to  the  actual  impact  point  is  measured. 

a.  Derive  an  expression  for  the  muzzle  velocity  v in 
terms  of  L,  d,  and  the  acceleration  due  to  gravity  g. 

b.  For  a target  distance  L = 3.00  X 102  m,  the  mea- 
sured drop  is  d — 1.30  m.  What  is  the  muzzle  velocity? 

c.  If  the  rifle  is  fired  horizontally  from  a height  of 
1.70  m,  where  does  the  bullet  strike  the  ground  if  it  misses 
the  target? 

3-23.  Raindrops.  During  a rainstorm,  raindrops  are 
observed  to  be  striking  the  ground  at  an  angle  of  35°  with 
the  vertical.  The  wind  speed  is  4.5  m/s  (10  mi/h).  Assum- 
ing that  the  horizontal  velocity  component  of  the  rain- 
drops is  the  same  as  the  speed  of  the  air,  what  is  the  ver- 
tical velocity  component  of  the  raindrops?  What  is  their 
speed?  (As  is  discussed  in  Chap.  4,  air  resistance  operates 
in  such  a way  that  the  raindrops  do  not  accelerate,  but  fall 
at  a constant  speed  called  the  terminal  speech) 

3-24.  How  far  away  and  how  high?  An  explorer  on  a 
plain  carefully  sights  on  a distant  mountain.  He  finds  that 
the  mountain  is  located  20°  east  of  north.  After  traveling 
10  km  due  northward,  he  makes  new  measurements.  He 
finds  that  the  mountain  now  lies  25°  east  of  north. 

a.  What  is  the  distance  from  the  second  sighting 
point  to  the  mountain? 

b.  As  viewed  from  the  second  sighting  point,  the 
peak  of  the  mountain  is  elevated  8.0°  above  the  horizon. 
How  high  above  the  plain  does  the  mountain  extend? 

3-25.  An  acute  case  of  constituent  vectors.  A vector  can  be 
resolved  into  two  constituent  vectors  not  necessarily  at 
right  angles.  It  is  useful  to  do  this  for  the  conical  pen- 
dulum treated  in  the  text  (Sec.  3-7).  Resolve  the  vector  g 
into  constituents  along  the  string  and  in  the  horizontal 
direction.  Then  obtain  Eq.  (3-51),  tan  0 = v2/rg. 


104  Kinematics  in  Two  and  Three  Dimensions 


3-26.  What's  up,  Doc ? The  motion  picture  comedy 
What’s  Up,  Doc?  features  a wild  automobile  chase  which 
ends  with  several  cars  hurtling  into  San  Francisco  Bay 
from  the  end  of  a dock.  Photographically  speeding  up  the 
action  is  a common  technique  in  slapstick  cinema,  and 
anyone  who  watches  the  film  must  hope  (for  the  sake  of 
the  stunt  drivers)  that  the  action  did  not  really  happen 
that  fast.  The  cars  appear  to  be  traveling  at  least  22  m/s 
(50  mi/h)  as  they  leave  the  dock. 

Based  on  several  viewings  of  the  sequence,  reasonable 
estimates  for  three  properties  of  the  trajectories  are: 

Vertical  drop:  6 m 

Horizontal  distance  traveled  before  splash-down: 
14  m 

Angle  between  velocity  vector  and  water  surface  at 
splash-down:  35° 

a.  Use  the  vertical  drop  and  horizontal  distance  esti- 
mates to  calculate  the  initial  speed  of  the  automobiles. 
Express  your  result  in  meters  per  second. 

b.  Use  the  vertical  drop  and  impact  angle  estimates  to 
calculate  the  initial  speed. 

c.  By  how  much  do  the  results  of  parts  a and  b differ? 
By  how  much  does  the  average  of  your  two  calculated 
speeds  differ  from  22  m/s? 

d.  Suppose  an  object  in  free  fall  were  photographed 
and  then  the  action  were  speeded  up  greatly.  In  what  way 
would  the  film  sequence  appear  “unnatural”? 

(Note:  The  data  given  above  are  estimates  based  on  a 
visual  recollection.  The  authors  have  no  documentary  evi- 
dence that  the  film  employed  speeded-up  sequences.) 

3-27.  All  the  way  or  by  relay?  A baseball  outfielder 
wishes  to  get  the  ball  from  his  position  to  home  plate  as 
soon  as  possible.  An  infielder  is  ready  to  act  as  relay  man  it 
the  outfielder  decides  not  to  throw  the  ball  all  the  way  to 
the  plate.  Both  the  relay  man  and  the  outfielder  have  the 
same  maximum  throwing  speed  v0.  The  relay  man  re- 
quires a time  interval  A t to  catch  the  ball,  turn,  and  throw 
it  again.  The  field  is  soggy,  so  using  bounces  is  not  a sensi- 
ble strategy. 

a.  If  the  outfielder  is  at  a distance  fi!,nax  so  far  from 
home  plate  that  he  can  just  barely  get  the  ball  there  on  the 
fly,  how  long  will  the  ball  be  in  flight  if  he  throws  it  all  the 
way? 

b.  If  the  outfielder  throws  instead  to  the  relay  man 
who  is  standing  halfway  to  home  plate,  how  long  will  it 
take  the  ball  to  arrive  at  home? 

c.  Evaluate  your  results  for  v0  = 35  m/s  and  At  = 
0.5  s.  Which  method  is  quicker? 

3-28.  Snow  fun.  Hugh  and  Lou  are  having  a snowball 
fight.  They  are  standing  40  m apart,  and  Hugh  decides  to 
throw  two  snowballs  at  the  same  initial  speed  of  30  m/s, 
but  at  different  times  and  elevation  angles,  so  that  they 
will  hit  Lou  simultaneously. 

a.  What  are  the  two  elevation  angles  that  Hugh  must 

use? 


b.  How  long  after  the  first  snowball  is  thrown  must 
Hugh  throw  the  second  one?  How  long  after  that  will  both 
snowballs  land? 

3-29.  Effective  gravity  on  a rotating  earth.  At  the  equa- 
tor, the  effective  value  of  g is  smaller  than  at  the  poles. 
One  reason  for  this  is  the  centripetal  acceleration  due  to 
the  earth's  rotation.  The  magnitude  of  the  centripetal 
acceleration  must  be  subtracted  from  the  magnitude  of 
the  acceleration  due  purely  to  gravity  in  order  to  obtain 
the  effective  value  of  g. 

a.  Calculate  the  fractional  diminution  of  g at  the 
equator  as  a result  of  the  earth’s  rotation.  Express  your  re- 
sult as  a percentage. 

b.  How  short  would  the  earth’s  period  of  rotation 
have  to  be  in  order  for  objects  at  the  equator  to  be 
“weightless”  (that  is,  in  order  for  the  effective  value  of  g to 
be  zero)? 

c.  How  would  the  period  found  in  part  b compare 
with  that  of  a satellite  skimming  the  surface  of  an  airless 
earth? 

3-30.  Flying  in  circles.  As  indicated  in  Fig.  3E-30,  a 
plane  flying  at  constant  speed  is  banked  at  angle  6 in  order 
to  fly  in  a horizontal  circle  of  radius  r.  Its  motion  can  be 
analyzed  by  analogy  with  the  conical  pendulum  of  the 
text.  The  aerodynamic  lift  force  acts  generally  upward  at 
right  angles  to  the  plane’s  wings  and  fuselage.  This  lift 
force  corresponds  to  the  tension  provided  by  the  string  in 
the  conical  pendulum. 


a.  Obtain  the  equation  for  the  required  banking 
angle  0 in  terms  of  v,  r,  and  g. 

b.  What  is  the  required  angle  for  v = 60  m/s 
(216  km/h)  and  r — 1.0  km? 

c.  When  a plane  that  has  been  flying  straight  enters  a 
turn,  it  is  necessary  to  increase  the  engine  power  to  main- 
tain constant  speed  and  altitude.  Can  you  explain  why? 

3-31.  Row,  row,  row.  A rowboat  is  pointing  perpendic- 
ularly to  the  bank  of  a river.  The  rower  can  propel  the 
boat  with  a speed  of  3.0  m/s  with  respect  to  the  water.  The 
river  has  a current  of  4.0  m/s. 

a.  Construct  a diagram  in  which  the  two  velocities  are 
represented  as  vectors. 

b.  Find  the  vector  which  represents  the  boat’s  velocity 
with  respect  to  the  shore. 

c.  At  what  angle  is  this  vector  inclined  to  the  direc- 
tion in  which  the  boat  is  pointing?  What  is  the  boat’s  speed 
with  respect  to  the  launch  point? 


Exercises  105 


Fig.  3E-35 


d.  If  the  river  is  100  m wide,  how  far  downstream  of 
the  launch  point  is  the  rowboat  when  it  reaches  the  oppo- 
site bank? 

3-32.  The  open  road.  A truck  is  traveling  due  north 
and  descending  a 10  percent  grade  (angle  of  slope  = 
tan-1  0.10  = 5.7°)  at  a constant  speed  of  90  km/h.  At  the 
base  of  the  hill  there  is  a gentle  curve,  and  beyond  that  the 
road  is  level  and  heads  30°  east  of  north.  A southbound 
police  car  with  a radar  unit  is  traveling  at  80  km/h  along 
the  level  road  at  the  base  of  the  hill,  approaching  the 
truck.  What  is  the  velocity  vector  of  the  truck  with  respect 
to  the  police  car? 

Group  C 

3-33.  Out  of  the  park.  Strongarm  Sam,  a baseball 
player,  decides  to  throw  a ball  out  of  the  stadium.  The  sta- 
dium seats  slope  upward  at  angle  a,  as  shown  in  Fig. 
3E-33. 


a.  Find  the  horizontal  distance  Rx  a baseball  will 
travel  before  landing  in  the  seats  if  the  ball  is  thrown  with 
speed  v0  at  an  elevation  angle  0 above  the  horizontal. 

b.  Assuming  that  Sam’s  maximum  throwing  speed  is 
independent  of  direction,  what  elevation  angle  0max 
should  he  use  to  maximize  Rx ? 

c.  What  is  the  value  of  0max  if  a = 30°? 

d.  Suppose  the  stands  are  100  m deep  and  58  m high 
(corresponding  to  a = 30°).  What  minimum  throwing- 
speed  is  needed  to  get  the  ball  out  of  the  park?  Note:  A 
speed  of  40  m/s  corresponds  to  a very  respectable  major 
league  fastball. 

3-34.  Throwing  down  a mountain.  A mountain  climber 
stands  on  top  of  a peak  whose  straight  sides  slope  down  at 
an  angle  /3  with  the  horizontal.  The  climber  wishes  to 
throw  a rock  as  far  as  possible  down  the  slope.  Adapt  the 
result  of  Exercise  3-3 3a  to  determine  the  elevation  angle 
for  maximum  range. 

3-35.  Music  on  the  subway.  A subway  rider  is  playing  a 
record  on  a battery-powered  record  player.  The  record 
has  radius  r and  rim  speed  v0.  The  train  is  traveling  due 
north  on  level  tracks  at  speed  u0.  Use  the  coordinate  direc- 
tions indicated  in  Fig.  3E-35;  where  the  positive  x direc- 
tion is  east  and  the  positive  y direction  is  north.  As  shown 
in  the  figure,  the  position  of  any  point  on  the  rim  of  the 
record  can  be  specified  by  the  angle  0,  whose  vertex  is  at 


N 


the  center  of  the  record.  The  angle  0 is  measured  clockwise 
from  north  to  the  rim  point.  (Records  turn  clockwise  as 
they  are  played.) 

a.  What  is  the  velocity  vector  w of  some  point  on  the 
rim  of  the  record  with  respect  to  the  tracks ? Express  w in 
terms  of  u0,  v0,  and  the  angle  0. 

b.  Evaluate  your  result  for  a 45-rpm  record  if  u0  = 
5.0  m/s  (18  km/h).  The  radius  of  a 45-rpm  record  is 
0.087  m.  What  is  the  maximum  speed  of  any  point  along 
the  rim,  with  respect  to  the  tracks?  What  is  the  maximum 
angle  between  w and  the  positive  y direction  (due  north)? 
For  what  value  of  0 does  this  angle  occur? 

3-36.  Swimming  across.  A swimmer  can  swim  at  a 
speed  of  0.70  m/s  with  respect  to  the  water.  She  wants  to 
cross  a river  which  is  50  m wide  and  has  a current  of 
0.50  m/s. 

a.  If  she  wishes  to  land  on  the  other  bank  at  a point 
directly  across  the  river  from  her  starting  point,  in  what 
direction  will  she  need  to  swim?  How  rapidly  will  she  in- 
crease her  distance  from  the  near  bank?  How  long  will  it 
take  her  to  cross? 

b.  If,  rather,  she  decides  to  cross  in  the  shortest  pos- 
sible time,  in  what  direction  must  she  swim?  How  rapidly 
will  she  increase  her  distance  from  the  near  bank?  How 
long  will  it  take  her  to  cross?  How  far  downstream  will  she 
be  when  she  lands? 

Numerical 

3-37.  Parabolic  trajectory  I.  Use  the  trajectory  plotting 
program  in  the  Numerical  Calculation  Supplement  to  plot 
the  trajectory  of  a projectile  having  an  initial  velocity  of 
magnitude  = 70.7  m/s  (this  is  100  m/s  divided  by^/2) 
for  an  elevation  angle  of  0 = 45°.  Compare  the  range  of 
the  projectile  with  the  range  obtained  in  Example  3-7  for 
Vi  = 100  m/s  and  0 = 45°. 

3-38.  Parabolic  trajectory  II.  Using  vt  = 100  m/s  and 
several  values  of  0 in  the  neighborhood  of  45°,  show  that 
the  maximum  range  of  the  projectile  considered  in  Ex- 
ample 3-7  occurs  when  0 = 45°.  Do  this  by  running  the 


106  Kinematics  in  Two  and  Three  Dimensions 


trajectory  plotting  program  in  the  Numerical  Calculation 
Supplement. 

3-39.  Parabolic  trajectory  III.  When  the  % and  y values 
provided  by  the  trajectory  plotting  program  in  the  Nu- 
merical Calculation  Supplement  are  plotted,  the  graph  re- 
sembles a strobe  photo  of  a projectile.  It  is  not  very  diffi- 
cult to  identify  the  reason  for  the  resemblance:  The  pro- 
gram evaluates  y for  uniformly  spaced  values  of  x.  Since 
the  x component  of  the  projectile’s  velocity  is  constant,  the 
projectile’s  x coordinate  increases  uniformly  with  time. 
Thus  the  plotted  x and  y values  are  those  occurring  at 
equal  intervals  of  time,  just  as  in  a strobe  photo. 

a.  Run  the  program  to  obtain  a set  of  points  showing 
the  positions  of  the  projectile  on  a trajectory  for  v-L  = 
100  nt/s  and  6 = 70°,  with  successive  values  of  x taken  to 
be  50  m apart.  Construct  a plot  of  these  points,  but  do  not 
connect  the  points  with  a smooth  curve:  that  obscures  the 
points,  as  can  be  seen  in  Fig.  3-29. 

b.  Making  use  of  the  fact  that  the  speed  is  propor- 
tional to  the  separation  between  adjacent  points  on  the 
plot,  determine  the  relative  values  of  the  speed  of  the  pro- 
jectile in  various  parts  of  its  trajectory.  Explain  why  the 
speed  behaves  as  it  does. 

c.  Explain  why  it  is  not  possible  to  use  this  procedure 
to  compare  the  speeds  of  projectiles  in  two  different  tra- 
jectories. 

3-40.  A strobed  trajectory  program. 

a.  Overcome  the  limitation  described  in  Exercise 


3-39c.  That  is,  write  a calculator  or  computer  program 
that  will  produce  a strobe-photo-like  set  of  points  on 
any  projectile  trajectory  in  such  a way  that  it  is  possible  to 
compare  the  speeds  of  projectiles  on  different  trajectories. 
Do  this  by  having  the  program  use  Eqs.  (3-4)  and  (3-5)  to 
evaluate  x and  y at  the  same  set  of  uniformly  separated  val- 
ues of  t for  all  trajectories. 

b.  Use  your  program  to  obtain  sets  of  points  on  the 
trajectories  for  vt  = 100  m/s,  8 = 35°  and  = 100  m/s, 
0 = 55°,  with  successive  values  of  t taken  1.0  s apart.  Com- 
pare the  speeds  of  the  projectile  in  the  two  different  tra- 
jectories. 

c.  Compare  the  total  times  of  flight  of  the  projectile 
in  the  two  different  trajectories,  by  simply  counting  the 
number  of  points  on  each  trajectory.  Explain  why  the 
times  of  flight  are  different  although  the  ranges  are  the 
same. 

3-41.  Adding  vectors  by  machine. 

a.  Write  a program  that  instructs  a calculator  or  com- 
puter to  evaluate  the  sum  of  any  number  of  vectors  and  to 
present  the  result  in  component  form  and  as  a magnitude 
and  direction.  Use  the  algebraic  procedure  of  Example 
3-4 b.  Test  the  program  by  repeating  the  calculation  done 
there. 

b.  Use  your  program  to  evaluate  the  sum  of  the  fol- 
lowing set  of  vectors:  (1)  rx  = 1.57  m,  (p1  = 28.3°;  (2)  r2  = 
6.03  m,  <f>2  = 258.6°;  (3)  r3  = 4.67  m,  c/>3  = -105.6°;  (4) 
r4  = 3.71  m,  (/>4  = 96.2°. 


Exercises  107 


4 

Newton's 
Laws  of  Motion 


4-1  NEWTON’S  FIRST  In  Chap.  1 we  began  our  discussion  of  newtonian  mechanics  with  the  basic 
LAW  AND  INERTIAL  question:  What  is  the  relation  between  force  and  motion?  The  experiments 
REFERENCE  FRAMES  this  question  led  us  to  analyze  are  summarized  in  Fig.  4-1,  which  is  a repro- 
duction of  Fig.  1-7.  From  the  analysis  we  concluded  that  an  answer  is:  The 
net  force  acting  on  an  object  is  related  to  its  acceleration.  But  since  we  had 
not  even  defined  acceleration  precisely  in  Chap.  1,  the  relation  had  to  re- 
main qualitative. 

Our  task  in  this  chapter  is  to  obtain  a quantitative  relation  between 
force  and  motion.  In  Sec.  4-2  we  go  through  preliminary  considerations 
that  use  little  more  than  an  intuitive  understanding  of  the  crucial  quantities 
force  and  mass.  The  relation  among  force,  mass,  and  acceleration  that  will 
emerge  from  this  treatment  is  a form  of  Newton’s  second  law  of  motion.  The 
form  is  applicable  to  most  of  the  systems  studied  in  newtonian  mechanics, 
but  not  to  them  all.  In  order  to  get  the  general  form  of  the  law  needed  in 
some  of  our  work  with  newtonian  mechanics,  and  to  develop  a thorough 
understanding  of  force  and  mass,  we  reconsider  Newton’s  second  law  in 
Secs.  4-3  and  4-4  from  a rigorous  and  quite  different  approach.  This  in- 
volves analyzing  strobe  photos  of  collisions  between  pucks  on  an  air  table. 
As  an  additional  advantage,  the  rigorous  approach  will  make  it  possible  for 
us  to  derive  in  Sec.  4-5  Newton’s  third  law  of  motion,  which  relates  the  forces 
that  objects  exert  on  each  other  when  they  interact.  Newton’s  third  law  also 
enters  into  the  preliminary  considerations  of  Sec.  4-2,  but  there  it  is  given 
only  an  intuitive  justification. 

In  this  section  we  are  concerned  with  Newton’s  first  law  of  motion.  It  has 
to  do  with  the  motion  of  a single  body  on  which  no  net  force  acts.  Figure  4-1 
will  remind  you  that  as  long  as  there  is  no  net  force  applied  to  an  air  table 
puck,  there  is  no  change  in  its  velocity,  as  observed  by  a stroboscopic 


108 


Gravity 


Fig.  4-1  A summary  of  experiments 
considered  in  Chap.  1,  which  led  to  t lie 
conclusion  that  the  net  force  applied  to 
an  object  is  related  to  its  acceleration. 


No  net  force 

Support 


Hand 


Friction 


Direction  of  motion 


No  acceleration 


Gravity 


Support 


Direction  of  motion 


Direction  of  acceleration 


Direction  of  net  force 


Gravity 


Support 


Direction  of  motion 


No  acceleration 


No  net  force 


Gravity 

1 


id) 


Direction 
of  motion 


Direction 
of  acceleration 


Direction 
of  net  force 


camera  system  supported  from  the  ground.  Since  the  camera  is  performing 
the  function  of  an  observer,  the  situation  can  be  described  as  follows:  If  no 
net  force  is  applied  to  a body,  it  maintains  a constant  velocity  with  respect  to  an  observer 
fixed  to  the  earth’s  surface.  This  statement  is  Galileo’s  form  of  the  law  of  inertia. 
The  name  is  appropriate,  since  the  word  “inertia”  means  the  tendency  to 
avoid  changing  a state  of  motion.  The  law  of  inertia  was  modified  by 
Newton,  in  a way  that  is  explained  later,  to  become  his  first  law  of  motion. 

The  law  of  inertia  played  a key  role  in  Galileo’s  1632  Dialogue  Concerning  the 
Two  Chief  World  Systems,  his  magnificent  argument  in  defense  of  the  Copernican 
view  that  the  earth  revolves  around  the  sun,  not  vice  versa.  In  the  Aristotelian 
view  almost  universally  held  at  the  time,  a fundamental  distinction  was  made 
between  celestial  matter,  which  moved  of  its  own  accord,  and  terrestrial  matter, 
which  moved  only  under  the  influence  of  a continually  applied  force. 


4-1  Newton’s  First  Law  and  Inertial  Reference  Frames  109 


In  a series  of  arguments,  Galileo  destroyed  the  distinction  between  celestial 
and  terrestrial  matter.  It  was  thus  important  for  him  to  demonstrate  that  if  celestial 
matter  could  move  forever  without  any  force  being  applied,  the  same  was  true  of 
terrestrial  matter  as  well.  Galileo  perceived  that  it  was  possible  and  desirable  to 
neglect  friction,  which  had  always  been  regarded  as  an  essential  aspect  of  the  mo- 
tion of  bodies.  This  was  far  from  obvious  in  a day  when  the  common  experience  of 
motion  was  with  such  things  as  oxcarts  on  rough  roads. 

Nevertheless,  Galileo  was  able  to  imagine  how  bodies  would  move  in  the  ab- 
sence of  friction  centuries  before  the  invention  of  the  air  table.  His  argument  goes 
as  follows:  Imagine  a ball  rolling  with  negligible  friction.  If  it  rolls  down  an  inclined 
plane,  it  will  accelerate.  If  it  rolls  up  an  inclined  plane,  it  will  decelerate.  Now  imag- 
ine the  ball  rolling  on  a level  plane.  It  will  neither  accelerate  nor  decelerate — 
neglecting  friction — and  so  it  must  move  at  constant  velocity. 

Isaac  Newton  (1642-1727)  was  born  the  year  after  Galileo’s  death.  In  1666 
Cambridge  University  was  closed  because  of  a plague  epidemic,  and  Newton 
spent  the  time  at  home  in  what  must  be  the  most  productive  single  year  of  scien- 
tific endeavor  in  history.  During  1666  Newton  developed  the  major  part  of  his  me- 
chanics and  (among  other  things)  invented  calculus. 

Much  later,  under  pressure  from  friends,  Newton  prepared  a systematic, 
formal  account  of  his  work  in  mechanics.  It  was  published  in  Latin  in  1686  under 
the  title  Philosophiae  Naturalis  Principi a Mathematica,  or  Mathematical  Princi- 
ples of  Natural  Philosophy.  The  organization  of  the  work  is  that  of  a classical 
euclidean  geometry  text,  with  definitions,  axioms  or  laws,  and  theorems.  Newton 
adapted  Galileo’s  law  of  inertia  into  his  first  axiom,  that  is,  his  first  law  of  motion. 
After  his  book  had  won  wide  acclaim,  Newton  acknowledged  his  debts  to  Galileo 
and  other  predecessors  in  the  charming  statement  “If  I have  seen  farther  than 
others,  it  is  because  I have  stood  on  the  shoulders  of  giants.” 

In  Chap.  1 we  replaced  Galileo’s  imaginary  experiment  justifying  the 
law  of  inertia  with  a real  experiment  made  possible  by  the  air  table  and 
stroboscopic  photography.  Figure  4-2  repeats  the  experiment.  It  depicts 
a strobe  photo  of  a puck  set  into  motion  across  the  nearly  frictionless 
horizontal  top  of  an  air  table  by  a launching  apparatus  at  the  upper  right  of 
the  photograph.  Figure  4-3  demonstrates  the  constancy  of  the  puck’s  sub- 
sequent velocity.  We  use  the  duration  of  the  strobe  flash  interval  as  the  time 
unit,  just  as  in  Sec.  3-5.  Then  the  dashed  white  vector  is  not  only  the  dis- 
placement of  the  puck  during  a time  interval  at  the  beginning  of  its  motion 
but  also  its  average  velocity  during  that  time  interval.  Similarly,  the  dashed 
gray  vector  is  the  average  velocity  of  the  puck  during  a time  interval  at  the 
end  of  its  motion.  These  initial  and  final  average  velocity  vectors  are 
moved,  without  changing  their  directions  or  lengths,  to  form  the  solid 
w'hite  and  gray  vectors.  This  is  done  to  make  it  easier  to  compare  the  initial 
and  final  average  velocity  vectors.  Such  a comparison  shows  that  they  are 
essentially  the  same,  so  that  the  average  velocity  is  constant.  In  fact,  it  is  not 
necessary  to  make  a distinction  between  average  and  instantaneous  quan- 
tities when  we  speak  of  the  initial  and  final  velocities.  Thus  we  can  say  that 
the  velocity  of  the  puck  is  observed  to  be  essentially  constant  during  its  mo- 
tion. 

The  downward  gravitational  force  exerted  on  the  puck  by  the  earth  is 
exactly  compensated  by  the  upward  force  exerted  on  it  by  the  air  him 
under  the  puck.  And  the  puck  does  not  experience  significant  frictional 
force  acting  in  the  horizontal  direction.  So  after  it  leaves  the  launcher,  the 
puck  is  a body  experiencing  essentially  no  net  applied  force.  Furthermore, 
it  is  viewed  by  an  observer  (the  camera)  fixed  to  the  surface  of  the  earth. 
Thus  Galileo’s  law  of  inertia  should  describe  the  behavior  of  the  puck,  and 
it  does. 


110  Newton’s  Laws  of  Motion 


Fig.  4-2  Strobe  photo  of  a puck  moving  across  the 
horizontal  top  of  an  air  table. 


Fig.  4-3  An  analysis,  explained  in  the  text,  which 
verifies  the  conclusion  that  the  velocity  of  the  puck  is 
essentially  constant,  as  observed  by  the  camera,  fixed 
to  the  surface  of  the  earth,  that  took  the  photograph. 


Very  accurate  measurements  would  show  that  the  velocity  of  the  puck 
in  Fig.  4-2  is  not  quite  constant.  Part  of  the  change  in  its  velocity  is  due  to 
the  fact  that  a small  amount  of  friction  still  acts  on  the  puck,  even  though  it 
is  on  an  air  table.  More  interesting  for  our  purpose  here  is  the  part  that  is 
due  to  the  small  accelerations  of  the  observer  fixed  to  the  surface  of  the 
earth.  If  a puck  were  launched  with  a velocity  of  1.0  m/s  due  north  across  a 
completely  frictionless  air  table  located  at  a latitude  of  45°  North,  an  ob- 
server standing  next  to  the  table  would  see  that  after  1.0  s the  puck  had 
gained  a velocity  of  1.0  x 10-4  m/s  to  the  east  — providing  the  observer 
had  equipment  sensitive  enough  to  measure  such  a velocity  gain.  The 
reason  for  the  change  in  the  velocity  of  the  puck  with  respect  to  the  ob- 

4-1  Newton’s  First  Law  and  Inertial  Reference  Frames  111 


server  is  that  the  observer  standing  on  the  surface  of  the  earth  is  accelerat- 
ing. Almost  all  of  this  acceleration  is  due  to  the  (daily)  rotation  of  the  earth 
about  its  axis.  But  there  are  other  sources  of  acceleration,  too.  In  order  of 
decreasing  magnitude,  these  arise  from  the  (annual)  revolution  of  the 
earth  about  the  sun,  the  revolution  of  the  sun  about  the  center  of  our  gal- 
axy (one  revolution  takes  approximately  2 x 108  years),  and  the  motions,  as 
yet  not  completely  known,  of  our  galaxy  relative  to  the  universe  as  a whole. 

If  measurements  are  made  on  a body  experiencing  no  net  force  from  a 
reference  frame  which  does  not  partake  of  the  rotation  of  the  earth  about 
its  axis,  the  observed  deviation  of  the  motion  from  constant-velocity  motion 
is  much  reduced.  (Experimental  evidence  justifying  this  statement  is  pre- 
sented in  Sec.  5-4.)  Following  this  line  of  thought,  we  are  led  to  consider  a 
reference  frame  that  also  does  not  revolve  about  the  sun,  then  one  that  in 
addition  does  not  revolve  about  the  center  of  the  galaxy,  and  finally  one 
that  does  not  even  move  with  the  galaxy.  We  believe  that  from  such  a frame 
of  reference,  completely  motionless  with  respect  to  the  universe  as  a whole, 
the  velocity  of  a body  would  be  observed  to  be  exactly  constant  when  it  had 
no  net  force  at  all  acting  on  it.  The  reference  frame  is  called  an  inertial  ref- 
erence frame. 

Newton  modified  Galileo’s  law  of  inertia  by  specifying  that  it  describes 
the  constant  velocity  of  a body  experiencing  no  net  force  as  seen  by  an  ob- 
server in  an  inertial  reference  frame,  not  as  seen  by  an  observer  fixed  on 
the  earth’s  surface.  (Actually,  at  the  time  of  Newton  the  motions  of  stars 
through  galaxies,  and  of  galaxies  through  the  universe,  were  not  known. 
So  Newton  considered  an  inertial  frame  to  be  one  that  is  motionless  with 
respect  to  the  “fixed  stars.”  But  there  is  no  question  that  he  would  find  the 
modern  definition  of  an  inertial  frame  completely  acceptable.)  The  modi- 
fied version  of  the  law  of  inertia  is:  If  no  net  force  is  applied  to  a body,  it  main- 
tains a constant  velocity  with  respect  to  an  observer  fixed  to  an  inertial  frame . (In- 
cluded is  the  case  in  which  the  magnitude  of  the  constant  velocity  is  zero,  so 
that  the  body  remains  at  rest  with  respect  to  the  observer.)  This  statement 
was  incorporated  by  Newton  into  his  theory  of  mechanics  as  its  first  axiom. 
Consequently,  it  is  also  called  Newton’s  first  law  of  motion. 

A reference  frame  that  is  motionless  with  respect  to  the  universe  as  a 
whole  is  an  inertial  frame.  Newton’s  first  law  says  that  a body  not  acted  on 
by  a net  force  maintains  a constant  velocity  with  respect  to  an  inertial 
frame.  So  the  first  law  says  that  a body  which  has  no  net  force  acting  on  it 
maintains  a constant  velocity  with  respect  to  the  universe  as  a whole.  But 
the  first  law  uses  the  concept  of  an  inertial  frame  to  make  the  statement 
indirectly.  The  reason  is  that  the  expression  of  Newton’s  second  law  of  mo- 
tion also  involves  the  concept  of  an  inertial  frame.  The  first  law  serves  to  in- 
troduce the  concept  and  define  it  in  context.  Thus  the  first  law  has  two 
functions.  One  is  to  tell  you  that  inertial  frames  exist.  The  other  is  to  tell 
you  how  to  determine  whether  you  are  stationed  in  such  a reference  frame 
without  the  need  of  knowing  what  the  entire  universe  is  doing  in  order  to 
know  your  state  of  motion  with  respect  to  the  universe.  What  you  do  to 
make  the  determination  is  to  find  a body,  like  an  air  table  puck,  that  you 
have  good  reason  to  believe  is  not  acted  on  by  a net  force.  Then  you  ob- 
serve it  and  determine  whether  its  velocity  relative  to  you  remains  constant. 
If  this  velocity  is  constant,  you  are  in  an  inertial  frame. 

You  will  never  be  in  a precisely  inertial  reference  frame,  even  if  you 
become  an  astronaut,  because  you  will  never  be  able  to  free  yourself  corn- 


112  Newton’s  Laws  of  Motion 


pletely  from  the  motions  of  the  sun  and  of  our  galaxy.  But  when  stationed 
on  the  earth’s  surface,  you  are  in  a good  approximation  to  an  inertial  refer- 
ence frame.  It  may  or  may  not  be  a sufficiently  good  approximation  for 
your  purposes,  depending  on  the  system  you  are  studying  and  the  accuracy 
of  your  studies.  If  too  much  error  would  arise  from  ignoring  the  fact  that  a 
reference  frame  fixed  to  the  earth’s  surface  is  not  exactly  an  inertial  frame, 
there  is  a procedure  which  can  be  used  to  correct  accurately  for  this  fact. 
(The  procedure  is  developed  in  Sec.  5-4.)  For  most  practical  studies  in 
physics,  and  for  nearly  every  application  of  physics  to  engineering,  you  are 
completely  justified  in  treating  a reference  frame  fixed  to  the  surface  of  the 
earth  as  an  inertial  frame. 

Any  reference  frame  which  moves  at  constant  velocity  relative  to  an  inertial 
frame  is  also  an  inertial  frame . This  is  a consequence  of  Newton’s  first  law  and 
of  a conclusion  obtained  in  Sec.  3-8.  There  two  reference  frames  were  con- 
sidered. One  was  the  x,  y,  z frame  and  the  other  the  x' , y' , z'  frame  which 
moved  at  an  arbitrary  constant  velocity  relative  to  the  unprimed  frame.  We 
wrote  (he  Galilean  position  transformation  equation  relating  the  position 
vector  in  the  primed  frame  to  the  position  vector  in  the  unprimed  frame. 
We  then  calculated  the  time  derivative  of  each  term  in  the  equation  and  ob- 
tained the  Galilean  velocity  transformation  equation.  From  this  equation  we 
concluded  that  a body  observed  to  move  relative  to  the  unprimed  frame  at 
constant  velocity  will  be  observed  to  move  relative  to  the  primed  frame  with 
a different,  but  still  constant,  velocity.  Now  let  the  unprimed  frame  be  an 
inertial  frame.  Then  Newton’s  first  law  requires  that  there  be  no  net  force 
acting  on  the  body.  To  find  out  whether  the  primed  frame  is  also  an  inertial 
frame,  we  note  that  a body  on  which  no  net  force  acts  is  observed  from  that 
frame  to  move  with  constant  velocity.  So  the  test  of  an  inertial  frame  given 
by  the  first  law  is  satisfied,  and  the  primed  frame  therefore  is  also  an  iner- 
tial frame.  This  proves  the  italicized  statement. 


4-2  NEWTON  S SECOND  We  begin  the  preliminary  consideration  of  Newton’s  second  and  third  laws 
AND  THIRD  LAWS  of  motion,  with  which  this  section  is  concerned,  by  drawing  an  implication 

from  Newton’s  first  law  of  motion.  The  first  law  implies  that  if  a body  does 
not  maintain  a constant  velocity  with  respect  to  an  inertial  frame  of  refer- 
ence, then  there  is  a net  force  applied  to  it.  Since  a body  with  a changing 
velocity  is  accelerating,  the  implication  is  that  a body  with  a net  force  ap- 
plied to  it  is  accelerating  with  respect  to  an  inertial  frame.  Newton’s  second 
law  of  motion  gives  the  precise  connection.  It  asserts  that  the  relation  between 
the  net  force  and  the  acceleration  is  a direct  proportionality. 

Everyone  has  at  least  an  intuitive  feeling  for  the  fact  that  the  net  force 
applied  to  an  object  is  directly  proportional  to  its  acceleration,  as  observed 
from  the  approximately  inertial  reference  frame  ol  the  earth’s  surface.  For 
instance,  the  drag  racer  knows  that  he  doubles  the  acceleration  of  his  ve- 
hicle by  doubling  the  propulsive  force  produced  by  its  engine,  providing 
the  tires  maintain  their  grip. 

Figure  4-4  indicates  how  a quantitative  measurement  could  be  made  to 
verify  the  proportionality  between  the  acceleration  of  an  object,  observed 
from  the  surface  of  the  earth,  and  the  strength  of  the  net  force  applied  to 
it.  An  experimenter  pulls  a block  across  the  frictionless  top  of  an  air  table 
by  applying  a force  to  it  through  a spring  scale.  As  is  discussed  in  detail 
near  the  end  of  this  chapter,  the  extension  of  the  spring  is  proportional  to 


4-2  Newton’s  Second  and  Third  Laws  113 


Fig.  4-4  An  experiment  which  shows  the  proportionality 
between  the  acceleration  of  an  object  and  the  strength  of  the  net 
force  applied  to  it.  In  the  upper  part  of  the  figure,  an  experi- 
menter applies  a constant  horizontal  force  to  the  object 
consisting  of  the  block  plus  the  spring  scale.  The  constancy  of  the 
force  is  verified  by  the  constant  reading  on  the  scale.  If  friction  is 
negligible  because  the  block  is  on  an  air  table,  a strobe  photo 
taken  by  a camera  fixed  to  the  earth  will  show  that  the  object 
experiences  a constant  acceleration,  within  experimental  accu- 
racy. In  the  lower  part  of  the  figure,  the  experimenter  doubles 
the  force  applied  to  the  object.  A strobe  photo  will  show  that  its 
acceleration  is  doubled. 


the  force  acting  on  it.  Consequently,  the  scale  reading  provides  a measure 
of  the  strength  of  the  force  transmitted  through  the  scale  to  the  block.  The 
experimenter  applies  a constant  force  to  the  block.  Meanwhile  a strobe 
photo  is  made  of  its  motion.  Then  the  experiment  is  performed  a second 
time,  with  the  experimenter  pulling  harder  so  that  the  measured  strength 
of  the  applied  force  is  doubled.  Analysis  of  the  photographs  will  show  that 
the  acceleration  is  doubled  in  the  second  experiment. 

Many  measurements  such  as  these  show  that  the  net  force  F applied  to 
a body  is  proportional  to  its  acceleration  a with  respect  to  an  inertial  frame 
of  reference.  We  write  the  relation  as 

F oc  a 

where  the  symbol  oc  means  “proportional  to.”  It  is  usually  more  convenient 
to  treat  equations  than  proportionalities.  We  therefore  introduce  a propor- 
tionality constant  m and  convert  the  proportionality  into  the  equation 

F = ma  (4-1) 

This  is  Newton’s  second  law  of  motion. 

What  is  the  physical  meaning  of  to?  From  experience  we  know  that  not 
all  bodies  are  equally  easy  to  set  into  motion.  It  appears  that  the  difficulty  of 
accelerating  them  increases  with  the  amount  of  matter  they  contain.  This 
“amount  of  matter”  is  called  mass.  All  material  bodies  in  the  universe  have 
mass.  Qualitatively,  the  mass  of  a body  measures  its  inertia — in  other 
words,  its  reluctance  to  accelerate.  That  is  exactly  what  Ecp  (4-1)  says  quan- 
titatively. 

The  value  of  to  in  Eq.  (4-1)  specifies  how  difficult  it  is  to  make  a body  of 
that  mass  accelerate.  The  equation  says  that  as  viewed  from  an  inertial  ref- 
erence frame  the  strength  F of  the  force  required  to  produce  an  acceleration 
of  magnitude  a will  be  proportional  to  the  mass  to  of  the  body  it  acts  on. 
An  experiment  verifying  the  proportionality  of  F to  m,  for  a fixed  value  of 
a,  can  be  performed  along  the  lines  suggested  in  Fig.  4-5.  If  the  mass  of  the 
spring  scale  is  negligible,  the  experimenter  finds  that  for  the  double  block  it 
takes  a doubling  of  the  applied  force  (determined  by  the  spring  scale 
reading)  to  produce  the  same  given  acceleration  (determined  by  a strobe 
photo).  And  since  the  total  mass  of  two  identical  blocks  must  certainly  be 
twice  the  mass  of  each,  the  proportionality  of  F to  m is  verified. 


114  Newton’s  Laws  of  Motion 


Fig.  4-5  An  experiment  which  shows  the  proportionality 
between  the  mass  of  an  object  and  the  net  force  which  must  be 
applied  to  give  the  object  a certain  value  of  acceleration.  In  the 
upper  part  of  the  figure,  the  experimenter  applied  a force  to  a 
spring  scale  and  a block  supported  by  an  air  table.  A strobe  photo 
taken  by  a camera  fixed  to  the  earth  will  show  that  the  block  has  a 
certain  acceleration.  In  the  lower  part  of  the  figure,  a second 
identical  block  is  attached  to  the  first  one.  If  the  mass  of  the  spring 
scale  can  be  neglected,  this  doubles  the  mass  of  the  object  to 
which  the  force  is  applied.  A strobe  photo  will  now  show  that, 
within  experimental  accuracy,  giving  the  object  of  doubled  mass 
the  same  acceleration  as  before  requires  applying  a force  of 
doubled  strength. 


lb) 


An  important  feature  of  Eq.  (4-1)  is  that  the  symbol  for  force  is  written 
as  a vector  and  the  symbol  for  mass  is  written  as  a (positive)  scalar.  Thus  the 
equation  states  that  the  force  F is  a vector  having  the  same  direction  as  the 
acceleration  a which  it  produces.  Intuition  certainly  indicates  that  this  state- 
ment is  correct.  Can  you  give  an  example  drawn  from  everyday  experi- 
ence? 

Newton’s  second  law  of  motion  is  the  cornerstone  of  newtonian  me- 
chanics and  is  perhaps  the  most  important  relation  in  all  physics.  To  help 
emphasize  the  law,  we  write  again  the  equation  representing  it,  Eq.  (4-1), 

F = ma 

And  remembering  that  the  experiments  used  to  establish  Newton’s  second 
law  are  carried  out  in  an  approximation  to  an  inertial  frame,  we  express  the 
law  in  words:  The  net  force  F acting  on  a body  equals  the  product  of  the  mass  m of 
the  body  and  the  acceleration  a which  it  gives  to  that  body,  if  the  motion  is  observed 
from  an  inertial  frame . The  net  force  acting  on  a body  is  the  vector  sum  of  all 
the  forces  applied  to  the  body. 

If  we  are  to  make  Eq.  (4-1)  useful,  we  must  define  a standard  unit  of 
mass.  This  unit  is  the  kilogram  (abbreviated  kg).  It  is  the  mass  of  a certain 
block  of  platinum-iridium  alloy,  kept  for  safety  in  a vault  in  Sevres,  a sub- 
urb of  Paris.  Its  mass  was  originally  chosen  to  be  as  close  as  possible  to  that 
of  the  amount  of  pure  water,  at  a pressure  of  1 atmosphere  (atm)  and  a 
temperature  of  4°C,  which  occupies  a volume  of  10-3  m3  (that  is,  1 L).  In 
fact,  the  makers  of  the  block  came  fairly  close  to  this  goal.  But  the  standard 
is  the  block,  and  not  the  water.  Once  the  standard  had  been  established, 
many  copies  were  made  for  use  elsewhere.  A subsidiary  unit  of  mass  is  the 
gram  (abbreviated  g,  or  often  gm  in  the  older  literature).  It  is  defined  to  be 
exactly 

1 g = 0.001  kg 

The  mass  of  this  book  is  about  1.5  kg.  A strong  athlete  can  lift  over  his 
head  a barbell  of  mass  of  about  200  kg. 

The  unit  of  mass  for  the  system  of  units  still  trequently  used  in  engi- 
neering practice  in  the  Ehiited  States  is  named  the  slug.  To  three  decimal 
places,  its  value  is 

1 slug  = 14.594  kg 

4-2  Newton’s  Second  and  Third  Laws  115 


The  SI  units  of  length  and  time  have  been  redefined  in  terms  of  the  properties 
of  atoms,  but  this  has  not  yet  been  done  for  the  unit  of  mass.  The  relation  between 
atomic  masses  and  the  mass  defined  by  the  standard  platinum-iridium  block  is 
known.  Specifically,  5.01848  x 1025  atoms  of  carbon-12  have  a mass  of  1 kg.  But  it 
is  not  yet  possible  to  obtain  this  number  to  more  significant  figures  because  of  the 
difficulty  in  determining  the  masses  of  atoms.  Since  the  much  larger  masses  we 
deal  with  in  most  circumstances  can  be  compared  to  that  of  the  standard  block  in 
measurements  accurate  to  several  more  significant  figures  than  there  are  in  the 
number  just  quoted,  the  block  continues  to  be  used  to  define  the  kilogram. 

Now  that  we  have  defined  the  unit  of  mass,  we  can  use  Newton’s  sec- 
ond law  to  define  the  unit  of  force.  In  terms  of  magnitudes,  the  law  is 
F = ma.  It  tells  us  that  the  net  force  required  to  give  a unit  mass  (1  kg)  a 
unit  acceleration  (1  m/s2)  is  1 kg  x 1 m/s2  = 1 kg-m/s2.  T his  quantity  is 
the  unit  of  force.  The  force  unit  is  important  enough  to  warrant  its  own 
name.  It  is  called  the  newton  (N).  That  is,  the  force  unit  is 

1 N = 1 kg-m/s2 

If  you  hold  this  book  in  the  palm  of  your  hand,  it  presses  down  with  a force 
of  about  15  N.  The  barbell  supported  by  the  athlete  is  exerting  a down- 
ward force  on  his  arms  of  about  2000  N. 

Note  that  it  is  not  necessary  to  construct  a standard  newton.  The  force  unit  is 
specified  in  terms  of  the  three  fundamental  units:  length,  mass,  and  time.  Conse- 
quently, the  force  unit  is  known  as  a derived  unit.  All  the  other  units  used  in  me- 
chanics are  also  derived  units.  For  instance,  the  unit  of  velocity,  1 m/s,  is  speci- 
fied in  terms  of  two  of  the  three  fundamental  units.  The  special  role  of  the  units  for 
length,  mass,  and  time  was  made  apparent  by  the  name  for  the  immediate  ancestor 
of  SI.  It  was  called  the  mks  system,  the  letters  standing  for  meters,  kilograms,  and 
seconds. 

Example  4-1  illustrates  a simple  application  of  Newton’s  second  law  of 
motion  to  an  accelerating  object. 

EXAMPLE  4-1  a— a — - 

The  speed  of  a 1000-kg  automobile  on  a straight  road  increases  uniformly  from 
0 m/s  to  30.0  m/s  in  10.0  s.  What  net  force  is  acting  on  the  automobile? 

■ This  is  a problem  in  one  dimension,  so  you  can  express  directed  quantities  by 
using  either  the  signed  scalar  symbols  of  Chap.  2 or  the  vector  symbols  of  Chap.  3. 
If  you  choose  vector  symbolism  to  gain  experience  with  it,  you  can  evaluate  the  au- 
tomobile’s constant  acceleration  a in  terms  of  the  final  velocity  v by  writing  the  rela- 
tion for  constant  acceleration  of  Eq.  (2-29)  in  vector  form: 

v = v;  + a t 

Setting  the  initial  velocity  v,  equal  to  zero  and  solving  for  a,  you  obtain 

v 

a = — 
t 

Since  t is  a positive  scalar,  it  is  evident  that  a has  the  same  direction  as  v.  Newton's 
second  law  states  that  the  net  force  F acting  on  the  automobile  of  mass  m is 

F = ma 


Substituting  for  a,  you  have 


t 


116  Newton’s  Laws  of  Motion 


The  direction  of  this  net  force  is  t lie  same  as  that  of  the  automobile's  acceleration  a, 
and  thus  of  its  final  velocity  v;  it  is  the  direction  of  the  automobile’s  motion.  The 
magnitude  F of  the  net  force  is 

mv 
F = — 
t 


where  v is  the  final  speed  of  the  automobile.  Inserting  the  numerical  values  given 
for  m,  v,  and  t,  you  obtain 


1000  kg  x 30.0  m/s 
10.0  s 


3000  kg-m/s2 


or 


F = 3000  N 


How  have  we  gotten  to  where  we  are  now?  We  began  with  the  familiar 
intuitive  concept  of  force.  Combining  this  concept  with  the  defined  concept 
of  acceleration,  we  arrived  at  the  less  familiar  concept  of  mass.  But 
although  mass  was  not  a familiar  concept,  there  was  no  difficulty  in  de- 
fining a standard  mass,  the  kilogram.  Once  this  was  done,  we  backtracked 
and  put  the  intuitive  concept  of  force  on  a sounder  logical  basis  through 
the  definition  of  the  unit  of  force  in  terms  of  the  units  of  mass  and  accelera- 
tion. 

While  mass  may  not  be  a concept  in  everyday  use,  it  has  an  intuitive 
connection  with  the  familiar  concept  of  weight.  This  is  demonstrated  in  Ex- 
ample 4-2. 


A 0.70-kg  billiard  ball  and  a 7.0-kg  bowling  ball  both  fall  toward  the  ground  with  the 
same  downward  acceleration  g of  magnitude  9.8  m/s2.  Find  the  force,  symbolized 
as  W,  exerted  by  gravity  on  each  ball. 

■ Figure  4-6  shows  the  two  falling  balls.  For  each  ball  the  only  force  acting  on  it  is 
the  gravitational  force  W.  Newton’s  second  law  says  that  this  net  force  must  be  in 
the  direction  of  its  acceleration  g,  namely  downward.  The  second  law  also  says  that 
the  magnitude  W of  the  net  force  acting  on  the  ball  must  equal  its  mass  m times  the 
magnitude  g of  its  acceleration.  So  you  have  for  the  billiard  ball 

W = mg  = 0.70  kg  x 9.8  m/s2  = 6.9  N 


jiiltRk 


w = 0.70 


\ V 


w 


Fig.  4-6  The  gravitational  forces  W exerted  by  the  earth  on  two  balls 
of  different  mass  m and  their  accelerations  g when  falling  toward  the 
earth.  Observation  shows  that  both  accelerations  have  the  same 
magnitude,  providing  that  air  resistance  is  negligible. 


Earth 


4-2  Newton’s  Second  and  Third  Laws  117 


For  the  bowling  ball. 


W = mg  = 7.0  kg  x 9.8  m/s2  = 69  N 

■ II  M I M I H'»i»  ll  i I I I I 1 II  I'  I 


Although  the  masses  of  the  two  balls  considered  in  Example  4-2  are 
different,  experiment  shows  that  near  the  surface  of  the  earth  they  fall  with 
accelerations  of  the  same  magnitude  g (neglecting  air  resistance).  Ac- 
cording to  Newton’s  second  law  as  applied  in  the  example,  this  can  be  true 
only  if  it  is  also  true  that  the  magnitude  W of  the  gravitational  force  exerted 
on  each  ball  is  directly  proportional  to  its  mass  m,  with  the  proportionality 
constant  being  g.  That  is, 

W = mg  (4-2) 

This  equation  allows  us  to  understand  why  the  two  balls  fall  with  the  same 
acceleration,  despite  the  difference  in  their  masses.  The  bowling  ball, 
having  10  times  as  much  mass  as  the  billiard  ball,  feels  a gravitational  force 
which  is  10  times  as  strong.  But  because  its  mass  is  10  times  that  of  the  bil- 
liard ball,  Newton's  second  law  says  it  takes  the  10  times  stronger  force  to 
give  the  bowling  ball  the  same  acceleration  as  the  billard  ball.  In  other 
words,  since  force  equals  mass  times  acceleration,  it  follows  that  accelera- 
tion equals  force  divided  by  mass.  And  for  each  ball  the  gravitational  force 
mg  acting  on  it,  divided  by  its  mass  m,  yields  the  same  acceleration  g. 

Frequent  use  will  be  made  of  Eq.  (4-2)  since  it  applies  to  all  bodies  near 
the  surface  of  the  earth,  where  the  magnitude  of  their  acceleration  when 
falling  with  negligible  air  resistance  is  essentially  equal  to  the  standard 
value  g.  (In  Chap.  1 I you  will  see  that  in  this  context  “near”  means  that  the 
separation  between  a body  and  the  earth's  surface  is  small  compared  to  the 
earth’s  radius.)  We  call  the  magnitude  W of  the  gravitational  force  W acting 
on  a body  near  the  earth’s  surface  its  weight.  With  this  terminology,  Eq. 
(4-2)  says  t hat  the  weight  of  a body  is  its  mass  multiplied  by  the  magnitude  of  the  gravi- 
tational acceleration.  The  gravitational  force  exerted  on  the  body  is  a mani- 
festation of  the  gravitational  attraction  between  the  earth  and  the  body.  Its 
direction  is  always  toward  the  center  of  the  earth,  that  is,  in  the  direction 
called  downward. 

Example  4-3  will  show  you  how  Newton’s  second  law  can  be  applied  to 
stationary  bodies  and  will  demonstrate  one  method  for  Ending  the  weight 
of  a body. 


EXAMPLE  4-3 

The  billiard  ball  and  the  bowling  ball  of  Example  4-2  are  suspended  from  spring 
scales  and  hang  motionless.  See  Fig.  4-7.  Find  the  magnitude  S of  the  force  exerted 
by  the  scale  holding  up  each  ball.  In  other  words,  find  the  force  registered  on  each 
spring  scale. 

■ You  see  from  the  figure  that  acting  on  each  ball  is  a force  W exerted  by  gravity 
and  a force  S exerted  by  the  scale.  The  net  force  F acting  on  the  ball  is  the  vector 
sum  of  the  two: 

F = W + S 

Since  the  ball  is  not  accelerating,  a = 0 and  so  F = ma  = 0.  Thus  you  have  in  each 
case 

W + S = 0 
or 

S = - w (4-3) 


118 


Newton’s  Laws  of  Motion 


Support 


W = mg 


Earth 

Fig.  4-7  I he  balls  of  Fig.  4-6  hanging 
motionless  from  cords  connected  to 
spring  scales.  Although  the  earth  still 
exerts  downward  gravitational  forces 
W = mg  on  them,  the  balls  are  pre- 
vented from  falling  by  the  upward 
forces  S exerted  on  them  by  the  scales 
from  which  they  are  suspended. 


EXAMPLE  4-4 


Since  the  force  exerted  on  the  ball  by  gravity  is  downward,  Eq.  (4-3)  shows  that  the 
force  exerted  on  the  ball  by  the  scale  is  upward,  in  agreement  with  Fig.  4-7.  As  for 
magnitudes,  the  equation  shows  that 

S = W 

Using  the  values  of  W obtained  from  Example  4-2,  you  have  for  the  billiard  ball 

S = 6.9  N 

For  the  bowling  ball  you  have 

S = 69  N 

A spring  scale  is  calibrated  so  that  the  magnitude  S of  the  force  it  exerts  can  be 
read  from  the  markings  on  the  scale.  Because  S = W,  the  scale  reading  gives  the 
weight  W of  the  ball  it  supports. 


A spring  scale  measures  the  weight  W of  an  object,  not  its  mass  m . That  is,  the 
scale  measures  how  much  gravitational  force  is  exerted  on  the  object  by  the  earth, 
not  how  much  matter  the  object  contains.  So  although  we  do  not  measure  mass 
directly  in  everyday  life,  we  can  infer  it.  For  most  purposes,  the  acceleration  of 
gravity  can  be  taken  to  have  the  same  magnitude  everywhere  on  the  surface  of  the 
earth.  Using  the  value  g = 9.8  m/s2,  we  infer  the  mass  to  be  m = W/g.  It  is,  in  fact, 
the  masses  of  things,  rather  than  their  weights,  which  usually  interests  us.  When 
we  buy  an  expensive  steak,  for  example,  it  is  the  quantity  of  matter — the 
mass — which  we  care  about,  not  the  gravitational  force  exerted  on  the  steak  by  the 
earth. 

The  proportionality  between  mass  and  weight  under  everyday  conditions  has 
given  rise  to  a universal  but  confusing  practice.  Since  it  is  the  mass  that  is  really  of 
interest,  the  result  of  the  weighing  process  is  almost  always  expressed  directly  in 
terms  of  the  corresponding  mass.  Thus  we  speak  of  an  object  as  “weighing” 
5.0  kg.  What  we  mean,  precisely  speaking,  is  that  the  object  is  supported  against 
the  gravitational  force  acting  on  it  by  a scale  exerting  a force  of  magnitude 
S = W = mg  = 5.0  kg  x 9.8  m/s2  = 49  N.  This  is  just  the  force  that  we  expect 
to  be  exerted  to  support  an  object  of  5.0-kg  mass  located  near  the  earth’s  surface. 
So  we  can  correctly  infer  that  the  object  has  a mass  of  5.0  kg.  Some  spring  scales 
are  calibrated  to  read  mass  directly.  This  makes  them  convenient  to  use,  in  normal 
circumstances.  How  useful  would  they  be  if  taken  to  the  moon? 

In  Example  4-4  the  mass  of  a body  results  in  a gravitational  force  being 
exerted  on  it,  and  the  mass  also  results  in  the  body  having  inertia.  These 
two  different  aspects  of  mass  also  are  involved  in  treating  a falling  body,  as 
in  Example  4-2.  But  in  the  situation  considered  next  it  is  easier  to  distin- 
guish between  the  gravitational  role  of  mass  and  its  inertial  role.  Ex- 
ample 4-4  is  the  hrst  to  involve  more  than  one  dimension,  and  therefore  the 
first  in  which  the  true  vector  nature  of  force  and  acceleration  must  be  taken 
into  account. 


A block  slides  down  a long,  frictionless  plane  which  is  supported  from  the  earth  at 
an  angle  of  inclination  0 = 37°  with  respect  to  the  horizontal.  Find  the  acceleration 
of  the  block  and  the  distance  the  block  has  traveled  3.0  s after  it  starts  from  rest. 

■ Your  hrst  step  is  to  make  a sketch  of  the  block  and  plane  and  to  indicate  on  the 
sketch  an  appropriate  set  of  coordinates.  A good  choice  for  the  coordinates  is 
shown  in  Fig.  4-8.  The  x axis  lies  in  the  inclined  plane,  with  its  positive  direction  in 
the  direction  of  the  block’s  motion.  The  y axis  is  constructed  perpendicular  to  the 


4-2  Newton’s  Second  and  Third  Laws  119 


y 


Fig.  4-8  The  forces  acting  on  a block  of  mass  m sliding  down  a frictionless  plane 
inclined  to  the  horizontal  at  angle  6.  Constructed  parallel  to  the  plane  in  the 
direction  of  motion,  and  normal  to  the  plane  in  the  generally  upward  direction,  are 
x and  y axes.  The  gravitational  force  W exerted  by  the  earth  has  components 
Wx  = + mg  sin  0 and  Wy  — — mg  cos  0 along  the  positive  directions  of  these  axes. 
The  force  N exerted  by  the  plane  has  only  a single  nonzero  component, 
Ny  = +N.  The  ticks,  labelled  + mg- sin  d and  -mg  cos  6 , on  the  axes  drawn  par- 
allel to  the  x and  y axes  illustrate  the  scheme  used  here  and  subsequently  to  de- 
pict the  components  of  a vector.  The  tick  on  the  axis  which  is  parallel  to  the  x axis 
marks  a distance  from  the  origin  of  the  former  whose  magnitude  is  proportional 
to  the  magnitude  of  the  x component  of  the  vector.  The  distance  is  measured  in 
the  positive  x direction  if  the  x component  has  a positive  value  (as  it  does  in  this 
case),  and  in  the  negative  x direction  otherwise.  The  same  scheme  is  used  to  rep- 
resent pictorially  the  y component  of  the  vector  (which  has  a negative  value  in 
this  case).  This  scheme  is  compatible  with  the  scheme  that  is  used  to  depict  the 
vector  itsell  by  starting  at  the  origin  and  measuring  a distance,  in  the  direction 
of  the  vector,  whose  magnitude  is  proportional  to  the  magnitude  of  the  vector. 


x axis.  Furthermore,  the  y axis  is  normal  to  the  inclined  plane.  (The  word  normal 
means  that  any  plane  containing  the  y axis  — not  just  the  plane  of  the  page- — is  per- 
pendicular to  the  inclined  plane.)  The  positive  direction  of  the  y axis  is  generally  up- 
ward. These  rectangular  coordinates  are  better  to  choose  than  ones  in  which  the  x 
and  y axes  are  horizontal  and  vertical  because  they  make  the  block’s  acceleration 
vector  have  only  a single  nonzero  component.  With  this  choice,  the  vector  describ- 
ing the  gravitational  force  acting  on  the  block  has  two  nonzero  components.  But 
only  the  one  along  the  direction  of  the  block’s  motion  is  important  since  it  is  the  only 
one  which  leads  to  acceleration  of  the  block. 

11  the  mass  of  the  block  is  m,  the  gravitational  force  acting  on  it  has  the  magni- 
tude W = mg.  The  figure  shows  that  the  x and  y components  of  this  force  are  W x = 
mg  sin  0 and  = - mg  cos  6.  There  is  also  a supporting  force  exerted  on  the 
block  by  the  inclined  plane.  The  plane  is  assumed  to  be  frictionless,  and  a fric- 
tionless plane  cannot  exert  any  force  in  a direction  parallel  to  itself.  So  this  force  is 
directed  normal  to  the  inclined  plane,  just  as  the  force  exerted  on  a puck  by  an  air 
table  is  normal  to  the  tabletop.  Representing  the  magnitude  of  the  force  by  N,  you 
have  for  its  components  Nx  = 0 and  N y = N.  As  for  the  acceleration  of  the  block, 
you  know  that  whatever  it  may  be,  it  has  no  component  in  the  y direction.  This  is 
true  since  the  block  neither  rises  off  the  plane  nor  descends  through  it,  so  the  block 
does  not  accelerate  in  tbe  y direction.  Also,  the  positive  x direction  is  in  the  direction 
of  the  block’s  acceleration  since  that  is  the  direction  of  its  motion.  Thus  if  the  mag- 
nitude of  the  acceleration  is  a , then  ax  = a and,  as  just  argued,  ay  = 0. 

The  vector  equation  expressing  Newton’s  second  law,  F = ma,  is  equivalent  to 
two  scalar  equations:  Fx  = max  and  Fy  = may.  (Can  you  explain  why?)  Consider 
the  one  involving  x components.  The  x component  Fx  of  the  net  force  acting  on  the 
block  is  just  equal  to  Wx  since  Nx  = 0.  Thus  you  have 

W x = max 

Evaluating  Wx  and  ax,  you  obtain 

mg  sin  0 = ma  (4-4 a) 

Canceling  the  m appearing  on  both  sides  of  this  equality,  and  then  solving  for  a, 
gives  you 

a = g sin  0 (4-4 b) 

For  the  values  specified,  you  find 

a = 9.8  m/s2  X sin  37°  = 5.9  m/s2 

120  Newton’s  Laws  of  Motion 


Knowing  a,  you  can  find  the  distance  x traveled  in  3.0  s from  Eq.  (2-30),  x = xt  + 
Vit  + af  /2.  With  x,  = vt  = 0,  that  equation  gives 


x 


at2 

~2 


or 


5.9  m/s2  x (3.0  s)2 
x = c?  = 27  m 


If  you  also  wanted  to  determine  the  magnitude  N of  the  force  exerted  on  the 
block  by  the  plane,  you  could  find  it  by  considering  the  equation  Fy  = may.  Since 
av  = 0,  it  must  be  that  Fu,  the  y component  of  the  net  force  acting  on  the  block,  is 
also  zero.  Thus  you  have 

Fy=  Wy  + Ny  = 0 
Evaluating  W u and  Ny,  you  obtain 


or 


— mg  cos  6 + N = 0 
N = mg  cos  0 


(4-5) 


Inset  ting  the  particular  values  of  m and  6,  you  will  find  N. 

In  more  general  terms,  note  that  for  a level  plane  (6=0)  this  equation  yields 
N = mg  = W,  while  fora  vertical  plane  (0  = 90°)  it  yields  N = 0.  In  the  first  case  the 
frictionless  plane  fully  supports  the  block,  while  in  the  second  it  gives  the  block  no 
support  at  all.  These  results  certainly  conform  to  the  results  predicted  by  Eq.  (4-4 b) 
for  the  magnitude  a of  the  acceleration  down  the  plane.  With  0 = 0 (a  level  plane) 
the  block  is  fully  supported,  and  so  it  will  not  accelerate,  in  agreement  with  the 
a = 0 predicted  for  this  angle.  With  6 = 90°  (a  vertical  plane)  the  plane  might  as 
well  not  be  there  at  all  since  it  gives  no  support  at  all  to  the  block.  The  block  will  fall 
freely,  in  agreement  with  the  predicted  value  a = g. 


In  the  preceding  example  the  inclined  plane  partially  supports  the 
block  against  gravity.  The  strength  of  the  “effective  gravitational  force” 
acting  on  the  block  is  the  x component,  mg  sin  d.  This  is  less  than  the  weight 
mg  of  the  block,  so  the  inclined  plane  has  “diluted”  the  effect  of  gravity.  But 
the  mass  which  the  force  mg  sin  0 must  accelerate  is  the  full  mass  of  the 
block.  This  is  why  the  magnitude  a of  the  acceleration  of  the  block  is  less 
than  the  value  g that  would  be  found  if  the  block  were  falling  freely. 

Note  that  the  value  of  the  mass  m in  the  left  side  of  Eq.  (4-4 a), 
mg  sin  0 = ma,  determines  the  strength  of  the  effective  gravitational  force 
acting  on  the  block,  while  the  value  of  the  mass  m in  the  right  side  deter- 
mines its  inertia.  In  Example  4-5  the  role  of  mass  in  determining  the  inertia 
of  a body  is  even  easier  to  distinguish  from  its  role  in  determining  the  gra- 
vitational force  acting  on  body.  This  is  because  two  bodies  are  involved.  For 
one  of  them  all  that  matters  is  its  inertia;  for  the  other  all  that  matters  is  the 
gravitational  force  acting  on  it. 


EXAMPLE  4-5 

Figure  4-9a  is  a reproduction  of  Fig.  3-30.  It  is  a strobe  photo  showing  a puck 
moving  under  the  influence  of  a force  exerted  on  it  by  one  end  of  a string.  The 
string  extends  from  the  puck  to  a swiveling  pulley  at  the  center  of  the  table,  over 
the  pulley,  and  through  a hole  in  the  table  down  to  a large  washer  hanging  from  the 
other  end  of  the  string,  as  in  Fig.  4-9 b.  The  gravitational  force  acting  on  the  washer 
is  transmitted  by  the  string  (whose  mass  is  negligible)  to  the  puck.  The  puck  has 


4-2  Newton’s  Second  and  Third  Laws  121 


Fig.  4-9  (a)  Strobe  photo  of  a puck  in  a circular  orbit 

on  an  air  table.  ( b ) A sketch  of  the  apparatus.  Shown 
acting  on  the  stationary  washer  of  mass  m is  a 
downward  force  of  strength  mg  applied  to  it  by  the 
gravitational  pull  of  the  earth  and  an  upward  force  of 
equal  strength  applied  to  the  washer  by  the  string.  The 
other  end  of  the  string  applies  an  inward  force  of  the 
same  strength  mg  to  the  puck  of  mass  M.  This  force 
results  in  an  inward  acceleration  of  the  puck.  The 
acceleration  has  magnitude  a = t4/r,  where  v is  the 
speed  of  the  puck  and  ris  its  orbit  radius.  The  force  of 
gravity  exerted  by  the  earth  on  the  puck,  and  the 
canceling  normal  force  exerted  on  it  by  the  air  table 
top  are  not  shown  because  they  are  not  of  interest,  (c) 
The  forces  acting  on  the  two  ends  of  the  string. 


been  set  into  motion  in  a circular  orbit  of  radius  r,  and  it  moves  around  this  orbit  at 
speed  v.  Measurements  made  on  Fig.  3-30  showed  that  r = 0.44  m and  v = 0.54 
m/s.  Additional  measurements  showed  the  mass  of  the  puck  to  be  M = 0.33  kg. 
What  is  the  mass  m of  the  washer  hanging  from  the  string? 

■ As  you  learned  in  Sec.  3-5,  the  acceleration  a of  the  puck  moving  at  constant 
speed  around  a circular  orbit  is  always  directed  from  the  puck  to  the  center  of  its 
orbit.  Since  F = Ma,  where  M is  the  puck’s  mass,  the  net  force  F acting  on  it  is  also 
directed  to  the  orbit’s  center.  Any  set  of  cartesian  (x,  y ) coordinates  would  be  diffi- 
cult for  you  to  use  in  this  case,  since  F and  a would  have  time-varying  components. 
But  polar  (r,  0)  coordinates  with  an  origin  at  the  center  of  the  orbit  are  already 
being  used,  since  the  dimensions  of  the  orbit  are  given  in  terms  of  the  radial  coordi- 
nate r.  In  these  coordinates  the  acceleration  of  the  puck  has  a component  only  along 
the  inward  radial  direction,  and  the  same  is  therefore  true  of  the  net  force  acting  on 
it.  According  to  Eq.  (3-41a),  the  magnitude  of  this  centripetal  acceleration  of  the 
puck  is 

u2 

a = — 
r 


Newton’s  second  law  tells  you  that  the  magnitude  F of  the  net  force  which  produces 
the  acceleration  is 

F = Ma 

or  Mv2 

F = 

r 


122  Newton’s  Laws  of  Motion 


The  net  force  acting  on  the  puck  is  directed  inward  along  the  string  and  is  ex- 
erted by  the  string  connected  to  the  puck.  The  magnitude  of  F is  the  weight  mg  of 
the  washer  hanging  from  the  lower  end  of  the  string.  Thus  you  have 

F = mg 

Combining  the  last  two  equations,  you  obtain 

Mv2 

mg  = (4-6  a) 

r 

or 

Mi ? 

m = (4-66) 

gr 

The  numerical  value  of  the  mass  of  the  washer  is 


0.33  kg  x (0.54  m/s)2 
9.8  m/s2  x 0.44  m 


= 0.022  kg  = 22  g 


The  mass  m in  the  left  side  of  Eq.  (4-6a)  determines  the  force  that 
gravity  exerts  on  the  washer  suspended  from  one  end  of  the  string.  Since 
for  the  situation  considered  in  the  example  the  washer  never  accelerates 
anyway,  its  inertia  is  of  no  real  consequence.  In  contrast,  the  mass  M in  the 
right  side  of  the  equation  determines  the  inertia  of  the  puck  connected  to 
the  other  end  of  the  string.  Its  inertia  is  significant  because  it  is  accelerat- 
ing. But  since  the  puck  is  supported  by  the  air  table,  whatever  force  is  ex- 
erted downward  on  it  by  gravity  is  exactly  canceled  by  the  upward  force  ex- 
erted on  it  by  the  air  film  on  which  the  puck  rides.  So  the  force  of  gravity  on 
the  puck  is  of  no  consequence.  You  can  see  now  why  it  is  particularly  easy 
to  distinguish  the  gravitational  function  of  mass  from  its  inertial  function, 
for  the  system  analyzed  in  Example  4-5.  Can  you  invent  another  system 
which  allows  this  to  be  done? 

All  the  systems  we  have  studied  to  illustrate  Newton’s  second  law  also 
contain  many  examples  of  Newton’s  third  law  of  motion.  For  instance,  the 
washer  at  the  lower  end  of  the  string  in  Example  4-5  applies  a downward 
force  to  that  end  of  the  string.  (It  is  the  force,  shown  in  Fig.  4-9c,  pulling  on 
the  lower  end  of  the  string.)  At  the  same  time  the  lower  end  of  the  string 
applies  a force  to  the  washer  of  equal  magnitude  but  directed  upward.  (It  is 
the  force,  shown  in  Fig.  4-9 b,  supporting  the  washer.)  The  string  and  the 
washer  interact  because  they  are  connected.  The  interaction  between  the 
two  involves  a pair  of  forces:  the  force  which  the  string  exerts  on  the  washer 
and  the  force  which  the  washer  exerts  on  the  string.  These  two  forces  have 
equal  magnitude  but  opposite  direction. 

Whenever  a force  is  applied  to  an  object,  something  else  must  be  in- 
teracting with  this  object  in  some  way.  Furthermore,  if  body  1 is  interacting 
with  body  2,  then  body  2 necessarily  must  be  interacting  with  body  1 . (That 
is  what  we  mean  by  interaction.)  So  if  a force  is  applied  to  body  1 as  a result 
of  its  interaction  with  body  2,  there  must  also  be  a force  applied  to  body  2 as 
a result  of  its  interaction  with  body  1.  Forces  come  in  pairs.  The  two  forces 
in  each  pair  are  related  by  Newton’s  third  law  of  motion:  If  there  is  a force 
exerted  on  body  1 by  body  2,  there  is  also  a force  exerted  on  body  2 by  body  1.  The  two 
forces  have  equal  magnitudes  but  opposite  directions.  Expressed  in  symbols,  the 
third  law  is 

f on  l by  2 = — T’0n2byl  (4~7) 


4-2  Newton’s  Second  and  Third  Laws  123 


The  pair  of  forces  is  called  an  action-reaction  pair.  (Which  force  is  the  ac- 
tion and  which  the  reaction  is  arbitrary;  it  depends  on  which  member  of 
the  pair  you  think  about  first.) 

Note  that  the  action  force  is  exerted  on  one  of  the  interacting  bodies 
and  the  reaction  force  on  the  other.  The  two  forces  are  never  exerted  on  the 
same  body.  So  you  should  never  add  them  to  reach  the  false  conclusion  that 
they  produce  a zero  net  force  on  the  same  body  because  they  cancel. 

Note  also  that  Newton’s  third  law  is  not  restricted  to  inertial  reference 
frames.  Motions  are  not  well  defined  unless  there  is  a specification  of  the 
reference  frame  from  which  they  are  observed.  But  this  is  not  true  of 
forces.  Despite  the  fact  that  it  is  called  the  third  law  of  motion,  the  law 
refers  to  forces,  not  to  motions.  In  fact,  the  third  law  applies  no  matter 
what  reference  frame  is  used  to  observe  the  interacting  bodies  that  exert 
forces  on  one  another. 

In  illustrating  the  third  law,  Newton  wrote:  Whatever  draws  or  presses  an- 
other is  as  much  drawn  or  pressed  by  that  other.  If  you  press  a stone  with  your 
finger,  the  finger  is  also  pressed  by  the  stone.  If  a horse  draws  a stone  tied  to  a rope, 
the  horse,  if  I may  say  so,  will  be  equally  drawn  back  towards  the  stone;  for  the  dis- 
tended rope,  by  the  same  endeavour  to  relax  or  unbend  itself,  will  draw  the  horse 
as  much  towards  the  stone  as  it  does  the  stone  towards  the  horse,  and  will  obstruct 
the  progress  of  the  one  as  much  as  it  advances  that  of  the  other. 

Example  4-5  provides  yet  another  illustration  of  an  action-reaction 
pair.  The  puck's  end  of  the  string  applies  a radially  inward  force  to  the 
puck.  (This  centripetal  force  produces  the  puck’s  centripetal  acceleration; 
it  is  shown  in  Fig.  4-9 h.)  The  puck  applies  a radially  outward  force  of  the 
same  magnitude  to  its  end  of  the  string.  (This  centrifugal  force  is  the  force 
pulling  on  the  puck’s  end  of  the  string,  shown  in  Fig.  4-9c,  which  prevents 
the  string  from  running  over  the  pulley  in  response  to  the  force  pulling  on 
the  washer’s  end.)  What  other  instances  of  action-reaction  pairs  can  you 
find  in  the  system  considered  in  Examples  4-1  through  4-5? 

You  should  not  let  the  cases  of  action-reaction  pairs  that  have  been 
cited  give  you  the  impression  that  actual  contact  between  the  two  in- 
teracting bodies  is  necessary.  Example  4-6  concerns  a pair  of  forces  exerted 
between  two  bodies  which  interact  while  separated. 


EXAMPLE  4-6  imw iiiiib 

The  7.0-kg  bowling  ball  of  Example  4-2  is  falling  toward  the  earth  because  of  the 
gravitational  force  exerted  on  it  by  the  earth.  The  two  interacting  bodies  are  sepa- 
rated. But  the  separation  is  assumed  to  be  small  compared  to  the  radius  of  the 
earth.  Hence,  the  gravitational  acceleration  of  the  bowling  ball  can  be  assumed  to 
have  the  magnitude  measured  near  the  surface  of  the  earth;  that  is, 
a = g = 9.8  m/s2.  What  is  the  force  exerted  on  the  earth  by  the  bowling  ball?  If  this 
force  were  the  only  force  acting  on  the  earth,  what  would  be  the  earth's  accelera- 
tion? Measurements  and  calculations  described  in  Chap.  1 1 show  that  the  mass  of 
the  earth  is  6.0  x 1024  kg. 

■ According  to  Newton’s  third  law,  you  know  that 

Eon  earth  by  ball  Eon  ball  by  earth 


124  Newton’s  Laws  of  Motion 


Fig.  4-10  A schematic  illustration  of  a bowling  ball  and  the  earth,  ignoring  the  presence  of  all 
other  bodies.  The  size  of  the  ball  has  been  much  exaggerated  for  the  sake  of  clarity.  The  same  is 
true  of  its  separation  from  the  earth's  surface;  the  actual  separation  is  small  compared  to  the 
earth’s  radius.  The  magnitude  of  the  acceleration  of  the  ball  toward  the  earth  has  the  standard 
value  of  the  gravitational  acceleration,  and  the  magnitude  of  the  force  exerted  on  the  ball  by  the 
earth  is  the  ball's  mass  times  that  value.  The  force  exerted  on  the  earth  by  the  ball  is  also  shown. 
According  to  Newton's  third  law,  the  two  forces  have  equal  magnitude  but  opposite  direction. 
The  force  exerted  on  the  earth  causes  it  to  have  an  acceleration  toward  the  ball.  But  since  this 
force  has  the  same  magnitude  as  the  force  exerted  on  the  ball,  Newton’s  second  law  requires 
that  the  mass  of  the  earth  times  the  magnitude  of  the  earth’s  acceleration  be  equal  to  the  mass  of 
the  ball  times  the  magnitude  of  the  ball’s  acceleration.  This  means  the  acceleration  of  the  earth 
has  an  extremely  small  value  compared  to  the  acceleration  of  the  ball,  because  the  mass  of  the 
earth  is  extremely  large  compared  to  the  mass  of  the  ball.  If  the  separation  between  the  ball  and 
the  earth  is  comparable  to,  or  larger  than,  the  earth’s  radius,  then  the  magnitude  of  the 
gravitational  force  exerted  on  the  ball  by  the  earth  is  reduced.  But  Newton's  third  law  still 
applies,  so  there  is  still  a force  of  equal  magnitude  but  opposite  direction  exerted  on  the  earth  by 
the  ball.  Although  no  bowling  balls  are  known  to  be  out  there,  an  example  of  such  a situation  is 
provided  by  the  moon  and  the  earth.  Each  exerts  a gravitational  force  on  the  other,  and  the 
forces  have  equal  magnitudes  but  opposite  directions.  The  action  of  the  earth  on  the  moon 
provides  the  force  required  to  keep  it  in  its  approximately  circular  orbit  about  the  earth.  The 
reaction  force  of  the  moon  on  the  earth  is  best  known  through  the  ocean  tides,  of  which  it  is  the 
principal  underlying  cause. 


The  magnitude  of  the  force  on  the  right  side  of  this  equation  is  just  the  weight  W of 
the  bowling  ball.  Its  value  is 

W = mg  = 7.0  kg  x 9.8  in/s2  = 69  N 
Thus  you  have  for  the  magnitude  of  the  force  on  the  left  side 

-^on  earth  by  ball  69  N 

The  direction  of  the  force  is  opposite  to  the  direction  of  the  force  exerted  on  the 
bowling  ball  by  the  earth.  That  is,  the  force  exerted  on  the  earth  by  tbe  bowling  ball 
is  in  the  direction  from  the  earth’s  center  to  the  bowling  ball’s  center.  See  Fig.  4-10. 

It  the  only  force  acting  on  the  earth  were  the  one  exerted  by  the  bowling  ball, 
then  it  would  be  the  net  force  acting  on  the  earth.  You  can  determine  the  magni- 
tude of  the  acceleration  it  would  give  to  the  earth  by  using  Newton’s  second  law: 

f^on  earth 

^ of  earth 

™ of  earth 

Setting  Ton  earth  = 69  N and  using  the  quoted  value  mofeaith  = 6.0  x 1024  kg,  you 
obtain 


fl°f earth  6 0 x 1024  kg  l-2  x 10  23  m/s2 

The  direction  of  this  acceleration  is  also  from  the  center  of  the  earth  to  the  center  of 
the  bowling  ball. 

A force  of  magnitude  69  N is  a very  significant  one  in  affecting  the  motion  of 
a body  having  the  inertia  of  the  7.0-kg  ball.  It  gives  the  body  an  acceleration  of 
9.8  m/s2.  But  such  a force  is  completely  negligible,  as  judged  by  the  effect  it  would 
produce  on  the  motion  of  a body  having  the  inertia  of  the  earth.  The  acceleration 
would  be  only  1.2  x 10-23  m/s2.  Strictly  speaking,  a bowling  ball  does  not  accelerate 
toward  an  immovable  earth  when  it  falls.  If  these  were  the  only  bodies  present,  the 
earth  would  accelerate  toward  the  bowling  ball.  But  practically  speaking,  the  accel- 
eration of  the  earth  could  be  neglected  because  it  would  be  immeasurably  small. 


4-2  Newton’s  Second  and  Third  Laws  125 


4-3  MASS  AND 
MOMENTUM 
CONSERVATION 


Newton's  second  law  in  the  special  form,  F = m a,  and  Newton’s  third 
law,  Fonlby2  = — Fon2byl,  make  it  possible  to  analyze  successfully  the  great 
majority  of  systems  considered  in  newtonian  mechanics.  Many  examples 
involving  their  use  have  been  presented  in  this  section.  Many  more  are  pre- 
sented in  the  next  chapter.  But  there  is  a completely  general  form  of  the  sec- 
ond law.  This  general  form  must  be  used  to  study  some  very  important 
systems.  It  can  be  obtained  by  following  a more  fundamental,  and  sounder, 
logical  approach  than  we  have  followed  in  this  section.  The  same  approach 
will  clarify  the  origin  of  the  third  law.  This  approach,  presented  in  the  next 
three  sections,  is  based  on  the  experimental  law  of  momentum  conserva- 
tion. 


We  obtained  a clearer  understanding  of  what  happens  when  a bowling  ball 
falls  toward  the  earth  in  Example  4-6  than  we  did  in  Example  4-2.  The 
reason  is  that  we  considered  the  behavior  of  both  the  earth  and  the  bowling 
ball  in  Example  4-6,  whereas  the  earth  was  ignored  in  Example  4-2.  This 
situation  is  typical.  We  can  go  only  so  far  by  treating  a single  body  with  a 
force  acting  on  it.  After  all,  that  force  must  be  exerted  by  something — that 
is,  by  another  body.  To  analyze  matters  thoroughly,  we  must  at  the  very 
least  consider  the  reaction  force  exerted  on  this  other  body  by  the  hrst  one. 

In  this  section  we  study  the  simplest  possible  system  of  bodies  in  which 
Newton’s  laws  can  be  applied  in  a complete  fashion.  The  system  consists  of 
two  bodies  that  interact  with  one  another  but  in  effect  do  not  interact  with 
anything  else.  The  two  bodies  are  effectively  isolated  from  their  environ- 
ment because  no  net  force  is  exerted  on  either  of  them  by  their  environ- 
ment. They  form  what  is  called  an  isolated  system.  We  will  view  the  system 
from  a reference  frame  that  we  can  consider  to  be  an  inertial  frame.  Spe- 
cifically, we  will  investigate  a system  of  two  moving  air  table  pucks  which 
interact  with  each  other  if  and  when  they  collide.  But  in  effect  neither  of 
them  interacts  with  the  nearby  earth.  This  is  because  they  are  on  the  top  of 
the  air  table,  and  the  gravitational  force  exerted  on  each  puck  by  the  earth 
is  canceled  by  the  force  exerted  on  it  by  the  supporting  air  hint.  Since, 
moreover,  the  top  of  the  air  table  is  almost  friction-free,  there  is  no  appre- 
ciable frictional  force  acting  on  the  pucks.  So  essentially  no  net  force  is  ap- 
plied to  either  puck  from  outside  the  system  of  two  pucks,  and  they  can  be 
considered  to  form  an  isolated  system.  We  will  observe  the  pucks  with  a 
strobe  camera  supported  from  die  earth.  Thus  the  system  will  be  viewed 
from  a reference  frame  that  can  be  considered  to  be  an  inertial  frame. 

We  could  accept  the  largely  intuitive  justification  given  for  Newton’s 
second  and  third  laws  in  the  preceding  section  and  then  apply  the  laws  to 
predict  the  behavior  of  this  system.  But  we  will  not  do  so.  Instead,  we  will 
start  afresh  and  will  obtain  Newton’s  second  and  third  laws  from  an  analysis 
of  experiments  performed  on  the  system.  The  analysis  will  be  carried  out 
in  a series  of  steps.  The  hrst  step  is  to  use  the  experimental  observations  to 
give  a rigorous  definition  of  the  basic  concept  of  mass.  At  the  same  time  we 
will  be  led  to  consider  the  important  quantity  momentum,  which  is  the  prod- 
uct of  mass  and  velocity,  and  will  find  the  fundamental  law  of  momentum  con- 
servation: the  total  momentum  of  an  isolated  system  viewed  from  an  inertial 
frame  remains  constant.  In  subsequent  steps,  which  we  carry  out  in  Secs. 
4-4  and  4-5,  the  observations  are  used  to  formulate  Newton’s  second  and 
third  laws  in  a logically  consistent  way.  (We  do  not  need  to  reconsider 


126  Newton’s  Laws  of  Motion 


Newton’s  first  law.  We  already  have  a satisfactory  understanding  of  this 
law,  which  has  to  do  with  the  much  simpler  situation  of  an  isolated  system 
containing  only  one  body.) 

The  analysis  we  will  go  through  in  these  three  sections  involves  only 
the  inertial  properties  of  mass,  not  the  gravitational  properties.  The  relation 
between  mass  and  inertia  will  enter  the  analysis  in  a very  significant  way  be- 
cause the  velocity  of  each  puck  will  change  when  the  pucks  collide.  But  just 
as  it  was  for  the  puck  in  orbit  on  the  top  of  an  air  table  in  Example  4-5,  the 
gravitational  force  is  of  no  consequence  here  in  analyzing  the  behavior  of 
the  pucks.  Whatever  the  force  of  gravity  acting  on  an  air  table  puck  may  be, 
it  is  automatically  canceled  by  the  force  provided  by  the  air  film  supporting 
the  puck.  Therefore  the  relation  between  mass  and  gravitational  force  will 
not  enter  the  analysis.  The  experiments  considered  in  Sec.  4-2  have  taught 
us  already  most  of  what  we  need  to  know  about  the  gravitational  role  of 
mass.  The  remainder  of  what  we  need  to  know  we  will  learn  about  from 
experiments  discussed  in  Sec.  11-2. 

The  system  in  Fig.  4-11  consists  of  two  identical  pucks,  supported  on  an 
air  table.  The  pucks  are  made  out  of  a relatively  hard  plastic.  Puck  1 is  set 
into  motion  by  a launcher  at  the  upper  right.  It  strikes  puck  2,  which  is  ini- 
tially at  rest  near  the  center  of  the  table.  After  puck  1 leaves  the  launcher, 
the  system  of  two  pucks  is  effectively  isolated  from  its  environment. 

4'he  two  pucks  interact  with  each  other  when  puck  1 strikes  puck  2. 
The  actual  interaction  takes  place  over  a very  short  time.  But  while  the  col- 
lision time  is  short,  the  collision  has  a great  effect  on  the  motion  of  both 
pucks.  After  the  collision  occurs,  puck  2 moves  toward  the  bottom  of  the 
strobe  photo,  and  puck  1 moves  in  a changed  direction.  You  can  see  that  the 
speed  of  puck  1 has  also  been  changed  because  it  travels  a shorter  distance 
in  the  constant  strobe  Hash  interval.  Can  you  see  any  relation  between  the 
initial  velocity  of  puck  1 and  the  final  velocities  of  pucks  1 and  2? 


Fig.  4-11  Strobe  photo  of  a collision  between  two 
identical  plastic  pucks  on  an  air  table.  Puck  1 is 
incident  from  the  upper  right  on  puck  2,  which  is 
initially  at  rest  near  the  center  of  the  table. 


4-3  Mass  and  Momentum  Conservation  127 


Fig.  4-12  Analysis  of  the  initial  and  final  velocities  of 
the  collision  between  identical  plastic  pucks  shown  in 
Fig.  4-11.  Defining  the  time  unit  as  the  duration  of  the 
interval  between  successive  strobe  light  flashes,  as  in 
Fig.  4-3,  allows  the  velocity  vectors  to  be  constructed 
simply  by  connecting  adjacent  positions  of  the  puck 
centers. 


Figure  4-12  demonstrates  that  there  is  such  a relation.  Initial  and  final 
velocity  vectors  are  constructed  in  exactly  the  same  way  as  in  Fig.  4-3.  The 
strobe  flash  interval  is  taken  to  be  the  unit  of  time.  As  a consequence  of  this 
choice,  vectors  connecting  adjacent  puck  positions  are  velocity  vectors  be- 
cause they  give  the  displacements  per  unit  time.  The  initial  velocity  of  puck 
1 is  given  by  the  dashed  white  vector  labeled  vi;.  Its  final  velocity  is  the 
dashed  gray  vector  vlf.  Puck  2 has  zero  initial  velocity.  Its  final  velocity  is 
the  dashed  gray  vector  v2f.  The  relation  between  the  initial  and  final  veloc- 
ities is  seen  by  moving  v^and  v2/  together  to  construct  their  sum  vx/  + v2/. 
Then  this  vector  and  the  vector  v1('  are  moved  together  so  that  they  can  be 
compared.  The  comparison  is  shown  by  the  solid  gray  and  white  vectors. 
Within  the  accuracy  of  the  experiment,  these  two  are  equal.  That  is, 

Vi  / + V2/  = Vu 

We  use  the  symbol  v2,  for  the  initial  velocity  of  puck  2 (which  has  zero 
value),  and  write  this  experimental  relation  between  the  initial  and  final  ve- 
locities of  the  identical  pucks  in  the  form 

vlf  + v2 / = vu  + v2i  where  v2i  = 0 


We  would  like  to  see  whether  this  relation  might  be  valid  for  the  more 
general  case  where  v2i  ^ 0,  that  is,  where  both  pucks  are  moving  before  the 
collision.  The  experiment  recorded  in  Fig.  4-13  is  a collision,  between  the 
same  two  identical  plastic  pucks,  which  tests  the  hypothesis.  But  this  time 
puck  1 is  launched  from  the  upper  right  of  the  photograph  with  velocity  vi; 
to  collide  with  puck  2 as  it  is  moving  from  the  left  with  velocity  v2i.  After 
the  collision,  puck  1 recoils  to  the  bottom  of  the  picture  with  velocity  v^, 
and  puck  2 recoils  to  the  lower  left  with  velocity  v2/. 

Fhe  analysis  of  the  velocity  vectors  is  like  that  in  Fig.  4-12.  The  two  ini- 
tial velocities  are  added.  The  same  is  done  for  the  two  final  velocities.  Then 
the  sums  of  the  initial  and  final  velocities  are  compared.  Again,  we  find 


128  Newton’s  Laws  of  Motion 


Fig.  4-13  Another  collision  between  the  identical 
plastic  pucks.  In  this  case  both  pucks  1 and  2 are 
moving  initially;  1 comes  from  the  upper  right,  and  2 
comes  from  the  left. 


these  sums  are  equal  within  experimental  accuracy.  So  the  rule  for  this 
more  general  collision  between  the  identical  plastic  pucks  is 

Vi  / + V2/  = vi;  + v2i  (4-8) 

The  collision  does  not  change  thesttw  of  the  velocities  of  the  identical  pucks 
in  this  isolated  system  viewed  from  an  inertial  frame. 

If  we  investigate  any  other  collision  between  identical  plastic  pucks,  we 
find  that  the  same  relation  holds.  It  is  also  valid  for  collisions  between  iden- 
tical pucks  which  interact  in  ways  that  are  quite  different  from  the  way 
plastic  pucks  interact  when  they  collide.  We  show  this  by  studying  a colli- 
sion between  two  pucks  with  magnets  mounted  in  them  so  that  they  repel 
each  other.  Thus  they  “collide”  without  actually  touching.  They  have  a 
magnetic  interaction  instead  of  the  contact  interaction  that  takes  place  between 
plastic  pucks  when  touching. 

The  magnetic  puck  collision  is  of  the  kind  called  elastic.  An  elastic  colli- 
sion is  defined  as  a collision  in  which  the  speed  of  one  body  relative  to  the 
other  as  they  move  apart  after  the  collision  has  the  same  value  as  the  speed 
of  one  relative  to  the  other  as  the  two  bodies  move  together  before  the  colli- 
sion. That  is,  the  condition  for  an  elastic  collision  is 

lviy  - v2/|  = |v14  - v2i|  (4-9a) 

The  quantity  — v2/is  the  final  velocity  of  puck  1 relative  to  puck  2,  since 
it  is  the  velocity  of  puck  1 as  seen  by  an  (imaginary)  observer  moving  with 
puck  2.  You  can  verify  this  by  letting  = V in  the  Galilean  velocity  trans- 
formation, Eq.  (3-57).  The  magnitude  of  this  quantity,  |vx/  — v2/|,  is  there- 
fore the  final  relative  speed  of  the  two  pucks.  Similarly,  |v1{  — v2i|  is  their 
initial  relative  speed.  The  plastic  puck  collisions  are  said  to  be  inelastic  since 

|vi/  - v2/|  < |v„  - v«|  (4-96) 

(You  can  easily  show  the  inelastic  character  of  a plastic  puck  collision  by 

4-3  Mass  and  Momentum  Conservation  129 


Fig.  4-14  A “collision”  between  two  identical  mag- 
netic pucks.  Puck  1 comes  from  the  upper  right,  and 
puck  2 is  initially  at  rest  near  the  center  of  the  air  table. 
Note  that  there  is  never  any  actual  contact  between  the 
pucks. 


going  back  to  Fig.  4-12.  Construct  graphically  the  velocity  at  which  puck  1 
recedes  from  puck  2 after  the  collision  by  taking  the  vector  difference 
vi/  — v2/ between  their  two  velocities.  Then  compare  its  magnitude  to  the 
magnitude  of  v^,  the  velocity  at  which  puck  1 approaches  the  initially  sta- 
tionary puck  2 before  the  collision.) 

In  Fig.  4-14  puck  1 is  a magnetic  puck  launched  with  velocity  vi;  from 
the  upper  right  of  the  photograph.  It  collides  (without  actually  touching) 
with  puck  2,  an  identical  initially  stationary  magnetic  puck.  After  the  colli- 
sion, puck  2 moves  with  velocity  v2/towarcl  the  bottom  of  the  picture,  while 
puck  1 moves  with  velocity  toward  the  left.  Note  that  the  trajectories  of 
the  pucks  after  the  collision  make  a 90°  angle.  This  is  a hallmark  of  elastic 
collisions  at  nonrelativistic  speeds  between  identical  bodies,  one  of  which  is 
initially  stationary.  If  you  look  back  at  Fig.  4-12,  you  will  see  that  there  the 
angle  between  the  trajectories  is  less  than  90°.  The  angle  is  smaller  because 
the  pucks  do  not  move  apart  as  rapidly  after  an  inelastic  collision.  [You  can 
show  that  the  collision  in  Fig.  4-14  is  elastic  by  following  the  instructions 
given  in  parentheses  below  Eq.  (4-96).] 

The  manipulations  of  the  velocity  vectors  in  the  elastic  collision  of  Fig. 
4-14  show  you  that  Eq.  (4-8),  Vj/  + v2/  = v1(  + v2i,  is  valid.  Thus  the  equa- 
tion applies  to  this  elastic  collision  between  identical  pucks  that  interact 
magnetically,  just  as  it  applies  to  inelastic  collisions  between  identical  pucks 
that  interact  on  contact. 

So  far  we  have  considered  only  collisions  between  identical  pucks.  Now 
we  will  investigate  what  happens  when  the  pucks  are  not  identical.  We  will 
find  that  the  velocity  relation  for  identical  pucks  does  not  hold.  But  we  will 
also  find  a complete  general  relation,  only  slightly  more  complicated,  which 
is  satisfied,  and  which  includes  the  velocity  relation  as  a special  case. 

On  the  basis  of  common  experience  and  the  discussion  in  Sec.  4-2,  it  is 
easy  to  guess  that  the  relation  for  the  collision  of  nonidentical  pucks  will  in- 
volve their  masses.  But  in  the  discussion  which  follows  we  will  abandon  the 


130  Newton’s  Laws  of  Motion 


definition  of  mass  given  in  Sec.  4-2,  and  begin  anew.  While  that  definition  is 
perfectly  satisfactory  for  most  practical  purposes,  it  has  a serious  difficulty. 
We  gave  the  name  “mass”  to  the  proportionality  constant  m which  appears 
in  Newton’s  second  law,  F = ma.  But  the  “force”  F had  not  been  very  ade- 
quately defined.  Indeed,  we  later  defined  it  in  terms  of  mass  — the  proce- 
dure smacks  of  being  a logically  circular  argument.  Out  of  the  following 
discussion  will  come  a resolution  of  the  difficulty. 

Before  the  next  experiment  was  performed,  three  identical  magnetic 
pucks  were  procured.  Two  of  them  were  glued  together,  one  on  top  of 
the  other,  to  form  a double  puck,  called  puck  2.  The  collision  investigated 
took  place  between  the  double  puck  and  the  remaining  single  puck,  called 
puck  1. 

Figure  4-15  shows  the  collision  process.  Puck  1 is  launched  with  veloc- 
ity vi;  at  the  initially  stationary  puck  2.  After  they  collide,  puck  1 has  veloc- 
ity vlf,  and  puck  2 has  velocity  v2/.  You  can  tell  immediately  that  things  are 
different  from  the  experiment  of  Fig.  4-14,  which  used  two  identical  mag- 
netic pucks.  Because  the  pucks  are  magnetic  in  the  present  experiment,  the 
collision  is  elastic.  This  can  be  verified  by  constructing  the  vector  dif- 
ferences of  Eq.  (4-9).  But  the  angle  between  the  two  final  trajectories  is  no 
longer  90°.  The  comparison  in  Fig.  4-15  of  the  sums  of  the  initial  and  final 
velocities  shows  that  things  are  indeed  different  in  this  case  of  colliding 
nonidentical  pucks.  Contrary  to  all  the  identical-puck  cases,  the  sum  of  the 
velocities  after  the  collision  is  not  the  same  as  the  sum  of  the  velocities  be- 
fore the  collision.  Can  we  find  a quantity  that  is  unchanged  by  the  collision? 

We  may  guess  that  what  is  missing  from  the  “after  = before”  equation 
is  the  factor  mass.  Whatever  mass  may  be,  puck  2,  the  double  puck,  has 
twice  as  much  of  it  as  puck  1,  the  single  puck.  That  is,  m2/m1  = 2.  There- 
fore it  seems  reasonable  to  try  multiplying  the  vector  v2/of  the  double  puck 
by  2 before  adding  it  to  the  vector  v^of  the  single  puck.  This  is  done  in  Fig. 
4-16,  and  it  works  quite  well! 


Fig.  4-15  A collision  between  a single  and  double 
magnetic  puck.  The  single  puck,  called  puck  1,  comes 
from  the  upper  right,  and  the  double  puck,  called 
puck  2,  is  initially  at  rest  near  the  center  of  the  air 
table. 


4-3  Mass  and  Momentum  Conservation  131 


list  + 


Fig.  4-16  An  analysis  of  the  collision  between  the 
single  and  double  magnetic  pucks,  shown  in  Fig.  4-15, 
in  which  the  velocity  vector  of  the  double  puck  is 
doubled  in  magnitude. 


The  guess  is  tried  again  in  the  experiment  shown  in  Fig.  4-17.  Puck  1, 
the  single  magnetic  puck,  comes  in  initially  from  the  upper  right  with 
velocity  vi; . It  recoils  after  the  collision  to  the  upper  left  with  velocity  viy-. 
Puck  2,  the  double  magnetic  puck,  comes  in  from  the  left  with  velocity  v2i 
and  recoils  to  the  lower  right  with  velocity  v2/.  The  initial  and  final  velocity 
vectors  for  the  double  puck  are  doubled  to  account  for  the  double  mass, 
and  then  each  is  added  to  the  corresponding  velocity  vector  for  the  single 
puck.  Here  again  the  sum  of  the  initial  vectors  is  equal  to  the  sum  of  the 
final  vectors,  within  the  accuracy  of  the  measurement. 


Fig.  4-17  A collision  between  a single  and  a double 
magnetic  puck.  In  this  case  both  puck  1,  the  single 
puck,  and  puck  2.  the  double  puck,  are  moving 
initially;  1 comes  from  the  upper  right,  and  2 comes 
from  the  left. 


132  Newton’s  Laws  of  Motion 


Thus  the  results  of  both  of  the  nonidentical  puck  experiments  that  we 
have  studied  are  in  agreement  with  the  equation 

m2  m-2 

Vi  / + v2/  = v1{  + — v2i  (4-10) 

mx  mx 

We  therefore  take  a bold  step  and  define  the  mass  ratio  by  Eq.  (4-10).  In 
other  words,  we  define  mass  ratio  as  follows:  Make  bodies  1 and  2 collide  while 
effectively  isolated  from  their  external  environment,  and  use  an  inertial  frame  to 
measure  their  initial  and  final  velocities.  Then  find  the  value  of  m 2/mx  for  which 
xxf  + (m2/mx)x2f  equals  v1{  + (m2/mx)x2i.  This  value  is  the  ratio  of  the  mass  of 
body  2 to  the  mass  of  body  1.  The  definition  of  mass  ratio  that  we  have  given  is 
founded  on  just  two  things:  (1)  the  definition  of  velocity  (which  itself  de- 
pends only  on  the  definitions  of  distance  and  time)  and  (2)  a vast  amount  of 
experimental  evidence  exemplified  by  the  two  experiments  we  have  dis- 
cussed. 

Once  a way  of  defining  mass  ratio  is  available,  the  procedure  for  de- 
fining mass  itself  is  just  a matter  of  agreeing  on  the  unit  of  mass.  The  unit 
could  be  the  mass  of  a standard  air  table  puck.  Instead  it  usually  is  the  mass 
of  a standard  kilogram.  Thus  we  have  the  following  definition  for  mass: 
Perform  a collision  experiment  between  bodies  1 and  2 in  which  body  1 has  a mass  m x 
equal  to  1 kg,  and  determine  the  mass  ratio  m2/mx  from  the  preceding  definition.  The 
mass  m2  of  body  2 is  m2/mx  kg.  With  this  procedure  for  defining  mass  we 
avoid  the  circular  reasoning  involved  in  the  more  intuitive  definition  of 
Sec.  4-2. 


Multiplying  Eq.  (4-10)  through  by  the  mass  mx,  we  have 

mx\ i/  + m2x2f  - mx\xi  + m2x2i  (4-1 1) 

It  is  convenient  to  have  a name  for  the  quantity  mass  times  velocity.  So  we 
give  it  the  name  momentum  and  the  symbol  p.  That  is,  by  definition  the  mo- 
mentum p of  a body  equals  its  mass  m times  its  velocity  v: 

p = mx  (4-12) 

The  momentum  is  a vector  since  it  is  the  product  of  a scalar  (the  mass)  and 
a vector  (the  velocity). 

The  fact  that  the  sum  of  the  momenta  of  a system  is  not  affected  by  a 
collision  of  its  constituent  parts  does  not  depend  on  the  collision  being 
elastic.  In  order  to  demonstrate  this,  we  do  an  experiment  similar  to  the  last 
one,  but  using  plastic  pucks  which  collide  inelastically.  Three  identical 
plastic  pucks  are  obtained.  One  of  them  is  used  as  puck  1,  and  the  other 
two  are  glued  together  to  form  puck  2.  Figure  4-18  shows  puck  1 coming 
from  the  upper  right  with  momentum  pi;  and  colliding  with  puck  2 coming 
from  the  left  with  momentum  p2i.  For  convenience,  these  momentum 
vectors  are  drawn  to  such  a scale  that  the  length  of  the  vector  showing  the 
initial  momentum  of  puck  1 is  the  same  as  the  length  of  the  vector  that 
would  be  drawn  in  the  manner  of  the  preceding  figures  to  show  its  initial 
velocity.  With  this  scale,  the  initial  momentum  vector  of  puck  2 is  twice  as 
long  as  its  initial  velocity  vector  would  be.  This  is  because  puck  2 has  twice 
as  much  mass  as  puck  1.  After  the  collision  pucks  1 and  2 move  away  from 
each  other  with  the  momenta  labeled  pxf  and  p2/,  respectively.  The  mo- 
mentum vector  analysis  is  shown  in  the  figure.  As  in  all  experimental  situa- 
tions, the  results  are  not  perfect.  But  they  are  certainly  good  enough  to  give 


4-3  Mass  and  Momentum  Conservation  133 


Fig.  4-18  The  momenta  in  a collision  between  a 
single  and  a double  plastic  puck.  Puck  1,  the  single 
puck,  comes  from  the  upper  right,  and  puck  2,  the 
double  puck,  comes  from  the  left.  The  pucks  appear 
to  collide  without  quite  touching.  But  this  is  only 
because  the  strobe  light  did  not  happen  to  flash  at  the 
instant  of  collision.  A click  heard  by  the  experimenter 
at  the  moment  of  collision  made  it  clear  that  there  was 
actual  contact. 


one  more  verification  of  the  relation  obtained  by  using  Eq.  (4-12)  in  Eq. 
(4-11): 

Pi/  + P2  / = Pn  + p2( 

or 

(Ptotal)final  — (Ptotal)i  nitial  (4-13) 

We  have  found  the  law  of  momentum  conservation.  This  fundamental 
law  of  physics  can  be  stated  as  follows:  An  observer  in  an  inertial  frame  sees  a 
system  experiencing  no  net  interaction  with  its  environment  maintain  a constant  total 
momentum.  An  equivalent  statement  is:  The  total  momentum,  is  conserved  in  an 
isolated  system  observed  from  an  inertial  frame . 

The  momentum  conservation  law  certainly  is  in  agreement  with  the  re- 
sults of  all  the  identical-puck  experiments.  To  see  this,  write  Eq.  (4-11)  ex- 
plicitly for  the  case  mx  — m2.  Then  cancel  the  masses,  and  you  will  obtain 
Eq.  (4-8).  In  fact,  there  is  a tremendous  amount  of  experimental  evidence 
taken  from  many  different  fields  of  physics  which  shows  that  the  law  of  mo- 
mentum conservation  holds  in  any  system  of  two  or  more  particles  of  any 
type  that  interact  with  one  another  in  any  way,  provided  the  system  is  effec- 
tively isolated  from  all  external  influences  and  is  viewed  from  an  inertial 
reference  frame.  Our  search  for  a rigorous  definition  of  mass  has  been 
very  successful.  Not  only  have  we  found  the  desired  definition,  but  also  we 
have  found  the  most  broadly  applicable  conservation  law  in  mechanics. 

The  next  example  illustrates  the  use  of  momentum  conservation  in  a 
simple  collision  problem. 


EXAMPLE  4-7  ... 

A puck  of  mass  mx  = 2 kg  moves  with  a speed  vu  = 3 m/s  in  the  positive  x direc- 
tion. It  hits  “head-on”  a second  puck  of  mass  m2  = 4 kg,  which  is  initially  at  rest. 
There  is  a drop  of  instant-setting  glue  on  the  second  puck,  so  that  the  two  stick 
together  in  a totally  inelastic  collision.  Find  the  final  velocities  of  the  two  pucks. 

■ Puck  2 is  initially  stationary.  Thus  you  have  v2,  = 0,  and  Eq.  (4-11)  reads 

m1vu  + m2\2f  = m1\li 


134  Newton’s  Laws  of  Motion 


Since  the  two  pucks  are  joined  after  they  collide,  they  have  the  same  final  velocity, 
so  that 

Vi  / = v2/ 

Using  this  equation  in  the  equation  above  and  then  factoring,  you  obtain 

(m  i + m2)xu  = 


You  can  see  that  the  direction  of  v^must  be  the  same  as  that  of  vi;.  As  for  magni- 
tudes, you  have 

( mx  + m2)vx!  = mxvxi 


or 


V\(  = 


m i 


ml  + m2 


Inserting  the  numerical  values,  you  obtain 


'1/ 


2 kg 


2 kg  + 4 kg 


3 m/s  = 1 m/s 


Because  physicists  are  convinced  that  the  law  of  momentum  conserva- 
tion is  universally  applicable,  they  can  use  it  in  nuclear  and  elementary-par- 
ticle physics  to  determine  experimentally  the  mass  of  a newly  discovered 


Fig.  4-19  Bubble-chamber  photographs  of  a collision  between  a pion  and  a proton.  The  pion  enters  the  chamber,  which  contains  liquid 
hydrogen,  from  the  left.  As  it  passes  near  the  hydrogen  atoms,  its  electric  charge  results  in  ionization  of  some  of  them  — that  is,  removal  of 
their  electrons.  Under  proper  conditions  this  leads  to  boiling  of  the  liquid  in  the  vicinity  of  the  ions,  and  the  track  consists  of  the  tiny 
bubbles  visible  in  the  photo.  Because  the  plane  containing  the  track  of  the  entering  pion  and  the  tracks  of  the  scattered  pion  and  proton  is 
arbitrary,  a pair  of  photos  taken  simultaneously  from  two  different  angles  is  required  to  make  quantitative  measurements.  In  this  pair,  the 
lower  photo  is  a view  looking  down  on  the  collision  through  the  flat  top  of  the  glass  chamber.  The  upper  photo  was  taken  through  the  flat 
left  side  of  the  chamber,  as  seen  by  an  observer  looking  along  the  path  of  the  incident  pion.  (The  upper  photo  may  be  regarded  as  having 
been  “folded  upward.”  That  is,  the  bottom  edge  of  the  left  side  of  the  chamber  is  seen  at  the  top  of  the  photo.)  Liquid  hydrogen  consists 
entirely  of  protons  and  electrons,  the  latter  having  negligible  mass  compared  to  the  pion.  After  traveling  through  about  four-fifths  of  the 
10-cm  length  of  the  bubble  chamber  without  having  come  very  close  to  any  of  the  relatively  massive  but  tiny  protons,  the  pion  by  chance 
collides  with  a proton  whose  initial  velocity  is  essentially  zero  and  whose  mass  is  an  order  of  magnitude  greater  than  that  of  the  pion.  The 
pion  is  deflected  through  a large  angle;  it  moves  downward  (as  seen  in  the  upper  photo),  to  the  right  (as  seen  in  the  lower  photo)  and  back 
the  way  it  came  (as  seen  in  both  photos).  It  leaves  the  bubble  chamber  through  the  bottom  surface.  The  proton  recoils  through  a relatively 
short  distance  before  coming  to  rest  as  a result  of  repeated  collisions  with  hydrogen  atoms.  A few  large  bubbles  of  hydrogen  are  visible 
in  the  photos,  as  well  as  the  tracks  of  energetic  electrically  charged  particles  not  involved  in  the  pion-proton  collision.  The  faint  grid  lines 
are  1 cm  apart.  These  historic  1956  photographs  were  made  in  the  course  of  the  first  experiment  ever  performed  using  a liquid  hydrogen 
bubble  chamber.  (Courtesy  Roger  H.  Hildebrand,  Enrico  Fermi  Institute,  University  of  Chicago.) 


4-3  Mass  and  Momentum  Conservation 


135 


particle.  This  is  done  by  first  measuring  the  initial  and  final  velocities  in  a 
collision  between  it  and  a particle  of  known  mass  and  then  determining  the 
mass  ratio  which  will  satisfy  momentum  conservation.  Such  a collision 
between  microscopic  particles  is  shown  in  the  bubble  chamber  picture  of 
Fig.  4-19.  Example  4-8  carries  through  the  analysis  for  a collision  between 
macroscopic  particles. 


EXAMPLE  4-8 

Figure  4-20  shows  a collision  between  an  incident  puck  of  unknown  mass  ap- 
proaching from  the  upper  left  and  an  initially  stationary  puck  which  it  strikes. 
Using  the  mass  of  the  struck  puck  to  define  a unit  mass  and  employing  momentum 
conservation,  determine  the  mass  of  the  incident  puck  in  terms  of  this  unit. 

■ First,  take  advantage  of  your  intuition  to  guess  whether  the  incident  puck  is 
more  or  less  massive  than  the  struck  puck.  Imagine  a collision  between  a moving  bil- 
liard ball  (a  body  of  small  mass)  and  a stationary  bowling  ball  (a  body  of  large  mass). 
Then  imagine  a collision  between  a moving  bowling  ball  and  a stationary  billiard 
ball.  In  which  case  woulcTyou  expect  that  both  balls  move  off  after  the  collision  in 
the  same  general  direction  as  the  direction  of  motion  of  the  incident  ball? 

Now  that  you  have  guessed  that  the  incident  puck  is  more  massive  than  the 
struck  puck,  try  different  values  for  the  unknown  mass  of  the  incident  puck  until 
you  find  a value  which  satisfies  the  condition  that  the  total  momentum  of  the  system 
be  conserved. 

In  Fig.  4-21  the  momentum  conservation  analysis  is  carried  out  using  mstr  = 1 
mass  unit  for  the  mass  of  the  struck  puck  and  the  three  trial  values  wtinc  = 2 mass 
units,  3 mass  units,  4 mass  units  for  the  mass  of  the  incident  puck.  For  each  value  a 
construction  compares  the  final  momenta  of  the  two  pucks  (with  the  vectors  ar- 
ranged so  that  their  sum  is  apparent)  and  the  initial  momentum  of  the  incident 
puck  (the  only  one  that  has  an  initial  momentum).  This  is  done,  as  in  Fig.  4-18,  by 
drawing  the  momentum  vectors  to  a scale  in  which  the  length  of  the  momentum 
vector  of  the  struck  puck  is  the  same  as  the  length  of  its  velocity  vector.  The  con- 
structions show  that  the  total  final  momentum  of  the  system  equals  its  total  initial 
momentum,  within  experimental  accuracy,  only  for  the  choice  mlnc  = 3 mass  units. 
That  is,  (3  mass  units)vinc>/ + (1  mass  unit)vstr>/ — (3  mass  units)vinc>j.  Thus  the 


Fig.  4-20  A collision  between  a puck  of  unknown 
mass  incident  from  the  upper  left  on  a puck  of  unit 
mass  that  is  initially  at  rest  near  the  center  of  the  air 
table.  The  incident  puck  is  labeled  "Incident"  and  the 
struck  puck  is  labeled  “Struck.” 


136  Newton’s  Laws  of  Motion 


Fig.  4-21  Analysis  of  the  initial  and  final  momenta  in 
the  puck  collision  shown  in  Fig.  4-20.  The  dots  used  to 
construct  the  velocity  vectors  are  at  the  centers  of  the 
puck  images  in  that  figure. 


mass  of  the  incident  puck  is  determined  to  be  approximately  3 mass  units,  where 
the  mass  unit  is  the  mass  of  the  struck  puck.  Of  course,  more  accurate  velocity  mea- 
suring techniques  would  lead  to  a more  accurate  determination  of  the  mass. 

You  may  wish  to  analyze  the  initial  and  final  relative  speeds  to  determine 
whether  the  collision  was  elastic  or  inelastic.  But  it  makes  no  difference  as  far  as  the 
mass  determination  is  concerned  because  momentum  conservation  always  holds, 
independent  of  the  type  of  interaction  involved  in  the  collision. 


The  procedure  used  for  a particular  case  in  Example  4-8,  and  stated  for  the 
general  case  below  Eq.  (4-10),  constitutes  an  operational  definition  of  mass.  It  de- 
fines the  quantity  mass,  in  terms  of  an  agreed-upon  unit  of  mass,  by  specifying  a 
set  of  operations  for  measuring  the  quantity.  As  you  continue  your  study  of  phys- 
ics, you  will  find  that  operational  definitions  are  used  quite  frequently. 

Operational  definitions  deliberately  avoid  answering  such  questions  as:  What 
is  mass,  really?  They  may  therefore  strike  you  at  first  as  unsatisfactory  skirtings  of 
the  really  interesting  questions  in  physics.  But  the  answer  to  such  a question  as 
What  is  mass,  really?  is  something  like:  “Mass  is  the  amount  of  matter.”  This 
answer  only  convinces  you  that  you  understand  something,  when  you  have 
merely  swept  the  issue  under  the  rug  by  defining  the  quantity  in  question  in  terms 
of  still  another  undefined  quantity.  It  does  not  help  to  say  that  “Mass  is  the  prod- 
uct of  density  and  volume.”  Since  density  is  mass  divided  by  volume,  such  a defi- 
nition commits  the  logical  error  of  defining  a quantity  in  terms  of  itself.  In  con- 
trast, the  operational  definition  is  logically  consistent,  and  it  tells  you  what  you 
really  need  to  know  to  do  physics:  How  to  measure  the  quantity  defined. 

A question  of  a quite  different  nature  is:  What  is  the  physical  effect  of  mass? 
This  question  has  two  satisfactory  answers.  The  first  one  is:  “The  mass  of  a body  is 
a measure  of  its  inertia,  that  is,  how  much  it  resists  changes  in  its  velocity.”  In  an 
interaction  between  two  bodies  of  the  same  mass,  any  change  in  the  velocity  of 
one  is  exactly  compensated  for  by  an  equal  but  oppositely  directed  change  in  the 
velocity  of  the  other.  Equation  (4-11)  shows  that  when  bodies  of  unequal  mass  in- 
teract, the  ratio  of  the  changes  in  their  velocities  is  the  negative  of  the  reciprocal  of 
their  mass  ratio.  The  more  massive  body  experiences  the  smaller  velocity  change 
because  it  has  more  inertia. 

The  second  answer  to  the  question  about  the  physical  effect  of  mass  is:  “The 
mass  of  a body  is  a measure  of  the  strength  of  the  gravitational  force  that  some 


4-3  Mass  and  Momentum  Conservation  137 


other  particular  body  will  exert  on  it  when  the  two  bodies  have  a certain  separa- 
tion.” For  instance,  Eq.  (4-2)  shows  that  the  mass  of  a body  determines  how  much 
gravitational  force  the  earth  exerts  on  it  when  its  separation  from  the  earth  is  small 
compared  to  the  earth’s  radius.  Experiments  of  extremely  high  precision  show 
that  the  mass  of  a body  measured  by  its  gravitational  effect  always  equals  the  mass 
measured  by  its  inertial  effect.  Thus  the  gravitational  effect  of  mass  provides  a sec- 
ond way  of  defining  mass  operationally.  As  an  example,  the  mass  of  the  earth  is 
measured  by  (in  other  words,  defined  in  terms  of)  the  strength  of  the  gravitational 
force  that  the  earth  exerts  on  some  other  body  of  known  mass  when  the  two  bodies 
have  a certain  separation.  We  have  more  to  say  about  these  matters  in  Chap.  11. 

To  summarize,  the  procedure  stated  below  Eq.  (4-10)  gives  us  a rigorous  way 
of  defining  the  basic  quantity  mass  in  terms  of  its  inertial  effect.  This  definition  is 
of  great  theoretical  significance  because  it  puts  newtonian  mechanics  on  a firm 
logical  foundation  and,  as  you  will  see  in  Chap.  15,  because  it  plays  the  central 
role  in  the  development  of  relativistic  mechanics.  In  the  macroscopic  world  the 
definition  is  not  of  great  practical  significance  since  it  is  rarely  used  to  measure 
the  mass  of  an  object.  Instead,  the  mass  of  an  object  of  macroscopic  size  (say  a bil- 
liard ball)  is  almost  always  measured  by  some  technique  that  involves  its  gravita- 
tional effect.  It  is  usually  much  easier  to  measure  accurately  the  mass  of  such  a 
body  by  weighing  it  than  by  studying  how  it  behaves  in  a collision  with  some 
other  body.  But  in  the  microscopic  world  the  way  we  have  found  of  defining  mass 
in  terms  of  its  inertial  effect  (or  some  related  way)  must  be  used  to  measure  the 
mass  of  a body.  The  reason  is  that  for  a microscopic  body  (say  an  electron)  the  grav- 
itational effect  of  its  mass  is  so  minute  that  it  is  impractical  to  measure. 


4-4  FORCE  AND  Let  us  consider  again  two  interacting,  but  otherwise  isolated,  bodies  viewed 
NEWTON’S  SECOND  from  an  inertial  frame — two  pucks  on  an  air  table.  The  law  of  momentum 

LAW  conservation  applies  to  the  total  momentum  of  the  two  bodies.  But  it  does 
not  apply  to  the  individual  momenta  of  each  of  these  bodies  since  mo- 
mentum is  transferred  between  them  when  they  interact.  We  investigate 
this  fact  by  fixing  our  particular  attention  on  one  of  the  bodies.  That  is,  we 
respecify  the  system  of  interest  so  that  it  now  contains  only  one  puck.  If  we 
follow  that  puck  through  a collision,  we  note  that  its  momentum  does  not 
remain  constant,  even  though  we  observe  it  from  an  inertial  frame.  I he 
momentum  of  the  puck  that  we  are  concerned  with  changes  because  it  is 
not  isolated  from  its  environment.  The  system  containing  one  puck  is  not 
an  isolated  system  because  something  external  to  the  system  acts  on  the 
body  it  contains  and  causes  its  momentum  to  change.  An  agency  acting  on  a 
body  which  leads  to  a change  in  its  momentum  is  called  a force.  Thus  we 
say  that  the  momentum  of  a body  changes  when  it  is  acted  on  by  a force. 


Figure  4-22  is  an  enlarged  strobe  photo  depicting  a collision  between 
two  magnetic  pucks.  In  order  to  show  what  happens  when  they  interact  in 
the  collision,  the  camera  shutter  was  opened  for  only  a short  period  span- 
ning the  collision,  and  a very  short  strobe  flash  interval  was  used.  To  pre- 
vent confusion  from  multiple  overlapping  puck  images,  the  pucks  were 
painted  black  except  for  a white  dot  at  the  center  of  each.  Taking  the  puck 
in  the  upper  part  of  the  photograph  to  be  the  one  of  interest,  you  can  see 
that  it  initially  moved  toward  the  bottom  of  the  photograph  with  approxi- 
mately constant  momentum.  Then  it  slowed  and  also  changed  direction. 
Next  it  increased  its  speed  while  continuing  to  change  direction.  Finally  it 
moved  off  to  the  right  with  an  approximately  constant  momentum,  dif- 
ferent from  its  original  momentum  in  both  direction  and  magnitude.  The 


138  Newton’s  Laws  of  Motion 


Fig.  4-22  A short-flash-interval  strobe  photo  of  a 
collision  between  two  magnetic  pucks.  Multiple  over- 
lapping images  were  avoided  by  using  pucks  which 
were  black  except  for  a central  white  dot.  The  tech- 
nique used  caused  more  light  to  be  reflected  from  the 
air  table  into  the  camera  than  in  the  other  strobe 
photos. 


change  in  momentum  developed  gradually  while  the  puck  was  in  proximity 
to  the  other  puck.  In  fact,  careful  inspection  will  show  you  that  the  greatest 
momentum  change  in  one  strobe  flash  interval  occurred  when  the  puck 
was  closest  to  the  other  puck.  The  reason  is  that  the  force  exerted  on  the 
puck  of  interest  by  the  other  puck  was  strongest  when  the  two  pucks  were 
closest. 

A quantitative  measure  of  the  strength  and  direction  of  the  force 
acting  on  the  puck  of  interest  is  obtained  by  measuring  the  rate  at  which  it 
caused  the  momentum  of  that  puck  to  change.  That  is,  the  net  force  F 
acting  on  a body  at  some  instant  is  specified  in  magnitude  and  direction  by 
the  time  derivative  dp/dt  of  the  momentum  p of  the  body: 


(4-14) 


This  is  Newton’s  second  law  of  motion,  in  its  most  basic  form:  Net  force 
equals  rate  of  change  of  momentum.  Remember  that  it  assumes  the  observer  is 
in  an  inertial  frame  of  reference. 

In  the  approach  we  are  taking,  this  fundamental  “law”  is  actually  a def- 
inition. Newton ’s  second  law,  F = dp/dt,  is  the  definition  of  force.  The  definition 
follows  from,  but  is  distinct  from,  the  definitions  of  mass  and  momentum. 
Note  that  the  development  leads  necessarily  to  the  conclusion  that  force  is  a 
vector  quantity  and  therefore  must  satisfy  the  rules  for  the  addition  of 
vectors.  This  is  the  case  since  force  is  defined  as  the  limiting  value  of  a vec- 
tor (the  momentum  change)  divided  by  a scalar  (the  time  change)  and  so 
must  be  a vector.  Figure  4-23,  and  the  analysis  explained  in  the  caption, 
lends  conviction  to  the  consistency  of  the  definition  by  presenting  experi- 
mental evidence  for  the  vectorial  nature  of  force.  It  shows  that  the  net  force 
acting  on  a body  is  the  vector  sum  of  all  the  forces  acting  on  it. 


4-4  Force  and  Newton’s  Second  Law  139 


(a) 


Fig.  4-23  A demonstration  of  the  vectorial  nature  of  force,  (a)  Bodies  whose  weights  are 
in  the  ratio  2 to  3 to  4 exert  forces  with  magnitudes  equal  to  their  weights  on  the  lower  ends 
of  three  strings.  The  strings  pass  over  pulleys  at  the  rim  of  the  circular  table,  and  transmit  these 
forces  to  the  ring  to  which  they  are  tied.  The  locations  of  the  pulleys  are  adjusted  until  the 
ring  remains  stationary  at  the  center  of  the  table  with  the  three  forces  acting  on  it.  ( b ) A top 
view  with  vectors  depicting  the  forces.  Their  magnitudes  are  in  the  ratio  2 to  3 to  4.  (c)  A graphic 
construction  which  sums  the  three  force  vectors  to  give  the  net  force  exerted  on  the  ring.  To 
within  the  accuracy  of  the  technique,  the  net  force  is  seen  to  have  the  value  F = 0.  This  value 
agrees  with  the  prediction  of  Newton’s  second  law,  F = dp/dt , since  the  momentum  of  the 
ring  maintains  the  constant  value  p = 0 and  so  dp/dt  = 0.  The  agreement  confirms  the 
vectorial  nature  of  force  because  the  graphic  construction  is  based  on  treating  forces  as  vectors 
when  summing  them.  It  also  shows  that  the  net  force  acting  on  a body  is  the  vector  sum  of 
all  the  forces  acting  on  it. 


(c) 


For  a definition  to  be  worthwhile,  it  must  be  not  only  consistent  but 
also  convenient.  Equation  (4-14)  passes  the  second  test  too.  It  is  convenient 
because  the  forces  that  occur  in  nature  have  relatively  simple  physical  and 
mathematical  descriptions  when  that  equation  is  taken  to  be  the  definition 
of  force.  For  example,  consider  a body  located  anywhere  in  a region  ex- 
tending upward  from  the  earth’s  surface  through  a distance  small  com- 
pared to  the  earth's  radius,  and  extending  along  the  surface  through  a dis- 
tance that  also  is  small  compared  to  the  radius.  The  gravitational  force 
acting  on  the  body  is  constant  in  both  magnitude  and  direction.  You  could 
not  ask  a force  to  have  a simpler  description  than  that.  Another  example  is 
the  force  exerted  on  a body  by  a stretched  spring  to  which  the  body  is  con- 
nected. The  force  is  constant  in  direction,  the  direction  lying  along  the 
spring  axis.  The  magnitude  of  the  force  varies,  but  the  variation  is  simply 
in  direct  proportion  to  the  stretch  of  the  spring.  As  we  proceed  in  our  study 
of  various  fields  of  physics,  we  will  come  across  numerous  other  examples 
of  the  simplicity  of  natural  forces,  when  force  is  defined  by  the  equation 
F = dp/dt. 

The  relation  between  that  equation  and  F = ma  is  easy  to  obtain.  Since 
we  have  defined  p as 


it  is  apparent  that 


p = vi\ 


d{m\) 

dt 


If,  and  only  if,  the  mass  m is  a constant,  this  immediately  yields 


F = 


m a 


The  case  of  constant  mass  is  the  one  usually  encountered  in  newtonian  me- 
chanics. The  great  majority  of  mechanical  systems  occurring  in  practical 
studies  certainly  comprise  bodies  with  constant  mass.  But  there  are  some 
important  exceptions.  For  instance,  an  engineer  studying  the  motion  of 
a rocket  burning  its  fuel  finds  it  expedient  to  treat  the  rocket  as  a system 
which  is  losing  mass.  In  relativistic  mechanics  mass  cannot  be  treated  as  a 
constant  in  any  situation,  and  F = »/a  is  not  correct.  But  F = dp/dt  remains 
valid  in  relativity,  as  we  will  see  in  Chap.  15. 

In  Sec.  4-2  we  considered  a number  of  examples  of  the  application  of 
F = dp/dt  for  constant  m , where  the  relation  can  be  expressed  as  F = m a. 
Example  4-9  considers  a case  where  m is  not  constant  so  that  F is  not  equal 
to  ma. 


140  Newton’s  Laws  of  Motion 


EXAMPLE  4-9 


Figure  4-24  shows  crushed  ore  from  a mine  dropping  onto  a very  long  conveyor 
belt  at  a rate  of  300  kg/s.  The  belt  moves  at  a speed  of  2.00  m/s.  Find  the  net  force 
that  must  be  acting  on  the  belt  to  keep  it  moving  at  this  constant  speed,  neglecting 
friction  in  the  rollers  supporting  the  belt. 

■ If  you  take  the  belt  plus  the  ore  lying  on  it  to  be  the  system,  then  you  have  a 
simple  example  of  a system  with  varying  mass  to  which  you  can  apply  F = clp/dt. 
The  mass  at  time  t is  m,  and  the  momentum  at  that  instant  is  mv,  where  v is  the 
velocity  of  the  belt.  At  time  t + dt  the  mass  is  m + dm,  and  momentum  is  thus 
(m  + dm)\.  Therefore  the  change  in  momentum  in  time  dt  is 

dp  = (m  + dm)\  — mv 


or 


Dividing  by  dt  gives  you 


dp  = v dm 


dp  _ dm 
dt  dt 


Equation  (4-14)  tells  you  that  F = dp/dt.  That  is,  the  rate  of  change  of  the  system’s 
momentum  (due  to  its  increasing  mass)  equals  the  force  applied  to  it  (to  keep  its 
velocity  constant).  So,  equating  dp/dt  to  the  necessary  force  F,  you  obtain 


F = 


(4-15) 


This  result  shows  that  the  force  is  in  the  direction  of  the  belt’s  velocity,  and  its  mag- 
nitude is  the  product  of  the  velocity  of  the  belt  and  the  rate  at  which  mass  is  added 
to  it. 

Since  the  speed  v of  the  belt  is  2.00  m/s  and  the  rate  dm/dt  at  which  mass  is 
added  to  it  by  the  falling  ore  is  300  kg/s,  the  magnitude  of  the  force,  F = v dm/dt,  is 

F = 2.00  m/s  x 300  kg/s  = 600  kg-m/s2  = 600  N 


In  Example  4-9,  v is  constant.  But  this  need  not  always  be  the  case.  To 
obtain  an  expression  for  F that  applies  more  generally,  we  consider 
Newton’s  second  law 


F = dp  = d(mv) 
dt  dt 

when  neither  m nor  v is  constant.  Then  we  apply  Eq.  (2-15),  the  rule  for 
differentiating  a product  of  two  variables.  The  result  obtained  immediately 
is 


Fig.  4-24  Material  dropping  onto  a 
very  long  conveyor  belt.  The  belt  plus 
the  material  on  it  form  a system  which  is 
gaining  mass.  The  equation  F = mu  does 
not  apply  to  such  a system. 


4-4  Force  and  Newton’s  Second  Law  141 


(4-16) 


4-5  MOMENTUM 
CONSERVATION 
AND  NEWTON  S 
THIRD  LAW 


F = 


d\  dm 

m — h v — r- 
dt  dt 


Note  that  the  second  term  on  the  right  side  of  this  equation  is  the  only  term 
present  on  the  right  side  of  Eq.  (4-15),  the  expression  for  F developed  in 
the  example.  The  reason  is  that  d\/dt  = 0 there.  But  in  the  great  majority 
of  systems  treated  in  newtonian  mechanics  dm/dt  = 0,  and  only  the  hrst 
term  on  the  right  side  is  present  in  Eq.  (4-16).  Then  it  becomes 
F = m dv/dt  = ma. 


When  we  previously  studied  the  strobe  photo  of  a collision  between  two 
magnetic  pucks  shown  in  Fig.  4-22,  we  singled  out  the  puck  in  the  upper  part 
of  the  photograph  to  be  the  one  of  interest.  Then  we  discussed  the  relation 
between  its  rate  of  change  of  momentum  and  the  force  acting  on  it.  But  we 
could  just  as  well  have  focused  our  attention  on  the  other  puck.  That  puck 
also  experiences  a change  in  momentum,  and  therefore  a force  equal  to  the 
rate  of  change  of  its  momentum  is  acting  on  the  puck.  It  is  apparent  from 
the  photograph  that  there  is  some  symmetry  between  what  happens  to  one 
puck  and  what  happens  to  the  other.  The  hrst  puck  exerts  a repulsive  force 
on  the  second,  which  changes  its  momentum;  at  the  same  time  the  second 
puck  exerts  a repulsive  force  on  the  hrst,  thereby  changing  its  momentum. 
We  know  from  momentum  conservation  measurements  that  we  have  made 
on  many  such  collisions  that  there  is,  in  fact,  a perfect  symmetry  between 
the  two  momentum  changes.  Since  the  two  pucks  form  an  effectively  iso- 
lated system  when  considered  together,  each  momentum  change  must 
equal  the  negative  of  the  other  because  the  sum  of  their  two  momenta 
must  remain  constant.  An  intimately  related  manifestation  of  this  same 
symmetry  is  that  each  force  is  the  negative  of  the  other — the  pair  of  forces 
obeys  Newton's  third  law. 

It  is  easy  to  derive  Newton’s  third  law  mathematically  by  combining  the 
experimental  law  of  momentum  conservation  with  the  definition  of  force 
given  by  Newton’s  second  law.  Consider  two  bodies,  1 and  2,  which  interact 
with  each  other  in  any  way  but  which  have  no  net  interactions  with  any- 
thing else.  In  an  inertial  reference  frame,  their  momenta  obey  the  mo- 
mentum conservation  law  of  Eq.  (4-13): 


Pi/  + P2/  = P n-  + P2; 
or  Pi  + p2  = constant 

Take  the  time  derivative  of  all  terms  in  this  equation,  to  obtain 

dEl  + = o 

dt  dt 


(4-17  a) 


^Pi  = dp2 
dt  dt 


(4-176) 


The  rate  at  which  the  momentum  of  body  1 is  changing  must  be  the  exact 
opposite  of  the  rate  at  which  the  momentum  of  body  2 is  changing  because 
the  momentum  of  the  entire  isolated  system  remains  constant. 

Since  there  is  a rate  of  change  d'p1/dt  in  the  momentum  of  body  1,  as 
viewed  from  an  inertial  frame,  the  definition  of  Eq.  (4-14)  says  there  is  a net 
force  acting  on  the  body.  This  force  is  Fon  lby  2,  the  force  exerted  on  it  by 
body  2.  Applying  the  definition  gives  Fon  lbv2  — dpx/dt.  Similarly,  the  force 


142  Newton’s  Laws  of  Motion 


exerted  on  body  2 by  body  1 is  determined  by  the  definition  of  force  to  be 
Fon  2 by  i = d-p2/dt.  Using  these  two  applications  of  the  definition  in  Eq. 
(4-176)  produces  immediately  the  result 

fon  1 by  2 fon  2 by  1 (4- 1 8) 

The  forces  which  the  bodies  exert  on  each  other  have  equal  magnitude  but 
opposite  direction.  Equation  (4-18)  is  Newton’s  third  law  of  motion.  Thus 
Newton’s  third  law  does  not  describe  a separate  property  of  nature;  it  is 
contained  in  the  experimental  fact  of  momentum  conservation  and  in  the 
way  force  is  defined. 

Interesting  questions  arise  concerning  the  applicability  of  the  third  law  to  sit- 
uations in  which  objects  interact  at  an  appreciable  distance,  if  one  object  or  the 
other  has  a characteristic  important  to  the  interaction  that  is  changing  abruptly  in 
time.  An  example  is  the  interaction  between  electrons  in  two  separated  radio  an- 
tennas. When  the  electrons  in  the  first  antenna  are  briefly  put  into  oscillation,  the 
electrons  in  the  second  will  experience  a corresponding  pulse  of  oscillation,  but 
only  after  a certain  time  delay.  In  turn,  the  pulsed  oscillation  of  the  electrons  in 
the  second  antenna  will  induce  a pulsed  oscillation  of  the  electrons  in  the  first, 
after  an  additional  time  delay.  Thus  it  appears  that  the  electrons  in  each  antenna 
exert  forces  on  those  in  the  other  antenna,  but  only  after  a certain  time  delay.  Such 
forces  cannot  satisfy  Eq.  (4-18)  because  they  are  not  of  equal  magnitude  but  oppo- 
site direction  at  any  instant. 

When  we  study  the  electromagnetic  force  we  will  see  that  electrons  in  the  two 
antennas  do  not  interact  directly  with  each  other.  Instead  they  interact  indirectly. 
As  a result  of  their  oscillatory  motion,  the  electrons  in  one  antenna  emit  what  is 
called  electromagnetic  radiation.  The  radiation  moves  away  from  the  antenna  at 
the  speed  of  light.  Part  of  it  reaches  the  other  antenna,  after  the  time  it  takes  for  the 
radiation  to  travel  between  the  two.  There  it  is  absorbed  by  the  electrons  in  that 
antenna  and  causes  them  to  oscillate.  The  radiation  does  convey  forces  from  one 
set  of  electrons  to  the  other  because  it  carries  momentum  which  goes  from  the 
emitting  electrons  to  the  absorbing  electrons.  But  each  of  the  steps  of  the  process, 
emission  and  absorption,  must  be  considered  separately.  Then  Newton’s  third  law 
is  satisfied.  The  electrons  emitting  the  radiation  exert  forces  on  it,  and  the  radia- 
tion exerts  reaction  forces  back  on  them.  This  interaction  takes  place  during  a par- 
ticular interval  of  time  and  in  a particular  region  of  space,  just  like  a collision 
between  two  pucks.  And  in  the  pair  of  forces  involved  in  the  interaction  the  two 
forces  have  equal  magnitude  but  opposite  direction.  At  a later  time  and  in  a differ- 
ent region,  there  is  another  interaction,  the  one  between  the  radiation  being 
absorbed  and  the  electrons  absorbing  it.  Newton’s  third  law  is  also  satisfied  in  this 
step  of  the  process. 

Thus  there  is  actually  no  difficulty  with  the  third  law,  even  in  this  apparently 
difficult  case.  For  the  cases  generally  treated  in  newtonian  mechanics,  these  ques- 
tions never  even  arise. 


4-6  FORCES  IN 
MECHANICAL 
SYSTEMS 


Now  we  will  make  a preliminary  inquiry  into  the  properties  of  some  of  the 
forces  commonly  involved  in  the  behavior  of  mechanical  systems.  Our  pur- 
pose is  to  develop  enough  information  about  these  forces  so  that  in  the 
next  chapter  we  can  use  them  in  Newton’s  laws  to  study  the  motion  of  me- 
chanical systems.  The  approach  we  use  in  this  section  is  mostly  empirical. 
That  is,  we  concentrate  on  the  experimentally  observed  properties  of  cer- 
tain types  of  forces,  putting  aside  until  later  in  the  book  any  serious  attempt 
to  understand  the  origins  of  these  properties  in  a fundamental  way.  Never- 
theless, we  will  learn  enough  about  the  forces  to  be  able  to  solve  many  prac- 
tical problems  involving  the  behavior  of  systems  governed  by  them. 


4-6  Forces  in  Mechanical  Systems  143 


Fig.  4-25  A spring  hanging  from  a 
beam  has  a length  4,  . It  is  then  stretched 
to  a length  l by  suspending  a body  of 
mass  m from  its  lower  end.  The  forces 
shown  are  the  important  ones  acting 
when  the  spring  is  stretched. 


As  an  example,  consider  the  force  whose  magnitude  is  called  weight. 
At  this  juncture  we  have  not  really  addressed  the  question  of  why  a body  has 
weight.  We  have  merely  indicated  it  is  a manifestation  of  the  gravitational 
interaction  between  the  earth  and  the  body.  We  inquire  more  deeply  into 
gravitational  force  in  Chap.  11.  In  the  meantime,  however,  we  can  use  to 
advantage  the  experimental  fact  considered  in  Example  4-2  and  the  asso- 
ciated discussion:  All  bodies  near  the  surface  of  the  earth  fall  with  accelera- 
tion of  the  same  magnitude  g (when  air  resistance  is  negligible).  This  fact, 
taken  together  with  the  relation  among  force,  mass,  and  acceleration  im- 
posed by  Newton’s  second  law,  led  us  to  a now  familiar  conclusion:  The 
gravitational  force  exerted  by  the  earth  on  a body  of  mass  m has  a magni- 
tude (the  weight  of  the  body)  given  by 

W = mg  (4-19) 

if  the  separation  between  the  body  and  the  earth’s  surface  is  small  com- 
pared to  the  radius  of  the  earth.  The  direction  of  the  force  is  downward, 
that  is,  toward  the  center  of  the  earth.  We  have  already  used  this  empirical 
description  of  the  gravitational  force  to  explain  the  behavior  of  mechanical 
systems.  We  continue  to  use  it  in  subsequent  chapters. 

Another  force  which  has  a simple,  and  very  useful,  empirical  descrip- 
tion is  the  force  exerted  at  either  end  of  a compressed  or  stretched  spring 
on  whatever  is  connected  to  the  end.  Consider  a coil  spring  whose  coils  are 
not  in  contact  with  each  other  when  the  spring  is  relaxed.  The  spring  can 
be  compressed  by  pushing  on  both  ends,  or  it  can  be  stretched  by  pulling 
on  both  ends.  A convenient  way  to  stretch  the  spring  is  to  attach  one  end  to 
a rigid  beam  and  hang  a body  of  mass  m from  the  other  end,  as  in  Fig.  4-25. 
When  the  body  hangs  at  rest  from  the  extended  spring,  the  magnitude  Sbot 
of  the  force  that  the  bottom  end  of  the  spring  applies  to  the  body  must  just 
equal  the  magnitude  mg  of  the  gravitational  force  acting  on  the  body.  This 
equality,  Sbot  = mg,  must  hold  because  the  body  is  motionless,  and  there- 
fore Newton’s  second  law  says  there  is  no  net  force  acting  on  it. 

If  we  assume  the  mass  of  the  spring  itself  to  be  negligible,  the  total  gravi- 
tational force  exerted  on  the  system  body-plus-spring  also  has  the  magni- 
tude mg.  Since  this  system  is  motionless,  the  second  law  requires  the  magni- 
tude B of  the  force  exerted  on  it  by  the  beam  to  have  the  value  B = mg.  But 
the  force  of  magnitude  S,op  applied  to  the  beam  by  the  top  end  of  the 
spring  and  the  force  of  magnitude  B applied  to  the  top  end  of  the  spring  by 
the  beam  form  an  action-reaction  pair.  Therefore  Newton's  third  law  re- 
quires that  St0p  = B-  Thus  St0P  = mg.  Since  in  the  preceding  paragraph  we 
showed  that  Sbot  = mg,  we  conclude  that  the  two  ends  of  the  spring  exert 
forces  of  the  same  magnitude  on  the  objects  to  which  they  are  connected. 
Note  that  these  forces  exerted  by  the  two  ends  of  the  extended  spring  are 
both  directed  inward  along  the  axis  of  the  spring. 

The  common  magnitude  S of  the  spring  forces  is  related  to  the 
amount  of  extension  of  the  spring.  The  relation  can  be  studied  experi- 
mentally by  hanging  bodies  of  different  mass  m from  the  spring,  thereby 
varying  the  value  of  S.  For  each  value  of  S,  the  length  / of  the  spring  is  mea- 
sured. Then  the  amount  of  extension  is  found  by  subtracting  from  / the 
length  /Oof  the  spring  when  it  is  relaxed.  Results  of  measurements  on  a typ- 
ical spring  look  like  those  in  Table  4-1. 

From  the  data  we  can  conclude  that  the  magnitude  S of  the  forces  pro- 


144  Newton’s  Laws  of  Motion 


Table  4-1 


The  Magnitude  S of  the  Forces  Exerted  by  a Spring 
at  Each  End  When  the  Extension  of  the  Spring 
Is  1 ~ lo 


Magnitude  S of 

Extension  1 — h 

spring  force 

of  spring 

Measurement 

(in  N) 

(in  m) 

1 

10 

0.008 

2 

20 

0.016 

3 

30 

0.024 

4 

40 

0.032 

5 

100 

0.080 

6 

200 

0.146 

7 

20 

0.016 

duced  by  the  extended  spring  is  directly  proportional  to  its  extension 
l — /0,  through  measurement  5.  For  the  large  extension  obtained  in  mea- 
surement 6 the  rule  seems  to  fail  because  there  is  a twofold  increase  in  5 
from  measurement  5 to  measurement  6 but  there  is  not  a corresponding 
twofold  increase  in  l — l0.  Measurement  7 serves  the  purpose  of  checking 
the  reproducibility  of  the  measurements  and  assures  us  that  stretching  the 
spring  in  measurement  6 did  not  permanently  alter  its  characteristics. 

The  data  obtained  through  measurement  5 can  be  summarized  by  an 
equation  relating  the  magnitude  5 of  the  forces  exerted  by  the  ends  of  the 
extended  spring  to  the  extension  I — l0  of  the  spring.  I he  equation  is 

S = k(l  — /„)  (4-20a) 

where  k is  a constant.  Its  value  can  be  determined  by  requiring  the  equation 
to  conform  to  the  results  obtained  in  measurement  5.  Solving  for  k and 
using  these  results,  we  obtain  k = S/(!  — l0)  = 100  N/0.080  m = 
1.2  X 103  N/m  for  the  particular  spring  used  in  the  measurements. 

What  makes  these  results  interesting  is  that  practically  all  springs  be- 
have this  way,  regardless  of  the  details  of  their  construction,  provided  they 
are  not  stretched  too  far.  Furthermore,  when  a spring  is  compressed  to  a 
length  l shorter  than  its  relaxed  length  l0  but  not  compressed  enough  to 
make  its  coils  touch  one  another,  then  the  magnitude  S of  the  forces  ex- 
erted by  the  spring  at  its  end  is  related  to  the  compression  /0  — / by  an 
equation  having  almost  the  same  form  as  Eq.  (4-20a).  The  equation  is 

S = k(l0  ~ l)  (4-20 b) 

The  value  of  k in  Eq.  (4-206)  for  a certain  spring  is  the  same  as  the  value  of 
k in  Eq.  (4-20a)  for  that  spring.  The  forces  that  a.  compressed  spring  exerts  at 
its  ends  are  directed  outward  along  the  spring  axis. 

A single  equation  can  be  used  to  describe  the  magnitude  5 of  the 
spring  force.  This  is  the  magnitude  of  the  force  exerted  on  whatever  is 
connected  to  an  end  of  a spring  of  relaxed  length  /0  when  it  is  compressed 
or  extended  to  length  /.  The  equation  has  the  form 

S = k\l-l0\  (4-21) 

The  quantity  |/  — /0|  is  the  distortion  of  the  spring,  the  magnitude  of  the 
change  in  its  length  from  the  length  when  it  is  relaxed.  The  constant  k is 


4-6  Forces  in  Mechanical  Systems  145 


called  the  force  constant  (or  sometimes  the  spring  constant).  It  is  a prop- 
erty of  the  spring  which  specifies  how  stiff  the  spring  is.  The  relation  is 
known  as  Hooke’s  law,  after  Robert  Hooke,  a contemporary — and  long- 
time adversary — of  Newton.  Since  the  forces  produced  by  a spring  are 
directed  inward  at  each  end  if  the  spring  is  extended,  and  outward  if  it  is 
compressed,  in  all  circumstances  the  forces  act  on  the  objects  attached  to 
the  ends  of  the  spring  in  directions  which  tend  to  make  them  move  so  as  to 
restore  the  spring  to  its  relaxed  length.  Because  of  their  directional  proper- 
ties, the  forces  produced  by  springs  are  often  called  restoring  forces.  Can 
you  write  a vector  form  of  Hooke’s  law,  giving  both  the  magnitude  and 
direction  of  the  restoring  force  exerted  by  the  bottom  end  of  the  spring  in 
Fig.  4-25? 

The  proportionality  between  restoring  force  and  distortion  expressed 
in  Eq.  (4-21)  is  not  restricted  to  springs.  It  is  a common  property  of  many 
kinds  of  mechanical  objects,  particularly  those  composed  of  crystalline 
solids.  We  will  gain  insight  into  the  fundamental  cause  of  this  behavior  in 
Chap.  8.  But  until  then  Hooke’s  law  can  be  considered  to  be  an  empirical 
relation. 

Example  4-10  gives  you  some  experience  applying  Hooke’s  law. 


EXAMPLE  4-10 


Fig.  4-26  Forces  acting  when  two 
identical  springs,  connected  end  to  end, 
are  stretched. 


Two  identical  coil  springs,  each  having  force  constant  A,  are  connected  end  to  end. 
Find  the  effective  force  constant  A'  of  the  spring  pair.  That  is,  find  the  value  of  A'  in 
the  equation  S = k'\l  — /0|  relating  the  magnitude  S of  the  restoring  forces  exerted 
by  the  spring  pair  at  its  two  ends  and  the  distortion  |/  — /0|,  the  magnitude  of  the 
change  in  length  of  the  spring  pair  from  its  relaxed  length.  Then  let 
A = 1.2  X 103  N/m,  the  value  found  for  the  spring  in  Table  4-1,  and  evaluate  A'. 

■ You  should  make  a sketch,  as  in  Fig.  4-26,  showing  the  spring  pair  when  it  is 
distorted  from  its  relaxed  length  so  that  forces  are  being  exerted.  The  distortion 
can  be  either  an  extension  or  a compression.  Suppose  it  is  an  extension. 

The  figure  shows  four  restoring  forces,  one  exerted  at  each  end  of  each  indi- 
vidual spring.  Two  of  these  are  the  restoring  forces  exerted  by  the  spring  pair. 
They  are  the  force  exerted  to  the  right  at  its  left  end  and  the  force  exerted  to  the 
left  at  its  right  end.  Also  shown  are  a force  directed  to  the  right  applied  to  the  right 
end  of  the  spring  pair  and  a force  directed  to  the  left  applied  to  its  left  end.  These 
two  are  the  forces  producing  the  distortion.  All  the  forces  in  the  figure  have  the 
same  magnitude  S.  You  can  prove  this  by  an  argument  similar  to  the  one  used  to 
prove  that  S,op  = Sb0,  in  Fig.  4-25. 

The  restoring  forces  exerted  by  the  spring  pair  have  the  same  magnitude  S as 
those  exerted  by  each  individual  spring  of  the  pair.  But  the  change  in  length  of  each 
individual  spring  is  only  one-half  the  change  in  length  of  the  spring  pair.  So  when 
|/  — /„|  is  the  distortion  of  the  spring  pair,  the  distortion  of  each  individual  spring  is 
\l  ~ /ol/2. 

Applying  Hooke’s  law  to  an  individual  spring,  you  have 


k-/0i  s 

2 A 


or 


|/  - l o 


2S 

A 


Applying  the  law  to  the  entire  spring  pair  gives  you 


A' 


S 

|/  - lo 


146  Newton’s  Laws  of  Motion 


Using  the  evaluation  of  |/  — /0|  just  obtained,  you  find 


2 S/k 
or 

k 2/k  2 

This  calculation  shows  that  two  springs  connected  in  series  comprise  a weaker 
spring  than  either  spring  alone.  The  reason  is  that  each  contributes  its  own  exten- 
sion, or  compression,  to  the  total  extension,  or  compression,  of  the  pair,  yet  each 
exerts  forces  of  the  same  magnitude  as  the  forces  exerted  by  the  pair.  In  particular, 
if  k = 1.2  x 103  N/m,  you  will  have  k'  = 0.60  x 103  N/m. 

The  results  of  the  example  suggest  that  any  spring  made  by  using  part  of  a 
longer  spring  will  be  suffer  than  the  longer  spring.  You  can  easily  verify  this  by 
stretching  a coil  spring  first  from  both  ends  and  then  from  one  end  and  the  middle. 


Now  let  us  consider frictional  forces.  Seen  from  the  point  of  view  of  their 
effects,  frictional  forces  acting  between  two  objects  always  do  the  same 
thing — they  resist  any  attempt  to  put  one  object  into  motion  relative  to  the 
other,  and  tend  to  slow  the  motion  once  the  objects  are  moving  relative  to 
each  other.  Thus  they  are  always  directed  ‘‘backward.”  The  causes  of 
friction  are  varied,  and  almost  all  of  them  are  extremely  complicated  when 
studied  in  detail.  Nevertheless,  it  is  possible  to  give  a quantitative  account  of 
the  effects  of  friction  in  several  cases  of  great  practical  importance.  It  turns 
out  that  simple  empirical  equations  often  give  an  accurate  enough  descrip- 
tion of  these  effects  for  practical  purposes. 

There  are  two  broad  categories  of  friction.  One  is  called  contact  fric- 
tion, and  the  other  is  called  fluid  friction.  Contact  friction  is  present 
when  an  attempt  is  made  to  set  one  solid  object  into  motion  across  the  sur- 
face of  another,  and  also  when  such  motion  takes  place.  Fluid  friction  is 
present  when  a solid  object  moves  through  a fluid.  Frictional  forces  are  the 
first  ones  we  have  encountered  which  are  velocity-dependent.  For  both 
categories,  the  direction  of  the  frictional  force  acting  on  an  object  depends 
on  the  direction  of  the  velocity  it  would  have  if  it  moved,  or  actually  has  if  it 
is  moving,  since  friction  always  opposes  relative  motion.  And  for  fluid  fric- 
tion there  is  also  a dependence  on  the  magnitude  of  the  object’s  velocity,  as 
you  will  soon  see. 

We  begin  by  considering  contact  friction.  If  you  apply  a weak  force  in  a 
horizontal  direction  to  a heavy  box  resting  on  the  floor,  the  box  does  not 
move.  According  to  Newton’s  second  law,  the  net  force  on  the  box  must  be 
zero  while  it  remains  stationary.  Apparently  the  floor  is  able  to  apply  a fric- 
tional force  to  the  box,  equal  in  strength  and  opposite  in  direction  to  what- 
ever horizontal  force  you  apply  to  it,  providing  the  force  you  apply  is  weak. 
Now  gradually  increase  the  strength  of  the  force  you  apply  to  the  box. 
Nothing  happens  until  the  force  you  apply  to  the  box  exceeds  the  limiting 
strength  of  the  frictional  force  that  the  floor  can  apply  to  the  box.  When  it 
does  so,  the  box  “breaks  free”  and  begins  to  move. 

If  the  surfaces  of  two  objects  in  contact  are  not  so  rough  that  there  is  a 
gross  interlocking  of  projections  on  one  with  depressions  on  the  other,  the 
limiting  strength  of  the  contact  friction  force  that  can  be  developed 
between  the  surfaces  depends  mainly  on  two  factors.  The  first  factor  is  the 


4-6  Forces  in  Mechanical  Systems  147 


N 


A 


C, 

« 


mg 


Floor 


Fig.  4-27  Showing  all  the  forces  acting 
on  a box  of  mass  m resting  on  the  floor, 
when  the  strength  A of  a force  applied 
to  it  in  some  direction  parallel  to  the 
plane  of  the  floor  just  equals  the  limiting 
strength  Cs  of  the  force  that  the  floor 
can  apply  to  the  box  in  the  opposite 
direction.  In  these  circumstances  the 
box  is  on  the  verge  of  moving.  In  all 
circumstances  the  floor  also  applies  to 
the  box  a force  of  strength  N in  the 
direction  normal  to  its  plane,  and  oppo- 
site to  that  of  the  gravitational  force  of 
equal  strength  mg. 


magnitude  of  the  forces  which  the  objects  exert  on  each  other  because  they 
are  pressed  together.  These  forces  act  in  opposite  directions  and  are  of  the 
same  magnitude,  in  agreement  with  Newton’s  third  law.  Each  of  the  direc- 
tions is  normal  to  the  plane  containing  the  surfaces  in  contact.  So  each 
force  is  called  a normal  force,  and  its  magnitude  is  represented  by  the 
symbol  N.  In  the  case  of  the  box  resting  on  the  floor,  illustrated  in  Fig.  4-27, 
the  magnitude  N of  the  normal  force  applied  to  it  by  the  floor  equals  the 
weight  mg  of  the  box.  The  box  will  be  just  at  the  point  of  moving  over  the 
floor  when  the  magnitude  A of  the  force  you  apply  to  it  in  some  direction 
parallel  to  the  plane  of  contact  equals  the  limiting  magnitude  of  the  oppo- 
sitely directed  force  which  contact  friction  can  apply  to  the  box  to  prevent 
its  motion  over  the  plane.  This  limiting  frictional  force  is  called  the  static 
contact  friction  force,  and  we  designate  its  magnitude  as  Cs.  Experiment 
shows  that  over  a wide  range  of  conditions 

Cs  oc  N (4-22 a) 

The  static  contact  friction  force  is  proportional  to  the  normal  force.  In 
other  words,  the  greatest  force  that  can  be  applied  to  one  object  without 
causing  it  to  slide  over  another  is  proportional  to  the  force  pressing  the  ob- 
jects together.  Perhaps  surprisingly,  Cs  does  not  depend  at  all  strongly  on 
the  area  of  the  surfaces  in  contact.  You  might  guess  that  the  more  frictional 
surface  there  is,  the  more  friction  there  should  be.  But  this  is  not  so,  for 
reasons  soon  to  be  discussed. 

The  other  major  factor  determining  how  much  force  can  be  applied  to 
one  object  before  it  starts  sliding  over  another  is  the  specific  nature  and 
condition  of  the  surfaces  in  contact.  An  approximate  representation  of  the 
degree  of  “stickiness”  of  a pair  of  surfaces  is  given  by  their  coefficient  of 
static  friction  /jls.  (The  symbol  fi  is  the  Greek  letter  mu.)  This  quantity  is 
the  proportionality  constant  which  will  convert  the  relation  between  Cs  and 
N into  an  equality.  That  is,  the  relation  among  the  magnitude  Cs  of  the 
static  contact  friction  force,  the  magnitude  N of  the  normal  force,  and  the 
coefficient  of  static  friction  /jls  is 


Cs  = fxsN  (4-22  b) 

The  direction  of  the  static  contact  friction  force  is  always  opposite  to  the 
direction  of  the  motion  it  opposes.  Since  [xs  = Cs/N  is  the  ratio  of  two 
forces,  it  is  a dimensionless  number.  Some  representative  values  are  shown 
in  Table  4-2. 

When  you  set  an  object  into  motion,  it  is  often  possible  to  feel  it  “break 
loose,”  as  we  remarked  earlier.  In  any  case,  it  is  usually  easier  to  keep  an  ob- 
ject in  motion  against  contact  friction  than  it  is  to  start  the  motion.  Experi- 


Table  4-2 

Coefficients  of  Static  and  Kinetic  Friction 


Surfaces  in  contact 

Copper  and  cast  iron 
Steel  and  steel 
Steel  and  wood 
Steel  and  Teflon 


Approximate  value  of  fxs 

1.1 

0.7 

0.4 

0.04 


Approximate  value  of  Ma 

0.3 

0.5 

0.2 

0.04 


148  Newton’s  Laws  of  Motion 


ment  shows  that  the  magnitude  Ck  of  the  kinetic  contact  friction  force 
acting  on  an  object  that  is  moving  does  not  depend  on  its  speed,  if  the 
speed  is  not  too  large  or  too  small.  But  Ck  does  depend  on  the  magnitude  N 
of  the  normal  force  applied  to  the  object  by  the  object  in  contact  with  it,  and 
on  the  coefficient  of  kinetic  friction  fxk  for  the  surfaces  in  contact,  ac- 
cording to  the  relation 

Ck  = fikN  (4-23) 

As  is  always  true  for  a frictional  force,  the  direction  of  the  kinetic  contact 
friction  force  exerted  by  one  object  on  another  is  opposite  to  the  direction 
of  the  relative  motion  it  opposes.  The  value  of  /xk  for  a given  pair  of  sur- 
faces is  usually  less  than  the  value  of  /xs  for  that  pair.  You  can  see  this  in  the 
values  quoted  in  Table  4-2. 


EXAMPLE  4-11 

A block  of  mass  m slides  down  a plane  supported  from  the  earth  at  an  incline.  The 
angle  6k  between  the  plane  and  the  horizontal  is  adjusted  until  the  block  slides  with 
constant  speed.  Find  the  coefficient  of  kinetic  friction  fik.  If  6k  is  measured  to  be 
35°,  what  is  the  value  of  p,fc? 

■ Since  the  block  is  moving,  the  applicable  equation  is  Ck  = ^k N.  In  order  to  use 
this  equation,  you  must  first  determine  N.  To  do  so,  draw  a diagram  like  Fig.  4-28 
and  define  x and  y coordinate  axes.  Just  as  in  Fig.  4-8,  it  is  convenient  for  you  to 
make  these  axes  parallel  to  and  normal  to  the  plane.  The  forces  acting  on  the  block 
are  the  gravitational  force  of  magnitude  W = mg  applied  to  it  by  the  earth,  the 
normal  force  of  magnitude  N applied  to  it  by  the  plane,  and  the  frictional  force  of 
magnitude  C k that  the  plane  also  applies  to  the  block.  The  figure  shows  that  the  grav- 
itational force  has  two  components,  Wx  = mg  sin  6k  and  Wy  = —mg  cos  8k.  The 
normal  force  has  only  a y component,  Ny  = N.  The  only  component  of  the  fric- 
tional force  is  an  x component,  Ckx  = —Ck. 

Since  the  block  does  not  accelerate  in  the  y direction,  the  sum  of  the  y compo- 
nents of  the  forces  acting  on  it  must  be  zero.  Thus  you  have 

Wy + Ny = 0 


or 


— mg  cos  6k  + N — 0 


So  you  have 


4-6  Forces  in  Mechanical  Systems  149 


The  block  does  not  accelerate  in  the  x direction  either,  since  the  inclination  of 
the  plane  is  adjusted  so  that  its  speed  sliding  along  the  plane  is  constant.  Thus  the 
sum  of  the  x components  of  the  forces  acting  on  it  must  be  zero,  too.  Therefore  you 
have 

Wx  + = 0 


or,  since  Ck  = — Ck, 


mg  sin  dk  ~ liking  cos  6k  = 0 


Solving  for  fik,  you  obtain 


or 


Mfc  = 


sin  8k 
cos  8k 


M*  = tan  ek 


(4-24) 


For  the  particular  case  6k  = 35°,  you  can  immediately  calculate 

IAk  = tan  35°  = 0.70 


Example  4-11  suggests  a convenient  way  of  measuring  approximately 
the  coefficient  of  kinetic  friction  for  a pair  of  materials.  You  use  one  for  the 
surface  of  a plane  and  the  other  for  the  surface  of  a block,  and  then  you 
measure  the  angle  6k  for  which  the  block  slides  down  the  plane  at  constant 
speed.  This  angle  is  called  the  critical  angle  for  kinetic  friction.  Then  Eq. 
(4-24)  is  used  to  determine  the  coefficient  of  kinetic  friction  fxk.  Try  it, 
using  a coin  and  the  cover  of  this  book  for  an  inclined  plane. 

A similar  method  can  be  used  to  determine  the  coefficient  of  static  fric- 
tion. The  block  is  placed  on  the  plane,  and  the  inclination  of  the  plane  is  in- 
creased until  the  angle  reaches  the  critical  value  0sjust  before  motion  com- 
mences. At  this  point  the  component  of  the  gravitational  force  pulling  the 
block  down  the  inclined  plane  just  equals  the  magnitude  of  the  static  con- 
tact friction  force  opposing  the  component,  and  you  have 

Ms  = tan  0S  (4-25) 


It  is  easy  to  understand  the  origin  of  contact  friction,  at  least  in  a gen- 
eral way.  Imagine  that  two  blocks  of  metal  (say  steel  with  surfaces  of  ordi- 
nary cleanliness  and  smoothness)  are  put  together.  Most  surfaces,  even 
those  we  call  smooth  in  the  usual  sense,  are  extremely  rough  on  a 
molecular-size  scale;  see  Fig.  4-29.  Hence  the  “peaks”  of  the  two  surfaces 
which  first  come  into  mutual  contact  comprise  a very  small  part  of  the  total 
surface  area.  If  one  block  of  steel  is  gently  lowered  onto  the  other,  this  very 
small  initial  contact  area  cannot  possibly  support  the  weight  of  the  upper 
block;  the  pressure  is  well  past  the  yield  strength  of  steel.  The  peaks  thus 
deform,  allowing  a larger  and  larger  (but  still  very  small)  portion  of  the 
total  area  of  one  block  to  contact  the  adjacent  peaks  of  the  other. 

Under  the  still  high  pressure  present  at  the  areas  of  contact  between 
the  deformed  peaks,  metals  (especially  similar  metals)  tend  to  “cold-weld" 
together  unless  the  surfaces  are  quite  dirty.  That  is,  the  two  surfaces  are 
joined  by  a multitude  of  tiny  welds,  in  which  attractive  electric  forces  have 
bonded  molecules  from  one  surface  to  molecules  in  the  other.  These  welds 
must  be  broken  if  sliding  is  to  take  place.  This  is  one  of  the  sources  of  static 
friction.  In  addition,  there  is  always  some  interlocking  of  the  peaks  and 
valleys  of  the  surfaces.  If  sliding  is  to  begin,  these  protuberances  must  be 
deformed. 


150  Newton’s  Laws  of  Motion 


Fig.  4-29  (a)  Photograph  of  a single-edged  razor  blade  (x  10).  (b)  Photograph  of  one  of  the 

surfaces  forming  the  edge  (x  200).  (Courtesy  General  Electric  Research  Center,  Schenectady,  N.Y.) 


With  this  picture  in  mind,  we  can  understand  why  the  measured  fric- 
tional force  is  independent  of  the  total  surface  area  in  contact  friction.  The 
reason  is  that  the  frictional  force  depends  on  the  area  that  is  actually  in  con- 
tact. If  the  block  in  Fig.  4-30o  is  turned  so  that  it  rests  on  a side  of  greater 
area,  as  in  Fig.  4-306,  the  total  area  of  the  surface  peaks  actually  in  contact 
with  those  of  the  plate  it  rests  on  remains  constant.  The  inserts  in  the  fig- 
ures indicate  how  the  deformations  in  the  two  cases  automatically  arrange 
for  this  to  be  true.  Pressing  down  on  the  block  increases  the  actual  contact 
area  with  the  plate,  and  thus  the  frictional  force.  So  the  frictional  force  is 
proportional  to  the  normal  force.  When  the  block  has  been  set  into  motion 
across  the  plate,  cold  welds  are  constantly  being  broken  and  reformed.  But 
the  amount  of  cold  welding  present  at  any  time  is  reduced  below  the  static 
value,  so  the  coefficient  of  kinetic  friction  is  smaller  than  the  coefficient  of 
static  friction.  Finally,  the  presence  of  oil  or  grease  at  the  surfaces  in  contact 
prevents  welding  by  coating  them  with  an  inert  material.  This  reduces  both 
coefficients  of  friction.  (Note  in  Table  4-2  that  fxs  and  /xfr  are  equal,  and 
very  small,  for  steel  in  contact  with  Teflon.  Teflon  is  extremely  inert  chemi- 
cally and  bonds  very  weakly  to  other  substances.  Thus  there  is  very  little 
cold  welding.) 

When  two  objects  press  together,  the  force  which  one  exerts  on  the 
other  has  a component  in  a direction  parallel  to  the  plane  of  contact — the 
frictional  force — if  the  objects  are  sliding  or  attempting  to  slide  past  each 


4-6  Forces  in  Mechanical  Systems  151 


(6) 

Fig.  4-30  (a)  The  smaller  surface  of  a 

block  rests  on  a plate.  Below  is  a micro- 
scopic view  showing  large  deformations 
of  the  peaks  of  both  surfaces  which 
are  in  contact.  For  each  square  centi- 
meter of  the  block's  surface,  the  area  in 
actual  contact  is  relatively  large  (though 
it  is  usually  a small  fraction  of  1 cm2). 
( b ) The  larger  surface  of  the  block  rests 
on  the  plate.  The  microscopic  view 
below  shows  that  the  deformations  of 
the  peaks  in  contact  are  smaller.  So  a 
relatively  smaller  area  is  in  actual  contact 
for  each  square  centimeter  of  the  block’s 
surface.  But  the  block's  surface  has  more 
square  centimeters.  In  both  cases  illus- 
trated in  parts  (a)  and  ( b ) the  total  area 
in  actual  contact  is  essentially  the  same. 


other.  There  is  also  a component  of  the  force  in  a direction  normal  to  the 
plane  of  contact — the  normal  force — which  is  always  presexrt.  The  picture 
that  provides  some  understanding  of  the  properties  of  the  frictional  force 
also  helps  in  understanding  those  of  the  normal  force. 

When  you  apply  a force  to  an  object  by  pushing  on  it  in  a direction 
normal  to  its  surface,  the  reaction  force  you  feel  is  the  sum  of  the  repulsive 
electric  forces  exerted  on  very  many  molecules  at  the  surface  of  your  hand 
by  the  very  many  adjacent  molecules  at  the  surface  of  the  object.  Newton’s 
third  law  requires  that  the  sum  of  these  forces  always  be  just  enough  to  have 
the  same  magnitude  as  the  magnitude  of  the  force  you  apply.  How  does 
this  come  about?  It  is  the  same  mechanism  of  yielding  of  the  protuberances 
that  does  the  trick.  The  two  surfaces  in  contact  deform  until  the  actual  con- 
tact area  is  sufficient  to  produce  the  required  force.  The  forces  which  the 
two  objects  exert  on  each  other  when  they  are  in  contact,  and  which  act  in 
directions  normal  to  the  plane  of  contact,  are  often  called  contact  forces  in- 
stead of  normal  forces. 

Finally,  we  give  a brief  empirical  description  of  the  properties  of fluid 
friction,  which  exists  when  a solid  object  moves  through  a fluid.  A treatment 
of  its  origin  is  presented  in  Chap.  16.  The  fluid  friction  force,  also  called 
drag,  exerted  by  a liquid  or  a gas  on  a body  moving  through  it  always  has  a 
direction  opposite  to  the  direction  of  the  body’s  velocity  relative  to  the 
fluid,  and  a magnitude  D which  depends  on  the  magnitude  of  the  velocity. 
Experiment  shows  that  for  small  objects  moving  slowly  (like  a marble  drop- 
ping through  oil)  the  value  of  D is  proportional  to  the  value  of  v,  the  speed 
of  the  body  through  the  fluid.  For  spherical  objects,  the  measured  values  of 
D are  also  proportional  to  their  radii  r.  In  fact,  the  measured  magnitude  D 
of  the  fluid  friction  force  can  be  described  by  an  equation  known  as  Stokes’ 
law: 

D = 6 TTrjrv  (4-26) 

The  proportionality  constant  p (the  Greek  letter  eta)  is  called  the  coeffi- 
cient of  viscosity.  The  SI  unit  for  p is  newton  seconds  per  meter  squared 
(N-s/nr).  Since  1 N/m2  is  the  unit  ol  pressure,  which  is  called  the  pascal 
(Pa),  the  unit  for  p is  often  called  the  pascal-second  and  written  as  Pa-s. 
Some  typical  values  of  rj  are  given  in  Fable  4-3.  Note  the  enormous  range 
of  viscosities  encountered  in  practice.  An  application  of  Stokes’  law  is  given 
in  Example  4-12. 


Table  4-3 


Viscosity  of  Liquids  and  Gases 

Temperature, 

V 

Substance 

(in  °C) 

(in  N-s/m2,  or  Pa-s) 

Liquids 

Acetone 

0 

4.0  x 10~4 

Glycerine 

0 

1.2  x 10‘ 

Helium 

-272 

0 

Pitch 

0 

6 X 1010 

Water 

20 

1.0  X 10“3 

Gases 

Air 

20 

1.8  x 10~5 

Hydrogen 

0 

8.3  X 10“6 

Steam 

100 

1.2  X 10-5 

152  Newton’s  Laws  of  Motion 


EXAMPLE  4-12 


The  water  droplets  in  a certain  cloud  in  the  atmosphere  have  radii  r = 5.0  x 10~5  m. 
How  fast  can  such  droplets  fall  through  the  atmosphere?  Assume  the  atmospheric 
temperature  to  be  20°C. 

■ According  to  Stokes’  law,  the  magnitude  D of  the  fluid  friction  force  acting  on  a 
falling  droplet  is  proportional  to  its  speed  v.  The  direction  of  this  force  is  upward 
since  the  velocity  of  the  droplet  is  downward.  As  the  droplet  falls,  its  speed  builds  up 
until  it  is  large  enough  that  the  value  of  D becomes  equal  to  the  magnitude  mg  of  the 
downward-directed  gravitational  force  acting  on  the  droplet,  whose  mass  is  m.  The 
net  force  acting  on  the  droplet  is  then  zero,  and  it  will  not  fall  any  faster  because 
Newton’s  second  law  then  requires  its  acceleration  to  be  zero.  When  this  is  the  case, 
you  have  D = mg  and,  using  Stokes'  law  to  evaluate  D , 

( ^TT-qrv  = mg  (4-27) 

For  a spherical  water  droplet  of  radius  r and  density  p (the  Greek  letter  rho)  the 
mass  is  p times  its  volume  %-nr3.  So 

47JT3g 
mg  = p — g — 


Combining  this  with  Eq.  (4-27),  you  obtain 


6nr)rv  = 


4-rrpr3g 

3 


or 


_ 2 pr2g 
9tj 


(4-28) 


Table  4-3  gives  the  viscosity  p of  air  at  20°C  to  be  1.8  x 10-5  N-s/m2.  The 
density  p of  water  at  that  temperature  is  1.0  x 103  kg/m3.  Using  these  values,  the 
value  5.0  X 10-5  m given  for  r,  and  the  standard  value  of  g,  you  obtain 


2 x 1.0  x 103  kg/m3  x (5.0  x 10  5 m)2  x 9.8  m/s2 

9 x 1.8  x 10“5  N-s/m2 


3.0  X 10  1 m/s 


The  water  droplet  will  acquire  this  low  terminal  speed  quickly. 

Since  the  terminal  speed  v is  small,  a cloud  made  of  droplets  this  size  will  not 
descend  if  there  is  much  of  an  updraft  because  the  droplets  will  be  moving  more 
slowly  downward  with  respect  to  the  air  than  the  air  is  moving  upward  with  respect 
to  the  ground.  This  is  why  there  must  be  coalescence  of  cloud  droplets  before  an 
appreciable  amount  of  rain  can  fall. 


Stokes’  law  is  valid  only  when  a small  enough  body  is  moving  slowly 
enough  that  the  fluid  through  which  it  moves  flows  past  the  body  in  a 
smooth,  orderly  way.  As  the  size  and/or  speed  of  the  body  increases,  in  due 
course  the  flow  of  fluid  past  the  body  becomes  disorderly  and  turbulent. 
For  example,  the  flow  of  air  past  an  automobile  moving  at  100  km/h  is 
quite  turbulent.  Turbulence  leads  to  a much  larger  fluid  friction  force,  in 
other  words,  a much  larger  drag.  In  fact,  experiment  shows  that  when  tur- 
bulence is  present,  the  magnitude  D of  the  force  is  approximately  propor- 
tional to  v2,  the  square  of  the  body’s  speed.  The  experimental  results  are 
described  to  a good  approximation  by  the  empirical  law  for  turbulent  flow: 

D = pfA98v~  (4-29) 

Here  p/ is  the  density  of  the  fluid,  A is  the  cross-sectional  area  which  the 
body  presents  to  the  fluid,  and  8 is  its  coefficient  of  drag.  The  quantity  8 


4-6  Forces  in  Mechanical  Systems  153 


Table  4-4 


Coefficient  of  Drag  for  Various  Objects 

Shape  of  object  Approximate  value  of  5 

Circular  disk  (broadside  to  stream)  1.2 

Sphere  0.4 

Streamlined  airplane  body  0.06 


(lowercase  Greek  delta)  is  a dimensionless,  empirical  constant  which  de- 
pends on  the  shape  of  the  body,  but  is  reasonably  independent  of  A,  pf,  and 
v over  a fairly  large  range  of  these  parameters.  Some  values  of  S are  given 
in  Table  4-4.  At  the  end  of  Chap.  5 we  go  through  a detailed  analysis  of  the 
motion  of  a skydiver  falling  through  the  air  and  experiencing  a fluid  fric- 
tion force  given  by  Eq.  (4-29). 


EXERCISES 

Group  A 

4-1.  Mass  versus  weight;  slugs  versus  pounds.  In  the  Brit- 
ish engineering  system  of  units,  1 slug  is  defined  as  the 
mass  of  a body  which  experiences  an  acceleration  of 
1 ft/s2  when  a force  of  I lb  acts  on  it.  Neglecting  air  resist- 
ance, a body  falls  under  gravity  with  an  acceleration  of 
magnitude  g = 32  ft/s2. 

a.  A standard  loaf  of  bread  weighs  1 lb.  What  is  its 
mass  in  slugs? 

b.  If  a person  weighs  130  lb,  what  is  her  mass  in 
slugs? 

c.  Describe  how  to  calculate  the  mass  in  slugs  of  an 
object  of  known  weight. 

d.  What  is  the  weight  of  an  object  whose  mass  is  1 
slug? 

e.  If  a 1-lb  loaf  of  bread  were  transferred  to  the 
moon,  its  weight  would  be  about  i lb.  What  would  be  its 
mass  in  slugs?  Explain  your  answer. 

4-2.  Poetry  in  motion.  Figure  4E-2  shows  a piece  of 
equipment  commonly  used  in  lecture  demonstrations. 
The  car  has  a funnel  containing  a compressed  spring.  A 
ball  is  set  on  top  of  the  spring  in  the  funnel.  The  car  moves 
along  a straight  track  with  constant  velocity.  When  the  cart 
passes  over  an  upward  projection  on  the  track,  a trigger 
releases  the  spring  and  the  ball  is  shot  upward.  Meanwhile 
the  cart  continues  along  the  track  at  the  same  speed. 
Where  will  the  ball  land?  Explain  your  answer 

Fig.  4E-2 


4-3.  Force,  mass,  and  acceleration.  A constant  force  of 
magnitude  1 N acts  on  a body  of  mass  1 kg  for  1 s.  If  the 
body  is  initially  at  rest,  how  far  has  it  moved  at  the  end  of 
the  second? 

4-4.  Cart  on  a string.  A toy  cart  of  mass  100  g lies  on  a 
variable-speed  turntable.  It  is  tied  to  the  central  shaft  of 
the  turntable  by  a string  whose  breaking  strength  is  5.0  N. 
The  distance  from  the  center  of  the  shaft  to  the  center  of 
the  cart  is  10  cm. 

a.  What  is  the  greatest  speed  with  which  the  center  of 
the  cart  can  move  as  the  turntable  turns,  if  the  string  is  not 
to  break?  (Rolling  friction  is  negligible.) 

b.  What  rate  of  turning  (in  revolutions  per  minute) 
would  produce  the  speed  obtained  in  part  a ? 

4-5.  On  the  head.  A carpenter  swings  a 3-kg  hammer 
so  that  its  speed  is  5 m/s  just  before  it  strikes  a nail.  The 
nail  is  driven  6 mm  into  a block  of  wood. 

a.  Assuming  that  a constant  force  resisted  the  motion 
of  tire  nail,  what  must  have  been  the  magnitude  of  that 
force? 

b.  Compare  the  force  to  the  weight  of  the  hammer. 

4-6.  Using  signed  scalars.  Repeat  the  calculation  of 
Example  4-3  to  find  the  magnitude  of  the  force  exerted 
by  a spring  scale  from  which  a ball  of  a given  weight  is  sus- 
pended, using  signed  scalar  symbolism  instead  of  vectors. 
Be  sure  to  define  an  appropriate  coordinate  axis,  take 
proper  account  of  signs,  and  make  clear  distinctions  be- 
tween directed  quantities  and  their  magnitudes. 

4-7.  Mass  versus  weight.  Near  the  surface  of  the  moon 
an  object  falls  under  the  influence  of  the  moon's  gravita- 
tional attraction  with  an  acceleration  of  magnitude  1.62 
m/s2.  The  mass  of  an  astronaut  standing  on  the  moon  is 
1 15  kg,  including  his  life-support  equipment.  How  much 


154  Newton’s  Laws  of  Motion 


force  does  the  moon  exert  on  the  astronaut  and  the  equip- 
ment? How  much  would  the  astronaut  and  equipment 
weigh  on  earth? 

4-8.  Evaluating  normal  force.  Determine  the  magni- 
tude of  the  force  exerted  on  the  block  by  the  smooth  in- 
clined plane  of  Example  4-4,  if  the  mass  of  the  block  is 

2.0  kg. 

4-9.  All  aboard!  A train,  including  its  locomotive,  has 
a mass  of  2000  metric  tons  = 20  00  x 103  kg.  Starting 
from  rest,  the  train  acquires  a speed  of  2.0  m/s  in  5.0  s. 

a.  Assuming  that  a constant  net  force  acts  on  the 
train,  what  is  the  magnitude  of  that  force? 

b.  Describe  the  forces  that  act  on  the  locomotive  it- 
self. Describe  the  forces  that  act  on  one  of  the  other  cars  of 
the  train. 

c.  Compare  the  net  force  in  part  a with  the  weight 
of  the  train. 

4-10.  Investigating  a collision,  /.  Show  that  the  puck 
collision  in  Fig.  4-12  is  inelastic  by  making  the  appropriate 
graphical  subtraction  of  velocity  vectors. 

4-11.  Investigating  a collision , II.  Show  that  the  puck 
collision  in  Fig.  4-14  is  elastic  by  making  the  appropriate 
graphical  subtraction  of  velocity  vectors. 

4-12.  Investigating  a collision,  III.  Use  graphical 
methods  to  determine  whether  the  puck  collision  in  Fig. 
4-20  is  elastic  or  inelastic. 

4-13.  Snowslide.  A gondola  (open  freight  car)  of  mass 

20.000  kg  is  coasting  on  a siding  at  a speed  of  10  m/s. 
There  is  negligible  rolling  friction.  All  of  a sudden 

10.000  kg  of  snow  falls  from  a snowbank  into  the  car. 
What  is  the  speed  of  the  gondola  after  this  snowslide? 

4-14.  Muzzle  “velocity."  A tide  is  mounted  vertically 
with  a large  wooden  ball  balanced  on  its  muzzle  (the  end  of 
the  barrel).  When  the  gun  is  bred,  the  bullet  embeds  itself 
in  the  ball,  which  rises  1.00  m.  If  the  mass  of  the  wooden 
ball  is  1.00  kg  and  that  of  the  bullet  is  10.0  g,  with  what 
speed  did  the  bullet  leave  the  muzzle? 

4-15.  Momentum  conservation,  I. 

a.  A 3.0-kg  rifle  is  suspended  to  hang  freely  with  its 
barrel  supported  in  a horizontal  position.  With  what 
speed  does  it  recoil  if  it  shoots  an  1 1-g  bullet  with  a speed 
of  300  m/s? 

b.  Why  should  a rifle  be  held  firmly  against  the 
shoulder  when  it  is  bred? 

4-16.  Momentum  conservation,  II.  A 1.0-kg  car  and  a 
car  of  unknown  mass  M are  pushed  together  to  compress 
a spring  between  them  (Fig.  4E-16).  The  cars  are  then  re- 
leased. A movie  is  made  of  the  experiment.  It  is  deter- 


M 

1 kg 

— 

fa  « « 


mined  from  the  movie  that  the  speed  of  the  second  car 
immediately  after  release  is  one-fourth  that  of  the  1.0-kg 
car.  What  is  the  mass  M? 

4-17.  Momentum  nonconservation ? A tennis  ball  is 
dropped.  Before  it  strikes  the  ground  the  value  of  its 
momentum  mv  is  negative,  if  the  upward  direction  is  taken 
as  positive.  After  it  rebounds,  it  is  moving  upward  so  that 
its  momentum  has  a positive  value.  Its  momentum  is  not 
constant.  Explain  why  this  does  not  violate  the  law  of  con- 
servation of  momentum. 

4-18.  When  bat  meets  ball.  A baseball  thrown  with  a 
speed  of  40  m/s  is  hit  by  the  batter.  It  leaves  the  bat  with  a 
speed  of  70  m/s.  If  it  remains  in  contact  with  the  bat  for 
0.025  s,  what  is  the  average  force  exerted  by  the  bat  on  the 
ball?  The  mass  of  the  baseball  is  145  g. 

4-19.  Blasting  off.  A rocket  of  mass  10,000  kg  is  set 
for  takeoff.  The  speed  of  its  exhaust  gas  is  1500  m/s. 

a.  At  what  rate  must  burned  gas  be  ejected  to  give  the 
rocket  an  upward  acceleration  of  magnitude  g? 

b.  Although  the  exhaust  speed  and  ejection  rate  re- 
main constant,  the  acceleration  of  the  rocket  during  a 
later  part  of  the  burning  is  greater  than  its  initial  value. 
Why? 

4-20.  Force  and  momentum  change. 

a.  What  are  the  magnitude  and  direction  of  a con- 
stant force  that  changes  the  momentum  of  a body  from 
10  kg-m/s  east  to  10  kg-m/s  north  in  2.0  s? 

b.  If  the  same  force  continues  to  act,  what  will  be  the 
momentum  of  the  body  after  an  additional  time  interval 
of  2.0  s? 

4-21.  Ice  follies.  Two  young  boys  on  ice  skates  are  on  a 
frozen  pond.  One  boy  has  a rope  around  his  waist.  The 
second  boy,  initially  at  distance  L from  the  hrst  boy,  holds 
the  rope  taut  and  pulls  it  continually  hand  over  hand. 

a.  If  the  two  boys  have  the  same  mass,  what  happens? 

b.  When  the  two  boys  meet,  how  much  rope  will  have 
passed  through  the  second  boy’s  hands? 

4-22.  Stopping  distance.  A car  is  moving  with  speed  v. 

a.  Prove  that  the  minimum  distance  for  stopping  the 
car  without  skidding  is  given  by  the  expression 


where  p,s  is  the  coefficient  of  static  friction  between  the 
tires  and  the  road. 

b.  Find  smin  if  v = 25  m/s  and  /xs  = 0.75. 

c.  Calculate  the  time  required  to  bring  the  car  to  rest. 

4-23.  Those  sudden  stops.  Explain  why  an  unsecured 
block  of  stone  will  shift  forward  on  a flatbed  trailer  if  the 
tractor-trailer  undergoes  gradual  starts  and  sudden  stops. 

4-24.  Terminal  speed.  A stainless-steel  ball  bearing  of 
diameter  1.0  mm  is  dropped  into  a deep  tank  of  glycerine 
whose  temperature  is  0°C.  The  density  of  stainless  steel  is 


Exercises  155 


7.8  x 103  kg/m3.  Assuming  that  viscous  friction  obeys 
Stokes’  law,  use  Eq.  (4-28)  to  predict  the  speed  of  the  ball 
bearing  when  it  hits  the  bottom  of  the  tank. 

o 

Group  B 

4-25.  Collision:  before  and  after.  A freely  moving  ball 
strikes  a stationary  ball  of  equal  mass.  The  initial  velocities 
of  the  balls  are  vxi  and  v2i  = 0,  respectively.  After  the  colli- 
sion, the  velocities  are  vx/and  v2/,  respectively. 

Fig.  4E-25 


a.  Use  the  law  of  momentum  conservation  to  obtain 
an  equation  relating  v1{,  vlf,  and  v2/. 

b.  Show  that  vi;,  v1/;  and  v2/are  conhned  to  a single 
plane. 

c.  Figure  4E-25  shows  the  initial  and  final  velocities 
of  ball  1 , Vj;  and  \lf.  Copy  this  figure  and  use  it  to  construct 
graphically  the  vector  v2/.  Also  construct  the  parallelo- 
gram determined  by  v^and  \2f. 

d.  What  is  the  geometrical  significance  of  vlf  in  the 
parallelogram  which  you  have  constructed? 

e.  What  is  the  geometrical  significance  of  the  vector 
\lf  — v2/ in  the  parallelogram? 

f.  If  the  collision  is  elastic,  then  |vj/  — v^|  = |vi;  — v2i|. 
What  restriction  does  this  place  on  the  shape  of  the  paral- 
lelogram? What  does  the  restriction  imply  about  v2/? 

4-26.  Cruising.  A jet  plane  is  cruising  at  a steady 
speed  of  200  m/s.  It  is  burning  3.0  kg/s  of  fuel.  This  re- 
quires the  intake  of  80  kg/s  of  air.  The  products  of  com- 
bustion are  exhausted  with  a speed  of  500  m/s  relative  to 
the  plane. 

a.  What  is  the  thrust  (propulsive  force)  provided  by 
the  jet  engines? 

b.  How  large  is  the  drag  force  acting  on  the  plane? 

4-27.  Pitching  and  catching.  A standard  baseball  has  a 
mass  of  145  g. 

a.  A typical  major  league  pitcher  can  throw  a baseball 
at  a speed  of  about  40  m/s.  During  the  final  part  of  the 
pitching  motion  (the  delivery  stage),  the  ball  is  accelerated 
from  very  low  speed  to  its  final  speed  in  about  0.25  s. 
What  is  the  average  force  exerted  on  the  ball  during  the 
delivery  stage? 

b.  Assume  a catcher  is  going  to  stop  the  ball  by  decel- 
erating it  with  the  application  of  a constant  retarding 
force  of  strength  200  N.  How  much  time  would  be  re- 
quired to  bring  the  ball  to  rest?  How  far  would  the  ball 
travel  before  coming  to  rest?  In  your  opinion,  is  the  as- 
sumed value  of  200  N reasonable? 

4-28.  Great  leap  upward.  A physicist  of  mass  m finds 
that  by  crouching  and  then  springing  upward  he  can 


(temporarily)  elevate  himself  a distance  h above  his  tiptoe 
standing  height.  The  launch  time  (from  the  beginning  of 
the  spring  until  his  feet  leave  the  floor)  is  A t. 

a.  What  must  be  the  velocity  of  the  “springer”  at  the 
instant  his  toes  leave  the  floor? 

b.  What  is  the  average  net  force  on  the  springer 
during  launch?  What  average  force  does  the  floor  exert  on 
the  springer  during  launch?  Express  these  forces  in  SI 
units  and  also  as  multiples  of  the  physicist’s  weight. 

e.  A not-very-springy  adult  (whose  mass  m is  70  kg) 
finds  that  his  spring  height  is  0.40  m and  his  launch  time  is 
0.10  s.  Obtain  numerical  values  for  the  quantities  obtained 
in  parts  a and  b. 

4-29.  Dueling  spheres.  Two  steel  spheres  are  sus- 
pended on  very  long  cords  from  a common  point.  The 
ratio  of  their  masses  is  1:3. 

a.  The  smaller  sphere  is  pulled  to  the  left  and  re- 
leased. When  it  strikes  the  larger  sphere,  the  small  sphere 
has  velocity  <y0x,  and  the  line  joining  the  centers  of  the 
spheres  is  horizontal.  Collisions  between  steel  spheres  are 
almost  perfectly  elastic.  With  what  velocities  do  the  two 
spheres  rebound  from  the  collision? 

b.  Suppose  the  larger  sphere  were  pulled  aside  and 
allowed  to  strike  the  (stationary)  small  sphere  with  velocity 
— u0x.  What  would  the  velocities  of  the  spheres  be  imme- 
diately after  the  collision? 


4-30.  Head  -on  and  elastic.  An  object  of  mass  m1  strikes 
a stationary  object  of  mass  m2.  The  initial  velocity  of  object 
1 is  vi;.  The  collision  is  perfectly  elastic  and  is  also  a 
head-on  collision,  so  that  the  initial  and  final  paths  of  the 
objects  all  lie  along  a single  line. 

a.  Prove  that  the  final  velocities  vy  and  v2/  are  given 
by 


'i  f ' 


ml  — m2 

nii  + m2 


and 


v2/  = 


2 nii 


m,i  + m2 


Vii 


b.  Evaluate  vlfand  v2/for  the  case  = m2. 

c.  If  nii  < »z2,  how  does  the  direction  of  Vj/compare 
with  that  of  vi;? 

d.  If  nii  <SC  m2,  what  are  the  final  velocities? 

e.  If  Mi  > m2,  how  do  the  directions  of  vx/  and  v2/ 
compare  with  that  of  vi;? 

4-31.  Head-on  collision.  A massive  object  M moving 
with  speed  T;  encounters  a small  object  of  mass  m « M 
moving  in  the  opposite  direction  with  speed  vt.  The  colli- 
sion is  perfectly  elastic,  and  after  the  collision  both  objects 
move  in  the  original  direction  of  M.  Prove  that  the  final 
speed  of  the  small  object  exceeds  its  initial  speed  by  2Vt. 

4-32.  Puck  collision,  I.  Dots  1 and  2 in  Fig.  4E-32  repre- 
sent the  positions  of  the  centers  of  air  table  pucks  1 and  2 
in  a strobe  photo  of  a collision  between  the  pucks.  Puck  1 
was  incident  from  the  left,  striking  puck  2,  which  was 
initially  stationary. 

a.  Use  your  intuition  to  guess  whether  the  mass  of 
puck  1 is  larger  than  or  smaller  than  the  mass  of  puck  2. 


156  Newton’s  Laws  of  Motion 


1 


Fig.  4E-32 


b.  Determine,  as  accurately  as  you  can  from  a graphi- 
cal analysis,  the  ratio  of  the  puck  masses. 

c.  Is  the  collision  elastic  or  inelastic? 


1 

• • 


1 


Fig.  4E-33 


4-33.  Puck  collision,  II.  Using  Fig.  4E-33,  carry  out  the 
analyses  called  for  in  Exercise  4-32. 


2 


4-34.  Measuring  mass  without  a balance.  A ball  C is  al- 
lowed to  roll  freely  down  a curved  incline,  as  in  Fig. 
4E-34.  It  strikes  the  floor  at  B.  The  point  A is  vertically 
below  the  point  where  the  ball  leaves  the  incline.  The  line 
AB  therefore  gives  the  direction  of  the  velocity  of  the  ball 
when  it  leaves  the  incline.  Another  ball,  D.  is  placed  at  rest 
at  the  end  of  the  incline  directly  above  A,  and  ball  C is 
again  released  from  the  top  of  the  incline.  It  strikes  ball  D 
slightly  off  center.  Ball  C now  lands  on  the  floor  at  C' , and 
D lands  at  D'.  The  perpendicular  distances  C'C"  and 
D'D"  from  the  straight  line  AD" C" B are  measured.  The 
ratio  C'C"  .D'D"  is  found  to  be  2:  1.  What  is  the  ratio  of 


the  mass  of  C to  D ? Justify  your  result.  (This  experiment 
exemplifies  a method  for  determining  mass  from  its 
operational  definition.) 

4-35.  The  bouncing  ball.  A 0.10-kg  ball  falls  from  a 
height  of  1.0  m onto  a disk  equipped  with  a microswitch 
which  activates  an  electronic  timer.  The  ball  bounces  back 
up  from  the  disk,  reaching  a height  of  0.50  m.  The  timer 
indicates  that  the  ball  was  in  contact  with  the  disk  for  2.0 
milliseconds  = 2 x 10~3  s. 

a.  With  what  speed  did  the  ball  strike  the  disk? 

b.  With  what  speed  did  the  ball  leave  the  disk? 

c.  What  was  the  average  acceleration  of  the  ball 
during  its  contact  with  the  disk? 

d.  What  average  force  did  the  disk  exert  on  the  ball? 

e.  What  is  the  ratio  of  the  force  found  in  part  d to  the 
weight  of  the  ball? 

4-36.  Springs  end  to  end.  A spring  of  force  con- 
stant k j and  one  of  different  force  constant  k2  are  con- 
nected end  to  end.  Find  an  expression  for  the  effective 
force  constant  k'  of  the  spring  pair.  Let  S be  the  magni- 
tude of  the  total  restoring  force  exerted  by  the  pair.  Then 
you  have  S = k'\l  - /0|,  where  |/  — (0|  is  the  magnitude  of 
its  total  change  in  length. 


Exercises  157 


4-37.  Springs  side  by  side.  The  ends  of  two  identical 
springs,  each  having  force  constant  k,  are  joined  by  two 
rigid  bars,  as  illustrated  in  Fig.  4E-37.  Find  an  expression 
for  the  effective  force  constant  k"  of  the  spring  pair, 
as  suggested  in  Exercise  4-36. 


Fig.  4E-37 


4-38.  An  unusual  sundeck.  An  eccentric  astronomer 
plans  to  use  the  dome  of  his  observatory  as  a sun  deck.  Let 
/xs  be  the  coefficient  of  static  friction  between  the  dome 
surface  and  the  blanket  he  plans  to  use. 


Fig.  4E-38 


a.  How  far  from  the  top  of  the  dome  can  he  place  his 
blanket  without  sliding  off  the  dome?  Give  your  answer  as 
the  maximum  angular  distance  6 from  the  “pole”  of  the 
dome.  See  Fig.  4E-38. 

b.  Evaluate  your  result  for  /xs  = 0.30. 

4-39.  On  the  ramp.  A ramp  is  constructed  with  a para- 
bolic shape  such  that  the  height  y of  any  point  on  its  sur- 
face is  given  in  terms  of  the  point’s  horizontal  distance  x 
from  the  bottom  of  the  ramp  by  y = x2/2L.  A block  of 
granite  is  to  be  set  on  the  ramp;  the  coefficient  of  static 
friction  is  /jls. 

a.  What  is  the  maximum  x coordinate  xM  at  which  the 
block  can  be  placed  on  the  ramp  and  remain  at  rest?  What 
is  the  corresponding  height  yjW? 

b.  Evaluate  your  answers  for  the  case  L = 10  m and 
/xs  = 0.80. 

4-40.  Terminal  speed  when  friction  involves  turbulence. 
Derive  an  equation  for  the  terminal  speed  of  a body 
moving  through  a fluid  under  circumstances  in  which 
viscous  friction  obeys  the  law  for  turbulent  flow.  Compare 
your  results  with  Eq.  (4-28). 

4-41.  Geronimo!  Use  the  equation  derived  in  Exercise 
4-40  to  estimate  the  speed  at  which  a parachutist  hits  the 
earth  at  a location  where  the  altitude  is  approximately 
that  of  sea  level.  Take  the  mass  of  the  parachutist  plus  the 
parachute  to  be  100  kg,  the  diameter  of  the  parachute  to 


be  10  m,  and  the  coefficient  of  drag  of  the  parachute  to 
have  the  approximate  value  1.2  quoted  in  Table  4-4  for  a 
circular  disk.  The  density  of  air  at  sea  level  is  about  1 2 
kg/m3. 

Group  C 

4-42.  Trapped!  In  Fig.  4E-42  the  vertical  block  rests 
on  a wedge  and  is  held  in  equilibrium  between  two  guides 
G and  G',  with  the  help  of  a horizontal  push  P on  the 
wedge.  All  surfaces  are  highly  lubricated,  so  frictional 
forces  can  be  neglected.  The  magnitude  of  the  force  W, 
is  the  weight  of  the  vertical  block  and  the  wedge  has 
opening  angle  9.  as  shown. 


Fig.  4E-42 


a.  Find  the  required  force  P in  terms  of  W and  8. 

b.  What  prevents  the  wedge  from  accelerating  to  the 
right? 

c.  What  prevents  the  vertical  block  from  accelerating 
to  the  right? 

4-43.  Cart-walking,  I.  A 50-kg  girl  stands  at  the  left 
end  of  the  platform  of  a 100-kg  cart  with  small,  well- 
lubricated  wheels.  The  cart  is  initially  motionless,  but 
rolling  friction  between  the  wheels  and  the  ground  is 
negligible. 

a.  The  girl  begins  walking  steadily  tow'ard  the  right 
end  of  the  cart.  Her  velocity  relative  to  the  cart  is  2.0  m/s. 
What  is  the  velocity  of  the  cart  with  respect  to  the  ground? 

b.  What  is  the  velocity  of  the  girl  with  respect  to  the 
ground? 

c.  How'  long  will  it  take  her  to  reach  the  right  end  of 
the  cart,  which  is  4.0  m long? 

d.  How  far  will  she  have  moved  with  respect  to  the 
ground? 

e.  How  far  will  the  cart  have  moved  with  respect  to 
the  ground? 

f.  If  the  girl  stops  walking  when  she  reaches  the  right 
end  of  the  cart,  what  will  be  the  final  velocity  of  the  cart 
(and  girl)  with  respect  to  the  ground? 

4-44.  Cart-walking,  II.  Consider  again  the  girl  and 
cart  described  in  Exercise  4-43.  The  cart  is  initially  sta- 
tionary, and  the  girl  approaches  the  cart  from  the  left, 
walking  steadily  along  the  ground  at  a speed  of  2.0  m/s. 
When  she  reaches  the  cart,  she  steps  up  without  hesi- 
tating. She  stands  at  the  left  end  of  the  platform  very 
briefly  and  then  begins  walking  to  the  right  at  a speed  of  2.0 


158  Newton’s  Laws  of  Motion 


m/s  with  respect  to  the  cart  surface.  When  she  reaches  the 
right  end,  she  steps  off  without  hesitating  and  continues 
walking  away. 

Carefully  analyze  the  motion  of  the  girl  and  the  cart 
in  order  to  answer  the  questions  below.  As  you  work,  state 
your  assumptions  clearly.  Comment  on  any  differences 
between  your  results  here  and  the  corresponding  results 
for  Exercise  4-43. 

a.  Just  after  the  girl  steps  up,  but  before  she  starts 
walking  across  the  cart,  what  is  the  velocity  of  the  cart  rela- 
tive to  the  ground? 

b.  While  the  girl  is  walking  across  the  cart,  what  is  the 
cart’s  velocity  relative  to  the  ground? 

c.  While  she  is  walking  across  the  cart,  what  is  the 
girl’s  velocity  relative  to  the  ground? 

d.  How  long  does  it  take  for  the  girl  to  reach  the  right 
end  of  the  cart? 

e.  Relative  to  the  ground,  how  far  does  the  girl  move 
during  her  walk  across  the  cart? 

f.  Relative  to  the  ground,  how  far  does  the  cart  move 
while  the  girl  is  on  it? 

g.  After  the  girl  has  stepped  down,  what  is  the  veloc- 
ity of  the  cart  relative  to  the  ground? 

4-45.  Collision  course.  Bodies  A and  B of  mass  mA  and 
mB,  respectively,  approach  each  other  and  collide.  The 
initial  velocities  are  given  by  v/ti  and  v Bi , respectively.  The 
final  velocities  are  \ Af  and  \Bf.  All  velocities  are  confined 
to  the  xy  plane,  and  velocity  measurements  indicate  the 
following  speeds  and  directions.  The  angles  </>  tire  the 
angles  between  the  positive  x axis  and  the  vectors,  with 
the  positive  sense  being  toward  the  positive  y axis. 

vab  vm  = 10.0  m/s;  4>a;  = 110.0° 
vBi:  vBi  = 20.0  m/s;  </>Bi  = 50.0° 
vAf:  vAf~  10-0  m/s;  <j>Af  = 70.0° 
vs/:  vBf  = 16.0  m/s;  (/>B/  = 54.6° 


mass.  As  shown  in  the  figure,  the  recoiling  particles  A and 
B produce  tracks  PA'  and  PB'  at  angles  with  the  original 
direction  of  motion  of  A.  The  speeds  vBf  and  vAf  of  the 
particles  after  collision,  as  inferred  from  the  appearance 
of  their  tracks,  have  the  ratio  vBf/vAf  = 0438|.  Determine 
the  mass  of  particle  B in  atomic  mass  units.  Can  you  guess 
the  nature  of  the  particle  from  your  result? 

4-47.  Hanging  by  a thread.  A sphere  of  mass  M is  sus- 
pended by  a thread.  A second  thread  is  attached  to  the 
bottom  of  the  sphere.  Two  experiments  are  performed. 
First,  the  lower  thread  is  pulled  downward  with  a very' 
gradually  increasing  force,  so  that  the  sphere  has  neg- 
ligible acceleration.  It  is  observed  that  the  upper  thread 
breaks  first.  Then  the  sphere  is  suspended  just  as  before. 
Now  the  lower  thread  is  jerked  downward,  so  that  if  the 
lower  thread  did  not  break,  the  sphere  would  have  a large 
downward  acceleration.  However,  it  is  found  that  the 
lower  thread  breaks. 

a.  Write  an  equation  relating  the  acceleration  of  the 
sphere  to  the  forces  acting  on  it.  Let  Tu  and  T , represent 
the  tensions  in  the  upper  and  lower  threads,  respectively. 

b.  Use  this  equation  to  explain  the  differing  out- 
comes of  the  two  experiments. 

4-48.  The  sand  is  running  out.  Figure  4E-48  shows  an 
equal-arm  balance  suspended  from  a knife  edge.  The 
upper  part  of  the  vessel  on  the  right  contains  sand  which 
is  pouring  out  of  the  opening  at  A at  a steady  rate  c.  (The 
dimensions  of  the  constant  c are  mass  per  unit  time.)  Be- 
fore the  stopcock  at  A was  opened,  the  balance  was  leveled 
by  placing  the  appropriate  weights  in  the  left  pan.  The 
balance  was  then  clamped,  and  the  stopcock  was  opened. 
Once  the  sand  was  flowing  at  a steady  rate,  the  balance  was 
undamped.  It  was  observed  that  the  balance  remained 
level,  as  shown. 

Fig.  4E-48 


a.  Evaluate  |vB/  — vAf\  and  compare  it  with 
|v Bi  — v^|.  Was  the  collision  elastic? 

b.  What  is  the  mass  ratio  mB/mA ? 

c.  Based  on  your  result  for  part  b,  what  would  have 
been  the  common  final  velocity  if  bodies  A and  B had 
experienced  a completely  inelastic  collison,  starting  from 
the  initial  velocities  given? 


\4-46)  Recoil  in  a cloud  chamber.  Particle  A in  Fig.  4E-46 
is  a fast  alpha  particle  (helium  nucleus  of  4 atomic  mass 
units)  that  is  traveling  through  a “cloud  chamber,”  which 
records  the  trajectories  of  electrically  charged  particles.  At 
point  P,  particle  A collides  with  B,  a particle  of  unknown 


Fig.  4E-46 


a.  How  long  does  it  take  each  grain  of  the  sand  to  fall 
from  A to  B.  a distance  h ? 

b.  What  is  the  mass  of  the  falling  stream  of  sand  be- 
tween A and  B7 

c.  What  is  the  weight  of  the  stream? 

d.  What  is  the  velocity  of  the  falling  sand  as  it  reaches 
point  B ? 

e.  What  momentum  per  unit  time  is  given  by  the 
sand  stream  to  the  bottom  of  the  vessel  at  5? 


Exercises  159 


f.  What  is  the  force  exerted  on  the  bottom  of  the 
vessel  by  the  sand  stream  as  the  sand  grains  come  to  rest? 

g.  Why  does  the  balance  remain  level? 

h.  Describe  qualitatively  the  motion  of  the  balance 
as  the  sand  runs  out. 

4-49.  Where  will  the  slippage  be?  Three  numbered 
blocks  ( 1 is  a rectangular  slab,  and  2 and  3 are  triangular 
wedges)  are  stacked  as  shown  in  Fig.  4E-49.  The  blocks 
have  masses  mx,  m2,  and  m3,  respectively.  Interface  A 
(between  block  1 and  block  2)  is  characterized  by  coeffi- 
cient of  static  friction  /xA,  and  interface  B (between  block  2 
and  block  3)  is  characterized  by  ^B.  The  surfaces  com- 
prising interface  B are  inclined  at  an  angle  0B  with  the 
horizontal. 


a.  Block  3 is  pulled  to  the  right  with  a small  accelera- 
tion of  magnitude  a.  No  slippage  occurs  along  either  sur- 
face. What  is  the  magnitude  of  the  frictional  force  along 
interface  A?  What  is  the  magnitude  of  the  frictional  force 

o 

along  surface  B?  For  each  interface,  find  the  ratio  of  the 
actual  frictional  force  to  the  maximum  available  force  of 
static  friction. 

b.  Block  3 is  now  pulled  to  the  right  with  a gradually 
increasing  acceleration,  until  slippage  occurs  along  one  of 
the  two  interfaces.  Where  does  the  slippage  occur?  That 
is,  under  what  conditions  does  the  slippage  first  occur 
along  interface  A , and  under  what  conditions  does  it  occur 
along  interface  B? 

c.  Starting  once  more  from  rest,  block  3 is  pushed  to 
the  left  with  gradually  increasing  acceleration  until  slip- 
page occurs.  Describe  slippage  conditions  in  this  case. 

d.  Suppose  /jla  = 0.50  and  fj. B = 0.80.  For  what 
values  of  0B  does  slippage  first  occur  along  interface  B 
when  block  3 is  pulled  to  the  right  and  along  interface  A 
when  block  3 is  pushed  to  the  left? 

4-50.  Down  the  incline.  A line  of  n identical  rectangu- 
lar wooden  blocks  (each  of  mass  m)  is  sitting  on  a smooth 
incline  at  an  angle  a with  the  horizontal.  (See  Fig.  4E-50.) 
The  evenly  spaced  blocks  are  of  length  d and  are  sepa- 
rated by  gaps  of  length  /,  which  is  also  the  distance  from 


the  nth  block  to  the  bottom  of  the  ramp.  The  coefficient  of 
static  friction  between  each  block  and  the  incline  is 
jis  > tan  a,  but  the  coefficient  of  kinetic  friction  is 
f±i;  < tan  a. 


a.  Suppose  the  top  block  (block  1 ) is  given  a small  ini- 
tial velocity  to  start  it  down  the  incline.  What  is  its  accelera- 
tion during  the  time  interval  before  it  strikes  block  2?  How 
much  time  elapses  before  it  strikes  block  2?  With  what 
speed  does  it  strike  block  2? 

b.  If  the  collisions  between  blocks  are  completely 
elastic,  what  is  the  speed  of  the  jth  block  just  after  it  has 
been  struck,  and  again  just  before  it  strikes  block  j + 1? 
How  much  time  elapses  between  the  start  of  block  1 and 
the  instant  when  block  n reaches  the  end  of  the  ramp? 

c.  Compare  and  contrast  your  results  for  part  b with 
the  motion  of  a solitary  block  started  down  an  empty  ramp. 
In  which  case  does  a block  reach  the  bottom  end  of  the 
ramp  in  a shorter  time?  In  which  case  is  the  speed  of  the 
block  greater  as  its  front  end  reaches  the  bottom  of  the 
ramp? 

d.  Suppose  the  collisions  between  blocks  are  totally 
inelastic,  so  that  after  block  j is  struck,  blocks  1 through  j 
move  as  a unit.  What  is  the  velocity  of  blockjjust  after  it  is 
struck,  and  again  just  before  it  strikes  block  j +1?  How 
much  time  elapses  before  the  front  end  of  the  nth  block 
reaches  the  bottom  of  the  ramp? 

e.  Compare  and  contrast  the  results  of  part  d with  the 
results  for  parts  b and  c. 

4-51.  Unscheduled  stop.  A flatbed  truck  driver  is 
hauling  a large  block  of  granite.  He  is  driving  along  on 
level  ground  at  speed  v0  when  he  rounds  a curve  and  sees 
a disabled  car  a distance  S0  ahead  blocking  the  road. 

a.  Can  the  trucker  stop  without  causing  the  load  to 
shift  forward?  The  coefficient  of  static  friction  between 
the  granite  and  the  flatbed  is  /zs. 

b.  Evaluate  your  results  for  v0  = 30  m/s,  S0  = 100  m, 
and  jjis  = 0.50. 

c.  Suppose  the  trucker  requires  reaction  time  At  be- 
fore the  brakes  are  applied.  Modify  the  result  of  part  a to 
allow  for  this. 

d.  Evaluate  your  results  for  the  values  of  v0,  S0,  and 
Us  given  in  part  b if  At  = 0.50  s. 


160  Newton’s  Laws  of  Motion 


g 

Applications  of 
Newton's  Laws 


5-1  THE  FREE-BODY  In  this  chapter  we  apply  Newton’s  laws,  developed  in  Chap.  4,  to  the  analy- 
DIAGRAM  sis  of  the  motion  of  systems  of  practical  interest.  The  techniques  to  be  intro- 
duced here  are  widely  useful  in  almost  every  held  of  physics  and  engineer- 
ing. As  in  Chaps.  2 and  3,  we  restrict  our  attention  to  systems  consisting  of 
bodies  whose  motion  can  be  studied  in  terms  of  changes  in  location  only. 
The  study  of  motion  which  must  be  described  in  terms  of  changes  in  orien- 
tation is  deferred  until  Chaps.  9 and  10. 

Specifically,  in  this  chapter  we  apply  Newton's  second  law,  usually  in 
the  form 


F = ma  (5-1) 

to  a study  of  the  motion  of  bodies  in  a variety  of  systems.  In  doing  so,  it  is 
always  necessary  to  have  the  answers  to  two  questions  clearly  in  mind:  (1)  A 
system  may  consist  of  several  parts,  each  of  which  has  a certain  mass  and 
acceleration.  Equation  (5-1)  must  be  applied  separately  to  each  part.  Pre- 
cisely what  are  these  parts?  (2)  The  quantity  F in  Eq.  (5-1)  is  the  net  force 
acting  on  each  of  the  parts  of  the  system  which  have  been  precisely  defined 
in  answering  question  1.  Thus  for  each  part  F is  the  vector  sum  of  all  the 
forces  acting  on  the  part.  What  are  the  magnitudes  and  directions  of  all 
these  forces? 

In  this  section,  we  develop  a systematic  way  of  answering  these  ques- 
tions. To  begin  with,  consider  the  stationary  system  consisting  of  body  1 and 
body  2 in  Fig.  5-1.  Body  1 rests  on  body  2,  which  in  turn  rests  on  a tabletop. 
The  mass  of  body  1 is  ra1(  and  the  mass  of  body  2 is  ra2.  Even  in  this  rela- 
tively simple  case  there  is  the  possibility  of  confusion  if  we  are  not  careful. 
All  the  forces  having  to  do  with  the  system  are  shown,  and  there  are  eight 
of  them.  First,  there  is  the  force  Wj.  exerted  on  body  1 by  the  gravitational 


161 


Center  of  earth 

Fig.  5-1  A system  consisting  of  two 
bodies  rests  on  a table.  Tire  gravitational 
force  exerted  on  body  1 by  the  earth  is 
W,;  its  reaction  force  is  - Wt,  which  is 
exerted  on  the  earth  by  body  1 . The 
corresponding  forces  W2  and  — W2  for 
body  2 are  also  shown.  Exerted  on  body 
2 by  body  1 , due  to  its  weight,  is  a down- 
ward force  Don2byl;  the  reaction  force  is 
the  upward  support  force  Urallby2.  Ex- 
erted on  the  table  by  body  2 is  a down- 
ward force  D„nrby2;  the  reaction  force  is 
the  upward  support  force  T exerted  by 
the  table.  The  clashed  boxes  indicate  the 
division  of  the  system  into  two  parts  for 
separate  treatment,  as  shown  in  Fig.  5-2. 


attraction  of  the  earth.  The  reaction  force  to  Wj  is  the  force  — Wj  acting  on 
the  earth.  Body  1 is  supported  by  body  2,  which  exerts  on  it  an  upward  force 
Uon ! by  2 , and  there  is  a downward  reaction  force  Don 2 by  t exerted  on  body  2 
by  body  1.  The  other  forces  shown  in  the  diagram,  and  defined  in  the  figure 
caption,  can  be  accounted  for  similarly. 

In  order  to  determine  precisely  which  forces  are  acting  on  what  body, 
we  must  answer  question  1 posed  above:  What  are  the  parts  of  the  system? 
One  possible  division  into  parts  is  indicated  by  the  dashed  boxes  in  Fig.  5-1, 
where  we  choose  to  consider  each  of  the  two  bodies  separately. 

Once  this  choice  has  been  made,  it  is  useful  to  emphasize  it  by  sep- 
arating the  bodies  which  make  up  the  entire  system  and  drawing  an  indi- 
vidual force  diagram  for  each.  This  is  done  in  Fig.  5-2,  which  thus  provides 
an  answer  for  question  2 above.  The  two  diagrams  are  called  free-body  dia- 
grams. Each  of  the  bodies  can  be  idealized  as  a point,  since  only  its  mass, 
not  its  size  or  shape,  is  significant  in  applying  Newton’s  second  law  to  treat 
the  change  in  location  of  the  body.  Figure  5-2 a shows  all  the  forces  acting 
on  body  1 . It  does  not  show  the  reaction  forces  associated  with  these  forces, 
since  they  are  not  relevant  to  the  motion  (or,  in  this  special  case,  the  non- 
motion) of  body  1.  There  are  just  two  forces  of  interest  for  body  1.  These 
are  the  gravitational  force  Wj  = mx g which  acts  downward  and  the  upward 
support  force  Uun  lbv2  exerted  by  body  2. 

Figure  b-2b  is  a similar  picture  of  all  the  forces  acting  on  body  2.  There 
are  three.  They  are  the  gravitational  force  W2  = m2 g,  the  upward  support 
force  T exerted  by  the  table,  and  the  downward  force  Don2byi  exerted  by 
body  1.  (Note  that  while  Uonlby2  and  Don2byl  are  an  action-reaction  pair, 
only  one  of  the  pair  is  significant  in  each  free-body  diagram.) 

Once  the  free-body  diagram  or  diagrams  have  been  constructed  for  a 
system,  ii  is  possible  to  use  NewTon’s  second  law  to  analyze  the  motion  of 
the  system.  In  the  present  case  this  is  relatively  simple,  since  the  entire 
system  is  at  rest.  We  can  therefore  write  a = 0 for  either  body.  The  only  re- 


Body  1 


Jon  1 by  2 


m i 


Fig.  5-2  Separate  free-body  diagrams  for  the  two  bodies  compris- 
ing the  system  of  Fig.  5- 1 . The  construction  of  the  diagrams  is 
explained  in  the  text. 


Wj  = m i g 


(a) 


T 


Body  2 


m 2 

f ^on  2 by  1 

W2  = g 


(ft) 


162  Applications  of  Newton’s  Laws 


maining  task,  then,  is  to  find  the  magnitudes  of  the  unknown  forces 
Uon  lbv2,  Don  2 by  l-  and  T.  Applying  Newton’s  second  law  to  body  1 gives 

Fx  = mxa.  = 0 

for  the  net  force  Fx  acting  on  body  1.  Evaluating  Fx  from  the  free-body  dia- 
gram of  Fig.  5-2 a leads  to  the  equation 

U0n  1 by  2 + W x = 0 


or 


Uonlby  2 = - Wi  = -m1  g 

Likewise,  the  net  force  F2  acting  on  body  2 is  F2  = m2 a = 0.  Thus  the 
free-body  diagram  of  Fig.  5-2 b yields  T + W2  + Don2by  x = 0,  or 

f = W 2 Don  2 by  1 

The  two  bodies,  considered  in  terms  of  their  free-body  diagrams,  are 
independent.  However,  Newton’s  third  law  provides  a link  between  them 
through  the  fact  that  Uon  lby  2 and  DOI1 2by  x comprise  an  action-reaction  pair. 
We  thus  have 

D0n  2 by  1 Uon  i by  2 


or 


Don  2 by  1 ~ + W , 

The  force  T exerted  by  the  table  can  thus  be  written 

T = — W2  — Wj  - — m2g  — m1  g 


or 


T = ~(m2  + mx)g 

The  results  thus  obtained  in  this  simple  case  are  consistent  with  intui- 
tion. The  magnitude  of  the  upward  force  Uon  lby  2 is  equal  to  the  weight  mxg 
of  body  1,  which  is  supported  by  this  force.  Body  2 is  subject  to  a down- 
ward force  Dun  2by  i whose  magnitude  is  also  equal  to  the  weight  of  body  1. 
And  the  table  exerts  an  upward  force  T on  body  2 whose  magnitude  is 
equal  to  the  combined  weights  ( m2  + m^g  of  the  two  bodies.  Beginning 
with  Fig.  5-1,  how  can  you  draw  a free-body  diagram  so  as  to  obtain  this  re- 
sult in  a single  step? 


EXAMPLE  5-1  — 

The  two  bodies  discussed  above  are  laid  in  contact  with  each  other  on  an  air  table,  as 
shown  in  Fig.  5-3 a.  The  experimenter  pushes  them  to  the  right  by  exerting  on  body  2 
a contact  force  C of  magnitude  7.0  N.  If  the  mass  of  body  1 is  m1  = 3.0  kg  and  the 
mass  of  body  2 is  m2  = 5.0  kg,  find  the  force  Roniby2  exerted  to  the  right  on  body  1 
by  body  2 and  the  acceleration  a of  the  two  bodies. 

■ You  begin  by  separating  the  system  in  imagination  into  the  two  separate  bodies, 
and  you  construct  a free-body  diagram  for  each,  as  in  Fig.  5-3 b and  c.  In  these  dia- 
grams, the  gravitational  forces  exerted  on  each  body  and  the  support  forces  exerted 
by  the  air  table  on  each  are  shown  as  dashed  vectors.  The  two  bodies  remain  always 
on  the  air  table  surface,  and  thus  do  not  accelerate  in  the  vertical  direction.  Both 
intuition  and  the  discussion  immediately  preceding  this  example  tell  you  that  the 
vertical  forces  on  each  body  add  to  zero.  Since  the  forces  of  interest  in  this  example 
are  all  horizontal,  you  need  not  consider  the  vertical  forces  further. 


5-1  The  Free-Body  Diagram  163 


Body  2 


4 -m2 g 

I 
I 


Body  . 


-'on  2 by  1 1 


I 

I 

I 

\n 

( b ) 


i-ms 

j Body  1 

m 4 ► 

I ^on  1 by  2 

I 

4 ms 

(C) 


Fig.  5-3  (a)  Illustration  of  the  sys- 

tem discussed  in  Example  5-1.  ( b ) 
Free-body  diagram  for  body  2.  (c) 
Free-body  diagram  for  body  1. 


(a) 


Just  as  in  the  discussion  of  the  stationary  system  immediately  preceding  this  ex- 
ample, you  next  apply  Newton’s  second  law  to  body  1.  In  this  case,  however,  the 
acceleration  is  not  zero,  and  you  have 

Ft  = Runlby2  = Wj  a 

for  the  net  force  Fx  acting  on  body  1.  Similarly,  the  free-body  diagram  for  body  2 
gives  you 

F2  = C + Lon  2 by  i = m2a 

for  the  net  force  F2  acting  on  body  2,  where  Lon2byt  is  the  force  exerted  to  the  left 
on  body  2 by  body  1 . You  next  link  the  equations  obtained  from  the  two  free-body 
diagrams  by  noting  that  Lun2b.vi  and  Ronlby2  comprise  an  action-reaction  pair,  so 
that 

Lon  2 by  1 Roil  1 by  2 OTxa 

Thus  you  have  for  F2  the  equation 

F2  = C — ?«1a  = m2  a 

This  can  be  solved  for  the  acceleration  a,  giving 


m2  + m1 

Inserting  the  given  numerical  values  into  Eq.  (5-2),  you  obtain 

7.0  N . 

a = C = 0.88  m/s2  in  the  direction  of  C. 

5.0  kg  + 3.0  kg 

You  can  now  work  backward  and  evaluate  Ronlby2-  the  force  which  body  2,  the 
directly  pushed  body,  exerts  on  body  1 to  accelerate  it.  You  have 

Ron  1 by 2 = wqa  = 3.0  kg  X 0.88  m/s2  x C 
= 2.6  N in  the  direction  of  C 


In  the  following  two  sections,  the  method  of  the  free-body  diagram  is 
applied  to  more  complicated  systems,  where  its  usefulness  becomes  still 
more  apparent. 


5-2  ATWOOD’S  Many  important  physical  systems  consist  of  several  parts,  each  of  which  is 
MACHINE  AND  acted  on  by  one  or  more  external  forces.  However,  the  parts  of  such  a 
SIMILAR  SYSTEMS  sYsteni  are  linked  so  that  they  cannot  move  independently  of  one  another. 

In  this  section,  we  begin  the  study  of  such  systems  by  considering  the  rela- 
tively simple  device  illustrated  in  Fig.  5-4.  This  device  is  called  Atwood’s 
machine  in  honor  of  its  inventor.  (In  the  nomenclature  of  the  time,  the 
word  “machine”  was  applied  to  any  mechanical  device,  whether  or  not  it 
performed  a useful  task.) 


164  Applications  of  Newton’s  Laws 


Fig.  5-4  Atwood’s  machine.  Ideally, 
the  system  is  frictionless,  the  string  is 
inextensible,  and  the  string  and  pulley 
are  both  massless. 


S2 


St 


1 1 m 2 


m2g 


1 1 m i 


(a)  ( b ) 

Fig.  5-5  Free-body  diagrams  for  the 
bodies  of  mass  m1  and  m2  which  are  part 
of  the  Atwood  machine  of  Fig.  5-4. 


The  machine  consists  of  two  bodies  of  mass  mx  and  ra2,  attached  to  the 
ends  of  a limp  but  inextensible  string  whose  mass  is  negligible  compared  to 
theirs.  The  string  runs  over  a pulley.  It  is  assumed  that  the  pulley  also  has 
negligible  mass  and  that  the  friction  of  the  bearing  on  which  it  turns  is  neg- 
ligible. (These  ideal  conditions  can  be  approximated  reasonably  well  in 
actual  systems.) 

If  rax  and  m2  are  not  equal,  the  system  will  begin  to  move  as  soon  as  it  is 
released.  The  essential  problem  is  to  find  the  accelerations  of  the  two 
bodies.  Once  the  accelerations  are  known,  the  rules  of  kinematics  devel- 
oped in  Chaps.  2 and  3 can  be  applied  to  find  the  velocity  and  position  of 
the  bodies  at  any  time. 

We  begin  the  analysis  of  the  motion  of  the  Atwood  machine,  as  viewed 
in  an  inertial  reference  frame,  by  drawing  the  necessary  free-body  dia- 
grams, as  in  Fig.  5-5.  Since  the  two  bodies  will  move  in  different  directions, 
it  is  best  to  treat  each  as  a separate  free  body.  In  the  ideal  Atwood  machine, 
the  string  and  the  pulley  are  massless,  and  we  need  not  draw  free-body  dia- 
grams for  them.  In  Fig.  5-5,  Sx  and  S2  are  the  forces  exerted  on  ml  and  m2, 
respectively,  by  the  string.  The  gravitational  forces  exerted  on  the  two 
bodies  are  also  shown;  they  are,  respectively,  mxg  and  m2 g.  (At  this  point  we 
do  not  know  the  magnitudes  of  the  various  forces,  so  only  the  directions  of 
the  vectors  should  be  taken  literally.) 

We  can  now  write  Newton’s  second  law,  Eq.  (5-1),  for  each  of  the  two 
bodies  separately.  Assigning  the  symbol  ax  to  the  acceleration  of  mx  and  the 
symbol  a2  to  the  acceleration  of  m2,  we  have 


Si  + t/ij  g = mx  a, 

(5-3a) 

S2  + rn2g  = m2  a2 

(5-3  b) 

These  two  equations  contain  four  unknowns,  Sx,  S2,  ax,  and  a2.  We  thus 
need  two  more  equations  to  solve  the  problem  of  hireling  the  accelerations. 
These  are  provided  by  the  physical  situation.  First,  the  length  of  the  string 
remains  constant  as  the  system  moves.  The  string  and  the  pulley  taken 
together  ensure  that  if  one  body  has  an  upward  acceleration,  the  other 
body  must  have  a downward  acceleration  of  equal  magnitude.  Thus  we 
have 


a 2 = ~ax  (5-4) 

Second,  the  forces  Sx  and  S2  exerted  by  the  ends  of  the  massless  string 
passing  over  the  massless,  frictionless  pulley  are  equal  in  magnitude  and 
have  the  same  direction,  so  that 


Sx  = S2  (5-5) 

This  equation  depends  directly  on  the  assumption  that  the  string  and  the 
pulley  have  zero  (or  negligible)  mass.  The  string  is  not  a source  of  gravitational 
“driving”  force  for  the  Atwood  machine  even  when  one  body  is  lower  than  the 
other,  so  that  different  lengths  of  string  hang  from  the  two  sides  of  the  pulley.  And 
since  the  string  and  the  pulley  have  no  inertia,  none  of  the  driving  force  is  “used 
up”  in  accelerating  them.  (To  see  exactly  what  is  meant  by  the  vivid  but  imprecise 
term  “used  up,”  consider  what  would  happen  if  a massless  string  were  replaced 
by  a chain,  each  of  whose  links  possesses  mass.  Suppose  a force  exerted  on  the 
chain  at  its  right  end  makes  it  accelerate  toward  the  right.  Each  link  in  the  chain 


5-2  Atwood’s  Machine  and  Similar  Systems  165 


exerts  a force  on  the  link  to  its  left.  But  since  that  link  is  accelerating,  the  net  force 
acting  on  it  is  not  zero.  Thus  the  force  it  exerts  on  the  link  to  its  left  is  less  than  the 
force  exerted  on  it  by  the  link  to  its  right.  This  is  the  case  throughout  the  chain. 
Hence  the  chain  exerts  a smaller  force  on  some  object  attached  to  its  left  end  than 
that  exerted  on  it  at  its  right  end.) 


Equations  (5-3a),  (5-3 6),  (5-4)  and  (5-5)  comprise  a set  of  four  simulta- 
neous equations  in  four  unknowns.  We  first  use  Eqs.  (5-4)  and  (5-5)  to  elimi- 
nate a2  and  S2  from  Eq.  (5-3 6),  which  thus  becomes 

Si  + m2g  = ~ m2a, 

We  next  eliminate  the  unknown  quantity  Sx  by  subtracting  this  equation 
from  Eq.  (5-3a), 

Si  + r«ig  = m\2ii 

Doing  so  gives 


(mx  — m2)  g = ( mx  + m2)a!  (5-6) 

The  acceleration  ax  of  mx  is  thus  given  by  the  equation 


m i — m2 

g 

m1  + m,9 


(5-7  a) 


Again  using  Eq.  (5-4),  a2  = — ax,  we  find  the  acceleration  a2  of  m2  to  be 


m i — m2 
m1  + m2  ^ 


(5-7  6) 


The  accelerations  ax  and  a2  have  opposite  directions.  But  the  magni- 
tudes of  the  accelerations  of  both  bodies  in  Atwood’s  machine  have  a 
common  value  equal  to  the  magnitude  of  the  acceleration  of  gravity,  multi- 
plied by  a fraction  whose  absolute  value  is  always  less  than  or  equal  to  1. 
Whether  mx  accelerates  downward  or  upward  depends  on  whether  mx  is 
greater  than  or  less  than  m2\  this  determines  the  sign  of  the  numerator  of 
the  fraction  in  Eqs.  (5-7 a)  and  (5-76). 

It  is  particularly  easy  to  compare  the  predictions  of  these  equations 
with  what  we  expect  intuitively  in  the  two  extreme  cases  m2  = 0 and  m2  = 
ml.  If  m2  = 0,  Eq.  (5-7a)  yields  ax  = g.  That  is,  in  the  absence  of  the  re- 
tarding force  exerted  on  m1  because  m2  must  be  accelerated  upward,  the 
body  of  mass  m1  descends  in  free  fall.  If  m2  = mx,  the  system  is  balanced,  as 
evidenced  by  the  fact  that  Eqs.  (5-7«)  and  (5-7 6)  yield  ax  = a2  = 0. 


Let  us  return  to  Eq.  (5-6)  and  write  the  magnitudes  so  as  to  show 
clearly  that  it  is  a special  case  of  Newton’s  second  law: 

(m1  — m2)g  — (mx  + m2)ax  (5-8«) 

F = M a (5-8  b) 

We  call  Eq.  (5-8a)  the  equation  of  motion  of  the  entire  system,  consisting  of 
the  two  bodies  and  the  connecting  string.  We  obtained  it  by  using  the  con- 
straints of  the  system,  Eqs.  (5-4)  and  (5-5),  to  conjoin  the  two  separate 
equations  of  motion  for  the  parts  of  the  system,  Eqs.  (5-3o)  and  (5-36).  As 
comparison  of  Eqs.  (5-8a)  and  (5-86)  makes  clear,  M is  the  total  mass  of  the 
system,  while  F is  the  magnitude  of  the  net  force  on  the  system.  The  two 
connected  bodies  move  in  unison  with  acceleration  of  magnitude  a as  a re- 
sult of  the  application  of  F to  M.  As  far  as  the  system  is  concerned,  the 


forces  Sj  and  S2  exerted  on  the  individual  bodies  by  the  ends  of  the  string 
are  internal  forces.  They  therefore  do  not  appear  in  the  equation  of  motion 
of  the  system.  But  they  do  appear  in  the  equations  of  motion  for  the  indi- 
vidual bodies,  for  which  they  are  external  forces.  The  magnitude  of  the  net 
force  acting  on  the  entire  system  is  thus  the  difference  in  weight  of  the  two 
bodies.  Because  of  the  configuration  of  the  system,  with  the  string  bent 
around  the  pulley,  the  gravitational  forces  acting  on  the  two  bodies  pull 
against  each  other. 

Equations  (5-8a)  and  (5-86)  also  make  apparent  the  distinction  between 
the  gravitational  and  inertial  aspects  of  mass.  On  the  left,  or  “F,”  side  of 
these  equations,  it  is  the  gravitational  aspect  which  is  significant,  since  it  is 
from  the  masses  of  the  bodies  in  their  gravitational  role,  taken  together 
with  the  presence  of  the  earth,  that  there  arises  the  force  which  propels  the 
system.  On  the  right,  or  “Ma,”  side  of  the  equations,  it  is  the  inertial  masses 
which  provide  the  resistance  of  the  system  to  acceleration.  Note  especially, 
in  this  connection,  that  it  is  thevuw  of  the  masses  (that  is,  the  total  mass  of 
the  system)  which  appears  on  the  right  side  of  the  equation,  and  the  dif- 
ference of  the  masses  appears  on  the  left  side. 

Atwood  taught  physics  at  Cambridge  (Newton’s  university).  He  invented  his 
machine  as  a classroom  demonstration,  and  it  has  been  so  used  ever  since. 
Atwood’s  machine  seems  to  have  been  the  very  first  laboratory  demonstration  of 
the  validity  of  Newton’s  laws.  Yet  it  was  not  invented  until  about  1730,  or  almost 
half  a century  after  the  publication  of  the  Principia,  and  was  not  even  then 
regarded  as  an  important  experimental  confirmation  of  newtonian  mechanics.  The 
reason  lies  in  the  absolutely  central  role  which  Newton’s  laws  assume,  directly  or 
indirectly,  in  the  description  of  every  natural  phenomenon  involving  motion  of 
bodies  whose  speeds  are  small  compared  to  the  speed  of  light  and  whose  sizes  are 
large  compared  to  those  of  atoms.  By  Atwood's  time,  Newton  and  others  had  piled 
up  such  a vast  fund  of  evidence  and  examples  (such  as  planetary  motion)  that 
Atwood's  machine  became  an  interesting  demonstration  rather  than  a significant 
experimental  proof. 


EXAMPLE  5-2 


Find  the  force  exerted  by  the  string  on  either  of  the  bodies  in  Atwood's  machine. 

■ According  to  Eq.  (5-5),  the  forces  S]  and  S2  exerted  by  the  two  ends  of  the 
string  are  equal.  So  you  may  as  well  evaluate  Sj.  One  of  several  ways  you  can  do  this 
is  to  begin  with  Eq.  (5-3a),  which  you  can  rewrite  in  the  form 


Si  = -w,g  + m1  aj  = mj( aj  - g) 


If  you  substitute  into  this  equation  the  value  of  a!  given  by  Eq.  (5-7a),  you  get 


Factoring  the  quantity  g out  of  the  term  in  parentheses  in  this  equation  gives  you 


Note  that  the  fraction  (m1  + m2)/(m x + m2)  can  be  substituted  for  1 in  this  equation, 
which  may  thus  be  rewritten 


5-2  Atwood’s  Machine  and  Similar  Systems  167 


or 


Sx 


■wig 


2 m ■> 


mx  + m2 


According  to  Eq.  (5-5),  this  is  also  the  value  of  S2. 


(5-9  a) 


inni 


mmm 


What  is  the  physical  meaning  of  Ecp  (5-9a),  which  gives  the  force  Sx  ex- 
erted by  the  string  on  Suppose  that  & m2.  The  magnitude  Si  is  the 
strength  of  the  force  exerted  by  the  string  on  either  body.  The  equation 
shows  that  it  is  equal  to  the  weight  m^g  of  the  body  of  greater  mass  multi- 
plied by  the  quantity  in  parentheses.  This  quantity  is  a fraction  smaller  than 
1 . Thus  the  string  does  not  exert  a force  strong  enough  to  hold  up  the  body 
of  greater  mass  mlt  and  it  accelerates  downward. 

On  the  other  hand,  since  Sj  = S2  the  force  exerted  by  the  other  end  of 
the  string  on  the  body  of  smaller  mass  m2  also  has  the  value  given  by  the 
right  side  of  Eq.  (5-9o).  Rearranging  the  terms  of  that  equation  slightly, 
and  setting  Sx  = S2 , thus  gives 

m1  g 2m2 

S2  = ; 

mi  + m2 


or 


S2 


«2  g 


2m  1 


mi  + m2 


(5 -9b) 


Equation  (5-96)  tells  you  that  the  force  exerted  by  the  string  on  the  body  of 
smaller  mass  m2  has  a magnitude  ecpial  to  the  weight  m2g  of  that  body  mul- 
tiplied by  the  quantity  in  parentheses,  which  is  greater  than  1.  Thus  the 
force  exerted  by  the  string  on  the  body  of  smaller  mass  is  more  than  suffi- 
cient to  support  it,  and  the  body  accelerates  upward. 

Equations  (5-9a)  and  (5-96)  both  tell  you  that  the  force  exerted  by  the 
string  on  either  body  is  strongest  when  the  masses  are  equal.  In  this  case, 
each  body  can  exert  a force  on  the  string  sufficient  to  “hold  the  other  one 
up.”  On  the  other  hand,  the  force  exerted  by  the  string  on  one  of  the 
bodies  will  be  zero  if  the  mass  of  the  other  body  is  zero. 

How  would  the  discussion  following  Example  5-2  proceed  if  you  as- 
sumed that  m2 


In  Example  5-3,  the  accelerations  of  the  bodies  in  an  Atwood  machine 
are  calculated.  This  information  is  then  used,  together  with  the  rules  of 
kinematics,  to  develop  information  concerning  position,  time,  and  velocity. 


EXAMPLE  5-3 

In  a certain  Atwood  machine  the  masses  of  the  two  bodies  have  the  values  mx  = 
2.10  kg  and  m2  = 2.00  kg.  The  string  and  the  pulley  have  negligible  mass,  and  the 
friction  of  the  system  is  negligible  as  well.  If  the  two  bodies  are  initially  at  rest  at  the 
same  level,  how  long  will  it  be  before  the  vertical  separation  between  them  is  1.5  m? 
How  fast  will  they  then  be  going? 

■ It  is  most  convenient  here  to  work  in  terms  of  signed  scalar  quantities,  rather 
than  the  vector  quantities  we  used  to  derive  the  equations  of  motion  for  the  Atwood 
machine.  In  these  terms,  the  accelerations  of  the  two  bodies  bear  the  relation 

a2  = — ax 

which  is  the  signed  scalar  form  of  Eq.  (5-4).  So  you  may  as  well  focus  your  attention 


168  Applications  of  Newton’s  Laws 


Fig.  5 


on  m ! . Choose  the  upward  direction  as  that  of  the  positive  x axis,  with  the  origin  at 
the  starting  level  of  the  two  bodies,  as  shown  in  Fig.  5-6.  The  initial  position  of  m1  is 
then  given  by  xt  = xu  = 0.  Since  the  system  starts  at  rest,  you  also  have  the  initial 
velocity  vy  = vu  = 0 for  Equation  (2-30),  written  in  the  notation  of  the  present 
discussion,  then  reduces  to  the  simple  form 


Xi  = 


(5-10) 


In  order  to  use  this  equation,  you  must  first  use  Eq.  (5-7a)  to  find  the  acceleration  a1 
of  mx.  In  signed  scalar  terms,  this  gives  you 

nil  ~ m2 

«i  = T S 

m,i  + ni2 


Since  g is  directed  downward,  it  has  the  value  g = —9.8  m/s2.  You  thus  have 


a i 


2.10  kg  - 2.00  kg 
2.10  kg  + 2.00  kg 


(-9.8  m/s2) 


or 


a,i  = — 0.24  m/s2 


The  minus  sign  tells  you  that  nii  accelerates  downward,  as  you  would  expect  for  the 
more  massive  of  the  two  bodies.  (You  immediately  have  a2  = ~a1=  +0.24  m/s2  as 
well.)  Note  that  by  making  the  difference  between  the  masses  small  you  can  reduce 
the  acceleration  to  a value  which  is  not  difficult  to  measure. 

When  the  position  of  mx  is  given  by  xlt  the  vertical  separation  of  the  two  bodies 
is  2|x1|.  Thus  when  the  two  bodies  are  1.5  m apart,  you  have  x,  = —0.75  m.  (The 
sign  of*!  is  evident  from  Fig.  5-6.)  You  solve  Eq.  (5-10)  for  the  elapsed  time  t to  obtain 


t = 


Then  you  find  its  numerical  value  by  setting  ax  = —0.24  m/s2  and  Xj  = —0.75  m; 


j 2 X (-0,75  m) 
V -0.24  m/s2 


2.5  s 


Since  the  acceleration  ax  is  constant,  you  can  use  Eq.  (2-29),  vx  = Vu  + at,  to 
find  the  velocity  Vi  of  mx  at  the  instant  when  the  two  bodies  are  1.5  m apart.  Since 
vlt  = 0,  you  have 

Vi  = ait  = -0.24  m/s2  x 2.5  s = —0.60  m/s 


Again  the  negative  sign  signifies  the  downward  direction  of  Vi.  The  velocity  v2  of  m2 
is  immediately  given  by 

v2  = a2t  = —ait  = +0.60  m/s 


(But  you  already  knew  that  v2  = — tq.) 


Example  5-4  concerns  a system  which  amounts  to  an  Atwood  machine 
modified  by  the  introduction  of  an  inclined  plane. 


EXAMPLE  5-4 

In  Fig.  5-7a,  the  body  of  mass  m2  can  slide  without  friction  on  the  inclined  plane, 
which  makes  an  angle  6 with  the  horizontal.  A light,  inextensible  string  is  attached 
to  m2  and  passes  over  a light,  frictionless  pulley.  A body  of  mass  mx  hangs  from  the 
other  end  of  the  string.  Find  an  algebraic  expression  for  the  acceleration  of  the 
system.  Solve  in  particular  for  the  case  mi  = 7.0  kg,  m2  = 14.0  kg,  6 = 28°. 


5-2  Atwood’s  Machine  and  Similar  Systems  169 


Here  again  it  is  more  convenient  to  work  in  terms  of  signed  scalar  quantities 
rather  than  vectors.  While  the  system  as  a whole  is  two-dimensional,  the  motion  of 
each  of  the  bodies  is  restricted  to  one  dimension.  Moreover,  the  displacements  of 
the  two  bodies  are  connected  by  the  constraint  imposed  by  the  string.  If  m1  moves 
through  a certain  distance  vertically  downward,  m2  must  move  through  the  same 
distance  upward  along  the  plane.  Assume  for  the  sake  of  argument  that  the  sense 
of  the  motion  will  be  clockwise;  that  is,  m1  will  descend  and  m 2 will  be  pulled  up  the 
plane.  If  you  then  represent  all  quantities  directed  clockwise  dispositive  scalars,  and 
quantities  directed  counterclockwise  as  negative  scalars,  your  results  will  be  consistent. 
(If  the  actual  motion  turns  out  to  be  counterclockwise,  the  results  will  tell  you  this  by 
yielding  a negative  acceleration.) 

The  significant  difference  between  the  system  in  Fig.  5-7 a and  the  Atwood  ma- 
chine is  that  the  weight  of  m2  is  partially  supported  by  the  inclined  plane.  You  have 
already  dealt  with  a simpler  but  similar  situation  in  Example  4-4,  in  which  a body- 
slid  freely  down  the  plane. 

You  begin  by  drawing  free-body  diagrams  for  the  two  bodies,  as  shown  in  Fig. 
5-7 b and  c.  The  latter  diagram,  which  represents  the  hanging  body  of  mass  mu  is 
one-dimensional.  But  the  former,  representing  the  body  of  mass  m2  on  the  inclined 
plane,  is  not.  The  vectors  S2,  W,  and  N are,  respectively,  the  force  exerted  on  m2  by 
the  string,  the  gravitational  force  on  m2,  and  the  normal  force  exerted  on  m2  by  the 
plane.  They  require  two  dimensions  to  represent  them.  However,  as  explained  in 
Example  4-4,  the  normal  force  cancels  the  normal  component  of  the  gravitational 
force.  Hence  the  free-body  diagram  of  m2  reduces  to  the  simpler  one-dimensional 
form  of  Fig.  5-7 d.  The  quantity  WK  is  the  component  of  the  gravitational  force  along 
the  x axis  drawn  parallel  to  the  plane.  If  you  take  the  sign  convention  into  consider- 
ation, you  see  that  Fig.  5-7 d shows  that  Wn  has  the  value 

Wn=  -m2g  sin  9 (5-11) 


In  terms  of  the  same  sign  convention,  the  accelerations  of  the  two  bodies  are  related 
by  the  expression 


Fig.  5-7  (a)  Sketch  of  the  system  dis- 

cussed in  Example  5-4,  showing  the 
directions  chosen  as  positive  for  the 
motion  of  the  two  bodies,  (b)  Free-body 
diagram  for  the  body  on  the  inclined 
plane,  (c)  Free-body  diagram  for  the 
hanging  body.  ( d)  Free-body  diagram 
for  simplified  to  one  dimension  by  the 
method  developed  in  Example  4-4. 


a2  = ai  = a (5-12«) 

And  the  forces  exerted  on  the  two  bodies  by  the  string  bear  the  relation 

St=~S1  (5-12  b) 

Newton’s  second  law  gives  you  the  equations  of  motion  for  the  two  bodies.  For 
m1  you  have 

mxg  + Si  = m1a1  (5-13a) 

and  for  m2  you  have 

S2  + W\\  = m2a2  (5-136) 

Substituting  the  values  of  Sn,  a^,  S2,  and  from  Eqs.  (5-11),  (5-12a),  and 
(5-126),  you  obtain  the  equations  of  motion  in  the  form 

m1g  + Si  = m,]_a 


and 


— Sx  — m2g  sin  6 = m2a 
Adding  these  two  equations  gives  you 

{mi  — m2  sin  9)g  = {mi  + m2)a 

Solving  for  the  acceleration  a of  either  of  the  bodies,  you  have 

mi  — m2  sin  6 

a = g T 

mi  + m2 


(5-14) 


170  Applications  of  Newton’s  Laws 


This  result  is  very  much  like  the  solution  for  the  Atwood  machine  given  by  Eq. 
(b-la).  The  main  difference  is  that  the  term  in  the  numerator  containing  m2  is  di- 
minished by  the  factor  sin  6 , which  is  always  less  than  1.  Note  that  the  system  will 
move  clockwise  if  mx  > m2  sin  0 and  counterclockwise  if  the  reverse  is  true.  Why 
does  the  factor  sin  0 not  appear  in  the  denominator  of  Eq.  (5-14)? 

Now  you  can  use  Eq.  (5-14)  to  find  the  specific  numerical  solution.  In  terms  of 
the  sign  convention  adopted,  g = +9.8  m/s2.  Thus  you  have 


a = 9.8  m/s2  x 


7.0  kg  — 14.0  kg  x sin  28° 
7.0  kg  + 14.0  kg 


or 

a = 0.020  m/s2 

The  body  of  mass  m,  descends  even  though  mx  is  smaller  than  m2.  This  is  the  reason 
why  inclined  planes  are  useful  devices;  they  make  possible  the  raising  of  large 
weights  with  relatively  small  forces. 

Example  5-5  applies  the  Atwood  machine  analysis  to  a system  which 
has  been  discussed  twice  before,  in  Sec.  3-5  and  in  Example  4-5.  In  our  ex- 
perimental approach  to  centripetal  acceleration  in  Sec.  3-5,  we  began  by 
placing  an  air  puck  in  a circular  orbit  on  the  horizontal  top  of  an  air  table. 
The  necessary  centripetal  force  was  supplied  by  the  weight  of  a washer 
hanging  from  a string  which  passed  through  a small  hole  in  the  center  of 
the  air  table,  as  shown  in  Fig.  5-8.  A crucial  part  of  the  quantitative  argument 
involved  the  fact  that  the  centripetal  acceleration  of  the  puck  (which  was 
measured  using  strobe  photos)  was  equal  to  the  radially  inward  acceleration 
of  the  same  puck  when  it  was  released  from  rest  and  was  pulled  toward  the 
hole  by  the  string.  The  tension  in  the  string  was  the  same  in  both  cases.  In 
Example  5-5,  you  will  see  that  keeping  the  tension  the  same  involves  using 
washers  of  slightly  different  weights  in  the  two  cases.  You  will  determine 
the  difference  and  thus  achieve  a complete  analysis  of  the  system. 


EXAMPLE  5-5 


In  Example  4-5,  a puck  of  mass  M = 0.33  kg  was  made  to  move  at  constant  speed  in 
a circular  orbit.  The  necessary  centripetal  force  was  supplied  by  a string,  from  die 
other  end  of  which  was  hung  a washer  of  mass  m — 0.022  kg.  The  washer  did  not 
move.  Next,  the  same  puck  was  released  from  rest,  so  that  the  washer  as  well  as  the 
puck  experienced  acceleration.  If  the  force  exerted  by  the  string  on  the  puck  is  to 
have  the  same  magnitude  S in  the  two  experiments,  what  must  be  the  mass  m'  of 
the  washer  in  the  second  experiment? 


Fig.  5-8  Sketch  of  the  air  table  experiment  discussed  in  detail  in  Sec. 
3-5.  When  the  puck  moves  in  a circular  orbit,  the  weight  of  the  washer 
supplies  the  necessary  centripetal  force.  When  the  puck  is  released 
from  rest,  the  weight  of  the  washer  accelerates  the  system  consisting  of 
washer,  string,  and  puck.  This  system  is  very  similar  to  the  system  of 
Fig.  5-7a. 


5-2  Atwood’s  Machine  and  Similar  Systems  171 


5-3  MOTION  WITH 
CONTACT  FRICTION 


■ When  you  apply  Newton’s  second  law  to  the  puck  in  the  two  experiments,  you 
obtain  S = Mac  and  5 = Mar,  respectively,  where  ac  and  aT  are  the  magnitudes  of 
the  centripetal  and  radial  accelerations  of  the  puck.  Hence  you  have  ac  = ar  \ this 
was  verified  by  actual  measurement  of  the  accelerations. 

The  net  force  exerted  on  the  washer  is  the  sum  of  the  force  S exerted  by  the 
string  and  the  gravitational  force  W.  When  the  puck  is  whirling  in  a circle,  the 
washer  does  not  move.  Hence  the  net  force  on  the  washer  must  be  F = S + W = 0. 
You  thus  have  S = W = mg.  This  gives  you  Mac  = mg,  or 


ac 


S _ m 
M ~MS 


(5-15) 


When  the  puck  is  released  from  rest  and  moves  radially  inward,  the  system  is 
the  same  as  that  just  considered  in  Example  5-4,  with  the  simplification  that  the 
inclination  angle  of  the  plane  is  now  8 = 0°.  You  can  therefore  apply  Eq.  (5-14). 
Since  here  sin  8 = 0,  mx  = m' , and  m2  = M,  and  a = ar,  that  equation  assumes  the 
particularly  simple  form 


ar 


m 

m'  + M S 


(5-16) 


Compare  this  with  Eq.  (5-15).  You  can  see  that  the  condition  of  equal  string  ten- 
sions, which  led  to  the  equality  of  ac  and  ar,  requires  that  m'  ^ m.  (Since  m <5<  M, 
however,  the  difference  between  m and  m'  will  turn  out  not  to  be  very  large.) 
Equating  the  right  sides  of  Eqs.  (5-15)  and  (5-16),  you  have 

m _ m 
M ~ m'  + M 

Multiplying  both  sides  of  this  equation  by  M(m'  + M),  in  order  to  clear  fractions, 
gives  you 

Mot'  = mm'  + will 


Collecting  all  terms  containing  m'  on  one  side  of  the  equation  and  factoring,  you 
obtain 


m'(M  — ?«)  = mM 


You  thus  find  for  m'  the  result 


mM 
M — m 


(5-17) 


Using  the  known  values  of  the  puck  mass  M and  the  mass  m of  the  washer  used 
in  the  circular-orbit  experiment,  you  can  now  find  the  mass  of  the  washer  required 
for  the  radial-acceleration  experiment.  You  have 


m 


0.022  kg  x 0.33  kg 
0.33  kg  - 0.022  kg 


0.024  kg 


The  difference  between  m'  and  m is  thus  0.002  kg  = 2 g.  The  proportional  dif- 
ference is  2 parts  in  24,  or  about  8 percent. 


In  the  last  section,  we  deliberately  ignored  the  effects  of  friction.  In  a situa- 
tion such  as  Example  5-4,  where  a block  is  pulled  up  an  inclined  plane,  this 
is  not  usually  realistic.  We  are  now  ready  to  incorporate  the  effects  of  fric- 
tional forces  into  the  analysis  of  such  systems.  All  we  need  do  in  principle  is 
to  include  the  frictional  force  in  the  free-body  diagram  of  the  block  on  the 


172  Applications  of  Newton’s  Laws 


plane.  The  method  of  finding  the  magnitude  of  that  force  is  the  one  used 
in  Example  4-11,  where  a block  slid  clown  an  inclined  plane  under  the 
influence  of  gravitation  and  friction  alone.  Example  5-6  combines  the  es- 
sential features  of  Example  4-1 1 with  those  of  Example  5-4. 


EXAMPLE  5-6 


Fig.  5-9  Illustration  for  Example  5-6. 


In  the  system  ot  fig.  5-7 a the  body  of  mass  m2  now  slides  on  a well-greased  plane; 
the  coefficient  of  kinetic  friction  is  /xk  = 0.030.  As  in  Example  5-4,  let  mA  = 7.0  kg, 
m2  = 14.0  kg,  and  6 = 28°.  Find  the  acceleration  of  the  system.  Compare  the 
result  with  that  found  in  Example  5-4,  where  friction  was  neglected. 

■ To  begin  with,  assume  that  the  frictional  force  is  not  so  large  as  to  keep  the 
system  from  moving  altogether.  A frictional  force  can  never  change  the  direction  of 
motion  (why  not?),  so  you  can  assume  that  the  acceleration  is  clockwise,  as  it  was  in 
Example  5-4,  if  it  is  not  zero.  The  frictional  force  Ck  must  therefore  act  downward 
to  the  left,  in  the  same  counterclockwise  sense  as  W\\.  See  Fig.  5-9,  which  is  sim- 
ply the  free-body  diagram  of  Fig.  5-7 d with  the  frictional  force  added. 

From  Example  4-1 1 you  have 

Cu  = ~ ^km2g  cos  6 

The  negative  sign  is  needed  to  conform  to  the  convention  of  taking  the  clockwise 
sense  to  be  positive,  which  is  the  same  here  as  in  Example  5-4.  Just  as  in  that  ex- 
ample, you  can  write  the  equations  of  motion  for  the  two  bodies: 


m1g  + Sj  = »«,«■! 


(5- 18a) 


S 2 + VE||  + Ck  = 


(5-186) 


These  are  the  same  as  Eqs.  (5- 13a)  and  (5-136)  except  that  the  contact  friction  force 
Ck  has  been  included  in  the  net  force  acting  on  m2.  You  proceed  to  solve  these  equa- 
tions just  as  in  Example  5-4,  and  you  obtain 

g[mx  - w,2(sin  6 + /j.k  cos  0)]  = a(m1  + m2) 


or 


a = 


ml  ~ ?W2(sin  0 + /JLk  cos  6) 


(5-19) 


which  you  should  compare  with  Eq.  (5-14). 

When  you  insert  the  numerical  values,  you  find  the  result 


a = 9.8  m/s 
a = 0.0027  m/s 


9 7.0  kg  - 14.0  kg  x (sin  28°  + 0.030  cos  28°) 


7.0  kg  + 14.0  kg 


(5-20) 


This  is  only  about  one-tenth  as  great  as  the  acceleration  in  the  frictionless  case.  The 
large  effect  of  a quite  small  frictional  force  in  this  example  is  due  to  the  fact  that  the 
system  was  nearly  balanced  without  friction.  If  Ck  had  been  a little  bigger,  the 
system  would  not  have  moved  at  all.  However,  Eq.  (5-19)  does  not  give  you  this  in- 
formation automatically,  and  it  is  important  to  check  your  initial  assumption  that  a 
is  not  zero.  The  general  method  for  doing  this  is  discussed  immediately  following 
this  example.  However,  it  is  easy  to  check  a particular  numerical  result  such  as  Eq. 
(5-20),  provided  you  are  careful  to  keep  your  signs  straight.  You  assumed  at  the 
outset  that  a was  positive.  If  you  had  obtained  a negative  numerical  result,  you 
would  have  to  see  whether  a would  be  negative  even  in  the  absence  of  friction.  To 
do  this,  repeat  the  calculation  with  fjLk  = 0.  (In  fact,  this  was  done  in  Example  5-4.) 
If  a turns  out  to  be  positive  without  friction,  although  it  was  negative  with  friction, 
the  true  result  is  that  the  system  does  not  move,  or  will  come  to  rest  if  it  is  started 
with  a push.  Once  at  rest,  the  system  is  under  the  influence  of  the  even  larger  static 


5-3  Motion  with  Contact  Friction  173 


coefficient  of  friction  /jls,  and  Eq.  (5-19)  does  not  correctly  describe  the  motion.  If 
the  kinetic  frictional  force  was  large  enough  to  bring  the  system  to  rest  (assuming  it 
was  moving  in  the  first  place),  the  static  friction  will  certainly  provide  a force  suffi- 
cient to  balance  the  other  forces  on  the  bodv  and  keep  it  at  rest. 


A system  of  the  sort  treated  in  Example  5-6  may  involve  arbitrary 
masses  m1  and  m2 , an  arbitrary  inclination  angle  d , and  arbitrary  coeffi- 
cients of  friction  fxk  and/or  /jls.  Given  the  values  of  these  quantities,  the 
system  may  fall  into  one  of  two  general  classes,  each  of  which  has  three  sub- 
classes. First  you  must  determine  the  general  class:  whether  the  system 
tends  to  move  clockwise  or  counterclockwise.  This  is  determined  by  what 
the  system  would  do  in  the  absence  of  friction,  as  given  by  Eq.  (5-14).  Once 
the  direction  of  friction-free  motion  has  been  found,  taking  friction  into 
consideration  leads  to  one  of  the  three  following  subclasses:  (a)  the  system 
may  move  under  all  circumstances;  ( b ) it  may  move  only  if  it  is  initially  in 
motion  or  if  it  is  given  a push  to  start  it;  or  (c)  it  may  come  to  a stop  even  if 
it  is  initially  in  motion.  The  various  possibilities,  enumerated  in  Table  5-1, 
are  the  subject  of  an  exercise  at  the  end  of  this  chapter. 

Table  5-1 


Summary  of  Cases  for  the  System  of  Example  5-6 
Case  Condition 


I.  Clockwise  motion 

a.  System  always  moves 

b.  System  moves  if  started 

c.  System  comes  to  rest  if  moving 
System  balanced  in  absence  of  friction 

II.  Counterclockwise  motion 

c'.  System  comes  to  rest  if  moving 
b’.  System  moves  if  started 
a'.  System  always  moves 


m2g  sin  9 < mxg 

ni2g(sin  9 + /xs  cos  9)  < mtg 

rrisgfsin  9 + p*  cos  9)  < rriig  < m2g(sin  9 + /jls  cos  9) 
rriog  sin  9 < irtjg  < msgfsin  9 + cos  9) 
nijg  = m2g  sin  9 
mjg  < m2g  sin  9 

m2g(sin  9 - /j.k  cos  9)  < mg  < m2g  sin  9 
m2g(sin  9 - (jls  cos  9)  < rrijg  < m2g(sin  9 - /jik  cos  9) 
m jg  < m2g(sin  9 - / jls  cos  9) 


5-4  FICTITIOUS  There  is  a kind  of  force  familiar  to  evexyone,  to  which  we  have  so  far  given 
FORCES  only  the  briefest  mention.  Everyone  knows  that  on  rounding  a curve  the 
passengers  in  a car  feel  an  outward  foxce,  called  the  centrifugal  (that  is, 
centei-fleeing)  force.  No  such  force  appealed  in  our  analysis  of  circular 
motion  in  Examples  4-5  and  5-5;  the  only  radial  force  dealt  with  was  the  in- 
ward centripetal  force. 

This  apparently  paradoxical  situation  arises  from  one  of  the  most  fun- 
damental and  subtle  points  in  newtonian  mechanics.  We  could  settle  the 
matter  by  simply  saying  that  the  passengers  in  the  car  are  not  in  an  inei'tial 
reference  frame,  so  that  they  cannot  apply  Newton’s  laws.  (Remember  that 
Newton’s  laws  of  motion  wei'e  developed  in  Chap.  4 to  account  for  the  ob- 
servations of  observers  stationed  in  inertial  reference  frames.  They  cannot  validly 
be  used  by  observers  in  noninertial  frames.)  What  we  wish  to  do  now,  how- 
ever, is  broaden  our  point  of  view  so  that  any  observer  can  predict  what  any 
other  observer  will  see,  regardless  of  the  reference  frames  in  which  the  observers 
are  located.  To  put  it  another  way,  we  want  to  see  how  Newton’s  laws  need  to 
be  supplemented  or  modified  so  that  any  observer  can  apply  them  to  what 
she  or  he  sees. 

Consider  a car  traveling  around  a ciicular  curve  of  radius  r.  If  you 

174  Applications  of  Newton’s  Laws 


stand  by  the  roadside,  it  is  clear  that  the  car  is  undergoing  a centripetal 
acceleration.  At  an  instant  when  the  velocity  of  the  car  is  V and  the  radius 
of  the  curve  is  r,  the  instantaneous  acceleration  is  given  by  Eq.  (3-4 1 A), 


where  f is  the  unit  vector  directed  from  the  center  of  the  curve  to  the  car. 
The  minus  sign  in  the  equation  specifies  that  the  direction  of  ac  is  inward. 

A passenger  O'  in  the  car  will  agree  with  you  that  the  radius  of  the 
curve  is  r.  He  will  even  agree  with  you  that  the  velocity  of  the  car  is  V, 
although  that  is  not  the  raw  product  of  his  observation.  What  he  actually 
sees  is  the  ground  moving  past  him  with  velocity  —V,  but  he  has  learned 
from  experience  to  make  the  transformation  of  velocity  to  your  reference 
frame,  which  makes  it  possible  for  him  to  speak  to  you  in  your  own  lan- 
guage. 

As  far  as  casual  observation  is  concerned,  this  tends  to  be  the  limit  of 
the  passenger’s  broadmindedness.  His  position  relative  to  the  car  in  which 
he  is  sitting  does  not  change,  and  he  therefore  finds  it  difficult  to  think  of 
himself  as  being  accelerated.  Instead,  he  takes  a more  natural  but  more 
narrow-minded  point  of  view.  He  insists  that  he  is  experiencing  no  acceler- 
ation. But  if  he  tries  to  verify  Newton’s  laws  by  means  of  an  experiment,  he 
will  find  that  they  fail.  For  example,  suppose  that  he  has  a small  air  table 
and  is  holding  a puck  at  a certain  position  on  it.  At  a particular  moment,  as 
the  car  is  rounding  the  curve,  he  releases  the  puck.  As  soon  as  he  does  so, 
there  can  be  no  further  net  force  acting  on  the  puck.  First  consider  what 
you,  observer  0,  see  from  your  observation  point  at  the  roadside.  As  shown 
in  Fig.  5- 10a,  b,  and  c,  you  see  the  puck  continue  moving  at  constant  veloc- 
ity after  it  is  released,  tangent  to  the  curve  at  the  point  of  release.  And  you 
see  that  the  car,  passenger,  and  air  table  are  accelerated  out  from  under  the 
puck  with  a centripetal  acceleration 

V2  „ 

a,  = r 

r 


(This  acceleration  is  produced  by  the  sidewise  frictional  force  of  the  tires 
on  the  loach) 

But  the  passenger,  observer  O' , observes  things  from  his  own  point  of 
view.  As  suggested  by  Fig.  5-10<i,  he  must  explain  why  it  is  that  he  sees  the 
puck  accelerating  outward  with  instantaneous  acceleration 


a 


(5-22  a) 


He  decides  to  apply  Newton’s  laws,  ignoring  the  fact  that  he  is  not  justified 
in  doing  so  because  he  is  not  in  an  inertial  frame.  Knowing  that  an  accelera- 
tion implies  the  existence  of  a force,  and  knowing  the  mass  m of  the  puck, 
he  postulates  the  existence  of  a centrifugal  force  F'  given  by 

mV2 

F'  = ma'  = H r (5-226) 


It  is  nothing  new  to  invent  a force  in  order  to  account  for  an  accelera- 
tion; we  have  done  precisely  this  in  accounting  for  the  gravitational  acceler- 
ation of  falling  bodies  by  means  of  a gravitational  force.  But  we  believe  that 
we  can  assign  a tangible  source  or  cause  to  the  gravitational  force — namely 
the  presence  of  the  earth — whereas  there  is  no  such  explanation  at  hand 


5-4  Fictitious  Forces  175 


Fig.  5-10  (a)  Cutaway  rear  view  of  a car  having  mass  M (including  its  contents)  which  is 

rounding  a curve  of  radius  r at  speed  V.  The  centripetal  force  exerted  by  the  roadway  on  the 
tires  results  in  a centripetal  acceleration  of  magnitude  ac  = V2 /r.  ( b ) Top  view  of  the  same  car. 
Observer  O',  seated  behind  an  air  table,  releases  a puck.  The  subsequent  paths  of  the  puck  and 
the  car  are  shown,  (c)  Detailed  view  of  the  air  table  and  puck  as  seen  by  observer  O standing  at 
the  side  of  the  road.  The  system  is  shown  at  two  instants.  The  first  is  the  instant  when  O'  releases 
the  puck,  and  the  second  is  a short  time  later.  Observer  O describes  the  initial  and  subsequent 
positions  of  the  puck  by  means  of  the  vectors  r,-  and  r.  and  she  finds  the  displacement  of  the  puck 
to  be  A r.  By  making  further  measurements,  she  can  find  that  the  motion  of  the  puck  is  inertial. 
That  is,  it  moves  in  a straight  line  at  constant  speed.  She  also  finds  that  the  car  has  an  inward 
acceleration  of  constant  magnitude  ac  = V2 /r.  ( d ) Observer  O',  seated  at  the  air  table,  observes 
the  table  to  be  motionless.  However,  he  observes  a displacement  Ar'  of  the  puck  during  the 
same  time  interval  over  which  O observes  the  displacement  Ar.  (Why  does  O'  see  Ar'  as  having  a 
small  forward  component  as  well  as  the  main  outward  component?) 


176  Applications  of  Newton’s  Laws 


for  the  centrifugal  force.  It  is  purely  an  artifice,  invented  by  the  passenger 
in  the  car  who  wishes  to  use  Newton’s  laws  in  a noninertial  frame.  Such  a 
force  is  called  a fictitious  force.  [It  is  also  sometimes  called  a d’Alembert 
force,  after  the  French  physicist  Jean  le  Rond  d’Alembert  (1717-1783) 
who  first  systematically  clarified  the  rules  for  application  of  the  laws  of 
physics  by  an  accelerated  observer.]  Note  that  only  the  observer  in  the  non- 
inertial frame  must  use  the  fictitious  force  to  account  for  what  he  observes. 

With  these  ideas  in  mind,  let  us  derive  the  rules  for  comparison  of  ob- 
servations made  by  an  observer  in  a noninertial  frame  with  simultaneous 
observations  made  in  an  inertial  frame  where  we  know  Newton’s  laws  to  be 
correct.  Observer  0 in  the  inertial  frame  is  doing  some  of  the  air  table-puck 
experiments  we  described  in  Chap.  4,  in  order  to  verify  that  she  is  indeed 
in  an  inertial  frame.  She  finds  that  pucks  accelerate  only  when  they  interact 
with  other  bodies.  Otherwise,  they  maintain  constant  velocity.  She  thus 
knows  that  the  principle  of  momentum  conservation  and  Newton’s  laws  are 
applicable  to  the  observations  she  will  make  subsequently. 

As  O works,  she  is  observed  by  observer  O',  who  rides  on  a magic 
carpet.  In  contrast  to  the  situation  described  in  Sec.  3-8,  where  0'  made  his 
observations  from  a car  moving  at  constant  velocity,  the  magic  carpet  is  accel- 
erated with  a constant  acceleration  A relative  to  observer  0.  That  is,  A is  the 
acceleration  of  the  frame  of  reference  O',  as  measured  by  observer  0. 
Using  suitable  telescopes  and  so  forth,  observer  O'  makes  for  himself  every 
measurement  made  by  O.  For  every  velocity  v determined  by  O at  a certain 
time,  0'  simultaneously  determines  a velocity  v'. 

In  order  to  compare  his  results  with  those  of  0,  observer  0'  must  find 
the  proper  velocity  transformation  equation,  which  relates  to  v the  value  of 
v'  that  he  observes  from  his  nonrotating  but  noninertial  reference  frame. 
He  begins  by  making  the  sketch  of  Fig.  5-1  la.  The  relative  positions  of 
the  two  reference  frames  are  shown  at  a certain  arbitrary  time  t.  In  frame 
0,  the  position  of  the  origin  of  frame  0'  is  described  by  the  vector  R.  At  this 
instant,  observers  0 and  0'  simultaneously  measure  the  position  of  an  arbi- 
trary body  B.  The  results  of  their  respective  measurements  are  described 
by  the  vectors  r and  r'. 

It  is  evident  from  inspection  of  Fig.  5-1 1 a that  the  vectors  r'  and  r are 
related  by  the  equation 


r'  = r - R (5-23a) 

The  desired  velocity  transformation  equation  can  be  obtained  by  differen- 
tiating both  sides  of  this  equation  with  respect  to  time.  This  gives 

dr'  _ dr_  _ dR 
dt  dt  dt 

Employing  the  definitions  v'  = dr' /dt,  v = dr/dt, , and  V = dR/dt,  we  have 

v'  = v — V (5-236) 

This  equation  appears  identical  to  Eq.  (3-57),  the  Galilean  velocity  trans- 
formation which  applies  when  V is  constant.  Now,  however,  the  velocity  V 
of  the  reference  frame  O'  as  observed  by  0 is  not  constant.  The  value  of  V 
which  must  be  used  in  Eq.  (5-23 b)  is  its  value  at  the  same  instant  at  which  O 
measures  the  velocity  v and  0'  measures  the  velocity  v'  for  the  same  body. 


5-4  Fictitious  Forces  177 


z axis 


/ 

/ 

v2  axis 


y'  axis 


Fig.  5-11  (a)  At  a certain  time  t,  the  position  of  the 

origin  O'  of  a noninertial  frame  of  reference  is 
described  with  respect  to  the  inertial  frame  0 by 
means  of  the  vector  R.  At  the  same  instant,  the 
position  of  body  B is  described  by  the  vector  r in  the 
inertial  frame  0 and  by  the  vector  r'  in  the  noninertial 
frame  O',  (b)  A representation  of  the  same  two  frames 
of  reference  in  velocity  space.  Frame  O'  has  constant 
acceleration  A relative  to  inertial  frame  0.  The  two 
frames  coincide  at  t = 0;  that  is,  the  instantaneous 
velocity  of  O'  with  respect  to  O is  V = At.  At  some 
arbitrary  instant  t,  the  velocity  of  body  B is  measured 
x axis  simultaneously  by  observers  in  both  frames  of 
reference.  Its  value  is  v as  measured  from  O and  v'  as 
measured  from  O'.  The  vector  diagram  shows  that 
v'  = v — At.  You  should  compare  this  velocity-space 
representation  with  the  position-space  representation 
of  two  inertial  frames  in  Fig.  3-42. 


(a) 


vx  axis 


Equation  (5-23 b)  can  be  differentiated  with  respect  to  time.  [ This 
amounts  to  taking  the  second  derivative  of  Eq.  (5-23 a).]  We  obtain 


dV 

dt 


d\ 

dt 


dV 

dt 


ib) 


Using  the  definitions  a'  = dx'  /dt,  a = dx /dt,  and  A = dV/dt,  we  have 

a'  = a — A (5-23  c) 

This  result  is  not  the  same  as  that  for  the  Galilean  acceleration  transforma- 
tion, where  a'  = a. 

We  assume  for  convenience  that  the  two  reference  frames  are  instanta- 
neously at  rest  with  respect  to  each  other  at  a certain  instant  which  we  call 
t = 0.  That  is,  at  the  time  t = 0 (but  only  then)  any  vector  v which  describes 
the  velocity  of  a body  B with  respect  to  frame  O is  identical  to  the  vector  v' 
which  describes  its  velocity  with  respect  to  frame  O' . 


178 


Applications  of  Newton’s  Laws 


At  any  other  time  t,  the  velocity  V of  frame  O'  relative  to  frame  0 is 
given  by  the  vector  At,  as  shown  in  Fig.  5-116.  Inspection  of  the  figure 
shows  that  the  relation  between  v'  and  v is 

v'  = v — At  (5-24) 

This  can  be  seen  in  algebraic  terms  as  well,  by  substituting  V = A t into  Eq. 
(5-23  b). 


Let  us  suppose  that  the  two  observers  have  determined  the  mass  m of 
body  B.  (This  might  have  been  done  at  an  earlier  time  when  they  were  at 
rest  with  respect  to  each  other  and  the  body  and  could  easily  measure  its 
weight  mg.)  Once  this  has  been  done,  both  observers  can  use  their  observa- 
tions of  the  velocity  of  body  B at  a certain  instant  to  determine  its  momentum 
at  that  instant.  Observer  0 finds  the  momentum  to  be 


p = mv 

Observer  O'  buds  the  momentum  to  have  the  different  value 


p'  = mv' 

Using  Eq.  (5-24),  observer  O'  can  reexpress  the  momentum  which  he 
determines  by  measurement  on  body  B in  terms  of  the  quantities  v and  A 
measured  by  observer  O.  He  lias 

p'  = mv'  = m(y  — At)  (5-25) 


In  her  inertial  frame,  observer  O defines  the  net  force  acting  on  body  B 
by  means  of  Eq.  (4-14),  according  to  which  this  force  is  the  rate  of  change 
of  the  momentum  p of  the  body.  With  a slight  change  in  notation,  this 
equation  can  be  written 


f s 


dp 

dt 


(5-26«) 


Given  the  assumption  that  body  B has  a constant  mass  m,  the  force  can  be 
expressed  in  terms  of  the  acceleration  a of  body  B as  seen  by  0: 

f = ^ (mv)  = m (~j  = ma  (5-26 b) 


If  0 observes  body  B to  be  accelerating  with  acceleration  a,  she  attributes 
this  to  a force  f which  is  applied  to  body  B by  some  external  agent  (such  as  a 
spring  or  a contact  with  another  body  or  the  gravitational  attraction  of  the 
earth). 

In  like  manner,  observer  O'  defines  the  force  acting  on  body  B in  terms 
of  the  rate  of  change  of  momentum  which  he  observes: 

f'  (5-27) 

dt  ’ 

In  order  to  express  this  force  in  terms  of  the  force  f which  he  knows  O has 
calculated,  O'  Hrst  uses  Eq.  (5-25)  to  express  p'  in  terms  of  the  quantities  O 
measures.  He  inserts  this  value  of  p'  into  Eq.  (5-27)  to  obtain 


f ' =~Jt  ~ 


5-4  Fictitious  Forces  179 


Carrying  out  the  differentiation,  he  obtains 

c d\  d , . , 

f =mn-mJtiM 

And  since  A is  constant,  this  yields 

f'  = ma  — mA 


which  can  also  be  written 

f'  = m(  a — A) 


(5-28a) 


(5-286) 


Let  us  consider  the  latter  form  of  this  equation  first.  According  to  Eq. 
(5-23c),  the  vector  difference  a — A is  equal  to  a',  the  acceleration  of  body 
B as  seen  by  O'.  We  make  this  substitution,  and  Eq.  (5-28 b)  becomes 

f'  = ma'  (5-28r) 

This  equation  has  the  form  of  Newton’s  second  law.  That  is,  Newton’s  sec- 
ond law  is  valid  even  though  observer  O'  is  not  in  an  inertial  frame.  How 
can  this  be,  when  Newton’s  laws  of  motion  are  valid  only  in  inertial  frames? 
The  answer  is  that  the  force  f'  acting  on  body  B,  as  determined  by  O',  is  not  the 
same  as  that  observed  by  0 in  her  inertial  frame . 

To  see  this,  consider  again  the  same  equation  written  in  the  form  of 
Eq.  (5-2 8a),  f'  = ma  — mA.  The  term  ma  is  simply  the  force  f which  O finds 
to  be  acting  on  body  B,  as  indicated  by  Eq.  (5-266).  It  is  the  product  of  the 
mass  of  B and  its  observed  acceleration.  The  extra  term  — mA  also  has  the 
form  of  a force,  since  it  is  the  product  of  a mass  and  an  acceleration.  We  use 
it  to  define  the  fictitious  force  F by  means  of  the  equation 

F = m{  — A)  (5-29) 

1 he  fictitious  force  acting  on  body  B is  a peculiar  force  indeed.  It  is  the 
mass  of  the  body,  multiplied  not  bv  its  own  acceleration  as  observed  either 
by  0 or  by  O',  but  by  the  negative  of  the  acceleration  of  0'  as  observed  by  01 
From  a purely  formal  point  of  view  F is  the  algebraic  term  needed  to  make 
the  connection  between  the  force  f'  observed  by  0'  and  the  force  f ob- 
served by  0.  Elsing  F,  we  find  this  connection  to  be 

f ' = f + F (5-30) 

The  fictitious  force  F makes  it  possible  to  adjust  Newton’s  laws  of  motion  so  that  they 
can  be  used  in  a noninertial  frame.  As  we  have  seen  in  the  example  of  the  air 
table  in  the  car  (Fig.  5-10),  O'  invents  the  fictitious  force  to  account  for  his 
being  “pulled  out  from  under”  the  system  he  observes.  However,  as  mathe- 
matical formalisms  and  fictitious  entities  go,  the  force  F has  a very  convinc- 
ing perceptual  existence,  as  anyone  who  has  ever  ridden  in  a car  around  a 
sharp  curve  can  attest! 

In  developing  the  concept  of  fictitious  force,  we  restricted  our  consideration 
to  a noninertial  frame  of  reference  O'  which  does  not  rotate  with  respect  to  the  in- 
ertial frame  O.  This  was  done  for  simplicity.  Every  point  fixed  to  the  noninertial 
frame  has  the  same  acceleration  A as  the  origin  O',  as  seen  by  an  observer  in  O. 
This  is  not  the  case  for  a rotating  frame.  In  such  a frame,  the  acceleration  of  a point 
depends  on  its  position  (x',  y',  z')  as  well  as  on  the  acceleration  A of  the  origin  O' 
with  respect  to  O.  (We  will  develop  vector  methods  for  dealing  with  such  rotation 
in  Chaps.  9 and  10.) 


180  Applications  of  Newton’s  Laws 


Axis  of  earth 


N 


O' 


Equator 


Fig.  5-12  Observer  O'  stands  on  the 
surface  of  the  rotating  earth  at  a location 
having  latitude  A.  The  radius  vector  from 
the  center  of  the  earth  to  O'  is  R.  The  vector 
along  the  perpendicular  from  the  axis  of  the 
earth  to  O'  is  r.  From  the  point  of  view  of  an 
observer  in  an  inertial  frame.  O'  experi- 
ences a centripetal  acceleration  whose  direc- 
tion is  A = — r. 


Nevertheless,  the  concept  of  fictitious  force  applies  to  rotating  reference  frames 
as  well  as  to  nonrotating  ones.  Although  it  may  be  more  complicated  to  determine 
the  acceleration  a of  a body  as  seen  by  O and  the  corresponding  acceleration  a'  as 
seen  by  O',  there  is  still  associated  with  each  of  these  accelerations  a force.  If  the 
mass  of  the  body  is  m,  those  forces  are,  respectively,  f = ma  and  f'  = ma'.  And 
Eq.  (5-23c),  a'  = a — A,  still  yields  a value  of  - A which  is  related  to  the  fictitious 
force  F by  the  equation  F = — mA,  as  in  Eq.  (5-29). 

The  surface  of  the  earth  is  a noninertial  frame.  Rotating  as  it  does  on 
its  axis,  the  earth  continually  accelerates  us  centripetally  with  acceleration 
A = — (v2/r) r,  where  v is  the  surface  speed  of  the  earth  at  the  latitude  k (the 
Greek  letter  lambda)  of  the  observer  and  r is  the  vector  from  the  axis  of  the 
earth  to  the  observer,  along  the  perpendicular  to  the  axis,  shown  in  Fig. 
5-12. 


EXAMPLE  5-7  - - 

Find  the  centrifugal  acceleration  experienced  by  an  observer  O'  at  the  equator  (A  = 
0°),  the  observed  magnitude  g'  of  the  gravitational  acceleration,  and  the  observed 
weight  of  a standard  2-kg  mass.  Take  the  value  of  g,  which  would  be  observed  if  the 
earth  were  not  rotating,  to  be  9.832  m/s2  and  the  radius  of  the  earth  as  6380  km. 

■ An  observer  in  an  inertial  frame  (say,  in  space)  sees  an  observer  O'  standing  at 
the  equator  to  have  a centripetal  acceleration  A = — (v2/R)r,  where  r is  the  outward 
unit  vector  along  the  perpendicular  from  the  axis  of  the  earth  to  O'  (see  Fig.  5-12), 
R is  the  radius  of  the  earth,  and  v is  the  speed  of  O'  as  he  is  carried  around  by  the 
rotation  of  the  earth.  For  an  observer  at  the  equator,  the  vectors  R and  r coincide, 
so  R = r.  The  centrifugal  acceleration  A'  is  the  negative  of  A,  so  you  have 

a'  = -a  = +Vr 

ii 

A point  on  the  equator  goes  around  the  circumference  of  the  earth  with  a period 
T = 23  h 56  min  (see  Example  3-10)  and  a speed 

2 77 R 


5-4  Fictitious  Forces  181 


Its  centrifugal  acceleration  is 


A'  = 


4t t'2R 
T 2 


R 


directed  outward.  Thus  you  have 

, 4tt  x 6.380  x 106  m 

(23  h x 3600  s/h  + 56  min  x 60  s/min)2 
= 3.393  x 10-2  m/s2 

or  about  0.35  percent  of  the  magnitude  of  the  gravitational  acceleration.  Since  the 
direction  of  A'  is  opposite  to  that  of  g,  you  have 

g'  = g~  A' 

= 9.832  m/s2  - 0.034  m/s2  = 9.798  m/s2 

A 2.000-kg  mass,  which  would  weigh  mg  = 19.66  N in  the  absence  of  rotation,  will 
weigh  19.60  N at  the  equator.  An  observer  standing  on  the  earth  and  noting  this 
slight  reduction  in  weight  can  attribute  it  to  the  fictitious  centrifugal  force  F = 
mA'  0.06  N R acting  on  it.  But  you  can  see  why,  for  many  practical  purposes,  we  can 
consider  ourselves  to  be  situated  in  an  inertial  reference  frame. 


Even  though  weight  reduction  due  to  the  earth’s  rotation  is  quite 
small,  there  are  important  consequences  of  the  fact  that  the  surface  of  the 
earth  is  a noninertial  frame.  It  is  well  known  that  ocean  currents  and 
moving  air  masses  tend  to  curve  in  a clockwise  direction  in  the  Northern 
Hemisphere  and  in  a counterclockwise  direction  in  the  Southern  Hemi- 
sphere. We  will  here  give  a qualitative  account  of  this  phenomenon  in 
terms  of  fictitious  forces. 

Consider  a quantity  of  water  in  the  ocean  at  the  equator.  It  participates 
in  the  daily  rotation  of  the  earth  as  a whole,  moving  eastward  with  a speed 
equal  to  that  of  the  surface  of  the  earth  at  the  equator,  as  shown  in  Fig. 
5-  13a.  Now  suppose  that  for  some  reason  (say  because  of  the  upwelling  of 
water  beneath  it)  this  water  moves  northward.  As  it  does  so,  it  remains  on 
the  surface  of  the  earth,  and  consequently  finds  itself  closer  to  the  axis  of 
the  earth  than  before.  From  the  point  of  view  of  an  observer  in  space,  in  an 
inertial  frame,  the  law  of  inertia  must  apply.  That  is,  the  water  tends  to 
continue  to  move  eastward  with  the  same  speed  as  when  it  was  at  the  equa- 
tor. But  the  surface  speed  of  the  earth  in  the  new  latitude  is  less  than  that  at 
the  equator,  so  that  the  water  tends  to  outpace  the  earth.  That  is,  it  moves 
eastward  with  respect  to  the  surface  of  the  earth. 

Now  consider  the  reverse  situation.  A quantity  of  water  somewhere 
north  of  the  equator  is  moving  at  the  surface  speed  of  the  earth  at  this 
latitude — that  is,  it  is  at  rest  with  respect  to  the  earth's  surface.  Suppose 
that  for  some  reason  it  moves  southward,  toward  the  equator.  Again,  an 
observer  in  an  inertial  frame  will  use  the  law  of  inertia  to  predict  that  the 
water  will  tend  to  move  eastward  at  its  original  speed.  But  as  it  moves 
southward,  it  moves  farther  from  the  earth’s  axis,  into  a region  where  the 
surface  speed  is  higher.  Therefore,  the  water  tends  to  lag,  or  to  move  west- 
ward with  respect  to  the  surface  of  the  earth.  Taken  together,  these  ten- 


182  Applications  of  Newton’s  Laws 


Axis  of  earth 
N 


(a)  ( b ) 

Fig.  5-13  (a)  Water  moving  northward  from  the  equator  tends  to  maintain  its  surface  speed 

vea,  which  is  greater  than  , the  characteristic  surface  speed  at  latitude  A.  An  observer  on  the 
surface  of  the  earth  who  does  not  perceive  the  rotation  of  the  earth  attributes  the  tendency  of 
the  water  to  move  relatively  eastward  to  the  fictitious  Coriolis  force.  The  same  argument  in  re- 
verse applies  to  water  moving  southward  toward  the  equator,  (b)  The  eastward  velocity  veq  — \K 
relative  to  the  earth  of  water  which  has  moved  northward  and  the  westward  velocity  vx  — veq 
relative  to  the  earth  of  water  which  has  moved  southward  in  the  Northern  Hemisphere.  The 
overall  effect  is  a tendency  for  moving  water  to  circulate  clockwise  in  the  Northern  Hemi- 
sphere. 


dencies  result  in  a clockwise  rotation  of  water  masses,  as  shown  in  Fig. 
5-136. 

But  now  let  us  look  at  the  situation  from  the  point  of  view  of  an  ob- 
server O'  on  the  surface  of  the  earth.  This  observer  is  not  (or  chooses  not  to 
be)  conscious  of  the  rotation  of  the  earth.  Nevertheless,  he  must  account 
for  the  tendency  of  the  ocean  currents  to  rotate  clockwise.  He  therefore 
“invents”  a fictitious  force  to  explain  the  motion  of  the  water.  This  force  is 
peculiar  in  that  it  acts  only  on  moving  matter  (here  water),  and  in  that  it  acts 
at  right  angles  to  the  direction  of  motion  (eastward  on  northward-moving 
water  and  westward  on  southward-moving  water).  This  fictitious  force  is 
called  the  Coriolis  force,  after  the  French  physicist  who  first  studied  it  in 
detail. 

Using  an  argument  similar  to  that  above,  you  can  show  that  the 
Coriolis  force  in  the  Southern  Hemisphere  acts  so  as  to  produce  counter- 
clockwise motion  of  ocean  currents.  (Contrary  to  popular  belief,  the 
Coriolis  force  is  far  too  small  to  account  for  the  vortex  sometimes  seen  in 
draining  bathtubs  and  in  similar  situations.  It  is  not  true  that  the  direction 
of  rotation  of  this  vortex  is  characteristic  of  the  hemisphere  in  which  the 
tub  is  located.) 


It  is  a little  harder  to  see  that  there  is  also  a Coriolis  force  acting  on  a body 
which  moves  relative  to  the  surface  of  the  earth  in  an  east- west  direction.  In  order 
to  understand  its  origin,  consider  first  the  simple  case  of  a plumb  bob  (a  body 
hanging  at  rest  from  a string)  on  a nonrotating  earth.  Figure  5- 14a  is  a free-body 
diagram  of  the  bob,  whose  mass  is  m.  It  is  at  rest  because  the  upward  force  S ex- 


5-4  Fictitious  Forces  183 


n s 


m 


' r mg 


(a) 


„ S 


1 ‘ F 

m 


'<  | 
r (to  R (to 
axis  of  center 

earth)  of  earth) 


' ' mg 


— R (to 
center 
of  earth) 


(,b)  ( c ) (d)  ( e ) 

Fig.  5-14  (a)  Free-body  diagram  for  a plumb  bob  of  mass  m,  hanging  from  a string  fixed  near 

the  surface  of  a nonrotating  earth,  (b)  The  same  bob  located  on  the  equator  of  a rotating  earth. 
An  observer  standing  next  to  the  bob  attributes  the  decrease  in  the  magnitude  of  the  force  S 
exerted  by  the  string  to  the  fictitious  centrifugal  force  F (much  exaggerated).  The  vector  — r 
points  toward  the  axis  of  the  earth,  and  the  vector  - R points  toward  the  center  of  the  earth 
(see  Fig.  5-12).  (c)  The  same  plumb  bob  located  at  latitude  X in  the  Northern  Flemisphere.  (Why 
is  the  magnitude  F of  the  centrifugal  force  less  than  that  shown  in  part  b7)  (d)  An  identical 
plumb  bob  moving  smoothly  and  steadily  eastward  in  a submarine  does  not  hang  along  the 
vertical  established  by  the  stationary  plumb  bob  in  part  c,  but  makes  a larger  angle  with  — R. 
(r)  The  downward  force  — S*  on  the  moving  plumb  bob  is  replaced  by  its  constituent  vectors 
along  the  horizontal  and  vertical  established  by  the  stationary  plumb  bob.  The  southward- 
directed  horizontal  force  — Sf  is  the  Coriolis  force. 


erted  by  the  string  is  exactly  equal  in  magnitude  to  the  downward  gravitational 
force  mg.  Next,  let  the  earth  rotate  and  consider  the  free-body  diagram,  Fig.  5-14b, 
of  a plumb  bob  hanging  at  the  equator.  The  gravitational  force  mg  is  unchanged  in 
magnitude  and  still  has  the  direction  — R toward  the  center  of  the  earth  (as  it 
always  does).  But  as  you  saw  in  Example  5-7,  the  magnitude  of  the  force  exerted 
by  the  string  is  slightly  reduced  from  that  in  the  case  of  the  nonrotating  earth.  An 
observer  O in  space  sees  this  as  a consequence  of  the  centripetal  acceleration  of 
the  bob  as  it  rotates  with  the  earth.  But  an  observer  O'  standing  on  the  surface  of 
the  earth  is  in  a noninertial  reference  frame  and  attributes  the  observed  reduction 
in  S to  the  presence  of  the  fictitious  centrifugal  force  F shown  in  the  figure.  The 
direction  of  F is  that  of  r,  the  unit  vector  along  the  perpendicular  to  the  axis  of  the 
earth  to  the  plumb  bob  (see  Fig.  5-12).  For  a point  on  the  equator,  r coincides  with 
R,  the  unit  vector  along  the  earth’s  radius.  The  force  exerted  by  the  string  is  given 
by  the  condition 


S + mg  + F = 0 


or 


S = - (mg  + F) 

Now  let  the  plumb  bob  hang  just  above  the  surface  of  the  earth  at  a location 
having  latitude  X.  (We  will  suppose  that  this  location  is  in  the  Northern  Hemi- 


184  Applications  of  Newton’s  Laws 


sphere.)  The  free-body  diagram  of  the  bob  is  shown  in  Fig.  5-14c.  The  gravita- 
tional force  mg  again  has  direction  -R,  and  the  centrifugal  force  again  has  direc- 
tion r.  But  as  you  can  see  from  Fig.  5-12,  r and  R are  no  longer  coincident.  Instead 
they  make  an  angle  X with  each  other.  This  angle  is  shown  in  Fig.  5-14c.  In  order 
to  keep  the  plumb  bob  motionless,  the  string  must  again  exert  a force  S = - (mg  4- 
F),  as  the  figure  shows.  Since  plumb  bobs  are  normally  used  to  establish  the  direc- 
tion called  “vertical,”  we  call  the  line  along  which  S lies  the  “plumb  bob”  ver- 
tical. (Note  that  S differs  from  R,  which  can  be  determined  by  means  of  astro- 
nomical measurements.) 

We  are  now  ready  to  consider  the  Coriolis  force  which  is  exerted  (from  the 
point  of  view  of  observer  O')  on  bodies  moving  with  respect  to  the  surface  of  the 
earth.  Suppose  that  a second,  identical  plumb  bob  is  mounted  in  a submerged  sub- 
marine, which  can  move  very  smoothly.  The  submarine  heads  eastward  at  con- 
stant speed  at  a constant  depth  just  great  enough  to  avoid  wave  action.  The  ob- 
server O,  in  an  inertial  frame  in  space,  notes  that  the  plumb  bob  is  now  circling  the 
earth’s  axis  faster  than  O',  who  remains  stationary  with  respect  to  the  surface  of 
the  earth.  She  therefore  expects  that  the  centripetal  acceleration  of  the  bob  is  in- 
creased and  that  the  new  magnitude  S * of  the  force  exerted  by  the  string  is  slightly 
less  than  S.  This  is  indeed  the  case. 

Let  us  consider  the  situation  in  more  detail  from  the  point  of  view  of  O'.  (We 
must  assume  that  he  has  some  means  of  observing  the  plumb  bob  in  the  moving 
submarine  from  his  fixed  position  on  the  surface  of  the  earth.)  His  new  observa- 
tions are  shown  in  Fig.  5-14d.  His  own  plumb  bob  still  hangs  along  the  plumb  bob 
vertical,  which  is  shown  as  a dashed  line.  But  the  bob  in  the  submarine  hangs 
at  a different  angle.  Observer  O'  argues  that  the  centrifugal  force  exerted  on  the 
bob  in  the  submarine  is  greater  in  magnitude  than  that  exerted  on  his  own,  be- 
cause of  the  greater  speed  at  which  the  bob  in  the  submarine  circles  the  earth’s 
axis.  He  calls  the  new  force  F*.  Its  direction  is  still  f;  that  is,  it  is  parallel  to  F.  The 
force  exerted  by  the  string  on  the  moving  bob,  which  holds  it  motionless  with 
respect  to  the  submarine,  is  given  by 

S*  = — (mg  + F*) 


as  shown  in  Fig.  5-14d. 

We  now  show  that  F*  tends  to  produce  southward  motion.  In  Fig.  5-14e,  the 
“plumb  bob”  horizontal  is  constructed  by  drawing  the  perpendicular  to  the  plumb 
bob  vertical  determined  in  Fig.  5-14c.  If  the  ocean  water  through  which  the  sub- 
marine is  traveling  is  stationary  with  respect  to  the  surface  of  the  earth,  the  ocean 
surface  will  conform  to  the  plumb  bob  horizontal.  (Not  only  does  the  ocean  adjust 
to  this  horizontal,  but  so  does  the  “solid”  earth.  This  is  the  source  of  the  equatorial 
bulge  of  the  earth,  whose  radius  is  some  21  km  greater  at  the  equator  than  at  the 
poles.)  The  downward  vector  — S*  specifies  the  direction  in  which  the  moving 
bob  hangs.  It  is  not  parallel  to  — S.  We  can  replace  — S*  by  its  constituent  vectors 
along  — S,  the  plumb  bob  vertical,  and  perpendicular  to  — S,  along  the  plumb  bob 
horizontal.  The  horizontal  constituent  vector  — S*  lies  along  the  surface  in  the 
southward  direction.  The  plumb  bob  is  held  in  place  by  the  string  and  merely  in- 
clines toward  this  direction.  But  an  unconstrained  body,  such  as  a moving  mass  of 
water  in  an  ocean  current  or  a long-distance  artillery  shell,  will  be  accelerated 
southward  by  this  force,  which  is  also  given  the  name  Coriolis  force.  Can  you 
explain  the  direction  of  this  acceleration  from  the  point  of  view  of  inertial  ob- 
server O,  who  does  not  “invent”  a fictitious  force? 

A similar  argument  shows  that  a westward-moving  body  in  the  Northern 
Hemisphere  is  likewise  deflected  to  the  right,  that  is,  northward.  A quantitative 
discussion  is  beyond  the  scope  of  this  book,  but  it  turns  out  that  the  magnitude  of 
the  horizontal  component  of  the  Coriolis  force  is  independent  of  the  direction  of 
motion  of  the  body  on  which  it  is  exerted,  as  long  as  the  body  moves  parallel  to  the 
surface  of  the  earth. 


5-4  Fictitious  Forces  185 


Fig.  5-15  The  rotation  of  the  path 
described  by  a Foucault  pendulum  in 
successive  swings,  resulting  from  the 
Coriolis  force.  The  path  of  the  bob, 
located  in  the  Northern  Hemisphere,  is 
seen  as  it  would  be  by  a viewer  looking 
down  on  it.  The  path  always  has  clock- 
wise curvature.  (The  curvature  is  grossly 
exaggerated.  In  actuality,  the  path 
during  a single  swing  is  not  distin- 
guishable to  the  unaided  eye  from  a 
straight  line.) 


5-5  ROCKETS 


A spectacular  demonstration  of  the  Coriolis  effect  is  the  Foucault  pen- 
dulum, named  after  lire  distinguished  French  physicist  who  was  the  first  to 
set  one  up  in  1 85 1 . A very  long  pendulum  with  a heavy  bob  will  swing  for  a 
long  time  once  it  is  started  and  will  not  be  much  disturbed  by  drafts  and 
similar  effects.  The  pendulum  describes  a vertical  plane  as  it  oscillates.  As 
time  passes,  an  observer  standing  near  the  pendulum  (and  thus  rotating 
with  the  earth)  observes  that  the  plane  of  oscillation  rotates  slowly  about  a 
vertical  axis.  That  is,  if  the  pendulum  is  at  first  swinging  in  an  east-west 
direction,  it  will  later  swing  between  southeast  and  northwest  (in  the 
Northern  Hemisphere)  and  later  south-north,  and  so  on.  See  Fig.  5-15, 
where  the  effect  is  grossly  exaggerated.  This  phenomenon  is  often  ex- 
plained by  saying  that  “the  earth  rotates  under  the  pendulum,”  but  this  is 
not  a complete  explanation  except  at  the  earth’s  poles.  What  does  happen  is 
that  the  Coriolis  force  exerts  a small  but  consistent  force  toward  the  right 
(in  the  Northern  Hemisphere)  no  matter  in  what  direction  the  pendulum  is 
swinging.  This  leads  to  the  described  effect.  It  turns  out  that  the  rotation 
rate  of  the  plane  of  the  pendulum  is 

277 

— sin  A rad/h  = 15°  sin  k h 1 

where  k is  the  latitude.  You  can  see  that  this  expression  gives  the  expected 
result  at  the  earth’s  poles  (A  = 90°)  where  the  earth  does  indeed  “turn 
beneath”  the  pendulum  and  the  plane  of  the  pendulum  takes  24  h to  rotate 
with  respect  to  the  earth.  At  the  equator  (A  = 0°)  the  centrifugal  force  is 
vertically  upward.  Thus  there  is  no  horizontal  component,  and  the  pen- 
dulum plane  does  not  rotate  at  all,  as  the  expression  predicts. 


Now  that  people  have  begun  the  exploration  of  space,  rockets  have  be- 
come a subject  of  general  public  interest.  The  central  feature  of  the  rocket, 
which  underlies  its  unique  capabilities,  is  its  reaction  engine.  The  rocket 
carries  its  own  supply  of  fuel  and  oxidizer,  which  the  engine  ejects  back- 
ward as  hot  gas.  (Fuel  and  oxidizer  together  are  frequently  referred  to  as 
“fuel.”)  Unlike  a jet  plane,  a rocket  has  no  dependence  on  an  external  supply 
of  air. 

As  the  rocket  engine  operates,  the  mass  of  the  rocket  changes  rapidly. 
Consequently,  Newton’s  second  law  in  the  general  form  F — dp/dt  does  not 
lead  to  F — ma.  We  must  therefore  begin  the  analysis  of  the  motion  of  a 
rocket  from  the  basic  principle  of  momentum  conservation. 

We  consider  a rocket  in  deep  space,  far  from  any  gravitating  body.  It 
moves  in  a straight  line,  subject  only  to  the  influence  of  its  main  engine. 
Figure  5-16  shows  two  consecutive  views  of  the  rocket,  as  seen  from  an  in- 
ertial frame  of  reference.  In  Fig.  5- 16a,  the  rocket  is  shown  at  some  arbi- 
trary time  t , when  its  mass  is  m.  It  is  most  convenient  to  describe  the  motion 
of  the  system  in  terms  of  signed  scalars.  An  inertial  observer  sees  the  rocket 
moving  with  instantaneous  velocity  V,  the  forward  direction  of  its  motion 
being  used  to  define  the  positive  direction.  The  engine  ejects  a contin- 
uous stream  of  gas  backward  from  the  tail  of  the  rocket.  As  a result,  the 
mass  of  the  rocket  changes  at  a constant  rate  dm/dt.  This  quantity  has  a neg- 
ative value,  because  the  mass  is  decreasing. 

The  velocity  at  which  gas  is  ejected  from  the  rocket  isy^.  But  this  is  the 


186  Applications  of  Newton’s  Laws 


velocity  of  the  gas  relative  to  the  rocket,  and  not  relative  to  the  inertial  frame. 
In  order  to  find  the  velocity  vg  of  the  gas  relative  to  the  inertial  frame,  we 
use  the  velocity  transformation,  Eq.  (5-236).  In  signed  scalar  notation  this  is 
written  v'  = v — V,  or  v = V + v' . For  the  present  case,  v'  is  the  velocity  of 
the  gas  (the  body  B of  Sec.  5-4)  as  seen  from  the  rocket,  so  that  v'  = v'g. 
And  v is  the  velocity  of  the  gas  as  seen  from  the  inertial  frame  of  reference, 
so  that  v = vg.  The  velocity  transformation  of  Eq.  (5-236)  thus  gives 

Vg  = V + Vg 

This  result  is  shown  in  Fig.  5-166.  (Unlike  v'g,  which  always  has  a negative 
value  since  gas  is  always  ejected  backward  from  the  point  of  view  of  an  ob- 
server on  the  rocket,  vg  can  be  either  positive  or  negative  depending  on 
how  fast  the  rocket  is  going.  All  observers  will  see  the  rocket  leaving  its  ex- 
haust gas  behind,  but  not  all  observers  will  see  the  rocket  and  the  exhaust 
gas  moving  in  opposite  directions!) 


We  now  consider  the  entire  rocket-pl ns-gas  system  from  the  inertial 
frame.  As  an  isolated  system  seen  from  an  inertial  frame,  its  momentum 
will  remain  constant  as  time  passes.  We  can  thus  write 


(momentum  of  rocket 
at  time  t) 


(momentum  of  rocket 
at  time  t + dt) 


(momentum  of  gas 
+ ejected  during  time 
interval  dt) 


In  order  to  make  use  of  this  equation,  we  next  calculate  the  various  mo- 
menta. 

At  time  t , shown  in  Fig.  5- 16a,  the  total  mass  of  the  system  is  the  mass 
of  the  rocket,  m,  and  its  velocity  is  V.  Its  momentum  is  thus  mV.  At  the  infin- 
itesimally later  time  t + dt,  shown  in  Fig.  5-166,  the  mass  of  the  rocket  has 
changed  by  an  amount  given  by  the  product  of  the  rate  of  change  dm/dt 
and  the  time  dt  which  has  elapsed.  Thus  the  rocket  has  mass 


m H — t dt  = m + dm 
dt 

(The  change  in  mass  dm  has  a negative  value,  since  dm/dt  has  a negative 
value.) 

During  the  same  time  interval  dt,  the  velocity  of  the  rocket  has  changed 
(increased)  by  an  infinitesimal  amount  dV,  and  is  thus  V + dV.  The  new 
momentum  of  the  rocket  is 


(m  + dm)(V  + dV) 

The  ejected  gas  has  mass  —dm  (since  any  change  in  the  mass  of  the 
rocket  must  correspond  to  a change  of  the  same  magnitude  but  opposite 
sign  in  the  mass  outside  the  rocket).  As  seen  from  the  inertial  frame,  the 
gas  has  emerged  from  the  engine  at  velocities  changing  uniformly  from 
V + vi,  to  V + dV  + Vg.  The  average  velocity  of  the  gas  ejected  during  the 
time  dt  is  thus  V + v'g  + dV / 2.  The  momentum  of  the  ejected  gas  is  con- 
sequently 

-dm  (v  + v'g  + j 


5-5  Rockets  187 


r 


m 


V>0 


(a) 


r + dt 


V + vg' 

V<  o 


din  < 0 


► 

V + dV 


Fig.  5-16  A rocket  shown  at  two 
instants  separated  by  an  infinitesimal 
time  interval  dt.  The  engine  is  operat- 
ing continuously,  but  the  gas  expelled 
before  the  time  t is  not  shown  in  either 
view,  (a)  The  rocket  at  time  t.  Its 
velocity  as  seen  from  an  inertial  frame 
is  V,  and  its  mass  is  m.  (ft)  The  rocket  at 
time  t + dt.  Its  velocity  is  V + dV.  The 
engine  has  expelled  gas  having  a mass 
— dm  during  the  interval  dt,  and  the 
mass  of  the  rocket  is  now  m + dm.  As 
seen  from  the  inertial  frame,  the  gas 
expelled  at  the  beginning  of  the  inter- 
val dt  has  velocity  V + Vg.  In  the  case 
illustrated,  the  rocket  has  not  yet 
attained  a very  large  velocity  V.  Conse- 
quently, the  quantity  vg  = V + v'g  has 
a negative  value.  That  is,  an  observer  at 
rest  sees  the  ejected  gas  to  be  moving 
backward.  The  analysis  in  the  text 
shows  that  the  direction  of  vg  is 
immaterial. 


(ft) 


Equating  the  total  momentum  at  the  beginning  of  the  time  interval 
with  the  total  momentum  at  its  end,  we  have 

mV  = (m  + dm){V  + dV ) + { — dm)  (V  + v'g  + 

Expanding  the  terms  on  the  right  side  of  this  equation  gives 

dV 

mV  = mV  + m dV  + dm  V + dm  dV  — dm  V — dm  v'g  — dm  — 
which  reduces  to 


0 = m dV  + dm  dV  — dm  v’a  — dm 

2 

All  the  terms  on  the  right  side  of  this  equation  are  infinitesimal.  The  first 
and  tfie  third  are  products  of  an  infinitesimal  cjuantity  with  a finite  quan- 
tity. But  the  second  and  fourth  terms  are  each  the  product  of  two  infinites- 
imals. They  are  therefore  negligibly  small  compared  to  the  other  two.  We 
thus  neglect  them  and  have 


0 = m dV  — dm  v'g 


Dividing  through  by  dt,  we  obtain 


0 


dV  dm  , 

= m — — v0 

dt  dt  6 


or 


dV  , dm 
dt  1 3 dt 


(5-3  la) 


Since  V is  the  velocity  of  the  rocket,  dV/dt  is  its  acceleration  A.  So  we  obtain 
the  result 


mA  = v'g~^  (5-316) 

(Note  that  A has  a positive  value,  as  expected,  since  the  right  side  of  this 
equation  is  the  product  of  two  terms  with  negative  value.) 


188  Applications  of  Newton’s  Laws 


In  Eq.  (5-3 la)  we  have  expressed  the  product  of  the  mass  m of  the 


rocket  and  its  acceleration  A as  the  quantity  v'g  dm/dt.  Since  the  equation  is 
in  the  form  of  Newton's  second  law 


mA  = F 


we  can  identify  v'g  dm/dt  as  a force.  That  is,  we  now  take  as  a system  the  rocket 
alone,  instead  of  the  rocket  plus  the  ejected  gas.  This  new  system  is  not  an 
isolated  system.  Rather,  there  is  an  external  force 


F 


, dm 
Va~dt 


(5-32) 


acting  on  it,  which  causes  it  to  accelerate.  This  force  is  called  the  thrust  of 
the  rocket  engine.  It  comprises  one  member  of  a Newton’s  third  law 
action-reaction  pair,  and  is  the  force  exerted  on  the  rocket  engine  by  the 
ejected  gas.  Hence  the  name  “reaction  engine.”  (The  other  member  of  the 
pair  is  the  force  exerted  on  the  ejected  gas  by  the  engine.  It  has  no  direct 
effect  on  the  system  of  interest,  which  is  the  rocket.)  It  is  reasonable  that 
the  magnitude  of  the  thrust  be  proportional  to  the  speed  \v'g\  at  which  the 
engine  ejects  gas,  and  also  to  \dm/dt\,  the  mass  of  gas  ejected  per  unit  time, 
called  the  fuel  consumption  rate. 

Equation  (5-32)  can  be  written  in  the  form 


9 dm/dt 


(5-33) 


I he  term  on  the  right  is  the  thrust  of  the  rocket,  divided  by  the  fuel  con- 
sumption rate  required  to  obtain  it.  This  is  an  important  figure  of  merit  in 
the  design  of  rocket  systems;  the  rocket  engineer  would  like  to  have  the 
maximum  possible  thrust  for  the  minimum  possible  fuel  consumption  rate. 


The  quantity  F/(dm/dt)  on  the  right  side  of  Eq.  (5-33)  is  called  the  specific 
thrust  or,  more  commonly,  the  specific  impulse.  Its  SI  units  are  usually  quoted  in 
the  form  newton-seconds  per  kilogram  (N-s/kg).  Equation  (5-33)  shows  that  spe- 
cific impulse  is  equal  to  the  velocity,  relative  to  the  rocket,  at  which  the  engine  can 
eject  the  gas.  It  turns  out  that  for  well-designed  engines  this  is  determined  mainly 
by  the  chemical  energy  per  unit  mass  stored  in  the  fuel.  The  largest  value  of  spe- 
cific impulse  theoretically  possible  for  chemical  fuels  is  about  3300  N-s/kg,  using 
a hydrogen-fluorine  mixture.  This  mixture  is  rarely  (if  ever)  used  because  of  the 
great  difficulty  and  danger  of  handling  fluorine  and  the  extreme  noxiousness  of 
the  principal  exhaust  product,  hydrogen  fluoride.  Fortunately  the  hydrogen- 
oxygen  mixture  has  a theoretical  specific  impulse  only  a few  percent  smaller  than 
hydrogen-fluorine,  and  the  exhaust  is  mainly  steam.  The  rockets  used  as  the  main 
engines  in  launching  space  vehicles  nearly  all  use  hydrogen  and  oxygen,  which 
are  stored  in  their  liquid  form. 


Even  'A  v'g  and  dm/dt  are  constant  while  the  rocket  engine  is  burning,  so 
that  the  thrust  F is  constant,  the  acceleration  A = F/m  of  the  rocket  is  not 
constant.  The  acceleration  increases  with  time  because  the  mass  m of  the 
rocket  decreases.  This  decrease  is  due  to  the  fact  that  the  engine  burns  fuel 
to  produce  the  ejected  gas,  so  that  the  value  of  dm/dt  is  negative.  The  mass 
decrease  is  by  no  means  a small  effect;  more  than  80  percent  of  the  launch 
mass  of  a rocket  intended  for  space  flight  is  fuel. 


5-5  Rockets  189 


To  determine  quantitatively  the  motion  of  the  rocket — that  is,  to  find 
its  velocity  and  position  from  its  acceleration  — involves  complications  we 
have  not  had  to  deal  with  in  previous  cases,  where  the  acceleration  was 
always  constant.  One  procedure  for  handling  systems  with  nonconstant 
acceleration  is  developed  in  Sec.  5-6. 


5-6  THE  SKYDIVER  The  rocket  discussed  in  Sec.  5-5  is  an  important,  though  not  typical,  ex- 
ample of  a physical  system  in  which  a body  moves  with  nonconstant  acceler- 
ation. While  its  engine  is  burning,  the  acceleration  of  a rocket  increases 
even  though  the  force  acting  on  it  remains  constant,  because  its  mass  is  de- 
creasing. Much  more  commonly,  varying  acceleration  is  the  result  of  a 
varying  force  acting  on  a body  whose  mass  remains  constant.  But  regard- 
less of  the  cause  of  variation,  if  the  acceleration  of  a body  varies,  its  velocity  and 
position  cannot  be  found  by  making  direct  use  of  the  familiar  kinematic  equations  of 
Sec.  2-7,  since  they  were  derived  for  the  case  of  constant  acceleration. 

In  this  section  we  develop  a method  for  determining  the  velocity  and 
position  of  a body  which  moves  with  nonconstant  acceleration.  The  method 
involves  a sequence  of  numerical  calculations.  We  will  use  the  method  to 
study  an  object  falling  vertically  downward  through  the  air  near  the  surface 
of  the  earth,  making  the  physically  realistic  assumption  that  the  strength  of 
the  retarding  drag  force  due  to  air  resistance  is  proportional  to  the  square 
of  the  velocity.  That  is,  we  will  assume  that  the  air  flow  around  the  rela- 
tively large,  rapidly  moving  object  is  the  turbulent  flow  discussed  in  Sec. 
4-6.  In  particular,  the  method  will  be  applied  to  study  the  motion  of  a 
skydiver — a person  who  jumps  from  an  airplane  to  enjoy  the  sensation  of 
fall  for  some  time  before  opening  a parachute.  See  Fig.  5-17. 

In  terms  of  the  signed  scalar  representation,  with  the  downward  direc- 
tion taken  as  positive,  the  net  force  acting  on  the  skydiver  is  positive.  But  as 
the  velocity  of  this  daring  athlete  builds  up  during  the  fall,  the  negative 
drag  force  also  builds  up  and  the  net  force  decreases.  The  acceleration  a — 


Fig.  5-17  A skydiver. 


190  Applications  of  Newton’s  Laws 


F/m  therefore  decreases,  so  the  constant-acceleration  kinematic  equations 
we  have  frequently  used  up  to  this  point  cannot  be  applied  in  a direct 
manner.  However,  you  will  soon  see  that  they  can  be  applied  in  a restricted 
way  which  nevertheless  yields  useful  results. 

As  a preliminary,  let  us  reconsider  a one-dimensional  situation  in 
which  the  acceleration  a of  a body  is  constant.  Suppose  that  you  know  the 
value  of  a and  also  know  the  velocity  v0  and  the  position  x0  of  the  body  at 
some  initial  time  t0  = 0.  One  way  that  you  can  determine  its  velocity  and  po- 
sition at  subsequent  times  is  to  select  a time  increment  A t and  then  carry  out 
the  following  set  of  calculations: 

Use  the  constant  value  of  a to  calculate 

A t 

Vit2  = Vo  + a -y  (5-34a) 

from  the  given  value  of  v0.  Use  the  result  to  calculate 

Xi  = x0  + v%/2  A t (5-34 b) 

from  the  given  value  of  x0.  Then  set  tl  = A t. 

Next  use  a,  and  the  value  of  u1/2just  obtained,  to  calculate 

^3/2  = vm  + a At  (5-34 c) 

Use  the  result,  and  the  value  of  aq  just  obtained,  to  calculate 


X2  = Xx  + v3/2  A t 


(5-34d) 


Then  set  t2  = 2 At. 

Next  use  a,  and  the  value  of  u3/2  just  obtained,  to  calculate 


V5I2  ~ V3I2  + a At 


(5-34c) 


Use  the  result,  and  the  value  of  x2  just  obtained,  to  calculate 


X3  — X2  + V512  At 


(5-34 -f) 


Then  set  t3  = 3 At. 

Continue  these  calculations  until  t reaches  whatever  value  is  required. 

Equations  (5-34)  are  exact  for  the  case  of  a constant  acceleration  a, 
regardless  of  the  size  of  the  chosen  time  increment  At.  The  value  of  At  you 
actually  choose  is  determined  by  how  far  apart  in  time  you  want  to  know 
the  successive  values  of  velocity  and  position. 

Let  us  consider  in  detail  how  the  calculations  produce  the  desired  re- 
sults. First  you  hnd  v1/2,  which  is  the  velocity  of  the  body  at  a time  iAt  later 
than  the  initial  time.  To  do  this,  you  add  to  the  initial  velocity  u0  the  prod- 
uct of  the  acceleration  a and  the  time  2 A t during  which  that  acceleration  has 
acted.  The  equation  used,  Eq.  (5-34<z),  is  just  a reexpression  of  the  defini- 
tion of  acceleration  for  the  case  of  constant  acceleration. 

Now  that  you  have  found  v1/2,  you  use  it  to  hnd  x1;  which  is  the  posi- 
tion at  a time  lAt  later  than  the  initial  time.  The  quantity  xx  is  obtained 


5-6  The  Skydiver  191 


from  Eq.  (5-346)  by  adding  to  the  initial  value  x0  the  product  of  the  velocity 
V112  and  the  time  increment  At.  The  validity  of  doing  so  hinges  on  two 
facts:  (1)  Since  the  acceleration  is  constant,  the  velocity  changes  uniformly 
during  the  time  interval  starting  at  t = 0 and  ending  at  t = lAt.  Therefore 
its  average  value  (v)  over  that  interval  is  equal  to  its  instantaneous  value 
halfway  through  the  interval,  which  is  vm-  (See  Fig.  2-22 a.)  (2)  Since  total 
change  in  position  is  the  average  velocity  multiplied  by  the  elapsed  time,  it 
follows  that  the  change  in  position  during  the  time  interval  is  (v)  A t = 
VH2  At. 

To  complete  the  hrst  cycle  of  calculations,  you  set  Z:  = At  in  order  to 
update  the  time. 

In  the  next  cycle  of  calculations,  you  first  evaluate  the  velocity  v3l2  at 
time  fAt,  using  Eq.  (5-34c)  to  do  so.  This  equation  is  very  much  the  same  as 
Eq.  (5-34o).  The  only  differences  are  in  the  subscripts  (which  specify  the 
particular  values  of  v involved)  and  in  the  fact  that  the  time  increment  used 
is  now  At.  Then  you  use  Eq.  (5-34 d)  to  obtain  x2  from  by  employing  the 
just-obtained  value  of  v3t2.  The  equation  is  identical  to  Eq.  (5-346)  except 
for  the  subscripts.  Then  you  again  update  time  by  setting  t2  = 2 At. 

The  next  cycle  of  calculations  yields  hrst  u5/2  and  then  x3  in  exactly  the 
same  way  that  the  two  preceding  values  v3/2  and  x2  were  obtained,  and  then 
t3  is  evaluated. 

You  continue  these  sequential  calculation  cycles  as  long  as  is  required. 
They  will  yield  values  of  v and  x over  any  desired  range  of  time  following 
the  initial  time.  The  method  works  for  a time  increment  At  of  any  size.  If 
you  choose  At  sufficiently  small,  you  will,  for  all  practical  purposes,  obtain  v 
and  x for  any  desired  value  of  t.  That  is,  you  will  determine  the  functions  v(t) 
and  x(Z)  for  the  given  values  of  a,  v0 , and  x0.  (These  functions  can  be  dis- 
played in  either  tabular  or  graphical  form.) 

For  most  cases  of  interest,  the  method  involves  a large  number  of 
repetitive  numerical  calculations.  Consequently,  it  is  not  very  practical  if 
the  calculations  must  be  carried  out  by  hand,  even  though  each  calculation 
is  itself  quite  simple.  However,  programmable  pocket  calculators  are  per- 
fectly suited  to  performing  repetitive  numerical  calculations,  as  are  elec- 
tronic computers  in  general.  With  such  devices  the  numerical  method  be- 
comes quite  feasible,  as  you  will  see  in  Example  5-8.  Even  so,  the  method 
is  not  competitive  with  the  analytical  method  for  treating  the  special  case 
of  constant  acceleration,  because  the  analytical  method  is  so  easy  to  use  in 
this  case. 

Nevertheless,  the  numerical  method  has  an  extremely  valuable  attribute. 
Namely,  it  can  easily  be  modified  into  a method  for  treating  situations  in 
which  the  acceleration  a is  not  constant.  Acceleration  can  vary  because  it  de- 
pends explicitly  on  any  of  or  all  the  quantities  v,  x,  and  t. 

The  method  used  to  treat  varying  acceleration  is  almost  the  same  as 
that  used  to  treat  constant  acceleration.  To  be  specific,  this  is  what  you  do. 
Always  employing  the  equation  that  expresses  a in  terms  of  the  quantities 
on  which  it  depends,  you  select  a small  time  increment  At  and  then  carry 
out  the  following  set  of  calculations: 

Determine  the  value  of  a from  the  values  at  Z0  = 0 of  the  quantities  on 
which  it  depends,  and  use  it  to  calculate 


192  Applications  of  Newton’s  Laws 


At 

Vi/2  — v0  + a 9 


(5-35  a) 


from  the  given  value  of  v0.  Use  the  result  to  calculate 


Xx  — x0  + U1/2  At  (5-35b) 

from  the  given  value  of  x0.  Then  set  t1  = At. 

Next  determine  the  value  of  a from  the  new  values  of  the  quantities  on 
which  it  depends.  Use  it,  and  the  value  of  vV2  just  obtained,  to  calculate 

^3/2  — vm  + a At  (5-35c) 


Use  the  result,  and  the  value  of  Xj  just  obtained,  to  calculate 


x2  ~ Xj  -F  V312  At 


(5-35 d) 


1 hen  set  t2  = 2A t. 

Next  determine  the  value  of  a from  the  new  values  of  the  quantities  on 
which  it  depends.  Use  it,  and  the  value  of  u3/2just  obtained,  to  calculate 

^5/2  — U3/2  + a At  (5-35c) 


Use  the  result,  and  the  value  of  x2  just  obtained,  to  calculate 


x3  — x2  + U5/2  At 


(5-35/) 


Then  set  t3  = 3 At. 

Continue  these  calculations  until  t reaches  whatever  value  is  required. 


X 


= 0 


X 


il  D < 0 

t > m 


s,  mg  > 0 

Fig.  5-18  F ree-body  diagram  for  a 
body  of  mass  m when  it  has  fallen 
through  air  a distance  x,  having  started 
at  rest.  The  forces  acting  on  the  body  are 
the  gravitational  force  mg  and  the  drag 
force  due  to  air  resistance  D = — rv2. 


I he  idea  is  as  follows:  If  the  time  increment  At  is  small,  the  change  in 
the  acceleration  a over  each  time  interval  will  be  small.  Thus,  although  a 
varies,  it  is  possible  to  approximate  the  effect  of  the  acceleration  on  the  mo- 
tion by  using  in  each  time  interval  a fixed  value  of  a.  The  value  is  changed 
from  one  time  interval  to  the  next.  For  a particular  interval,  the  value  of  a 
to  be  used  is  calculated  from  the  latest  available  values  of  v,  x,  and  t.  That 
value  of  a is  then  used  to  calculate  the  change  in  v over  the  time  interval. 
The  new  value  of  v is  then  used  to  calculate  the  new  value  of  x.  This  com- 
pletes a particular  cycle  of  calculations.  Then  a new  value  of  a,  appropriate 
for  the  next  cycle  of  calculations,  is  calculated  from  these  updated  values 
of  v,  x,  and  t,  and  that  cycle  is  then  carried  out  in  the  same  manner  as  the 
preceding  one.  Thus,  by  following  sequentially  the  instructions  given  by 
Eqs.  (5-35),  you  obtain  a sequence  of  values  of  v and  x which  describes  the 
actual  physical  situation  to  good  approximation.  That  is,  they  approximate 
the  actual  velocity  and  position  corresponding  to  the  actual  varying  acceler- 
ation, and  to  the  given  initial  values  v0  and  x0.  The  approximation  becomes  more 
accurate  the  smaller  you  choose  the  time  increment  At.  This  is  because  the  shorter 
the  time  increment,  the  smaller  the  range  of  actual  values  of  a in  a particu- 
lar interval  of  time  and  the  smaller  the  error  introduced  by  using  a single 
constant  value  of  a to  approximate  these  values. 

Let  us  try  this  method  on  the  fall  of  a skydiver  of  mass  m.  The  free- 
body  diagram  of  Fig.  5-18  shows  the  forces  acting  on  her  at  any  moment. 
Again  we  will  use  signed  scalars  and  take  the  downward  direction  as  posi- 
tive. Then  the  constant  gravitational  force  will  have  the  positive  value 
mg.  There  is  also  a varying  drag  force  with  a negative  value.  As  noted  at  the 


5-6  The  Skydiver  193 


beginning  of  this  section,  the  size  and  speed  of  a skydiver  are  such  that  the 
flow  of  air  is  turbulent.  According  to  Eq.  (4-29),  the  value  of  the  drag  force 
D is  therefore  given  by 


D = 


pA8  , 

2 V 


(5-36  a) 


In  this  equation,  p is  tbe  density  of  the  air,  A is  the  frontal  cross-sectional 
area  presented  to  the  air  by  the  skydiver,  8 is  the  coefficient  of  drag,  which 
depends  on  her  shape,  and  v is  her  velocity.  It  will  be  convenient  to  write 
Ec i . (5-36(7.)  in  the  compact  form 

D = — rv2  (5-366) 

Here  the  constant  r is  the  product  of  all  the  constants  on  the  right  side  of 
Eq.  (5-36a).  That  is, 


= pAS 

“ 2 


(5-36c) 


At  a moment  when  the  skydiver’s  velocity  is  v,  the  net  force  acting  on 
her  is 


F = mg  + D = mg  - rv2 

Her  acceleration  is  given  by  Newton’s  second  law, 

F _ mg  — rv2 
m m 

This  is  more  conveniently  written  in  the  form 

a = g ~ yv2 


(5-37) 


(5-38a) 


where  the  constant  y (lowercase  Greek  gamma)  is  defined  to  be 


r_  _ pA8 
m 2 m 


(5-38  6) 


In  order  to  calculate  the  numerical  values  of  a needed  to  solve  Eqs. 
(5-35),  we  must  have  a numerical  value  of  the  constant  y on  which  all  the 
values  of  a depend.  A value  of  y for  a skydiver  can  be  measured  in  a “wind 
tunnel,”  or  estimated  from  Eq.  (5-386).  We  will  do  the  latter.  The  equation 
shows  that  y depends  on  both  the  frontal  area  A and  the  coefficient  of  drag 
8,  as  well  as  on  the  mass  m of  a particular  skydiver.  But  a skydiver  does  not 
usually  maintain  a constant  orientation  as  she  falls.  Consequently,  both  A 
and  8 will  vary.  Let  us  take  rough  average  values  and  assume  that  the  sky- 
diver  does  not  engage  in  acrobatics  while  falling.  Suppose  that  m = 60  kg 
and  A = 0.4  m2.  Since  a skydiver  is  less  streamlined  than  a sphere  but  more 
streamlined  than  a circular  disk  facing  broadside,  we  take  for  her  coeffi- 
cient of  drag  the  average  of  the  values  given  for  those  two  shapes  in  Table 


194  Applications  of  Newton’s  Laws 


4-4.  This  gives  6 = 0.8.  Although  the  density  of  air  varies  with  altitude,  we 
will  use  in  this  rough  calculation  the  sea-level  value  p = 1.2  kg/m3.  In- 
serting these  numerical  values  into  Eq.  (5-38 b),  we  obtain  a one- 
significant-figure  estimate  for  y: 


y - 


1.2  kg/m3  x 0.4  m2  x 0.8 
2 x 60  kg 


= 0.003  m”1 


Thus  Eq.  (5-38a)  becomes 

a = 9.8  m/s2  — (0.003  m_1)u2  (5-39o) 

The  equivalent  expression,  suitable  for  use  in  numerical  calculations,  is 

a = 9.8  0.003u2  (a  in  m/s2,  v in  m/s)  (5-396) 

Having  an  equation  which  expresses  the  acceleration  of  the  skydiver  in 
terms  of  her  velocity,  you  are  now  ready  to  use  the  numerical  method  of 
Eqs.  (5-35)  to  find  her  position  x and  velocity  v at  subsequent  values  of  the 
time  t,  for  given  values  of  her  position  x0  and  velocity  v0  at  the  initial  time 

to  = 0. 

Use  of  the  numerical  method  is  explored  in  Examples  5-8  and  5-9. 
The  skydiver  program,  which  will  direct  a programmable  calculator  or 
computer  through  the  required  sequence  of  computations,  is  found  in  the 
Numerical  Calculation  Supplement.  In  Example  5-8  the  program  is  run 
with  the  value  of  y set  to  zero.  This  means  that  there  is  no  air  resistance  and 
the  skydiver  is  falling  freely.  The  purpose  is  to  check  the  numerical  method 
and  its  implementation  on  the  calculating  device.  With  y = 0,  you  have  a = 
g = constant.  Thus  you  are  dealing  with  a case  in  which  the  approximate 
equations,  Eqs.  (5-35),  reduce  to  the  exact  equations,  Eqs.  (5-34).  Conse- 
quently, the  results  of  the  numerical  method  should  be  in  complete  agree- 
ment with  those  obtained  by  the  analytical  method  for  the  familiar  case 
of  free  fall  from  rest: 


x 


(5-40) 


EXAMPLE  5-8  ■■mwwbiiiiiiiiiiiiiiiiiimi.iiiiwiii 

Run  the  skydiver  program  with  the  following  set  of  initial  conditions  on  x0,  v0,  and 
to  and  values  for  At,  y,  and  g: 

x0  = 0,  v0  = 0,  t0  = 0,  At  = 0.5  (in  s),  y = 0,  g = 9.8  (in  m/s2) 

In  this  run  the  skydiver  falls  from  rest  at  t = 0 and  falls  without  air  resistance. 

■ You  enter  the  listed  values  in  the  programmed  calculating  device  and  then  start 
the  run.  The  device  interprets  each  value  you  have  entered  as  being  precise.  For  in- 
stance, it  interprets  the  numerical  value  of  g to  be  9.80000  • • •,  to  as  many  decimal 
places  as  it  carries. 

The  results  produced  for  the  skydiver’s  position  x are  plotted  versus  the 
elapsed  time  t as  dots  in  Fig.  5-19.  The  crosses  are  the  values  of  x obtained  from  the 
analytical  method  by  evaluating  Eq.  (5-40)  for  a few  values  of  t.  The  numerical  and 
analytical  results  agree  beautifully. 


5-6  The  Skydiver  195 


1100 


Fig.  5-19  Numerical  and  analytical 
results  for  the  position  % of  a body  falling 
freely  from  rest  versus  the  time  t.  Here 
y = 0 and  g — 9.8  m/s2. 


1000 


900 


800 


700 


600 


500 


400 


300 


200 


100 


10  11  12  13  14  15  16 


t (in  s) 


In  Example  5-9,  the  constant  y specifying  the  amount  of  air  resistance 
is  set  equal  to  0.003  in-1,  the  value  estimated  for  a skydiver. 


EXAMPLE  5-9 

Run  the  skydiver  program  with  the  following  set  of  initial  conditions  on  x0 , v0,  and 
t0  values  for  At,  y,  and  g: 

x0  = 0,  t-o  = 0,to  = 0,  At  = 0.5  (in  s),  y = 0.003  (in  m"1),  g = 9.8  (in  m/s2) 

Here  the  skydiver  again  starts  from  rest.  But  now  the  effect  of  air  resistance  on  her 
fall  is  taken  into  account. 


196  Applications  of  Newton’s  Laws 


Table  5-2 


Numbers  Appearing  in  Skydiver  Program  Calculations  with 

x0  = 0,  v0  = 0,  t0  = 0,  A t = 0.5  (in  s),  y = 0.003  (in  m_1),  g = 9.8  (in  m/s2) 

v1/2  = 2.45  = 0 + [9.8  - 0.003(0)2]0.5/2 
X!  = 1.23  = 0 + (2.45)0.5 
v 3i2  - 7.34  = 2.45  + [9.8  - 0.003(2.45)2]0.5 
x2  = 4.90  = 1.23  + (7.34)0.5 
v5/2  = 12.16  = 7.34  + [9.8  - 0.003(7.34)2]0.5 
x3  = 10.98  = 4.90  4-  (12.16)0.5 


Note:  The  calculating  device  treats  the  input  numbers  as  exact,  to  as  many  decimal  places 
as  it  carries,  and  also  produces  values  of  x (in  m)  and  v (in  m/s)  to  that  many  decimal 
places.  The  values  of  x and  v listed  here  have  been  rounded  off  to  two  decimal  places. 


■ Entering  the  listed  values,  and  running  the  program,  you  will  obtain  the  results 
plotted  versus  time  in  Fig.  5-20.  In  this  figure  the  values  of  the  position  are  shown  as 
dots. 

Table  5-2  shows  the  numbers  which  the  calculator  or  computer  used  in  the  first 
three  cycles  of  the  calculation.  The  values  of  all  the  terms  in  the  first  six  of  Eqs. 
(5-35)  are  given  in  the  table.  In  ordinary  operation,  the  device  displays  only  the  con- 
secutive values  of  v and  x.  Table  5-2  should  make  clear  that  there  is  nothing  magical 
in  what  the  calculator  or  computer  does.  In  principle,  you  could  do  the  same  thing 
yourself  with  pencil  and  paper — if  you  had  enough  patience!  In  practice,  you 
would  soon  tire.  And  you  would  surely  soon  make  a mistake.  In  such  a sequential 
procedure,  a mistake  made  anywhere  invalidates  everything  that  follows.  Neverthe- 
less, before  the  advent  of  programmable  calculators  and  computers,  professional 
scientists  and  engineers  often  devoted  large  amounts  of  time  to  doing  such  tedious 
calculations  by  hand,  because  for  some  important  physical  problems  they  provide  the 
only  way  to  obtain  solutions.  Now  that  programmable  calculators  and  computers 
are  commonplace,  numerical  methods  are  widely  used  in  everyday  professional 
work.  They  are  used  for  solving  many  types  of  problems,  including  those  involving 
variable  acceleration. 


The  crosses  in  Fig.  5-20  were  obtained  by  evaluating 


1 

x = — In 

y 


-giyatnt  g-lvet^t- 
_ 2 


(5-41) 


This  exact  formula  is  found  by  means  of  a sophisticated  and  quite  complicated 
mathematical  analysis  of  the  problem  of  the  skydiver.  It  is  quoted  (without  proof 
since  the  proof  involves  mathematical  techniques  above  the  level  assumed  in  this 
book)  because  it  is  useful  in  testing  the  accuracy  of  the  numerical  method.  [In  case 
you  are  unfamiliar  with  the  symbols  In  and  e in  Eq.  (5-41),  e = 2.718  • ■ • is  a 
number  whose  properties  are  developed  in  Chap.  7,  where  it  is  put  to  actual  use; 
the  symbol  In  means  the  logarithm  to  the  base  e.] 

The  numerical  results  are  slightly  larger  than  the  corresponding  analytical 
ones.  For  the  conditions  of  this  example,  the  error  in  the  numerical  result  is  1 .7  per- 
cent at  t = 5 s and  1.3  percent  at  t = 15  s.  But  the  error  can  be  reduced  by  using  a 
smaller  value  of  At.  For  instance,  with  At  = 0.25  s,  the  error  drops  to  0.87  percent 
at  t = 5 s. 

The  squares  in  Fig.  5-20  represent  the  skydiver’s  velocity  v. 


5-6  The  Skydiver  197 


1000 


900 


800 


700 


600 


£ 

c 500 

H 


400 


300 


200 


100 


0 


t (in  s) 

Fig.  5-20  Numerical  solutions  for  the  position  x (dots)  and  velocity  v (squares)  of  a body  falling 
from  rest  with  air  resistance  proportional  to  the  square  of  its  speed.  The  crosses  display  exact 
analytical  solutions  for  the  position  of  the  same  body,  obtained  by  using  Eq.  (5-41).  The  value  of 
the  air  resistance  constant  used  is  y = 0.003  m_1,  a typical  value  for  a skydiver,  and 
g = 9.8  m/s2. 


Note  how  the  skydiver’s  velocity  builds  up  rapidly  for  the  first  few  sec- 
onds of  the  jump,  but  soon  approaches  a constant  value  a little  less  than 
60  m/s.  This  velocity  is  called  the  terminal  velocity.  (Its  magnitude  is  the 
terminal  speed,  used  in  Sec.  4-6.)  The  skydiver  can  adjust  the  terminal 
velocity  to  some  extent  by  extending  or  retracting  her  arms  and  legs,  and 
by  turning  with  respect  to  the  direction  of  fall— and  to  a very  considerable 
extent  (fortunately!)  by  opening  the  parachute.  All  these  actions  change  y 
by  changing  both  the  frontal  cross-sectional  area  A and  the  coefficient  of 
drag  6.  Skilled  skydivers  deliberately  use  such  changes  to  carry  out  spectac- 
ular maneuvers. 


198 


Applications  of  Newton’s  Laws 


Again  referring  to  Fig.  5-20,  note  that  the  x versus  t curve  looks 
approximately  parabolic  (as  in  Fig.  5-19)  while  the  velocity  is  relatively 
small,  but  becomes  linear  with  the  approach  to  terminal  velocity.  The  phys- 
ical reasons  for  this  behavior  are  as  follows.  In  the  first  second  or  two  of 
falling  from  rest,  the  velocity  is  so  small  that  the  resulting  air  resistance  is 
small  compared  to  mg.  Thus  the  only  significant  force  in  action  is  the  gravi- 
tational force.  The  acceleration  at  the  beginning  of  the  jump  is  almost 
equal  to  g,  and  the  distance  traveled  increases  in  approximate  proportion 
to  the  square  of  the  elapsed  time,  as  is  exactly  the  case  in  free  fall.  But  with 
continued  acceleration,  the  velocity  increases  rapidly.  The  drag  force  of  air 
resistance  builds  up  even  more  rapidly,  because  it  is  proportional  to  the 
square  of  the  velocity.  Since  this  force  opposes  the  gravitational  force,  the 
net  force  acting  on  the  skydiver  becomes  smaller  and  smaller.  The  effect  is 
to  reduce  the  acceleration,  which  is  the  rate  of  change  of  velocity.  The  sky- 
diver’s  velocity  stops  changing  when  it  is  essentially  equal  to  just  that  value 
for  which  the  frictional  force  is  equal  in  magnitude  to  the  gravitational 
force.  According  to  Eq.  (5-37),  this  happens  when 


,2 


mg  = nr 


Solving  for  the  terminal  velocity  v,  we  have 


(5-42) 


For  the  numerical  values  used  in  Example  5-9,  this  yields 


While  v is  still  increasing  slowly  at  the  last  time  recorded  in  Fig.  5-20,  its 
value  at  that  time,  v = 56.8  m/s,  is  in  good  agreement  with  the  value  just 
calculated.  This  value  of  terminal  velocity  lies  within  the  range  of  values 
actually  observed  for  skydivers  of  various  masses  falling  in  various  atti- 
tudes. The  assumptions  made  in  estimating  y for  a skydiver  are  thus 
justified. 

Terminal  velocity  is  attained  quite  rapidly  when  a body  such  as  a stone  falls 
through  a relatively  resistive  medium  such  as  water.  Its  magnitude  depends  on  the 
mass  of  the  body,  among  other  things.  It  is  on  the  basis  of  this  observation  that 
physicists  before  Galileo  (and  many  people  to  this  day)  came  to  believe  that  an  ob- 
ject in  free  fall  moves  with  a constant  velocity  proportional  to  its  weight. 

The  numerical  method  which  we  have  just  developed  has  extremely 
broad  applicability.  It  almost  always  allows  you  to  determine  the  motion  of 
any  body  whose  acceleration  varies.  It  does  not  matter  whether  its  accelera- 
tion varies  because  the  forces  acting  on  the  body  depend  on  its  velocity  (as 
for  a skydiver),  or  on  its  position  (as  for  a pendulum  bob),  or  on  the  time  (as 
for  a child  on  a swing  being  pushed  by  another  child),  or  on  all  these 
things.  And  by  means  of  a generalization  which  amounts  to  nothing  more 
than  a change  of  symbols,  the  same  method  can  be  applied  to  electrical 
systems,  or  systems  of  waves,  or  even  quantum  mechanical  systems.  We  will 
use  this  method  for  these  purposes  throughout  the  remainder  of  this  book. 


5-6  The  Skydiver  199 


EXERCISES 

Group  A 

5-1.  Force,  mass,  and  acceleration.  What  is  the  magni- 
tude of  the  acceleration  of  a body  if  the  magnitude  of  the 
net  force  acting  on  the  body  is  equal  to  its  weight? 


5-7.  Sliding  to  a halt.  A block  of  mass  1 .00  kg  slides  over 
the  rough  surface  of  a table.  From  the  point  where  its  speed 
is  1.00  m/s,  it  slides  0.20  m before  coming  to  rest. 

a.  What  is  the  frictional  force  (assumed  constant)? 

b.  What  is  the  coefficient  of  kinetic  friction  between 
the  block  and  the  table? 


5-2.  Determining  inertial  mass.  A spring  is  compressed 
by  squeezing  it  between  a standard  kilogram  and  a body  of 
unknown  mass.  The  system  is  placed  on  an  air  table  and 
released.  At  a certain  instant  the  acceleration  of  the  stand- 
ard mass  is  2 m/s2,  whereas  the  acceleration  of  the  body  of 
unknown  mass  is  — 1 m/s2.  What  is  the  magnitude  of  the 
unknown  mass?  Assume  that  the  mass  of  the  spring  is  neg- 
ligible. 

5-3.  Going  up?  An  elevator  car  whose  mass  is  1 metric 
ton  = 103  kg  is  pulled  upward  with  a force  of  1 1,000  N. 
What  is  its  acceleration? 

5-4.  Going  down?  The  elevator  car  described  in  Exer- 
cise 5-3  descends  with  an  acceleration  of  2.00  m/s2.  What  is 
the  tension  in  its  supporting  cable? 


5-8.  Looping  the  loop.  A stunt  flier  in  an  open- 
cockpit  biplane  loops  the  loop  in  a vertical  circle  of  radius 
400  m.  What  must  her  minimum  speed  be  at  the  top  of 
the  loop  if  a loose  pencil  in  the  cockpit  does  not  fall  out? 

5-9.  Merry-go-round.  A child  on  a merry-go-round 
holds  a ball  on  the  floor.  When  the  merry-go-round  is 
moving  counterclockwise  at  constant  speed,  he  releases 
the  ball.  Neglect  any  friction  between  the  ball  and  the 
floor. 

a.  Describe  qualitatively  the  motion  of  the  ball  as  seen 
by  an  observer  standing  on  the  ground  near  the  merry- 
go-round. 

b.  Describe  qualitatively  the  motion  of  the  ball  as  seen 
by  the  child. 


5-5  Pulleys,  bodies,  and  string.  Two  systems  are  shown 
in  Fig.  5E-5.  Assume  the  systems  to  be  frictionless  and 
the  pulleys  to  be  massless. 

a.  What  is  the  acceleration  of  the  2.00-kg  body  in  Fig. 
5E-5a? 

b.  What  is  the  acceleration  of  the  2.00-kg  body  in 
Fig.  5E-56? 

c.  What  is  the  tension  in  the  string  in  part  a ? 

d.  What  is  the  tension  in  the  string  in  part  b ? 

Fig.  5E-5 


(a)  ( b ) 


5-6.  Down  the  incline.  An  object  of  mass  m is  placed  on 
a smooth  incline.  Starting  from  rest,  the  object  moves 
2.00  m in  2.00  s.  What  is  the  angle  of  inclination  of  the 
plane? 


5-10.  Rounding  a corner.  An  automobile  rounds  a 
corner  in  a circle  of  radius  R at  constant  speed  v.  The 
roadway  is  horizontal. 

a.  What  must  be  the  minimum  value  of  the  coeffi- 
cient of  friction,  if  the  car  is  not  to  skid? 

b.  Which  coefficient  of  friction  is  relevant  here,  the 
static  coefficient  or  the  kinetic  coefficient?  Explain  your 
choice. 

5-11.  On  the  railroad,  I.  A train  is  moving  along  a 
straight  and  level  section  of  track.  The  conductor  stands 
in  the  aisle  of  the  train,  dangling  his  ticket  punch  from  a 
chain  in  his  hand.  The  chain  is  inclined  backward,  making 
an  angle  of  5°  with  the  vertical.  What  are  the  magnitude 
and  direction  of  the  train’s  acceleration? 

5-12.  On  the  railroad,  II. 

a.  Suppose  that  you  are  in  a windowless  train  and  are 
using  a watch  and  watch  chain  as  a makeshift  plumb  bob. 
You  observe  that  the  chain  is  inclined  toward  the  rear  of 
the  train.  This  inclination  could  be  due  to  an  acceleration 
of  the  train,  an  upward  slope  of  the  railroad  tracks,  or  a 
combination  of  the  two.  Can  you  describe  a way  to  distin- 
guish among  those  possibilities  without  using  any  evi- 
dence other  than  the  inclination  of  the  watch  chain? 

b.  Suppose  that  you  are  in  the  situation  described  in 
part  a,  but  that  you  also  have  access  to  an  accurate  pan  bal- 
ance and  a set  of  standard  masses.  Can  you  distinguish 
among  the  various  possible  reasons  for  the  hob’s  inclina- 
tion? 

c.  Suppose  that  you  are  in  the  situation  described  in 
part  a.  but  that  you  have  an  accurate  spring  scale  and  a 
set  of  standard  masses  in  addition  to  the  plumb  bob.  What 


200  Applications  of  Newton’s  Laws 


could  you  determine  about  the  cause  of  the  bob's  in- 
clination? 

5-13.  Oh,  how  I love  to  go  up  in  a swing!  Estimate  the 
magnitude  of  the  fictitious  force  experienced  by  a child  on 
a playground  swing  as  she  passes  through  the  bottom  of 
the  swing’s  arc.  This  fictitious  force  is  perceived  by  the 
child  as  an  increase  in  her  weight.  Compare  this  increase 
to  her  actual  weight. 

5-14.  Ski  jumping. 

a.  As  a ski  jumper  passes  the  low  point  on  the  ap- 
proach ramp,  his  speed  is  v0.  The  radius  of  curvature  of 
the  ramp  is  R.  What  is  the  upward  acceleration  of  the 
skier? 

b.  What  upward  force  must  the  ramp  exert  on  the 
skier,  whose  mass  is  ml  What  multiple  of  his  normal 
weight  is  this? 

c.  Evaluate  your  results  for  v0  = 22  m/s  (approxi- 
mately 50  mi/h),  R = 15  m,  and  m = 70  kg. 

5-15.  A rocket  in  action.  A rocket  has  a takeoff  mass  of 
13,000  kg.  It  burns  fuel  at  the  rate  of  125  kg/s,  with  an 
exhaust  velocity  of  1800  m/s. 

a.  What  is  the  thrust  of  the  rocket? 

b.  What  is  its  initial  acceleration  in  a vertical  takeoff? 


a.  F = 29.4  N 

b.  F = 49.0  N 

5-20.  Downhill  run.  A man  stands  on  a board 
of  mass  m.  The  board  rests  on  frictionless,  massless 
rollers,  which  in  turn  lie  on  an  inclined  plane  making  an 
angle  0 with  the  horizontal.  If  the  man’s  mass  is  M,  with 
what  acceleration  must  he  move  if  he  wishes  the  board  to 
remain  stationary? 

5-21.  Bead  on  a wire.  This  problem  was  devised  by  Ga- 
lileo Galilei.  From  a point  A , which  is  the  highest  point  on 
a vertical  circle,  three  wires  are  stretched  to  B.  B' , and  B". 
AB  is  a diameter,  while  AB'  and  AB"  are  arbitrary  chords 
of  the  circle.  On  each  wire  is  a bead  which  can  slide  with 
negligible  friction.  The  beads  are  held  at  A and  then  re- 
leased simultaneously.  Prove  that  the  three  beads  arrive  at 
the  circumference  of  the  circle  simultaneously  in  a time 
t = V2 d/g,  where  d is  the  diameter  of  the  circle. 

5-22.  Block  on  a table  top.  As  shown  in  Fig.  5E-22,  a 
2.0-kg  body  is  attached  by  a massless  string  to  a 400-g 
body.  The  pulley  is  frictionless  and  massless.  The  system  is 
initially  motionless. 


Group  B 

5-16.  Tension,  tension,  tension.  A locomotive  pulls  three 
freight  cars  with  an  acceleration  of  2.0  m/s2.  Each  car 
has  mass  1.0  x 104  kg. 

a.  What  is  the  tension  in  the  drawbar  between  the  lo- 
comotive and  the  first  car? 

b.  Between  the  first  car  and  the  second? 

c.  Between  the  second  car  and  the  third? 


5-17.  Hoisting  a chain.  A chain  of  mass  10  kg  hangs 
vertically.  It  is  pulled  upward  with  an  acceleration  of 
3.0  m/s2. 

a.  What  is  the  tension  at  the  top  of  the  chain? 

b.  What  is  the  tension  in  the  middle  of  the  chain? 


5-18.  Atwood’s  heirs,  /.  An  Atwood  machine  is  sus- 
pended from  a spring  scale.  It  is  motionless,  with  two 
1-kg  standard  masses  on  each  side. 

a.  Find  the  reading  of  the  spring  scale. 

b.  One  of  the  1-kg  masses  is  transferred  from  the 
right  side  to  the  left  side  of  the  system.  The  Atwood  ma- 
chine proceeds  to  move.  What  is  the  reading  of  the  spring 
scale  now? 


34^  5-19.  Atwood’s  heirs,  II.  Starting  from  rest,  the  system 
illustrated  in  Fig.  5E-5a  is  accelerated  clockwise  by  a force 
F applied  downward  to  the  body  on  the  right.  4'he  mass  of 
the  pulley  is  negligible.  Find  the  acceleration  of  each  of 
the  two  bodies  if 


A 


(Haw 


h Jmk\ 


Fig.  5E-22 


a.  If  the  table  surface  is  frictionless,  find  the  accelera- 
tion of  the  2.0-kg  body. 

b.  Suppose  the  table  is  reasonably  smooth  but  unlub- 
ricated, so  that  the  coefficient  of  static  friction  is  /u.s  = 
0.25  and  the  coefficient  of  kinetic  friction  is  ju,fc  = 0.17. 
Will  the  system  begin  to  move?  If  so,  find  the  acceleration 
of  the  2.0-kg  body  after  it  has  started  moving. 

c.  Suppose  that  the  table  top  is  lubricated,  so  that 
p.s  = 0.15  and  p*  = 0.10.  Will  the  system  begin  to  move? 
If  so,  find  the  acceleration  of  the  system  once  it  has  started 
moving. 


5-23.  Up  and  back.  A block  is  given  an  initial  speed  vt 
upward  along  a very  long  incline.  The  incline  makes  an 
angle  0 with  the  horizontal.  The  coefficient  of  static  friction 
between  the  block  and  the  surface  ol  the  incline  is  /xs,  and 
the  coefficient  of  kinetic  friction  is  /j.k. 


Exercises  201 


Fig.  5E-29 


a.  What  is  the  condition  that  determines  whether  the 
block  will  simply  come  to  rest  on  the  plane  or  begin  to  slide 
back  down  again? 

b.  Assuming  that  the  block  does  slide  back  down 
again,  what  is  its  speed  t^when  it  returns  to  the  point  from 
which  it  started?  Find  an  expression  for  the  ratio  vf/v t in 
terms  of  tan  0 and  /xfc. 

c.  Find  vf  if  /xk  = 0.20,  6 = 35°,  and  vf  = 25  m/s. 

5-24.  Down  and  out.  Starting  from  rest  at  a height  h 
above  the  base  of  a frictionless  incline  which  curves  gently 
toward  the  bottom  until  it  is  horizontal,  a block  slides  down 
the  incline  and  onto  a level  surface  where  the  coefficient 
of  kinetic  friction  is  fj,k . Prove  that  /xk  = h/s,  where  5 is 
the  distance  traveled  on  the  level  surface  by  the  block  be- 
fore it  comes  to  rest. 

5-25.  How  will  it  go?  Figure  5-7 a depicts  two  blocks 
connected  by  a string  passing  over  a pulley.  One  block 
hangs  from  the  string  while  the  other  lies  on  an  inclined 
plane.  It  is  pointed  out  in  the  discussion  following  Ex- 
ample 5-6  that  the  behavior  of  the  system  depends  on  the 
masses  of  the  blocks,  the  angle  of  the  inclined  plane,  and 
the  coefficient  of  friction  between  the  sliding  block  and 
the  plane.  Table  5-1  summarizes  the  conditions  which 
lead  to  all  possible  cases  of  behavior  of  the  system.  Verify 
these  conditions. 

5-26.  Blocked  shot.  A bullet  with  mass  m = 10  g is 
bred  with  speed  v = 500  m/s  into  a wooden  block  of  mass 
M = 1000  g that  is  at  rest  on  the  top  of  a very  long  table. 
The  bullet  buries  itself  in  the  block,  which  slides  a distance 
5 = 5.0  m before  coming  to  rest.  What  is  the  coefficient  of 
kinetic  friction  between  the  block  and  the  table? 

5-27.  Spinning  like  a top.  What  would  be  the  length  of 
the  day  if  the  earth  were  rotating  fast  enough  for  objects 
at  the  equator  to  appear  weightless?  Take  the  radius  of 
the  earth  to  be  6400  km  and  the  acceleration  due  to  grav- 
ity to  be  9.8  m/s2. 

5-28.  Trick  cycling.  A trick  motorcyclist  rides  in  a hori- 
zontal circle  around  the  vertical  walls  of  a cylindrical  pit  of 
radius  R. 

a.  What  is  the  minimum  speed  with  which  she  must 
ride  if  the  coefficient  of  static  friction  between  the  tires 
and  the  wall  is  p,s? 

b.  Evaluate  this  speed  if  R = 5.0  m and  p,s  = 0.90. 

5-29.  Round  and  round.  A string  is  threaded  through  a 
smooth  glass  tube.  Bodies  of  mass  M and  m are  tied  to  its 
ends,  with  M > m.  As  shown  in  Fig.  5E-29,  the  body  m 
is  made  to  revolve  around  the  tube  in  a horizontal  circle, 
so  that  body  M neither  rises  nor  descends.  The  period 
of  the  circular  motion  is  T. 


a.  What  is  the  angle  between  the  string  and  the  tube? 

b.  Express  the  free  string  length  L in  terms  of  T.  m, 
M , and  g. 

c.  Express  T in  terms  of  g and  h,  where  h is  the  ver- 
tical distance  from  the  top  of  the  tube  to  the  body  m. 

5-30.  Sudden  stop.  A motorist’s  brakes  fail  and  his  au- 
tomobile strikes  a brick  wall  at  speed  w0.  He  is  wearing  a 
seat  belt  which  keeps  his  body  fixed  with  respect  to  the  car. 
The  front  end  of  the  car  is  compressed  by  an  amount  d as 
the  car  comes  to  rest. 

a.  Assuming  a constant  deceleration  during  the  im- 
pact, find  an  expression  for  the  acceleration  a of  the 
driver.  Express  a in  terms  of  v0  and  d. 

b.  In  his  (highly  noninertial)  personal  frame  of  refer- 
ence, the  driver  perceives  that  he  is  acted  on  by  two  bal- 
anced forces:  the  “real"  restraining  force  supplied  by  the 
seat  belt  and  the  “fictitious”  force,  equal  in  magnitude  and 
oppositely  directed.  Express  the  magnitude  of  these 
forces  as  a multiple  of  the  driver’s  weight  W.  This  ratio  is 
the  “number  of  g's”  of  horizontal  acceleration  to  which  the 
driver  is  subjected. 

c.  Evaluate  the  result  of  part  b for  v0  = 9.0  m/s 
( — 20  mi/h)  and  d = 0.30  m. 

d.  During  the  deceleration,  the  driver  will  perceive 
that  his  head  has  an  “added  weight”  that  is  directed  for- 
ward. Estimate  this  apparent  extra  head  weight. 

5-31.  The  bigger  they  come,  the  harder  they  fall.  Two 
spheres  are  made  of  the  same  material,  but  one  has  twice 
the  diameter  of  the  other.  They  are  dropped  from  an  air- 
plane, and  each  sphere  ultimately  reaches  its  terminal 
velocity.  What  is  the  ratio  of  the  two  terminal  velocities, 
assuming  that  the  resistance  of  the  air  is  proportional  to 
the  square  of  the  velocity,  as  in  Eq.  (5-36o)? 

5-32.  Togetherness.  Two  blocks  of  mass  wq  and  wq  tied 
together  with  a string  are  sliding  down  the  incline  shown  in 
Fig.  5E-32.  The  coefficient  of  kinetic  friction  between  m 2 
and  the  plane  is  /r2  = 0.10.  The  coefficient  of  kinetic  fric- 
tion between  wq  and  the  plane  is  yUj  = 0.20. 

a.  Calculate  the  acceleration  of  the  blocks. 

b.  Calculate  the  tension  in  the  string  connecting  the 
blocks. 


202  Applications  of  Newton’s  Laws 


Fig.  5E-32 


5-33.  Table  for  two.  As  shown  in  Fig.  5E-33,  a body  of 
mass  mx  is  supported  by  the  level,  frictionless  surface  of  a 
table.  The  pulleys  have  negligible  mass,  and  the  system  is 
initially  motionless. 


Fig.  5E-33 


a.  As  the  system  begins  to  move,  find  the  relationship 
that  must  exist  between  the  distances  dx  and  d2  traveled  by 
mx  and  m2. 

b.  If  mx  = 500  g and  m2  = 100  g,  find  the  magni- 
tudes ax  and  a2  of  their  accelerations. 

c.  For  the  masses  given  in  part  b,  find  the  tension  in 
the  string. 

Group  C 

5-34.  Slipping  and  sliding.  A block  of  mass  mx  is  placed 
on  an  incline  whose  mass  is  m2 , as  shown  in  Fig.  5E-34.  I he 
incline  rests  on  a level  surface,  and  the  system  is  initially 
motionless.  All  surfaces  are  frictionless,  so  both  block 
and  incline  are  free  to  move. 

y 

Fig.  5E-34 


x 


a.  Show  that  the  x component  of  the  acceleration  of 
the  block  is 


m2g  tan  9 

m2  sec2  9 + m1  tan2  9 

b.  Show  that  the  x component  of  the  acceleration  of 
the  incline  (its  only  component)  is 


mxg  tan  9 

(>2x  m2  sec2  9 + m,  tan2  6 

c.  Show  that  the  y component  of  the  acceleration  of 
the  block  is 

_ _ (mx  + m2)g  tan2  0 
111  m2  sec2  9 + mx  tan2  9 

d.  Find  the  magnitude  NB  of  the  normal  force 
between  the  block  and  the  incline. 

e.  Find  the  magnitude  NT  of  the  normal  force  be- 
tween the  incline  and  the  supporting  surface. 

5-35.  Monkey  business.  A rope  is  hanging  over  a fric- 
tionless pulley.  One  end  of  the  rope  is  attached  to  a bunch 
of  bananas,  and  the  other  end  is  held  by  a monkey  at  a 
lower  level.  The  weight  of  the  bananas  is  equal  to  the 
weight  of  the  monkey.  Describe  what  happens  when  the 
moneky  climbs  the  rope  to  reach  the  bananas. 

5-36.  Atwood’s  heirs,  III.  As  shown  in  Fig.  5E-36,  body  1 
of  mass  itt,  is  suspended  from  a pulley,  while  body  2 of 
mass  m2  is  attached  to  the  end  of  the  string.  The  pulleys 
have  negligible  mass,  and  the  system  is  initially  motionless. 

Fig.  5E-36 


a.  Find  the  relationship  that  must  exist  between  the 
distances  dx  and  d2  traveled  by  body  1 and  body  2 as  the 
system  begins  to  move. 

b.  If  mx  = 0.30  kg  and  m2  = 0.50  kg,  find  the  accel- 
erations ax  and  a2  of  the  bodies. 

c.  For  the  masses  given  in  part  b,  find  the  tension  in 
the  string. 

5-37.  Atwood’s  heirs,  IV.  The  system  shown  in  Fig. 
5E-37  is  initially  motionless,  and  the  pulleys  are  of  neg- 
ligible mass.  Bodies  1 and  2 of  mass  are  1.0  and  2.0  kg, 
respectively.  The  mass  m3  of  body  3 is  unspecified.  When 
the  system  is  released,  it  is  found  that  body  1 remains  sta- 
tionary. 

a.  What  is  the  tension  in  the  string  supporting  body 

1? 

b.  What  are  the  magnitude  and  direction  of  the  accel- 
eration of  body  2?  Of  pulley  A?  Of  body  3? 


Exercises  203 


Fig.  5E-37 


Fig.  5E-40 


m2 

2 kg 


c.  What  is  the  tension  in  the  string  supporting  pulley 
A? 

d.  What  is  the  mass  m3  of  body  3? 

5-38.  Turntable.  A body  of  mass  m rests  on  a turn- 
table. The  coefficient  of  static  friction  between  the  body 
and  the  table  is  /xs.  Another  body  of  mass  M is  attached 
to  a string  that  passes  through  a hole  in  the  center  of 
the  turntable  and  over  a massless,  frictionless  pulley  and 
is  then  attached  to  m. 

a.  What  are  the  shortest  and  longest  periods  of  revo- 
lution of  the  turntable  for  which  m can  remain  fixed  on 
the  turntable  at  a distance  r from  its  center? 

b.  If  M = 2m  and  /jls  = 0.50,  find  the  ratio  of  the 
longest  period  to  the  shortest. 

5.39.  Which  way  will  they  go?  The  coefficient  of  static 
friction  between  the  blocks  and  the  planes  shown  in  Fig. 
5E-39  is  0.30.  The  coefficient  of  kinetic  friction  is  0.25. 
The  pulley  is  frictionless  and  massless.  The  triangular 
base  is  fixed,  and  the  system  is  initially  motionless. 

Fig.  5E-39 


a.  For  the  masses  and  angles  shown,  will  the  system 
begin  to  move? 

b.  If  the  system  does  move,  in  which  direction  will  it 
move  and  with  what  acceleration? 


a.  Which  way  does  the  plumb  bob  incline  with  respect 
to  the  vertical? 

b.  Let  a be  the  angle  between  the  steady  direction  of 
the  plumb  bob  and  the  vertical.  Prove  that  a = 9. 


5-41.  Caught  by  the  camera.  A long  inclined  plane  is 
fixed  on  a level  tabletop.  A partially  filled  aquarium  tank  is 
sliding  down  the  incline.  The  water  in  the  tank  has 
stopped  sloshing  and  settled  down  to  a plane  surface.  A 
snapshot  is  taken  of  the  tabletop,  incline,  and  the  sliding 
tank.  Describe  how  measurements  on  the  single  snapshot 
can  be  used  to  determine  the  coefficient  of  kinetic  friction 
between  the  tank  and  the  incline. 


5-42.  Effective  weight  on  a rotating  earth.  As  discussed 
in  Example  5-7,  a person  standing  at  the  earth’s  equator 
has  a slightly  smaller  effective  weight  than  he  or  she  would 
have  if  the  earth  did  not  rotate.  This  is  due  to  the  fictitious 
centrifugal  force;  at  the  equator,  this  force  acts  in  a direc- 
tion opposite  to  that  of  the  true  gravitational  force. 

a.  Show  that  the  fictitious  centrifugal  force  Fc  ob- 
served by  a person  of  mass  M standing  at  latitude  k has 

.^magnitude  (4-rrR/T2)  cos  (AM),  and  that  it  points  out- 
f ward  in  a direction  perpendicular  to  the  earth’s  axis  of  ro- 
tation. Here  R is  the  earth’s  radius,  and  T is  the  earth’s 
period  of  rotation. 

b.  Construct  and  label  a diagram  that  shows  the  true 
gravitational  force  W = Mg  and  the  centrifugal  force  Fc 
that  act  on  a person  at  latitude  k. 

c.  Find  an  expression  for  effective  weight  W'  of  a 
person  standing  at  latitude  k.  Remember  that  W'  = W + 


Numerical 

5-43.  Skydiver,  I.  Run  the  skydiver  program  with 
the  same  values  of  Xq  , v0 , t0 , y,  and  g as  in  Example  5-9,  but 
with  At  = 0.125  s,  and  obtain  values  for  x at  t = 5 s and 
t = 15  s.  Determine  the  error  in  these  values  by  using  Eq. 
(5-41).  Compare  these  errors  with  the  ones  in  the  values 
obtained  in  Example  5-9,  and  with  the  ones  quoted  below 
Eq.  (5-41).  What  relation  do  you  find  between  the  error  in 
the  numerical  method  for  treating  skydiver  motion  and 
the  size  of  At? 


5-40.  Bob  was  framed.  As  shown  in  Fig.  5E-40,  a frame 
is  sliding  down  a frictionless  inclined  plane.  A plumb  bob 
suspended  from  the  frame  has  settled  to  a steady  direc- 
tion. 


V 


5-44.  Skydiver,  II.  Run  the  skydiver  program  with  the 
same  values  of  x0,  v0 , t0,  At,  and  g as  in  Example  5-9,  but 
with  y = 0.001  m-1.  Plot  x and  v.  Compare  with  the  plots 
obtained  in  Example  5-9,  and  comment  on  the  similarities 
and  differences.  Also  use  Eq.  (5-42)  to  predict  the  termi- 


204  Applications  of  Newton’s  Laws 


nal  velocity,  and  use  it  to  check  the  value  you  obtained 
from  running  the  program. 


5-45.  Skydiver,  III.  A skydiver  falls  approximately 
500  m with  arms  and  legs  bent  so  that  y = 0.0025.  He 
then  extends  his  arms  and  legs,  increasing  the  frictional 
coefficient  to  y = 0.0035,  and  continues  falling.  When 
the  total  distance  fallen  is  750  m,  he  opens  his  parachute. 
Determine  the  time  dependence  of  his  velocity  v and  dis- 
tance fallen  x through  the  period  ending  when  the  para- 
chute opens.  Also  determine  the  elapsed  time  when  the 
parachute  is  opened.  Do  this  by  running  the  program  with 
the  initial  value  of  y until  x has  a value  near  500  m, 
stopping  the  calculating  device,  changing  the  stored  value 
of  y,  and  then  continuing.  Briefly  discuss  your  results. 


5-46.  Skydiver,  IV.  A skydiver  falls  for  25  s with  hi<? 
^ody  adjusted  so  as  to  make  y = 0.0015  m-1.  She  then  ad- 
justs it  to  make  y = 0.003  m-1.  Run  the  skydiver  pro- 
gram, using  At  = 0.5  s,  to  obtain  a plot  of  x and  v for  the 
first  40  s of  her  fall. 


5-47.  Skydiver,  V.  Change  the  skydiver  program  to 
make  the  magnitude  of  the  frictional  drag  force  porpor- 
tional  to  the  first  power  of  the  velocity  by  deleting  the  step 
in  the  program  in  which  the  velocity  is  squared.  1 hen  find 
a value  of  y that  will  lead  to  the  same  terminal  velocity 
found  in  Example  5-9,  from  an  equation  obtained  by  an 
analytical  argument  analogous  to  the  one  leading  to  Eq. 
(5-42).  Run  your  modified  program  with  this  y,  plot  x and 
v,  and  compare  with  the  plot  in  Example  5-9.  Discuss  the 
similarities  and  differences. 

'jr'  5-48.  Skydiver,  VI.  Change  the  skydiver  program  as 
in  Exercise  5-4^  but  this  time  modify  the  program  so  that 
the  frictional  drag  force  is  proportional  to  the  cube  of  the 
velocity.  This  is  of  practical  importance  in  certain  cases, 
but  there  is  no  way  to  handle  the  calculation  other  than 
numerically  since  the  equation  determining  the  motion  of 
the  falling  body  has  no  analytical  solution. 

5-49.  Model  rocket,  I.  Modify  the  skydiver  program  so 
that  it  can  predict  the  vertical  upward  flight  of  a model 
rocket  launched  from  the  ground.  The  engine  of  a model 
rocket  produces  an  upward  thrust  force  on  the  rocket 
which,  to  a good  approximation,  is  constant  during  the 
burn  period  and  zero  afterward.  Furthermore,  the  de- 
crease in  the  mass  of  the  model  during  engine  burn  is 
small  compared  to  its  total  mass,  so  it  is  a good  approxi- 
mation to  take  the  total  mass  of  the  model  during  engine 
burn  to  be  a constant  equal  to  the  average  value  of  the 
mass  during  this  time.  After  engine  burn,  the  model  con- 
tinues to  coast  vertically  upward,  with  constant  mass,  until 
it  comes  to  the  top  of  its  trajectory.  At  all  times  gravity 
exerts  a downward  force  on  the  model,  of  magnitude 
equal  to  its  mass  multiplied  by  g (use  average  mass  during 
engine  burn),  and  air  resistance  exerts  a downward  force 
of  magnitude  equal  to  the  drag  constant  r multiplied  by 
the  square  of  its  velocity. 


5-50.  Model  rocket,  II.  Run  the  model  rocket  program 
obtained  in  Exercise  5-49  with  the  following  set  of  values: 
Engine  thrust  = 4.17  N.  Engine  burn  time  = 1.2  s. 
Average  mass  of  model  during  engine  burn  = 0.0792  kg. 
Final  mass  of  model  = 0.0751  kg.  Drag  constant  = 
0.000646  kg/m.  Time  increment  = 0.05  s.  Watch  the  dis- 
played values  of  the  rocket’s  velocity,  without  plotting, 
until  the  first  negative  value  obtained  signals  that  the 
rocket  has  passed  the  top  of  its  trajectory.  Then  obtain  its 
altitude  above  ground  level  and  the  time  required  for  it  to 
reach  that  altitude. 

5-51.  Optimizing  peak  altitude.  A certain  model  rocket 
has  specified  total  mass,  fuel  mass,  and  exhaust  speed,  but 
the  burn  rate  can  be  set  to  any  desired  value  prior  to 
launch. 

a.  If  the  rocket  is  to  be  fired  in  an  airless  environment 
(such  as  from  the  lunar  surface),  how  does  its  peak  alti- 
tude depend  on  the  burn  rate  that  is  used?  What  burn  rate 
provides  maximum  altitude? 

b.  If  there  is  a drag  force  that  acts  on  the  rocket,  how 
is  your  argument  in  part  a affected? 

c.  Run  the  model  rocket  program  of  Exercise  5-49 
with  the  following  sets  of  values  for  the  thrust,  burn  time, 
and  burn  rate. 


Engine  thrust  Burn  time  Burn  rate 


(in  N) 

(in  s) 

(in  10  3 kg/s) 

1. 

4.17 

1.2 

6.83 

2. 

8.34 

0.6 

13.66 

3. 

2.085 

2.4 

3.415 

Use  the  values  given  in  Exercise  5-50  for  the  other  quan- 
tities. (Note  that  the  set  1 exactly  reproduces  the  con- 
ditions of  Exercise  5-50.) 

d.  Based  on  your  results  in  part  c,  is  the  optimum 
burn  rate  faster  or  slower  than  6.83  x 10-3  kg/s? 

5-52.  Rockets  away,  I.  At  t = 0 s,  a rocket  with  initial 
mass  m0  = 1 X 104  kg  is  fired  vertically.  The  speed  of 
the  exhaust  gases  (relative  to  the  rocket)  is  va  = 2500  m/s. 
The  fuel  is  burned  at  a steady  rate  such  that  0.05wi0  is 
ejected  each  second  until  the  fuel  is  exhausted,  when 
m = r«final  = 0.1w0.  For  the  purposes  of  this  exercise, 
ignore  the  pull  of  gravity. 

a.  For  how  long  does  the  fuel  burn? 

b.  What  is  the  value  of  mw,  the  mass  of  the  rocket  at 
t = 3 At? 

c.  Write  an  expression  for  the  velocity  increment  Av 
during  a small  time  increment  At,  in  terms  of  dm/dt,  v3 , At, 
and  the  mass  average  m during  the  time  increment. 

d.  In  the  present  exercise,  find  Aux,  the  increase 
during  the  first  time  increment.  Use  the  mass  value  w*1/2 
in  your  calculation. 

e.  Find  v1 , the  velocity  at  the  end  of  the  first  time 
increment.  Find  the  ratio  v^/vg. 

f.  What  is  the  value  of  m3/2 , the  rocket  mass  at  t = f At? 

g.  Find  Av2  , the  velocity  change  between  t = 1 A/and 
t = 2 At. 


Exercises  205 


h.  Find  v2,  the  velocity  at  t = 2 At.  Find  v2/vg. 

i.  Using  the  general  expressions 

Avj+1  = \dm/dt\ 

vg  m5+ 1/2 


and 


dilate  the  velocity  at  the  end  of  each  time  increment  from 
t — 0 s up  to  burnout.  Use  an  increment  At  = 1 s in  your 
calculations. 

j.  Construct  a graph  of  v, /vg  versus  tj. 

k.  Use  the  graph  of  velocity  versus  time  to  determine 
the  distance  traveled  by  the  rocket  before  burnout. 


vj+i  = Vj_  + AU 

Vg  Vg  Vg 

write  a program  which  enables  a calculating  device  to  cal- 


5-53.  Rockets  away,  II.  Repeat  the  analysis  of  Exercise 
5-52,  taking  gravity  into  account.  Assume  a constant  value 
of  9.8  m/s2  for  g. 


206  Applications  of  Newton’s  Laws 


Oscillatory  Motion 


6-1  STABLE  You  are  surrounded  by  things  whose  motion  is  oscillatory.  Some  obvious 
EQUILIBRIUM  AND  examples  are  the  pendulum  in  a grandfather  clock  and  a guitar  string 

OSCILLATORY  when  it  is  plucked.  Less  obvious,  but  at  least  as  important,  are  such  micro- 
MOTION  scoP*c  examples  as  the  oscillating  air  molecules  carrying  the  sound  wave 
produced  by  the  guitar.  Furthermore,  many  very  important  nonmechan- 
ical phenomena  involve  oscillations.  A set  of  related  examples  is  found  in 
the  alternating-current  electric  power  that  runs  a television  receiver,  the 
electromagnetic  signal  that  it  receives,  the  currents  in  the  electric  circuits 
which  decode  the  signal  to  produce  the  pictures,  and  the  light  waves  by 
which  you  see  these  pictures.  In  this  chapter  you  will  study  the  oscillatory 
motion  of  several  simple  mechanical  systems.  Almost  all  that  you  learn  will 
be  applied  later  in  the  book  to  more  complex  mechanical,  and  nonmechan- 
ical, forms  of  oscillatory  motion. 

Any  object  which  has  a position  of  stable  equilibrium  is  capable  of  performing 
oscillations  about  that  position.  Figure  6-1  will  be  used  first  to  explain  the 
meaning  of  the  expression  “position  of  stable  equilibrium”  and  then  used 
again  to  discuss  the  oscillation  of  an  object  about  a position  of  stable  equilib- 
rium. The  spring  in  the  figure  is  of  negligible  mass,  and  its  top  end  is  at- 
tached to  a rigid  support.  From  its  bottom  end  hangs  a body  of  mass  m.  We 
employ  signed  scalar  symbols  in  discussing  the  directed  quantities,  such  as 
forces,  which  enter  into  the  analysis  of  this  one-dimensional  system.  The 
downward  direction  is  taken  to  be  positive. 

In  Fig.  6- la,  the  body  is  shown  at  the  position  where  the  spring  is 
stretched  from  its  completely  relaxed  length  by  such  an  amount  that  the 
negative  (upward)  force  S exerted  on  the  body  by  the  spring  just  cancels 
the  positive  (downward)  force  mg  exerted  on  the  body  by  the  earth’s  grav- 


207 


| mg  > 0 
(a) 


Z) 


x > 


°I~ 


5 < 0 


P 

m 


\ 


mg  > 0 


(b) 


Fig.  6-1  A body  of  mass  m is  connected 
to  the  bottom  end  of  a spring  whose  top 
end  is  attached  to  a rigid  support.  The 
two  forces  acting  on  the  body  are  shown. 
One  is  the  force,  labeled  mg,  which  is 
produced  by  the  gravitational  attraction 
of  the  earth.  This  force  is  always  directed 
downward.  The  other  is  the  force, 
labeled  S.  which  is  produced  by  the 
spring.  In  each  part  of  the  figure  it  is 
assumed  that  the  length  of  the  spring  is 
greater  than  its  length  when  completely 
relaxed.  So  in  each  the  spring  force  is 
directed  upward.  The  coordinate  x of 
the  body  is  measured  from  its  position 
in  part  a,  in  which  the  body  hangs 
motionless  with  the  gravitational  force 
just  canceling  the  spring  force.  The 
downward  direction  is  taken  as  positive. 


ity.  At  this  position  the  net  force  F = mg  + S acting  on  the  body  is  zero. 
When  a body  is  at  such  a position,  where  zero  net  force  acts  on  it,  it  is  said  to 
be  at  a position  of  equilibrium.  A body  at  rest  at  an  equilibrium  position 
will  remain  at  rest  because  there  is  no  net  force  acting  to  change  its  velocity 
from  zero  and  thereby  set  it  into  motion. 

The  body  shown  in  Fig.  6- la  is  at  a position  of  equilibrium  of  a special 
kind  called  a position  of  stable  equilibrium.  That  is,  if  the  body  is  dis- 
placed a small  distance  from  the  position  in  Fig.  6- la  in  either  direction,  it 
will  experience  a net  force  in  a direction  which  tends  to  return  the  body  to 
that  position.  We  now  show  that  this  is  so.  Let  us  introduce  the  coordinate  x 
to  specify  the  position  of  the  body,  with  the  origin,  x = 0,  at  the  body’s  equi- 
librium position.  In  Fig.  6-16  the  body  is  at  a position  where  x is  positive,  so 
that  the  spring  is  stretched  more  than  in  Fig.  6- la.  The  additional  stretch 
increases  the  magnitude  of  the  negative  force  5 that  the  spring  exerts  on 
the  body.  But  there  is  no  change  in  the  positive  force  mg  which  the  earth 
exerts  on  the  body.  Thus  the  net  force  F = mg  + S acting  on  the  body  has  a 
negative  value,  and  so  acts  upward.  In  Fig.  6-lc  the  body  is  at  a position 
where  x is  negative,  and  the  spring  is  stretched  less  than  in  Fig.  6- la.  Conse- 
quently the  negative  force  5 exerted  on  the  body  by  the  spring  is  of  re- 
duced magnitude  and  the  net  force  F = mg  + S acting  on  the  body  has  a 
positive  value,  and  so  acts  downward.  Thus  if  the  body  is  displaced  in  either 
direction  from  the  equilibrium  position,  it  experiences  a net  force  which 
acts  in  the  direction  that  tends  to  move  the  body  back  to  the  equilibrium  po- 
sition. This  makes  the  position  of  equilibrium  a position  of  stable  equilib- 
rium. 

Figure  6-2  is  a qualitative  plot  of  the  net  force  F,  which  acts  on  the 
body  at  the  end  of  the  spring,  versus  the  coordinate  x specifying  the  posi- 
tion of  the  body.  Since  x = 0 has  been  chosen  to  be  at  a position  of  equilib- 
rium, the  value  of  F at  that  point  must  be  zero,  as  shown.  In  other  words, 
the  F(x)  curve  passes  through  the  x axis  at  a position  of  equilibrium.  In  the  immedi- 
ate vicinity  of  that  position,  F is  negative  where  x is  positive  and  F is  positive 
where  x is  negative,  as  must  be  the  case  in  the  vicinity  of  a position  of  stable 
equilibrium.  In  other  words,  the  slope  of  the  Fix)  curve  is  negative  in  the  vicinity 
of  a position  of  stable  equilibrium. 

In  this  particular  case,  the  F(x)  curve  is  a straight  line  in  the  region 
near  x = 0.  This  is  because  the  spring  obeys  Hooke’s  law,  Eq.  (4-21),  pro- 
viding its  length  is  not  too  different  from  its  relaxed  length.  That  is,  the 
magnitude  of  the  spring  force  5 depends  linearly  on  x since  x specifies  the 
distortion  of  the  spring,  and  so  the  magnitude  of  the  net  force  F = mg  + S 
is  proportional  to  x. 

Now  let  us  discuss  oscillations  about  a position  of  stable  equilibrium.  If 
you  connect  a body  which  you  are  holding  to  the  lower  end  of  an  un- 
stretched spring  whose  upper  end  is  fixed,  and  then  gradually  remove  the 
supporting  force  you  apply  to  the  body,  the  body  will  settle  into  the  posi- 
tion of  stable  equilibrium  shown  in  Fig.  6- la.  If  you  again  take  hold  of  the 
body,  pull  it  to  the  position  shown  in  Fig.  6-16,  and  then  release  it,  it  will 
begin  oscillating  about  its  stable  equilibrium  position.  Immediately  after 
you  release  the  body  in  the  position  illustrated  in  Fig.  6-16,  it  feels  a net 
force  tending  to  make  it  move  back  to  the  stable  equilibrium  position 
shown  in  Fig.  6- la.  But  the  body  does  not  simply  move  back  to  that  position 
and  then  stop.  All  the  while  it  is  moving  toward  the  stable  equilibrium  posi- 
tion it  is  acted  on  by  a net  force  directed  toward  that  position.  Thus  the 


208  Oscillatory  Motion 


F 


mC  O. 
Cl,  CO 
GO 


Fig.  6-2  The  net  force  F = mg  + S acting  on  the  body  in  Fig.  6-1  is  plotted  versus  the  co- 
ordinate x specifying  its  position.  Since  x = 0 is  defined  to  be  the  position  at  which  mg  + S = 0, 
the  net  force  has  the  value  F = 0 forx  = 0.  A positive  value  of  F means  that  the  force  acts  in  the 
direction  of  positive  x;  that  is,  in  the  downward  direction  in  which  it  tends  to  move  the  body  so 
as  to  extend  the  spring.  A negative  value  of  F means  that  the  force  acts  in  the  opposite  direc- 
tion. Over  the  range  of  x where  the  spring  obeys  Hooke’s  law,  the  plot  is  linear.  This  range 
includes  values  of  x for  which  the  length  of  the  spring  is  less  than  its  length  when  completely 
relaxed,  although  such  a situation  is  not  illustrated  in  Fig.  6-1.  Since  the  net  force  acting  on 
the  body  is  always  in  the  direction  in  which  it  tends  to  move  the  body  to  the  position  x = 0, 
the  sign  of  F is  opposite  to  the  sign  of  x.  This  requires  that  the  slope  be  negative  when  x is  in 
the  linear  range.  But  if  x becomes  too  negative,  the  coils  of  the  spring  touch  and  F very  rapidly 
becomes  more  positive  to  prevent  them  from  penetrating  each  other.  If  x becomes  too  positive, 
Hooke’s  law  is  no  longer  satisfied.  At  even  more  positive  values  of  x,  the  coils  are  stretched 
so  much  that  they  begin  to  yield  to  permanent  deformation.  Ultimately,  the  spring  breaks, 
and  the  net  force  acting  on  the  body  becomes  the  constant  gravitational  force,  which  is  positive 
because  it  is  directed  downward.  For  values  of  x in  the  region  beyond  the  point  where  per- 
manent deformation  begins,  the  curve  is  not  really  a plot  of  a function,  F(x).  The  reason  is 
that  it  describes  the  dependence  of  F on  x in  this  region  only  as  x becomes  more  positive  and 
only  for  the  first  time  x enters  the  region.  So  in  this  region  the  value  of  F depends  not  only  on 
the  value  of  x but  also  on  what  happened  before  x reached  the  value. 


body  picks  up  speed  while  moving  toward  the  stable  equilibrium  position. 
As  it  approaches,  the  net  force  gradually  diminishes,  until  at  the  stable 
equilibrium  position  the  net  force  is  zero.  The  body  keeps  moving,  how- 
ever, because  it  has  mass  and  cannot  change  its  speed  when  no  net  force 
acts  on  it.  So  it  moves  past  the  position  of  stable  equilibrium.  As  the  body 
continues  moving,  a net  force  directed  back  toward  the  equilibrium  posi- 
tion gradually  develops.  This  slows  the  body  until  it  comes  momentarily  to 
rest  at  the  position  shown  in  Fig.  6-lc.  The  hrst  half-cycle  of  an  oscillation 
has  now  been  completed.  The  body  immediately  starts  the  next  half-cycle 
of  oscillation.  This  half-cycle  is  just  the  reverse  of  the  hrst  half-cycle.  In  the 
absence  of  friction  the  oscillation  would  continue  indefinitely. 

All  other  examples  of  mechanical  oscillation  involve  the  same  interplay 
between  the  same  two  factors:  (1)  a force,  acting  on  a body,  that  always  is 
directed  toward  a certain  position — the  position  of  stable  equilibrium  — 
and  (2)  the  inertia  of  the  body  on  which  the  force  acts.  The  force  always 
pushes  the  body  toward  the  stable  equilibrium  position,  and  its  inertia 
always  makes  it  “overshoot.” 

A system  which  is  particularly  interesting  in  regard  to  stable  equilib- 
rium and  oscillatory  motion  is  shown  in  Fig.  6-3.  There  are  two  equilibrium 
positions  for  the  body  of  the  system  which  has  appreciable  mass.  One  is  a 
position  of  stable  equilibrium,  and  the  other  is  not.  The  body  will  execute 
oscillations  about  the  position  of  stable  equilibrium,  but  not  about  the  other 
equilibrium  position. 


6-1  Stable  Equilibrium  and  Oscillatory  Motion  209 


Fig.  6-3  A body  of  mass  m is  connected  to  one  end  of  a light  rod 
whose  other  end  is  attached  to  an  axle  in  horizontal  bearings.  The  two 
forces  acting  on  the  body  are  shown.  One  is  the  gravitational  force 
mg  exerted  in  the  downward  direction  by  the  earth.  The  other  is  the 
force  R exerted  by  the  rod  in  a direction  along  its  length.  Parts  g and 
h show  the  angular  coordinate  (j)  of  the  body.  It  is  measured  from  the 
downward  vertical,  with  the  counterclockwise  direction  positive.  Also 
shown  in  g and  h is  the  tangential  component  of  the  net  force  acting 
on  the  body.  This  quantity  has  the  value  F,  = —mg  sin  The  sign 
correctly  expresses  the  fact  that,  in  part  g , sin  cf>  is  positive  and  Ft  is 
negative,  while,  in  part  h,  sin  </>  is  negative  and  Ft  is  positive. 


(g) 


On 


The  system  consists  of  a compact  body  of  mass  m at  one  end  of  a rod 
having  length  / and  negligible  mass.  The  other  end  of  the  rod  is  connected 
to  a horizontal  axle  which  is  supported  by  bearings  of  negligible  friction. 
Thus  the  body  can  move  through  a circle  of  radius  l in  a vertical  plane. 

Your  intuition  will  probably  tell  you  that  the  body  has  two  equilibrium 
positions,  one  when  the  rod  is  oriented  vertically  downward  and  the  other 
with  the  rod  vertically  upward.  In  both  cases,  the  rod  can  exert  a force  on 
the  body  which  exactly  cancels  the  downward  gravitational  force  acting  on 
it.  But  only  the  lower  position  is  one  of  stable  equilibrium.  If  the  body  is  dis- 
placed slightly  from  the  upper  position  and  released,  it  will  not  experience 
a net  force  which  makes  it  tend  to  return  to  that  position,  as  would  be  the 
case  for  the  lower  position. 


We  now  discuss  the  forces  acting  on  the  body  in  detail.  Whatever  the 
position  of  the  body,  there  are  two  forces  acting  on  it.  One  is  the 
downward-directed  gravitational  force  mg  exerted  on  it  by  the  earth.  The 
other  is  the  force  R exerted  on  the  body  by  the  rod.  The  direction  of  R is 
either  outward  or  inward  along  the  rod,  depending  on  the  body’s  position 
and  speed.  The  magnitude  of  R also  depends  on  the  position  and  speed  of 
the  body. 

The  speed  dependence  of  R results  from  the  fact  that  when  the  body  is 
moving,  its  acceleration  has  a component  along  the  direction  of  the  rod.  This  is 
the  inward-directed  centripetal  acceleration  whose  magnitude  depends  on  the 
speed  of  the  body.  The  centripetal  acceleration  is  the  result  of  a centripetal  force 
exerted  on  the  body.  This  force  is  supplied  in  part  by  the  rod  and  in  part  by  the 


210  Oscillatory  Motion 


F, 


Fig.  6-4  The  tangential  component  F, 
of  the  net  force  acting  on  the  body  in  Fig. 
6-3  is  plotted  versus  the  coordinate  </> 
specifying  its  position.  This  component 
of  the  net  force  is  the  one  which  tends  to 
make  the  body  move  along  its  circular 
path,  and  so  it  is  the  one  that  determines 
the  body’s  positions  of  equilibrium. 
There  are  two  such  positions,  at  which 
Ft  = 0.  The  first  is  where  <f>  = 0,  and  the 
second  is  where  <f>  = ± v.  In  the  first 
position  the  body  is  stable,  and  in  the 
second  it  is  unstable.  Does  this  statement 
agree  with  the  physical  intuition  you 
have  for  the  system  shown  in  Fig.  6-3? 


gravitational  force  acting  on  the  body,  which  has  a component  along  the  direction 
of  the  rod.  In  each  part  of  the  figure  the  force  R is  drawn  for  the  simple  case  in 
which  there  is  no  centripetal  acceleration  because  the  body  is  motionless  at  the 
position  shown.  It  is  the  force  that  would  be  exerted  on  the  body  by  the  rod  imme- 
diately after  you  moved  the  body  to  that  position  and  then  released  it. 

In  considering  the  stability  of  the  equilibrium  positions  of  the  body  (and  also 
its  oscillations  about  a position  of  stable  equilibrium)  the  speed  dependence  of  R 
does  not  matter.  If  the  body  is  placed  at  rest  near  a position  of  stable  equilibrium 
and  then  released,  it  will  begin  to  move  toward  that  position.  In  other  words,  the 
magnitude  of  its  velocity  will  be  changed.  But  we  saw  in  Sec.  3-4  that  the  only 
component  of  the  net  force  acting  on  a body  which  is  effective  in  changing  the 
magnitude  of  its  velocity  is  the  component  along  the  direction  of  its  path.  Since 
the  force  R never  has  a component  along  this  direction,  we  are  not  concerned  with 
the  value  of  R in  our  present  considerations. 

The  most  convenient  way  to  specify  the  position  of  the  body  at  the  end 
of  the  rotatable  rod  is  to  use  the  value  of  the  angular  coordinate  (J)  shown  in 
Fig.  6-3g  and  h.  It  is  the  angle,  measured  in  radians,  from  the  downward 
vertical  to  the  rod.  The  value  of  <fi  is  chosen  to  be  positive  when  the  rod  is 
rotated  counterclockwise  from  the  downward  vertical.  You  can  see  from 
the  figure  that  the  component  F,  of  the  net  force  acting  on  the  body  along  a 
direction  tangent  to  its  path  is  given  by  the  equation  Ft  = — mg  sin  </>.  T he 
minus  sign  takes  into  account  the  fact  that  when  the  value  of  sin  </>  is  posi- 
tive (the  rod  is  rotated  not  more  than  half  a revolution  in  the  counterclock- 
wise direction),  the  value  of  Ft  is  negative  (the  tangential  force  acts  on  the 
body  in  the  clockwise  direction).  And  when  the  value  of  sin  </>  is  negative, 
the  value  of  F,  is  positive. 

The  relation  Ft  — —mg  sin  0 is  plotted  in  Fig.  6-4.  The  F,(</>)  curve 
passes  through  the  </>  axis  at  0 = 0,  consistent  with  the  fact  that  </>  = 0 spe- 
cifies an  equilibrium  position.  Furthermore,  the  slope  of  the  curve  is  nega- 
tive as  it  passes  through  the  axis  since  0 = 0 is  a position  of  stable  equilib- 
rium. This  conclusion  certainly  makes  sense.  If  the  body  is  directly  below 
the  axle,  as  in  Fig.  6-3a,  there  is  no  net  force  acting  to  make  it  move  along 
its  path,  and  it  is  at  an  equilibrium  position.  If  the  body  is  not  far  from  this 
position  in  either  direction,  as  in  Fig.  6-3 b or  c,  then  there  is  a net  force  ex- 
erted on  the  body  with  a component  in  a direction  tangent  to  its  path.  In 
either  case  this  force  is  directed  so  as  to  tend  to  move  the  body  toward  the 
equilibrium  position.  Thus  the  position  </>  = 0 is  indeed  a position  of  stable 
equilibrium. 

The  body  also  has  another  equilibrium  position  directly  above  the  axle. 
This  position  corresponds  to  the  coordinate  value  </>  = 77  or,  equally  well,  to 
the  value  4>  = — 77  (since  the  two  values  describe  the  same  position).  It  is  an 
equilibrium  position  because  the  Ft(<j>)  curve  passes  through  the  </>  axis  at 
these  values.  So  there  is  no  tangential  force  component  tending  to  move 
the  body  away  from  its  position  when  that  position  is  directly  above  the 
axle.  You  can  see  this  is  true  by  inspecting  Fig.  6-3 f.  But  this  equilibrium 
position  is  not  one  of  stable  equilibrium.  If  the  body  is  at  the  positions 
shown  in  Fig.  6-3 d or  e,  there  is  a net  force  component  tangent  to  the  path 
of  the  body  acting  on  the  body  in  the  direction  which  tends  to  move  it  away 
from — not  toward — the  equilibrium  position  directly  above  the  axle.  This 
is  true  not  only  of  the  positions  shown,  but  of  any  position  close  to  </>  = ±77. 
That  equilibrium  position  is  thus  a position  of  unstable  equilibrium.  With 
great  care  (and  if  there  is  a little  friction  in  the  axle  bearings),  the  body  can 


6-1  Stable  Equilibrium  and  Oscillatory  Motion  211 


be  “balanced”  directly  above  the  axle.  But  the  least  disturbance  that  moves 
it  slightly  away  will  put  it  at  a position  where  a force  acts  to  move  it  farther 
away.  There  it  will  feel  a stronger  tangential  force  that  acts  to  move  it  even 
farther,  and  so  forth.  The  behavior  of  the  Ft(<p)  curve  at  the  position  of  un- 
stable equilibrium  is  shown  in  Fig.  6-4  at  (p  = tt  or  <p  = — tt.  The  curve 
passes  through  the  4>  axis  with  a positive  slope — in  contrast  to  the  negative 
slope  characteristic  of  stable  equilibrium. 

The  body  at  the  end  of  the  spring  shown  in  Fig.  6-1  also  has  a posi- 
tion of  unstable  equilibrium.  But  this  is  not  as  evident  as  it  is  for  the  body  at 
the  end  of  a rotatable  rod.  Use  Fig.  6-2  to  identify  the  unstable  equilibrium 
position.  Can  you  explain  on  physical  grounds  why  the  equilibrium  is  un- 
stable? 

If  you  put  the  body  at  the  end  of  the  rotatable  rod  near  its  position  of 
stable  equilibrium  and  then  release  it,  it  will  oscillate  about  that  position. 
The  explanation  involves  the  same  interplay  between  force  and  inertia  as  in 
the  body  and  spring  case.  Here  also  there  is  a force  acting  on  the  body 
always  in  the  direction  toward  the  position  of  stable  equilibrium  and  a 
body,  having  inertia,  on  which  the  force  acts.  You  should  explain  to  your- 
self in  detail  what  happens  through  one  cycle  of  an  oscillation,  patterning 
the  explanation  on  the  one  given  earlier  for  the  body  at  the  end  of  the 
spring. 

In  contrast,  if  you  put  the  body  at  the  end  of  the  rotatable  rod  near  its 
position  of  unstable  equilibrium  and  then  release  it,  it  will  not  oscillate 
about  that  position.  Why  not?  What  will  it  do? 

In  an  oscillating  system  the  force  involved  is  neither  constant  in  direc- 
tion nor  constant  in  magnitude.  So  the  acceleration  of  the  body  on  which  it 
acts  is  not  constant.  However,  at  the  end  of  Chap.  5 we  developed  a quite 
general  numerical  method  for  treating  systems  with  nonconstant  accelera- 
tions. In  the  next  two  sections  we  will  apply  this  method  to  analyze  in  detail 
oscillatory  motion  in  the  two  systems  we  have  discussed  qualitatively  here:  a 
body  at  the  end  of  a spring  and  a body  at  the  end  of  a rotatable  rod.  There 
is  also  an  analytical  method  for  treating  oscillatory  motion  which  we  will  use 
later  in  this  chapter  to  reanalyze  the  oscillations  of  a body  at  the  end  of  a 
spring.  In  some  cases  the  analytical  method  does  not  work.  (It  cannot  be 
used  for  the  body  at  the  end  of  the  rod  if  the  rod  rotates  through  an  appre- 
ciable angle  when  the  body  oscillates.)  But  when  the  analytical  method 
works,  it  has  important  advantages  over  the  numerical  method.  The  two 
methods  complement  each  other  in  that  the  strong  points  of  each  generally 
correspond  to  the  weak  points  of  the  other. 


6-2  THE  BODY  AT  THE  If  a body  of  mass  m hangs  from  a spring  whose  top  is  fixed  to  a rigid  sup- 
END  OF  A SPRING  port,  as  in  Fig.  6-1,  then  two  forces  act  on  the  body.  One  is  the  force  ex- 
erted by  the  spring,  and  the  other  is  the  force  exerted  by  gravity.  But  the 
constant  gravitational  force  does  not  play  a role  in  the  oscillatory  motion  of 
the  body.  It  does  serve  to  orient  the  system  along  the  vertical  direction.  It 
also  determines  the  location  of  the  equilibrium  position  of  the  body  be- 
cause the  weight  of  the  body  stretches  the  spring.  But  when  the  body  is 
pulled  from  the  position  of  stable  equilibrium  and  then  released,  it  is  the 
variable  net  force  acting  on  it  that  produces  its  oscillatory  motion.  This  vari- 
able net  force  arises  from  the  variation  in  the  spring  force  from  its  value  at 


212  Oscillatory  Motion 


K1W  - 


5 < 0 


> 0 


Fig.  6-5  A body  of  mass  m is  connected 
to  the  movable  end  of  a spring  of  force 
constant  k.  The  body  is  supported  by  an 
upward-acting  force  which  cancels  the 
downward-acting  force  of  gravity.  The 
supporting  force  is  supplied  by  an  air 
track,  not  shown  in  the  figure,  on  which 
the  body  slides.  When  the  body  is  moved 
from  its  position  of  stable  equilibrium  in 
which  the  spring  has  its  relaxed  length 
and  then  released,  it  oscillates  along  a 
horizontal  line.  The  figure  shows  the 
body  at  an  instant  in  its  oscillation  cycle 
when  it  happens  to  be  to  the  right  of  its 
stable  equilibrium  position.  In  these 
circumstances,  its  position  coordinate  x, 
measured  from  the  stable  equilibrium 
position,  is  defined  to  have  a positive 
value.  The  spring  is  extended  and  so 
exerts  a horizontal  force  S on  the  body. 
This  force  acts  to  the  left,  and  therefore 
its  value  is  negative.  The  relation  be- 
tween S and  x is  given  by  Hooke’s  law: 
S = — kx,  where  k is  a positive  constant. 


the  stable  equilibrium  position.  Thus  the  oscillation  of  the  body  is  due  to 
the  variable  force  exerted  on  it  by  the  spring. 

To  focus  attention  on  what  is  important,  we  will  analyze  the  oscillatory 
motion  of  the  body  in  the  system  shown  in  Fig.  6-5.  It  is  connected  to  one 
end  of  a horizontally  oriented  spring  of  negligible  mass  whose  other  end  is 
fixed.  We  can  support  the  body  against  gravity  by  placing  it  on  a frictionless 
track.  (This  ideal  condition  can  be  well  approximated  by  using  a one- 
dimensional air  table  called  an  air  track.  It  is  made  from  a rail  pierced  with 
many  tiny  holes  through  which  air  is  blown  to  form  a film  of  air  that  sup- 
ports the  body.)  Flere  we  are  concerned  only  with  the  horizontally  directed 
force,  shown  in  the  figure,  which  is  exerted  on  the  body  by  the  spring  when 
the  spring  is  longer  or  shorter  than  its  relaxed  length.  This  is  the  force 
which  makes  the  body  oscillate. 

We  specify  the  position  of  the  body  by  the  coordinate  x,  shown  in  the 
figure.  This  coordinate  is  measured  along  a horizontal  axis,  with  the  posi- 
tive direction  to  the  right.  The  origin  of  the  axis  is  taken  to  be  the  body’s 
position  of  stable  equilibrium.  In  other  words,  when  x = 0,  the  spring  is  re- 
laxed. The  force  exerted  on  the  body  by  the  spring  is  indicated  in  the  fig- 
ure, at  an  instant  when  the  spring  is  stretched.  We  represent  this  spring 
force  by  the  signed  scalar  5.  Provided  that  the  maximum  stretch,  or  com- 
pression, of  the  spring  occurring  during  the  oscillation  is  not  too  large,  both 
the  magnitude  and  the  direction  of  S at  any  position  x of  the  body  can  be 
expressed  by  the  equation 

S = — kx  for  |x|  not  too  large  (6-1) 

This  is  just  a form  of  Hooke’s  law  pertaining  to  the  present  situation.  Since 
the  magnitude  of  x gives  the  magnitude  of  the  change  in  the  length  of  the 
spring  from  its  relaxed  length  (the  distortion  of  the  spring),  Eq.  (6-1)  says 
that  the  magnitude  of  the  spring  force  is  proportional  to  the  distortion  of 
the  spring.  Hooke’s  law  in  the  form  of  Eq.  (4-21)  says  the  same  thing.  The 
positive  proportionality  constant  in  Eq.  (6-1)  is  the  force  constant  k.  It  is 
identical  to  the  k in  Eq.  (4-21),  and  it  specifies  the  stiffness  of  the  spring.  In 
contrast  to  Eq.  (4-21),  the  direction  of  the  spring  force  is  also  given  by  Eq. 
(6-1).  For  a positive  value  of  x the  spring  is  extended  because  the  body  is  to 
the  right  of  its  equilibrium  position.  Thanks  to  the  minus  sign,  Eq.  (6-1) 
predicts  correctly  that  in  such  a case  the  value  of  5 is  negative,  so  that  the 
force  exerted  on  the  body  by  the  spring  acts  to  the  left.  And  when  x is  nega- 
tive because  the  body  is  to  the  left  of  its  equilibrium  position,  the  equation 
correctly  predicts  that  the  force  which  the  compressed  spring  exerts  on  the 
body  acts  to  the  right  since  its  sign  is  positive. 

When  Hooke’s  law  was  presented  in  Sec.  4-6,  it  was  emphasized  that 
the  law  applies  not  only  to  springs  but  to  a wide  variety  of  mechanical  ob- 
jects. When  distorted  moderately,  almost  anything  composed  of  crystal- 
line solids  produces  a force  that  is  proportional  to  the  distortion,  in  agree- 
ment with  Eq.  (6-1).  Thus  the  body-and-spring  system  is  really  a prototype 
of  a very  large  class  of  mechanical  systems.  They  all  act  in  essentially  the 
same  way  as  far  as  their  oscillatory  motion  is  concerned,  because  in  all  of 
them  the  force  producing  the  motion  is  essentially  that  of  Eq.  (6-1). 


We  study  this  oscillatory  motion  by  combining  Newton’s  second  law 
with  Hooke’s  law  and  then  finding  solutions  to  the  resulting  equation.  Ac- 
cording to  Newton’s  second  law,  the  net  force  F acting  on  the  body  at  the 


6-2  The  Body  at  the  End  of  a Spring  213 


end  of  the  spring  is  related  to  the  mass  m of  the  body  and  its  acceleration  a 
by  the  equation 

F = ma  (6-2) 

But  the  net  force  F is  just  the  spring  force  S.  So  F = S = -kx,  and  we  have 

— kx  = ma 

Solving  for  the  acceleration,  we  obtain 

k 

a = x 

m 

We  write  this  as 

a = — ax  (6-3) 

where  we  define 


The  quantity  a is  a parameter  specifying  the  mechanical  properties  of  the 
system.  In  other  words,  a is  a quantity  whose  value  is  constant  for  a given 
system,  and  whose  value  determines  the  behavior  of  the  system.  In  particu- 
lar, a is  the  constant  k specifying  the  stiffness  of  the  spring  divided  by  the 
constant  m specifying  the  mass  of  the  body  connected  to  its  movable  end. 
Its  SI  units  are  newtons  per  meter-kilogram,  that  is  N/(m-kg). 

We  will  employ  the  numerical  method  of  Sec.  5-6  to  find  solutions  to 
Eq.  (6-3)  for  several  different  sets  of  initial  conditions,  and  for  various  val- 
ues of  a.  That  is,  we  will  adapt  the  method  to  handle  this  case  where  the 
nonconstant  acceleration  a of  a body  has  the  mathematical  form  a = ax. 
Then  we  will  make  numerical  calculations  which  determine  how  the  body 
moves  when  it  is  started  in  several  different  ways,  and  how  the  motion  de- 
pends on  the  stiffness  of  the  spring  and  the  mass  of  the  body.  The  rela- 
tions of  the  numerical  method,  Eqs.  (5-35),  apply  immediately  to  this  case, 
providing  that  Eq.  (6-3)  is  used  at  each  step  to  evaluate  a.  So  all  that  is  re- 
quired is  to  make  the  appropriate  changes  in  the  part  of  the  calculator  or 
computer  program  of  Sec.  5-6  where  a is  evaluated.  The  resulting  body- 
and-spring  program  is  listed  in  the  Numerical  Calculation  Supplement. 
Some  solutions  to  Eq.  (6-3)  obtained  by  using  it  are  given  in  the  following 
examples. 


EXAMPLES  6-1  AND  6-2  

Run  the  body-and-spring  program  with  the  following  two  sets  of  initial  conditions 
and  parameters: 

x0  = 0.25  (in  m);  v0  = 0;  t0  = 0;  At  = 0.2  (in  s);  a = 1 [in  N/(nvkg)] 
x0  = 1.5  (in  m);  v0  = 0;  t0  = 0;  At  = 0.2  (in  s);  a = I [in  N/(m-kg)] 

■ rhe  results  obtained  for  both  sets  are  plotted  in  Fig.  6-6.  In  Example  6-1  the 
body  is  initially  displaced  to  precisely  x = 0.25  m and  then  let  go  from  rest.  The 
plot  shows  that  it  starts  to  move  slowly  toward  x = 0,  picks  up  speed,  and  before 
long  passes  through  x = 0.  Then  it  slows  down  as  it  approaches  x = —0.25  m. 
There  it  turns  around  and  repeats  this  motion  in  reverse,  until  it  returns  to  x = 
0.25  m.  It  has  completed  the  first  cycle  of  its  oscillation.  But  it  immediately  starts 
the  next  identical  cycle. 


214 


Oscillatory  Motion 


x (in  m) 


Fig.  6-6  The  position  coordinate  x versus  the  time  t for  small-  and  large-amplitude  oscillations 
of  a body  of  mass  m at  the  end  of  a spring  of  force  constant  k.  The  parameter  a = k/m  has  the 
value  1 N/(m-kg)  for  both  oscillations. 


In  Example  6-2  the  body  was  initially  displaced  to  precisely  x — 1.5  m and  re- 
leased with  zero  initial  velocity.  Its  subsequent  oscillations  look  very  much  like  those 
in  Example  6-1,  except  that  all  the  values  of  x ar e scaled  up  in  magnitude  by  a factor 
of  1.5/0.25. 


The  comparison  of  the  results  obtained  in  Examples  6-1  and  6-2,  dis- 
played in  Fig.  6-6,  shows  that  the  oscillations  of  a body  acted  on  by  a 
Hooke’s  law  force  have  a scaling  property.  That  is,  the  general  “shape”  of  a 
plot  of  the  coordinate  x versus  the  time  t does  not  depend  on  the  maximum 
magnitude  assumed  by  x during  a cycle  of  oscillation.  This  maximum  mag- 
nitude of  x is  called  the  amplitude  of  the  oscillation.  It  is  0.25  m for  the  os- 
cillation treated  in  Example  6-1  and  1.5  nr  for  the  one  treated  in  Exam- 
ple 6-2. 

As  a result  of  the  scaling  property,  two  important  characteristics  of  the 
oscillation  are  independent  of  its  amplitude.  These  are  its  period  and  its 
frequency.  The  period  T of  an  oscillation  is  the  time  required  for  the  oscil- 
lation to  go  through  one  full  cycle.  For  instance,  it  is  the  time  required  for 
the  body  in  Example  6-2  to  go  from  x = + 1.5  m in  one  cycle  to  x = 


6-2  The  Body  at  the  End  of  a Spring  215 


+ 1.5  m in  the  next.  In  more  general  terms,  T is  the  time  elapsed  from  the 
passage  of  the  body  in  a certain  direction  through  any  position  to  its  next 
passage  in  the  same  direction  through  the  same  position.  The  frequency  v 
of  an  oscillation  is  the  reciprocal  of  its  period,  that  is,  the  number  of  pas- 
sages per  unit  time  through  a given  position  in  a given  direction.  Thus  T is 
the  number  of  seconds  per  cycle,  v is  the  number  of  cycles  per  second,  and 


The  unit  for  T is  the  second.  The  unit  for  v is  called  the  hertz  (Hz).  A fre- 
quency of,  say,  v = 5 Hz  means  that  a body  completes  5 full  cycles  of  oscil- 
lation each  second. 

You  can  measure  the  period  T for  the  two  oscillations  plotted  in  Fig. 
6-6.  Careful  inspection  will  show  that  for  both  T — 6.28  s,  despite  their 
considerable  difference  in  amplitude.  The  corresponding  value  of  the  fre- 
quency is  v — (1/6.28)  Hz  = 0. 159  Hz.  Note  that  the  parameter  specifying 
the  mechanical  properties  of  the  system  has  the  same  value  a = 1 
N/(m-kg)  in  both  examples. 

You  may  be  surprised  to  hnd  that  a body  of  a particular  mass  at  the 
end  of  a particular  spring  oscillates  with  the  same  period  and  frequency 
whether  the  amplitude  of  the  oscillation  is  large  or  small.  The  body  cer- 
tainly travels  a longer  path  in  one  cycle  of  a larger  amplitude  oscillation, 
and  this  tends  to  increase  the  time  required  to  complete  a cycle.  On  the 
other  hand,  when  the  body  has  completed  a certain  fraction  of  its  cycle  of 
oscillation,  the  force  exerted  on  it  will  be  larger,  the  larger  the  amplitude, 
since  it  will  be  at  a greater  distance  from  its  equilibrium  position  and  hence 
the  deformation  of  the  spring  will  be  greater.  So  with  increasing  amplitude 
the  accelerations  experienced  by  the  body  at  various  points  in  an  oscillation 
cycle  increase.  Thus  the  speed  it  has  at  a certain  fraction  through  the  cycle 
increases  also,  and  this  tends  to  decrease  the  time  required  to  complete  a 
cycle.  It  appears  from  Examples  6-1  and  6-2  that  these  opposing  effects 
cancel,  so  that  the  period  and  frequency  of  the  oscillation  are  independent 
of  its  amplitude,  if  the  force  exerted  on  the  body  obeys  Hooke’s  law.  When 
we  study  the  analytical  treatment  of  the  oscillatory  motion  in  Sec.  6-5,  we 
will  see  exactly  how  this  comes  about. 

Figure  6-7  shows  values  of  the  period  T for  oscillations  obtained  by 
running  the  program  with  several  different  values  of  the  parameter  a. 
The  data  were  obtained  by  using  a cpiick  procedure  explained  in  the  Nu- 
merical Calculation  Supplement.  The  period  T decreases  as  the  ratio  a of 
the  force  constant  to  the  mass  increases.  This  is  what  you  might  expect.  A 
stiffer  spring  exerts  a larger  force.  A body  with  smaller  mass  has  less  in- 
ertia. Both  effects  tend  to  speed  up  the  oscillation. 

When  you  are  trying  to  understand  unfamiliar  data  showing  the  dependence 
of  one  quantity  on  another,  such  as  the  dependence  of  T on  a,  it  may  be  profitable 
to  try  to  describe  the  dependence  with  some  simple  mathematical  function.  One 
good  way  of  doing  this  is  to  guess  at  a functional  dependence,  and  then  use  it  to 
replot  the  data  in  such  a way  that,  if  the  guess  is  correct,  the  points  lie  on  a straight 
line. 

Often  it  is  possible  to  guess  the  functional  dependence  by  a dimensional  anal- 
ysis argument.  It  certainly  is  possible  here.  The  parameter  a is  defined  by  the 
equation  a = k/m,  where  k is  a force  divided  by  a length  (the  force  constant  of  the 
spring)  and  m is  a mass  (the  mass  of  the  body  at  the  free  end  of  the  spring).  Ac- 


216  Oscillatory  Motion 


(s  u?)  I (S  Ul)  JT 


Fig.  6-7  The  period  T of  the  oscillations 
of  a body  at  the  end  of  a spring,  for  several 
values  of  its  stiffness-to-mass  ratio  a. 


a [in  N/(m-kg)  = s 2] 


cording  to  Newton’s  second  law,  the  dimensions  of  force  are  the  same  as  those  of 
mass  multiplied  by  acceleration;  in  other  words,  the  dimensions  of  force  are  mass 
multiplied  by  length  divided  by  time  squared.  The  dimensions  of  k are  therefore 
(mass)(length)(time)-2/(length),  that  is,  (mass)(time)-2.  The  dimensions  of  a are 
consequently  (mass)(time)-2/(mass).  Thus  the  dimensions  of  the  parameter  a are 
(time)-2.  Of  course  the  dimensions  of  the  period  T are  (time).  Now,  any  functional 
relation  correctly  describing  the  dependence  of  T on  a must  be  dimensionally 


t f\fa.  (in  m'T-kg'^/N^2  = s) 


Fig.  6-8  A plot  of  T versus  l/y/a  for  the  oscillations  of  a body  at 
the  end  of  a spring,  using  the  data  presented  in  Fig.  6-7.  The  way 
the  points  fall  on  a straight  line  passing  through  the  origin  shows 
that  T is  proportional  to  l/Va.  The  slope  is  6.28,  which  suggests 
the  relation  T = 


6-2 


The  Body  at  the  End  of  a Spring 


217 


consistent.  A simple  relation  that  has  this  property  is  one  in  which  T is  propor- 
tional to  l/y/a,  with  the  proportionality  constant  being  a pure  number. 

Figure  6-8  shows  the  same  data  points  as  those  shown  in  Fig.  6-7,  but  with  T 
plotted  versus  l/Va.  Since  the  points  do  appear  to  lie  on  a straight  line  in  Fig. 
6-8,  the  data  confirm  the  guess,  based  on  dimensional  analysis,  that  T is  propor- 
tional to  l/Va. 

If  you  go  one  step  further  and  measure  the  slope  of  the  straight  line  to  deter- 
mine the  proportionality  constant,  you  will  obtain  the  value  6.28.  Within  the  accu- 
racy that  can  be  expected  from  the  numerical  work,  this  value  is  277.  Thus  it 
seems  there  is  an  unexplained  but  intriguing  relation  T = 27 r/Va.  A complete 
explanation  is  given  in  Sec.  6-5. 

Example  6-3  investigates  the  motion  of  the  body  at  the  end  of  the 
spring  when  it  is  started  in  a way  quite  different  from  the  way  it  was  started 
before. 


EXAMPLE  6-3 

Run  the  body-and-spring  program  with  the  following  set  of  initial  conditions  and 
parameters: 

x0  = 0;  v0  = 1.5  (in  m/s);  t0  = 0;  A t — 0.2  (in  s);  a = 1 [in  N/(m-kg)] 

■ Here  the  body  at  the  end  of  the  spring  is  given  an  initial  velocity  of  precisely 
1 .5  m/s  in  the  positive  x direction  when  it  is  at  its  equilibrium  position.  (These  initial 


Fig.  6-9  An  oscillation  of  a body  at  the  end  of  a spring  in  which  the  initial  displacement  of  the 
body  is  zero  and  its  initial  velocity  is  nonzero.  The  parameter  a = k/m  has  the  same  value  1 
N/(m-kg)  as  in  the  oscillations  plotted  in  Fig.  6-6. 


218  Oscillatory  Motion 


6-3  THE  SIMPLE 
PENDULUM 


mg 


Fig.  6-10  A simple  pendulum,  com- 
prising a compact  body  of  mass  m 
oscillating  at  the  end  of  a light  rod  or 
cord  of  length  l.  The  pendulum  is  shown 
at  an  instant  when  the  angular  coordi- 
nate (f> , measured  from  the  downward 
vertical,  is  positive.  This  choice  of  sign 
agrees  with  the  usual  convention  of 
making  counterclockwise  rotations  posi- 
tive. At  the  instant  depicted,  the  tan- 
gential component  of  the  net  force 
acting  on  the  body  has  the  value  Ft  = 
— mg  sin  (t>.  But  Fig.  6-3  shows  that  this 
relation  holds  at  any  instant,  that  is,  no 
matter  what  the  sign  or  value  of  <j>.  The 
oscillatory  motion  results  from  the  action 
of  the  tangential  force  component. 


conditions  could  be  achieved  approximately  by  starting  the  body  with  a sharp  blow.) 
Its  subsequent  motion  is  plotted  in  Fig.  6-9. 

Note  that  with  the  parameter  a having  the  same  value  as  in  Examples  6-1  and 
6-2,  the  same  value  T — 6.28  s is  found  for  the  period  of  the  oscillation.  So  again 
you  find  that  the  period  is  governed  by  the  value  of  a,  and  not  by  other  consider- 
ations. 

What  do  you  think  would  happen  if  you  ran  the  program,  using  initial  condi- 
tions in  which  neither  x0  nor  t/0  were  zero?  Try  it. 


Now  we  will  analyze  quantitatively  the  oscillatory  motion  of  the  system 
shown  in  Fig.  6-10,  which  we  discussed  qualitatively  in  Sec.  6-1.  One  end  of 
a rod  of  length  / is  attached  to  an  axle  supported  by  bearings  in  such  a way 
that  the  rod  can  rotate  with  no  appreciable  friction  in  a vertical  plane.  The 
other  end  of  the  rod  is  attached  to  a body  of  mass  m.  The  mass  of  the  rod  is 
negligible  compared  to  the  mass  of  the  body,  and  the  size  of  the  body  is 
negligible  compared  to  the  length  of  the  rod.  The  body’s  position  is  speci- 
fied by  its  angular  coordinate  (j> , measured  in  radians  from  the  downward 
vertical,  with  counterclockwise  rotations  of  the  body  corresponding  to  posi- 
tive values  of  </>.  Two  forces  are  acting  on  the  body.  They  are  the  force  mg 
exerted  by  gravity  and  the  force  R exerted  by  the  rod.  The  direction  of  mg 
is  downward.  As  was  discussed  in  Sec.  6-1,  the  direction  of  R is  either  in- 
ward or  outward  along  the  rod.  Which  it  is  depends  in  general  on  both  the 
speed  of  the  body  and  its  position.  But  if  the  magnitude  of  the  angular 
coordinate  </>  never  exceeds  tt/2  in  an  oscillation,  so  that  the  rod  is  never  in- 
clined above  the  horizontal,  then  R will  never  be  directed  outward  along 
the  rod.  (Use  your  mechanical  intuition  to  justify  this  statement.)  In  such  a 
case  the  rod  can  just  as  well  be  a cord  of  the  same  length.  Both  a cord  and  a 
rod  can  supply  whatever  force  is  required  to  make  the  body  move  in  a cir- 
cular path,  if  that  force  is  always  directed  inward  to  the  center  of  the  path.  A 
cord  can  pull  on  something  just  as  well  as  a rod  can.  But  if  the  force  is 
directed  outward  from  the  center,  it  can  be  supplied  only  by  a rod.  A cord  is 
not  capable  of  pushing  on  something.  In  summary,  we  will  analyze  oscilla- 
tory motion  in  a vertical  plane  of  a friction-free  system  consisting  of  a com- 
pact, massive  body  at  one  end  of  a rod  whose  other  end  rotates  about  a 
fixed  point,  or  at  one  end  of  a cord  if  the  body  is  never  higher  than  the 
fixed  point  about  which  the  other  end  of  the  cord  rotates.  In  either  case, 
the  system  is  called  a simple  pendulum,  and  the  body  is  called  a bob. 

The  oscillatory  motion  of  the  pendulum  bob  back  and  forth  through 
its  curved  path  is  driven  by  the  component  tangent  to  its  path  of  the  net 
force  acting  on  the  bob.  We  know  from  the  considerations  of  Sec.  6-1  that 
this  tangential  net  force  component  Ft  is  what  always  tends  to  move  the  bob 
toward  its  equilibrium  position,  where  (f>  — 0.  The  net  force  acting  on  the 
bob  also  has  a component  which  is  perpendicular  to  its  path  (except  at  the 
extremes  of  the  oscillation  where  the  body  is  motionless).  This  component 
produces  the  centripetal  acceleration  arising  from  the  curvature  of  the 
path.  But  here  we  are  not  interested  in  the  centripetal  acceleration,  so  we 
are  not  interested  in  the  perpendicular  component  of  the  net  force. 

By  referring  to  Sec.  6-1,  or  by  inspecting  Fig.  6-10,  we  see  that  the 
tangential  component  of  the  net  force  acting  on  the  pendulum  bob  is  given 
by 

Ft  = —mg  sin  <fi  (6-6) 


6-3  The  Simple  Pendulum  219 


Fig.  6-11  The  infinitesimal  displace- 
ment of  the  pendulum  bob  when  there  is 
an  infinitesimal  change  in  its  angular 
coordinate.  No  matter  what  units  are 
used  to  measure  angles,  the  magnitude 
ds  of  the  displacement  will  be  propor- 
tional to  the  magnitude  d<f>  of  the 
angular  change  and  also  proportional  to 
the  length  l of  the  pendulum  rod  or 
cord.  That  is  ds  oc  / d(f>.  But  when  angles 
are  measured  in  radians,  the  propor- 
tionality constant  has  the  convenient 
value  1,  so  that  ds  = l d<t>.  This  is  why  we 
measure  angles  in  radians. 


The  minus  sign  makes  this  expression  correctly  describe  the  fact  that  the 
tangential  force  is  in  the  clockwise  direction  (Ft  is  negative)  when  the  bob 
lies  in  the  counterclockwise  direction  from  its  stable  equilibrium  position 
(sin  </>  is  positive),  and  vice  versa. 

To  analyze  pendulum  oscillations,  we  need  to  obtain  an  expression  for 
at,  the  tangential  component  of  the  acceleration  of  the  pendulum  bob.  This 
acceleration  component  along  its  path  is  associated  with  the  change  in  the 
magnitude  of  its  velocity,  that  is,  with  the  change  in  its  speed.  (See  Sec.  3-4, 
where  this  fact  is  developed  and  contrasted  with  the  fact  that  the  accelera- 
tion component  perpendicular  to  the  path  is  associated  with  the  change  in 
the  direction  of  the  velocity.)  In  fact,  the  bob’s  rate  of  change  of  speed  gives 
the  value  of  the  acceleration  along  its  path,  just  as  the  rate  of  change  of 
speed  of  a body  moving  on  a straight  line  gives  the  value  of  the  acceleration 
of  the  body  along  its  path.  If  we  use  the  infinitesimal  displacement  vector  ds 
to  describe  the  change  in  position  of  the  bob  during  an  infinitesimal  time 
interval  dt,  we  can  write  its  velocity  as  v = ds/dt.  (Again  see  Sec.  3-4.)  The 
speed  v of  the  bob  can  then  be  written  as  v = ds/dt,  where  ds  is  the  magni- 
tude of  the  vector  ds.  Figure  6-11  shows  that  ds  can  be  expressed  in  terms  of 
the  length  l of  the  pendulum  rod  or  cord  and  the  infinitesimal  change  d<p  in 
the  coordinate  of  the  pendulum  bob  during  the  time  dt.  Specifically,  it 
shows  that 


ds  = / dp 


so 


ds  _ l dtp  _ j d(p 
dt  dt  dt 


The  rate  of  change  of  speed  is 

djp  _ d_  1 1 dpP 

dt  dt  V dt 

Since  l is  a constant,  this  is 


dv  _ i d_  d(p 
dt  dt  dt 


Expressed  in  terms  of  a second  derivative,  the  rate  of  change  of  speed  is 

dv  _ , d2(p 
Tt  ~ dt2 


Since  the  rate  of  change  of  its  speed  gives  the  value  of  the  pendulum  bob's 
acceleration  along  its  path,  and  since  that  quantity  is  its  tangential  accelera- 
tion component  at,  we  have 


at  = l 


d2d> 

dt2 


(6-7) 


This  expression  gives  the  sign  of  at  as  well  as  its  magnitude.  For  instance,  if 
the  bob  is  moving  counterclockwise  with  increasing  speed,  then  dcp/dt  is  be- 
coming more  positive.  So  d2<p/ dt2  is  positive  and  at  is  positive,  in  agreement 
with  the  fact  that  the  tangential  acceleration  is  directed  counterclockwise.  If 
the  bob  is  moving  counterclockwise  but  the  speed  is  decreasing,  then  dxp/dt 
is  becoming  less  positive,  so  d2<p/dt2  and  a,  are  negative.  This  agrees  with  the 


220  Oscillatory  Motion 


fact  that  the  tangential  acceleration  is  directed  clockwise.  (You  should  go 
through  similar  arguments  for  the  two  situations  in  which  the  bob  is  moving 
clockwise.) 


Newton’s  second  law  requires  that  the  tangential  component  of  the  net 
force  acting  on  the  pendulum  bob  equal  its  mass  times  the  tangential  com- 
ponent of  its  acceleration.  That  is, 


Ft  = mat 


Using  Eqs.  (6-6)  and  (6-7),  we  have 


— mg  sin  0 = ml 


d24> 

dt2 


or,  canceling  and  transposing, 

d2(f>  g . 

n?=~i sin  * 

If  we  define  the  parameter  a to  be 


a 


(6-8a) 


(6-8  b) 


(6-9) 


we  can  write  the  pendulum  equation  as 


d2(f) 

HF 


— a sin  0 


(6-10) 


The  parameter  a contained  in  the  pendulum  equation  has  a value  deter- 
mined by  the  gravitational  acceleration  g and  the  length  / of  the  pendulum 
rod  or  cord.  The  equation  itself  determines  how  the  pendulum  bob  will 
move,  once  it  is  started  in  a particular  way.  Note  that  the  equation  does  not 
involve  the  mass  of  the  bob.  The  m on  the  left  side  of  Eq.  (6-Sa),  which  is  a 
manifestation  of  the  gravitational  role  of  the  body’s  mass,  cancels  the  m on 
the  right  side,  which  is  a manifestation  of  its  inertial  role.  The  pendulum  is 
yet  another  example  of  a system  where  motion  does  not  depend  on  mass. 


In  order  to  compare  the  pendulum  equation  with  Eq.  (6-3),  a = —ax, 
which  determines  the  motion  of  a body  at  the  end  of  a spring,  we  will 
rewrite  the  latter  in  calculus  notation.  Using  the  definition  of  acceleration, 
a = d2x/dt2,  the  body-and-spring  equation  becomes 


d2x 

He  = -ax 


(6-11) 


where 


k 

a = — 
m 

Two  differences  between  Eqs.  (6-10)  and  (6-11)  are  that  they  involve  the 
different  dependent  variables  0 and  x,  and  that  the  meaning  of  the  con- 
stant a is  not  the  same.  But  these  differences  are  mathematically  (though 
not  physically)  unimportant.  The  substantial  difference  is  the  presence  of 
the  sine  function  in  Eq.  (6-10).  If  sin  0 were  equal  to  0 (with  0 expressed  in 
radians),  then  Eq.  (6-10)  would  be  mathematically  identical  to  Eq.  (6-11). 
Of  course,  it  is  not  true  that  sin  0=0.  But  it  is  approximately  true  if  0 is 
small  compared  to  1 rad.  (See  Example  2-7.) 

6-3  The  Simple  Pendulum  221 


Thus  when  a pendulum  is  performing  oscillations  of  sufficiently  small 
amplitude  that  sin  (/>=</>  can  be  used  as  a good  approximation,  the  pen- 
dulum equation  can  be  written 

= — a4>  for  </>  <K  1 (6-12) 


where 


Providing  the  restriction  on  </>  is  satisfied  through  the  swing  of  the  pen- 
dulum, this  is  the  mathematical  equivalent  of  the  body-and-spring  equa- 
tion. But  the  dependent  variable  is  now  the  angular  coordinate  t/>,  and  the 
constant  a is  now  the  gravitational  acceleration  g divided  by  the  length  / of 
the  pendulum  rod  or  cord.  In  both  equations  a has  the  same  units.  For  Eq. 
(6-11)  the  units  are  N/lm-kg).  If  we  use  the  definition  1 N = 1 kg-m-s~2, 
this  reduces  to  s-2.  For  Eq.  (6-12)  the  units  are  m-s_2/m,  which  reduces 
immediately  to  s~2. 

Because  the  equations  determining  their  motion  are  mathematically 
identical,  everything  we  learned  in  Sec.  6-2  about  the  motion  of  the  body  at 
the  end  of  the  spring  carries  over  directly  to  the  motion  of  a pendulum  bob 
undergoing  small  oscillations.  So  we  can  conclude  immediately  that  the 
period  T (or  frequency  v)  of  a pendulum  does  not  depend  on  the  ampli- 
tude of  its  oscillation,  providing  the  amplitude  remains  small.  But  T does  de- 
pend on  g and  /,  since  these  quantities  determine  the  value  of  a.  An  equa- 
tion relating  T to  a for  the  body  and  spring,  and  therefore  for  the  small  os- 
cillations of  a pendulum,  is  the  one  found  from  numerical  calculations  in 
Sec.  6-2:  T — 2i r/\/a.  It  is  obtained  in  Sec.  6-5  from  analytical  calculations. 

What  about  large  oscillations  of  a pendulum?  It  is  very  easy  to  modify 
the  calculator  or  computer  program  so  that  it  will  solve  Eq.  (6-10),  which 
applies  to  pendulum  oscillations  of  any  amplitude,  instead  of  Eqs.  (6-11)  or 
(6-12),  which  apply  to  the  body  and  spring  or  to  a pendulum  executing 
small  oscillations.  As  you  will  see  by  looking  at  the  pendulum  program 
listed  in  the  Numerical  Calculation  Supplement,  doing  so  is  just  a matter  of 
inserting  an  instruction  to  take  the  sine  of  the  dependent  variable  at  the 
appropriate  place  in  the  program,  and  then  to  use  it  instead  of  the  depen- 
dent variable  itself.  The  calculating  device  is  set  to  run  in  the  radian  mode, 
and  the  results  displayed  are  interpreted  as  the  angular  coordinate  </>. 


EXAMPLES  6-4  AND  6-5  ■—■■■■■■■■■■■■ 

Run  the  pendulum  program  with  the  following  two  sets  of  initial  conditions  and 
parameters: 

4>o  = 0.25  (in  rad);  (d<j>/dt) 0 = 0;  t0  = 0;  A t = 0.2  (in  s);  a = 1 (in  s-2) 

<£o  = 1.5  (in  rad);  (dd>/dt)0  = 0;  t0  = 0;  At  = 0.2  (in  s);  a = 1 (in  s-2) 

■ In  both  examples  the  pendulum  is  given  a certain  initial  displacement  (f>0  from 
the  position  of  stable  equilibrium  and  then  released  with  no  motion,  that  is,  with 
(d<j)/dt) o = 0.  The  subsequent  oscillations  are  plotted  in  Fig.  6-12. 


222  Oscillatory  Motion 


0 ( in  rad ) 


Compare  the  pendulum  oscillation  in  0 of  amplitude  0.25  rad, 
shown  in  Fig.  6-12,  with  the  body-and-spring  oscillation  in  x of  amplitude 
0.25  m shown  in  Fig.  6-6.  Both  plots  are  for  the  same  value  a = 1 s-2.  The 
comparison  will  make  it  clear  that  for  the  reasonably  small  amplitude 
0.25  rad  (about  14°)  the  plot  describing  the  oscillation  of  a pendulum 
bob  is  indistinguishable  from  the  one  describing  the  oscillation  of  a body  at 
the  end  of  a spring.  This  confirms  our  earlier  conclusion  that  the  behavior 
of  the  two  systems  should  be  essentially  identical,  providing  the  amplitude 
of  the  pendulum  oscillations  is  small. 

But  this  is  not  so  for  a pendulum  executing  large  oscillations.  Compare 
the  plot  shown  in  Fig.  6-12  for  a pendulum  oscillation  of  amplitude  1 .5  rad 
(approximately  86°)  with  the  plot  in  that  figure  for  the  pendulum  oscilla- 
tion of  amplitude  0.25  rad.  Both  are  for  the  value  a = 1 s~2.  But  they  do 
not  have  the  same  period  T.  The  period  is  increased  (and  so  the  frequency 
is  reduced)  for  the  large-amplitude  pendulum  oscillation.  Thus  the  period 
of  a pendulum  does  depend  on  the  amplitude  of  its  oscillation  if  the  ampli- 
tude becomes  large.  This  contrasts  sharply  with  the  way  the  period  of  oscil- 


■■■ 

1 

§pp5 

:a^ 

4--r- 

V 

Wm 

8 

- Sg| 

— ■-  

ffil  111  1 j j 4 

§H|t 

fil- 

SdfS 

pfSft 

roibAt 

TTmt 

: imM* 

t.rTryjrrnljrr: 

T: 

jj? 

» Jiiiil 

ferrf- 

Mji 

Ui 

• ■ 

:: 

n# 

[SBR-4 

•-* 

t— ' , d -pid.i.iTl-Up 

§ 

o 

fipjf 

TTtffttt 

# . 8 

9 /(ins) 

i 

§§8 

Eppsi 

s 

■ptjfi 

rhtttjfj 

tr|4tfrr 

44-i— 

rdp 

_• 

- 1, 

Bc|i 

~B' IIS 

11 

sll 

. 

litjHnt 

‘.T 

*ti 

1 

■ 

•!; 

Up 

jSfiji: 

§ 

• 

g| 

g||g 

•jtpt 

1 

W 

t-tfrnB 

4{+4H: 

1 mUiffirr 

M 

m 

BlitttfTlB? 

Bit 

Fig.  6-12  The  angular  coordinate  (/>  versus  the  time  t for  small-  and  large-amplitude  oscilla- 
tions of  a pendulum.  The  parameter  a = g/l,  the  ratio  of  the  magnitude  of  the  gravitational 
acceleration  to  the  length  of  the  pendulum  rod  or  cord,  has  the  value  1 s-2. 


6-3  The  Simple  Pendulum  223 


Force 


Fig.  6-13  T he  force  acting  in  the 
direction  of  motion  on  a pendulum  bob 
and  the  Hooke’s-law  force  acting  on  a 
body  at  the  end  of  a spring.  For  the  sake 
of  comparison,  the  parameter  a is 
assumed  to  have  the  same  numeri- 
cal value  in  both  systems.  Also,  the 
pendulum  oscillations  are  assumed  to 
be  confined  to  the  angular  range 
— 7t/2  =s  <t>  =S  +7r/2,  so  that  the  pendu- 
lum can  be  constructed  with  a cord  as 
well  as  with  a rod. 


lation  of  a body  at  the  end  of  a spring  remains  independent  of  the  ampli- 
tude as  long  as  the  spring  continues  to  obey  Hooke’s  law.  If  you  compare 
the  force  acting  in  the  direction  of  motion  on  a pendulum  bob  with  the 
Hooke’s-law  force  acting  on  a body  at  the  end  of  a spring,  you  can  under- 
stand qualitatively  the  reason  for  the  difference  in  behavior  of  the  two  sys- 
tems. The  comparison  is  shown  in  Fig.  6-13  for  a case  in  which  a has  the 
same  value  in  both  systems.  This  means  the  slope  of  both  force  versus  co- 
ordinate curves  will  be  equal  at  the  position  of  stable  equilibrium  where  the 
value  of  the  coordinate  is  zero.  For  small  oscillations  of  the  pendulum  bob 
about  the  stable  equilibrium  position,  its  behavior  is  indistinguishable  from 
the  behavior  of  the  body  at  the  end  of  a spring.  The  reason  is  that  for  small 
c />,  where  sin  <f>  = (/>  to  a very  good  approximation,  the  dependence  of  the 
force  on  the  coordinate  is  essentially  the  same  in  both  systems,  and  so  the 
motion  produced  by  the  force  is  essentially  the  same.  But  for  larger  oscilla- 
tions the  force  acting  on  the  pendulum  bob  becomes  weaker  than  the  force 
acting  on  the  body  at  the  end  of  a spring.  This  makes  the  pendulum  bob 
take  more  time  to  complete  a cycle  of  its  oscillation.  In  other  words,  the 
period  increases.  It  is  a simple  matter  to  run  the  pendulum  program  for  a 
variety  of  amplitudes,  and  thereby  determine  how  the  period  depends  on 
the  amplitude.  Results  are  shown  in  Fig.  6-14. 

The  curve  in  Fig.  6-12  for  the  larger-amplitude  oscillation  of  a pen- 
dulum is  not  simply  a scaled-up  version  of  the  smaller-amplitude  curve  in 


rat 


* •lag* 


5 


0.2  0.4  0.6  0.8  1.0 

0 max  (in  rad) 


1.2 


1.4 


1.6 


Fig.  6-14  The  period  T of  a pendulum  for 
several  values  of  its  amplitude  <i>max-  The 
value  of  the  parameter  a = g/l  is  1 s-2.  The 
amplitudes  do  not  exceed  tt/2  rad,  so  the 
data  apply  to  a pendulum  constructed  with 
either  a cord  or  a rod.  For  larger  amplitudes 
a rod  must  be  used.  But  there  the  de- 
pendence of  T on  4>m ax  becomes  much  more 
pronounced.  You  will  see  this  if  you  use  the 
pendulum  program  to  obtain  one  or  two 
data  points  for  $max  near  tt  rad. 


224  Oscillatory  Motion 


that  figure.  It  cannot  be  a scaled-up  version  since  the  periods  for  the  two 
amplitudes  are  not  the  same.  Thus  the  scaling  property,  seen  in  Fig.  6-6  for 
the  body-ancl-spring  oscillations,  does  not  apply  to  pendulum  oscillations. 
The  preceding  paragraph  gives  a qualitative  physical  explanation  of  this 
fact;  a quantitative  mathematical  explanation  is  given  in  Sec.  6-5. 


6-4  NUMERICAL 
SOLUTION  OF 
DIFFERENTIAL 
EQUATIONS 


The  numerical  method  for  determining  the  motion  of  a body  whose  accel- 
eration varies  according  to  a known  rule,  when  its  position  and  velocity  at 
some  initial  instant  are  given,  was  introduced  in  Sec.  5-6.  There  the  method 
was  used  on  the  skydiver  equation, 

a = g — yv2 


It  was  treated  as  an  algebraic  equation,  and  the  numerical  method  for  han- 
dling it  was  explained  as  a sequence  of  algebraic  calculations.  The  same 
point  of  view  was  used  in  Sec.  6-2  for  the  numerical  treatment  of  the  equa- 
tion for  a body  and  spring, 


a = — ax 


But  in  Sec.  6-3  the  development  of  the  pendulum  equation  led  naturally  to 
its  expression  in  terms  of  a derivative  in  Eq.  (6-10): 


d2(f> 
dt 2 


— a sin  d> 


We  then  found  it  convenient  to  express  the  body-and-spring  equation  in 
terms  of  a derivative  in  Eq.  (6-11): 

d2x 


The  skydiver  equation  can  also  be  written  in  terms  of  derivatives.  In  this 
form  it  is 


d2x  I dx\ 


(6-13) 


Equations  (6-10),  (6-1 1),  and  (6-13)  are  called  differential  equations  be- 
cause they  are  equations  in  which  the  “unknown,”  the  dependent  variable 
d>  or  x,  occurs  in  derivatives.  Specifically,  they  are  called  second-order, 
ordinary  differential  equations,  with  “second  order”  meaning  that  no 
third-  or  higher-order  derivatives  are  found  in  the  equations  and  “ordi- 
nary” meaning  that  each  equation  involves  only  a single  independent  vari- 
able, such  as  t.  Equation  (6-11)  is  said  to  be  a linear  differential  equation 
since  each  of  its  terms  is  proportional  to  the  first  power  of  the  dependent 
variable  x,  while  Eq.  (6-10),  or  Eq.  (6-13),  is  called  a nonlinear  equation 
since  some  of  the  terms  are  not  proportional  to  the  first  power  of  the 
dependent  variable  </>,  or  x.  Thus  you  have  actually  been  using  the  nu- 
merical method  to  obtain  numerical  solutions  of  linear  and  nonlinear 
second-order,  ordinary  differential  equations! 

The  fundamental  relation  of  newtonian  mechanics  is  the  second- 
order,  ordinary  differential  equation 

d2x  _ F_ 
dt 2 m 


6-4  Numerical  Solution  of  Differential  Equations  225 


It  may  be  linear  or  nonlinear,  depending  on  the  mathematical  form  of  F. 
Furthermore,  a great  variety  of  phenomena  of  the  physical  universe  lead 
through  their  analysis  to  second-order,  ordinary  differential  ecpiations. 
You  will  see  many  examples  in  this  book  — particularly  in  connection  with 
planetary  and  satellite  motion,  mechanical  and  electromagnetic  wave  mo- 
tion, electric  circuits,  and  quantum  mechanics. 

Even  though  their  analytical  solutions  may  be  difficult  or  impossible, 
all  these  equations,  and  many  more,  can  be  handled  by  applying  a single 
numerical  method.  This  method  is  a generalization  of  the  numerical 
method  used  in  Secs.  5-6,  6-2,  and  6-3.  The  generalized  method  is  applied 
in  the  same  way  to  any  second-order,  ordinary  differential  equation. 

The  numerical  method  for  solving  second-order,  ordinary  differential 
equations  goes  as  follows.  You  manipulate  the  differential  equation  so  as  to 
isolate  the  second  derivative  on  the  left  side  of  the  equality.  If  x and  t are 
the  dependent  and  independent  variables,  respectively,  you  then  have 


The  quantity  Q represents  whatever  remains  on  the  right  side  of  the  equali- 
ty. Thus  Q may  involve  any  functions  of  any  of  or  all  the  quantities  x,  t,  and 
dx/dt,  as  well  as  any  constants.  For  instance,  in  the  body-and-spring  equa- 
tion Q is  -ax.  In  the  skydiver  equation  Q is  g - y {dx/dt)2.  The  symbol  Q is 
used,  instead  of  the  symbol  a , since  in  general  the  second  derivative  need 
not  be  an  acceleration.  Next,  you  convert  Eqs.  (5-35)  to  more  general  forms 
by  writing  Q for  a and  dx/dt  for  v. 

The  new  equations  are  used  in  the  same  way  that  you  used  Eqs.  (5-35). 
fo  be  specific,  this  is  what  you  do.  Always  employing  the  differential  equa- 
tion to  express  Q in  terms  of  the  quantities  on  which  it  depends,  you  select  a 
small  time  increment  At  and  then  carry  out  the  following  set  of  calculations: 

Determine  the  value  of  Q from  the  values  at  t0  = 0 of  the  quantities  on 
which  it  depends,  and  use  it  to  calculate 


(6-15a) 


from  the  given  value  of  (dx/dt) 0.  Use  the  result  to  calculate 


Xj  — x0  + 


(6-156) 


from  the  given  value  of  x0.  Then  set  t1  = At. 

Next  determine  the  value  of  Q from  the  new  values  of  the  quantities  on 
which  it  depends.  Use  it,  and  the  value  of  (dx/dt)1/2  just  obtained,  to  calcu- 
late 


+ <2  At 


(6- 15c) 


Use  the  result,  and  the  value  of  xx  just  obtained,  to  calculate 


X2  — Xi  + 


At 


226  Oscillatory  Motion 


'3/2 


(6-15d) 


Then  set  ^ = 2A t. 

Next  determine  the  value  of  Q from  the  new  values  of  the  quantities  on 
which  it  depends.  Use  it,  and  the  value  of  (dx/dt)3j2  just  obtained,  to  calcu- 
late 


dx\  /dx\ 


dt  )i 


5/2 


dx\ 


dt 


3/2 


+ Q At 


Use  the  result,  and  the  value  of  x2  just  obtained,  to  calculate 

tdx\ 


x3  — x2  + 


dt  k 


At 


5/2 


(6-15?) 


(6-15/) 


Then  set  t3  = 3A t. 

Continue  these  calculations  until  t reaches  whatever  value  is  required. 


The  justification  of  these  equations  for  a small  value  of  At  is  apparent 
if  you  remember  that  the  first  derivative  is  the  rate  of  change  of  the 
dependent  variable,  that  the  second  derivative  is  the  rate  of  change  of  the 
first  derivative,  and  that  Q equals  the  second  derivative.  Just  read  what  the 
equations  say,  in  words.  Equations  (6- 15a),  (6- 15c),  (6- 15c),  and  so  forth  say 
the  new  value  of  the  first  derivative  is  approximated  by  the  old  value  plus 
the  rate  of  change  of  the  first  derivative,  Q,  multiplied  by  the  increment  in 
the  independent  variable.  Equations  (6-156),  (6-15d),  (6-15/),  and  so  forth 
say  that  the  new  value  of  the  dependent  variable  is  approximated  by  the  old 
value  plus  the  rate  of  change  of  the  dependent  variable,  dx/dt,  multiplied 
by  the  increment  in  the  independent  variable. 

If  the  dependent  and/or  independent  variables  are  other  than  x and  t, 
all  you  have  to  do  is  rewrite  these  equations  with  the  appropriate  symbols 
replacing  x and/or  t.  For  example,  in  the  pendulum  equation  the  depen- 
dent variable  is  </>,  instead  of  x,  and  Q is  — a sin  /.  The  numerical  method 
is  applicable  to  almost  any  second-order,  ordinary  differential  equation.  Its 
only  general  limitation  is  that  Q must  be  finite  everywhere  within  the  range 
over  which  it  must  be  evaluated. 


It  is  difficult  to  predict  the  accuracy  of  the  method  in  any  particular  applica- 
tion. However,  it  is  almost  always  true  that  (1)  the  accuracy  increases  as  you  de- 
crease the  size  of  the  increment  in  the  independent  variable;  (2)  the  larger  the 
value  of  Q,  the  smaller  the  size  of  the  increment  you  must  use  to  achieve  a certain 
accuracy;  (3)  the  accuracy  decreases  when  Q involves  a first  derivative.  You  can 
understand  the  third  comment  by  noting  that  the  dependent  and  independent 
variables  are  both  evaluated  in  Q at  the  middle  of  the  interval  where  Q is  used.  But 
if  the  first  derivative  is  present  in  Q,  as  is  the  case  in  the  sky  diver  equation,  then 
the  first  derivative  is  evaluated  at  the  beginning  of  the  interval.  This  makes  the 
fixed  value  used  to  represent  Q in  the  interval  a poorer  representation  of  the  range 
of  values  actually  assumed  by  Q through  the  interval.  Fortunately,  it  is  often  true 
that  where  the  first  derivative  is  large,  its  rate  of  change  is  small,  so  that  the  value 
of  Q is  not  too  sensitive  to  where  the  first  derivative  is  evaluated.  The  skydiver 
equation  is  an  example. 


All  the  numerical  calculations  given  in  Secs.  5-6,  6-2,  and  6-3  are  really 
examples  of  the  numerical  method  for  solving  second-order,  ordinary  dif- 
ferential equations.  We  will  consider  several  more  examples  in  Sec.  6-6  and 
many  others  later  in  the  book.  But  now  we  turn  our  attention  to  an  analyti- 
cal solution  of  the  equation  that  we  solved  numerically  in  Sec.  6-2. 


6-4  Numerical  Solution  of  Differential  Equations  227 


6-5  ANALYTICAL 
SOLUTION  OF 
THE  HARMONIC 
OSCILLATOR 
EQUATION 


For  many  of  the  differential  equations  that  arise  in  physics,  solutions  can  be 
found  by  carrying  out  mathematical  analyses  that  do  not  involve  numerical 
procedures.  In  some  cases  it  is  quite  easy  to  find  such  analytical  solutions, 
while  in  others  the  analytical  solutions  are  very  difficult  to  obtain.  But 
where  there  are  analytical  solutions,  they  can  be  expressed  in  a very  concise 
form  (often  in  terms  of  the  simple  and  familiar  functions  that  can  be  evalu- 
ated by  pressing  one  or  two  keys  on  a scientific  calculator).  So  for  many 
purposes  analytical  solutions  are  much  more  convenient  to  use  than  the 
graphs  or  tables  which  must  be  employed  to  express  numerical  solutions  of 
differential  equations. 

A major  obstacle  to  obtaining  analytical  solutions  to  differential  equa- 
tions is  that  there  is  no  single  method  that  works  on  all  those  equations  that 
do  have  such  solutions.  Instead,  there  are  many  different  analytical  proce- 
dures, each  suited  only  to  certain  differential  equations.  Almost  all  the 
methods  do,  however,  have  a common  approach:  (1)  You  use  whatever 
intuition  you  have  available  to  guess  at  the  mathematical  form  for  the  solu- 
tions to  the  equation.  If  you  are  wise,  you  will  build  flexibility  into  your 
guess  by  including  in  the  assumed  mathematical  form  constants  whose  val- 
ues can  be  subsequently  adjusted  to  make  the  form  fit  the  specific  require- 
ments of  the  differential  equation.  (2)  Next  you  evaluate  the  appropriate 
derivatives  of  the  solutions  whose  form  you  have  assumed,  and  substitute 
them  into  the  equation.  (3)  Then  you  explore  the  consequences  of  this  sub- 
stitution. That  is,  you  see  if  it  is  possible  to  adjust  the  values  of  one  or  more 
of  the  constants  so  as  to  obtain  results  which  are  mathematically  consistent. 
(You  will  see  immediately  below  an  example  of  what  is  meant  by  “mathe- 
matically consistent.”)  (4)  If  you  succeed,  you  have  found  solutions  to  the 
differential  equation.  (5)  If  you  fail,  because  substituting  your  guess  into 
the  equation  leads  to  an  irreconcilable  inconsistency,  you  must  make  an- 
other guess  aimed  at  removing  the  inconsistency  and  then  try  again. 


Let  us  see  if  we  can  use  this  approach  to  find  analytical  solutions  of  the 
equation  determining  the  motion  of  a body  at  the  end  of  a spring: 


d2x 

~ax 


(6-16) 


First  we  will  sharpen  our  intuition  by  looking  back  at  the  results  we  ob- 
tained in  solving  the  equation  numerically.  Figure  6-6  shows  that  the  equa- 
tion has  oscillatory  solutions  which  look  like  cosine  functions  of  t.  Figure 
6-9  shows  it  also  has  solutions  that  look  like  sine  functions  of  t.  Thus  its  so- 
lutions appear  to  be  sinusoids  (a  word  that  includes  both  sine  and  cosine) 
which  depend  on  t.  Figure  6-7  shows  that  the  sinusoid  oscillates  more  or 
less  rapidly  depending  on  the  value  of  a.  These  properties  lead  us  to  guess 
that  solutions  to  the  differential  equation  might  be  of  the  form 

x cc  cos(cof  + 6) 

The  value  of  the  constant  8 can  be  adjusted  to  make  this  single  form  repre- 
sent any  sinusoid  that  is  required.  For  example,  if  6 = 0,  then  the  form  de- 
scribes a cosine,  as  in  Fig.  6-6.  If  § = — tt/2,  then  it  describes  a sine,  as  in  Fig. 
6-9,  since  cos(o»t  — tt/2)  = cos(7r/2  — c ot)  = sin(cot).  The  value  of  the  constant 
a)  can  be  adjusted  to  make  the  rapidity  of  the  oscillation  fit  any  requirement. 
For  instance,  a large  a>  will  make  a small  change  in  t take  the  sinusoid  through 
a large  part  of  a cycle,  and  thereby  lead  to  a rapid  oscillation. 


228  Oscillatory  Motion 


The  proportionality  can  be  written  as  an  equality, 

x = A cos(a )t  + 8)  (6-17) 

by  introducing  the  adjustable  constant  A.  Note  that  the  form  of  Eq.  (6-17)  is 
consistent  with  what  we  learned  from  Fig.  6-6  about  the  scaling  property  of 
the  solutions  to  the  equation.  That  is,  using  different  values  of  A will  pro- 
duce solutions  which  differ  only  in  their  amplitude.  The  form  in  Eq.  (6-17) 
is  the  one  we  will  assume  for  the  solutions  of  the  differential  equation,  Eq. 
(6-16).  If  we  had  no  prior  mathematical  intuition  about  that  equation,  be- 
cause we  had  not  already  studied  some  of  its  numerical  solutions,  we 
would  have  to  base  our  assumption  on  physical  intuition  obtained  from 
experimental  study  of  the  oscillations  of  a body  at  the  end  of  a spring. 

Now  we  will  prepare  to  test  the  validity  of  the  form  in  Eq.  (6-17)  by 
computing  its  second  derivative.  We  begin  by  computing  the  first  deriva- 
tive: 

dx  _ d[A  cosjcut  + 5)] 
dt  dt 

Since  A is  a constant,  this  is 

dx  . d[cos(a>t  + 8)]  , Q. 

dt=A — it <6'18) 


To  proceed,  we  must  employ  the  “chain  rule”  of  differential  calculus. 
Given  any  function/ of  any  variable  u,  the  rule  states  that 


d[f(u)]  _ d[f(u)]  du 
dt  du  dt 

We  set  f{u)  — cos  u.  Then  the  rule  says 

d{ cos  u)  _ d( cos  u)  du 
dt  du  dt 


(6-19) 


Now  d{ cos  u)/du  = —sin  u.  [If  you  are  not  familiar  with  this  relation,  you 
can  obtain  it  by  taking  a>  = 1 and  t = u in  Eq.  (2-18).]  So  we  have 

die  os  u)  . du 

dt  dt 

Next  we  let  u = cot  + 8 and  obtain 


d[cos(a>t  + 8)] 
dt 


= — sin(ajt  + 8) 


d(a)t  + 8) 
dt 


Since  co  and  8 are  constants,  the  derivative  on  the  right  side  yields  co.  Thus 

d[cos(wt  + 8)] 


dt 


— — co  sin(cot  + 8) 


(6-20) 


Using  this  relation  in  Eq.  (6-18)  produces  immediately  the  result 

= —Am  sin(wt  + 8) 
dt 


(6-2  la) 


6-5  Analytical  Solution  of  the  Harmonic  Oscillator  Equation  229 


In  a very  similar  computation  we  use  this  result  to  find  the  second  deriva- 
tive. We  have 


d2x  d ldx\  d[  — Aco  sin(o>t  + 8)] 
dt2  dt  \dt)  dt 

, d[sin(a>t  + 8)1 
= ~Aw  it 

= — Aai  cos(a)/  4-  5)  + ^ 1 

cit 


This  gives 


d2x 

If 


— Aco2  cos(cot  + 8) 


(6-218) 


Next  we  substitute  Eqs.  (6-17)  and  (6-218)  into  the  differential  equa- 
tion we  hope  that  they  satisfy.  That  is,  we  substitute  the  expressions  for  x 
and  for  dfx/df  into  Eq.  (6-16),  dfx/df  = —ax.  We  obtain 

— Aco2  cos  (cat  + 8)  = — aA  cos(cot  + 8)  (6-22) 

Can  this  equation  be  mathematically  consistent?  Indeed  it  can!  If  we  adjust 
the  value  of  co  so  that 

co2  = a 


or 

co  = Va  (6-23) 

then  Eq.  (6-22)  is  satisfied  identically.  That  is,  both  sides  of  Eq.  (6-22)  have 
the  same  value  under  all  circumstances.  This  means  that  the  form  in  Eq. 
(6-17)  will  then  be  a solution  to  Eq.  (6-16).  [The  positive  root  is  taken  in  Eq. 
(6-23)  since,  by  convention,  the  quantity  co  is  always  positive.  This  is  because 
co  specifies  the  rapidity  of  the  oscillation;  a negative  co  would  have  no  physi- 
cal meaning.] 

Thus  we  have  found  solutions  to  the  differential  equation,  Eq.  (6-16), 

d2x 

-pr  = —ax 
dt 2 

The  solutions  are  of  the  form  given  by  Eq.  (6-17)  with  the  constant  co  ad- 
justed so  that  co  = Va.  That  is,  they  are 

x = A cos(Va  t + 8)  (6-24) 

The  solutions  are  sinusoidal  functions,  which  are  also  called  harmonic 
functions.  For  this  reason  the  physical  system  whose  motion  is  described  by 
the  solutions — the  system  we  have  exemplified  by  a body  at  the  movable 
end  of  a spring  whose  other  end  is  fixed  — is  called  a harmonic  oscillator. 
And  the  differential  equation  determining  the  motion,  Eq.  (6-16),  is  called 
the  harmonic  oscillator  equation. 

The  plural  is  used  in  referring  to  Eq.  (6-24)  since  this  actually  repre- 
sents a whole  family  of  different  solutions,  each  corresponding  to  a particu- 
lar set  of  values  of  the  constants  A and  8.  To  determine  the  significance  of 
these  constants,  we  will  evaluate  the  position  and  velocity  of  the  oscillating 
body  at  the  instant  t — 0.  From  Eq.  (6-24)  we  have  for  the  initial  position 

x0  = A cos  8 


230 


Oscillatory  Motion 


X 


X 


X 


Fig.  6-15  The  phase  constant  8 of  a 
sinusoidal  function  x(t).  Changing  8 has 
no  effect  on  its  amplitude  A or  period  T, 
but  slides  the  function  along  the  t axis. 
An  alternative  point  of  view  is  to  say  that 
changing  8 has  no  effect  at  all  on  the 
function,  but  it  does  change  the  way  that 
t = 0 is  debited.  Can  you  draw  a sketch 
which  illustrates  this  point  of  view? 


The  initial  velocity  is  given  by  Eq.  (6-2  la),  with  a>  = Va  and  t = 0: 


— Ax/a.  sin  8 


Now  note  that  when  8 = 0,  then  x0  = A and  (dx/dt) 0 = 0.  When  8 = - tt/2, 
then  x0  = 0 and  (dx/dt) 0 = A\fa.  For  general  values  of  8,  neither  x0  nor 
(dx/dt) o is  zero.  As  an  example,  if  8 = — 7t/4,  then  x0  = 0.707A  and 
(dx/dt) o = 0.707 Ax/a.  Figure  6-15  illustrates  the  oscillations  described  by 
Eq.  (6-24)  for  these  three  values  of  the  constant  8,  which  is  called  the  phase 
constant.  You  can  see  that  changing  the  phase  constant  corresponds  to 
taking  the  same  sinusoidal  solution  and  moving  it  along  the  t axis.  If  8 is 
made  more  negative,  the  sinusoid  shifts  in  the  direction  of  increasing  f,  if  8 
is  made  more  positive,  the  sinusoid  shifts  in  the  opposite  direction. 

Figure  6-15  also  shows  that  the  constant  A is  the  amplitude  of  the  oscil- 
lation because  A is  the  maximum  magnitude  assumed  by  x.  You  can  see  this 
directly  from  Eq.  (6-24)  since  the  value  of  the  cosine  oscillates  between  — 1 
and  + 1 as  t increases.  Thus  x,  which  represents  the  position  of  the  body, 
will  oscillate  between  — A and  +A. 

The  two  constants  A and  8 allow  you  to  adjust  the  right  side  of  Eq. 
(6-24)  so  that  it  is  the  particular  solution  to  the  harmonic  oscillator  equation 
appropriate  to  any  values  of  x0  and  (dx/dt) 0.  These  values  are  called  the  ini- 
tial conditions  on  the  equation.  In  fact,  the  general  expression  for  the  solu- 
tions to  any  second-order,  ordinary  differential  equation  must  contain  two 
constants  so  that  the  expression  can  be  tailored  to  ht  the  values  of  the 
dependent  variable  and  its  hrst  derivative  at  some  initial  value  of  the  inde- 
pendent variable. 


The  period  T of  the  harmonic  oscillator,  that  is,  the  time  for  it  to  com- 
plete one  full  oscillation,  is  indicated  in  Fig.  6-15.  Its  value  is  determined  by 
the  coefficient  of  t in  Eq.  (6-24): 

x = A cos(Va  t + 8) 

Since  (Va  t + 8)  is  the  “argument”  of  a cosine  (in  other  words,  a quantity 
whose  cosine  is  to  be  calculated),  that  quantity  is  an  angle.  While  the  angle 
in  radians  increases  by  2-77,  the  values  of  the  cosine  oscillate  through  one 
cycle  and  the  time  increase  is  one  period.  1 hus  7 is  such  that 

\yf(x  t + 8]  + 27r  = [Vcffi  + T)  + 8] 


or 

Va  T = 2tt 


So 


The  frequency  v has  the  value  v = 1 /T,  or 

Va 

V 277 


(6-25a) 


(6-258) 


The  quantity  w that  we  used  earlier  is  the  angular  frequency  of  the  os- 
cillation. It  measures  the  rate  of  increase,  in  radians  per  second,  of  the 
angle  specified  by  the  argument  of  the  cosine.  Since  the  angle  increases  by 

6-5  Analytical  Solution  of  the  Harmonic  Oscillator  Equation  231 


277  in  each  cycle,  and  since  the  frequency  is  the  number  of  cycles  per  sec- 
ond, we  have  the  relation 


cj  = 2 ttv  (6-26) 

In  terms  of  the  frequency  v , the  general  solution  of  the  differential  equa- 
tion for  a harmonic  oscillator,  Ecj.  (6-17),  takes  the  form 

x = A cos(27 rvt  + 8)  (6-27) 

For  the  harmonic  oscillator  the  value  of  w is,  from  Eqs.  (6-258)  and 
(6-26), 


GJ 


as  we  already  know  from  Eq.  (6-23).  These  results  from  T,  v,  and  oj  apply  to 
both  the  body  and  spring  and  the  small  oscillations  of  a pendulum,  pro- 
viding the  proper  expression  for  a is  used.  For  the  body  and  spring  it  is 

k 

a = — 
m 


so  that 


For  the  pendulum  it  is 


a = 


g 

I 


so  that 


v 


J_  = 

T 2 77 


where  cj)  <5<  1 


(6-2  8a) 


(6-288) 


Compare  the  results  obtained  from  the  analytical  solution  of  the  har- 
monic oscillator  differential  equation  with  those  obtained  earlier  from 
solving  the  equation  numerically.  You  will  find  the  following:  (1)  The  sinu- 
soidal forms  of  the  analytical  solutions  of  Eq.  (6-24)  agree  with  the  forms  of 
the  numerical  solutions  plotted  in  Figs.  6-6  and  6-9.  When  appropriate  val- 
ues of  the  amplitude  and  phase  constants  A and  8 are  used,  the  solutions 
agree  in  every  detail.  (2)  The  presence  of  the  multiplicative  constant  A in 
Eq.  (6-24)  agrees  with  the  scaling  property  seen  by  reading  the  numerical 
values  of  the  two  solutions  plotted  in  Fig.  6-6.  (3)  The  relation  of  Eq. 
(6-25a)  between  T,  the  period  of  the  oscillator,  and  a,  the  quantity  describ- 
ing its  mechanical  properties,  agrees  quantitatively  with  the  values  plotted 
in  Figs.  6-7  and  6-8,  as  well  as  with  the  equation  inferred  from  those  values. 
Of  course,  it  is  often  much  more  convenient  to  have  the  solutions  to  the 
harmonic  oscillator  differential  equation  in  the  one  equation  provided  by 
the  analytical  work  than  it  is  to  have  them  only  in  the  many  graphs  pro- 
vided by  the  numerical  work. 

Equation  (6-28b)  predicts  that  the  period  for  the  small  oscillations  of  a simple 
pendulum  with  a certain  length  1 is  the  same  as  the  period  for  small  rotations  of  a 
conical  pendulum  with  the  same  length  1.  The  latter  is  given  by  Eq.  (3-53)  with 
0 <3C  1 so  that  cos  9=1.  Make  a pendulum  and  test  this  prediction. 


232  Oscillatory  Motion 


EXAMPLE  6-6 


S < 0 


x > 0 


Fig.  6-16  A harmonic  oscillator. 


a.  An  object  is  connected  to  one  end  of  a horizontal  spring  whose  other  end  is 
fixed,  as  in  Fig.  6-16.  The  object  is  pulled  to  the  right  (in  the  positive  x direction)  by 
an  externally  applied  force  of  magnitude  20.00  N,  causing  the  spring  to  stretch 
1.000  cm.  Determine  the  value  of  the  force  constant. 

■ The  force  constant  is  the  k in  Hooke’s  law,  Eq.  (6-1): 

F = — kx 


The  force  F produced  by  the  spring  is  F = —20.00  N,  where  the  minus  sign  means 
that  the  force  acts  to  the  left  (in  the  negative  x direction).  Since  x = 1.000  x 10-2  m, 
you  have 


k = 


F 

x 


-20.00  N 
1.000  x 10~2  m 


= 2.000  x 103  N/m 


b.  The  mass  of  the  object  is  4.000  kg.  Determine  the  period  with  which  it  oscil- 
lates if  the  applied  force  is  suddenly  removed. 

■ The  ratio  a of  the  force  constant  to  the  mass  determines  the  period  T,  since  Eq. 
(6-25a)  shows  that  T = 2-nl\fa.  According  to  Eq.  (6-4), 


a 


2.000  x IQ3  N/m 
4.000  kg 


= 5.000  x 102  N/(m-kg) 


Thus  the  period  is 


T 


. = 0.2810  s 

V5.000  x 102  N/(nvkg) 


c.  Determine  the  frequency  of  the  oscillation. 
■ Since  v = 1 /T,  you  have 


v 


1 

0.2810  s 


= 3.559  cycles/s  = 3.559  Hz 


(Since  a cycle  is  simply  a count,  it  is  a dimensionless  number  and  can  be  inserted  at 
will  in  the  units  of  the  answer.)  ■ 

d.  Determine  the  angular  frequency  of  the  oscillation. 

■ Applying  Eq.  (6-26),  you  obtain 

co  = 2ttv  = 2tt  rad/cycle  x 3.559  cycles/s  = 22.36  rad/s  ■ 

e.  Determine  the  position  of  the  object  0.7500  s after  it  begins  its  oscillation. 

■ The  oscillation  is  described  by  Eq.  (6-17), 

x = A cos(a >t  + 8) 


if  A and  8 are  adjusted  to  fit  the  initial  conditions  and  if  the  proper  value  of  co  is 
used.  As  you  have  seen  before  from  Eqs.  (6-2 la)  and  (6-24),  the  initial  conditions 
(dx/dt) o = 0 and  x0  = 1.000  x 10-2  m require  that 

8 = 0 and  A = 1.000  x 10~2  m 


So 


x = 1.000  x 10~2  m x cos(22.36  rad/s  x t) 
where  the  value  of  a>  is  that  found  in  part  d.  The  position  at  t = 0.7500  s is 

x = 1.000  x 10~2  m x cos(22.36  rad/s  x 0.7500  s) 

= 1.000  x 10-2  m x cos(16.77  rad) 

= 1.000  x 10-2  m x (-0.4871)  = -4.871  x 10~3  m = -0.4871  cm 

The  argument  of  the  cosine  is  larger  than  27r  rad  because  the  oscillation  has  passed 
beyond  its  first  cycle.  The  minus  sign  means  that  the  body  is  to  the  left  of  its  equilib- 

6-5  Analytical  Solution  of  the  Harmonic  Oscillator  Equation  233 


rium  position,  and  the  spring  is  compressed.  Note  that  for  the  purpose  of  eval- 
uating x it  is  most  convenient  to  express  the  cosine  in  terms  of  the  angular  fre- 
quency w rather  than  the  frequency  v.  m 

f.  Determine  the  velocity  of  the  object  at  t = 0.7500  s. 

■ Evaluating  the  terms  in  Eq.  (6-21a),  you  have 


dx 

dt 


—Am  sin(cot) 


— — E000  x 10  2 m x 22.36  rad/s  x sin(16.77  rad) 
= - 1.000  x 10-2  m x 22.36  rad/s  x (-0.8733) 

= 0.1953  m/s  = 19.53  cm/s 


(A  radian  is  a dimensionless  number,  being  defined  as  the  ratio  of  two  lengths,  so  it 
can  be  deleted  at  will  from  the  units.)  The  positive  velocity  means  the  body  is 
moving  to  the  right.  Thus  although  the  spring  is  compressed,  the  compression  is 
decreasing.  ■ 

g.  Determine  the  acceleration  of  the  object  at  t = 0.7500  s. 

■ From  Eq.  (6-216)  the  acceleration  is 

d2x 

= -Am  cos (Mt) 

= -1.000  x 10“2  m x (22.36  rad/s)2  x cos(16.77  rad) 

= - 1.000  x 10-2  m x (22.36  rad/s)2  x (-0.4871) 

= 2.435  m/s2  = 243  cm/s2 


The  body  is  accelerating  to  the  right  since  the  acceleration  is  positive. 

h.  Determine  the  force  exerted  on  the  body  by  the  spring  at  t = 0.7500  s. 
■ One  way  you  can  do  this  is  to  apply  Newton’s  second  law: 


F = 


d2x 


m 


dt 2 


= 4.000  kg  x 2.435  m/s2 
= 9.740  N 


where  the  value  of  d2x/dt2  is  that  found  in  part  g.  The  force  has  a positive  value,  and 
so  acts  to  the  right  since  the  acceleration  is  to  the  right. 

Another  way  of  finding  F is  to  apply  Elooke’s  law- 

F = — kx 

= -2.000  x 103  N/m  x (-4.871  x 10"3  m) 

= 9.740  N 


where  the  values  of  k and  x are  those  found  in  parts  a and  e.  The  force  acts  to  the 
right  since  the  spring  is  compressed  when  the  body  is  to  the  left  of  its  equilibrium 
position.  The  agreement  between  these  two  evaluations  of  F provides  a good  check 
on  the  numerical  work  of  this  example. 


Figure  6-17  shows  the  quantitative  time  dependence  of  the  position, 
velocity,  and  acceleration  of  the  harmonic  oscillator  treated  in  Example 
6-6.  The  principal  features  of  these  three  interrelated  curves  were  dis- 
cussed before  in  Sec.  2-6,  using  the  similar  curves  of  Fig.  2-21.  But  it  is 
worthwhile  to  reemphasize  their  most  important  feature:  the  acceleration 
d2x/dt2  is  always  proportional  in  magnitude  but  opposite  in  sign  to  the  po- 
sition coordinate  x.  The  physical  reason  for  this  is  that  the  acceleration  is  pro- 
portional to  the  force  (Newton’s  second  law)  and  the  force  is  proportional  to  the 


x (in  m) 


Fig.  6-17  The  position  x,  velocity  dx/dt, 
and  acceleration  d2x/dt 2 of  the  harmonic 
oscillator  treated  in  Example  6-6  are 
plotted  as  a function  of  the  time  t. 


negative  of  the  position  coordinate  (Hooke’s  law).  This  relation  can  be  seen 
directly  from  the  harmonic  oscillator  differential,  d2x/dt2  = -ax. 

A related  feature  of  the  quantities  x,  dx/dt,  and  d2x/dt2  can  be  seen  in 
Fig.  6-17.  Consider  any  particular  relative  value  on  the  acceleration  curve, 
say  the  maximum  M.  If  this  value  is  attained  at  a time  t,  the  velocity  curve 
will  not  achieve  its  maximum  value  M'  until  a later  time  t' . From  the  figure 
you  can  see  that  t'  — t = 7/4,  where  7 is  the  period.  We  say  that  the  veloc- 
ity lags  the  acceleration  by  one-quarter  of  a cycle.  In  like  manner,  the  posi- 
tion lags  the  velocity  by  one-quarter  cycle,  as  you  can  see  by  noting  the  loca- 
tion of  the  corresponding  maximum  M"  on  the  position  curve.  Since  the 
acceleration  is  directly  proportional  to  the  force,  we  can  say  that  there  is  a 
phase  lag  of  one-half  cycle,  between  the  force  acting  on  the  body  and  the 
response  of  the  body  as  expressed  by  its  position.  This  is  a result  of  the 
body’s  inertia. 


We  close  the  analytical  treatment  of  the  harmonic  oscillator  differen- 
tial equation  by  discussing  briefly  the  bases  of  the  two  most  important  prop- 
erties of  its  solutions.  These  are  the  oscillatory  nature  of  the  solutions  and 
their  scaling  property.  It  can  be  discerned  from  direct  inspection  of  the 
equation,  d2x/dt2  = —ax,  that  its  solutions  must  be  oscillatory  functions  of  t. 
As  explained  in  Sec.  2-7,  d2x/dt2  is  a measure  of  the  curvature  of  a plot  of  x 
versus  t.  If  d2x/dt2  is  positive,  the  curvature  is  concave  upward,  and  if  it  is 
negative,  the  curvature  is  concave  downward.  Since  the  quantity  a is  posi- 
tive, the  differential  equation  says  that  the  sign  of  the  second  derivative  of  x 
is  always  opposite  to  the  sign  of  x itself.  Thus  if  in  some  region  of  t the  posi- 
tion curve  traced  out  by  plotting  x versus  t lies  above  the  t axis,  then  x is  pos- 
itive, d2x/dt 2 is  negative,  and  the  curve  is  concave  downward.  If  the  position 
curve  lies  below  the  t axis,  the  curve  is  concave  upward.  Simply  put,  the  po- 
sition curve  is  under  all  circumstances  concave  toward  the  t axis.  This 
means  that  the  position  must  oscillate  about  the  axis,  and  so  x is  an  oscilla- 
tory function  of  t. 

The  scaling  property  of  the  solutions  to  the  harmonic  oscillator  equation 
is  a direct  consequence  of  the  linearity  of  the  differential  equation.  The 
equation,  d2x/dt 2 = —ax,  is  said  to  be  linear  since  each  of  its  terms  is  pro- 
portional to  the  first  power  of  the  dependent  variable  x.  As  a consequence, 
if  a certain  dependence  of  x on  t satisfies  the  harmonic  oscillator  equation, 
then  that  dependence,  with  x scaled  up  or  down  by  any  constant  factor  C, 
will  also  be  a solution.  In  other  words,  if  x(t)  satisfies  the  equation,  then  Cx(t) 
also  satisfies  the  equation,  as  we  will  now  prove.  Given  that  x(t)  is  a solution 
to  the  harmonic  oscillator  equation,  we  test  whether  Cx(t)  is  also  a solution 
by  substituting  it  into  the  equation,  obtaining 


d2[cdf)]  = -cicm 

Since  C is  a constant,  the  derivative  can  be  simplified  to  yield 


d2[x(t)] 
L dt2 


—aCx(t) 


6-5  Analytical  Solution  of  the  Harmonic  Oscillator  Equation  235 


6-6  THE  DAMPED 
OSCILLATOR 


Now  divide  through  by  C (for  any  solution  of  interest,  C ^ 0).  The  result  is 


d2[x(Q] 

dt2 


—ax{t) 


Is  this  result  mathematically  consistent?  Yes  it  is.  Since  x(t)  is  a solution  to 
the  harmonic  oscillator  equation,  the  left  side  always  equals  the  right  side. 
So  the  equation  from  which  we  obtained  the  result  is  valid,  and  we  have 
confirmed  that  Cx(t)  is  also  a solution  to  the  harmonic  oscillator  equation. 


The  basic  properties  of  the  solutions  to  the  pendulum  differential  equation, 


can  be  understood  from  considerations  similar  to  those  used  in  discussing  the  har- 
monic oscillator  differential  equation.  The  sign  of  sin  </>  is  the  same  as  the  sign  of 
<j>,  within  the  range  - jt  < </>  < -n  that  <f>  assumes  for  a pendulum.  So  the  solutions 
to  the  pendulum  equation  must  also  be  oscillatory  functions  of  t.  But  the  scaling 
property  does  not  hold  for  these  solutions  because  one  of  the  terms  in  the  equa- 
tion is  not  proportional  to  the  first  power  of  </>.  Rather  it  is  proportional  to  sin 
and  so  the  equation  is  nonlinear.  The  solutions  are  not  sinusoidal,  or  harmonic, 
functions.  A pendulum  is  consequently  said  to  be  an  anharmonic  oscillator. 

The  pendulum  equation  does  not  have  analytical  solutions  which  can  be  ex- 
pressed by  a finite  number  of  terms  (in  contrast  to  an  infinite  series),  each  of  which 
involves  only  elementary  functions  (the  ones  you  find  on  the  keys  of  a scientific 
calculator).  The  differential  equation  is  best  dealt  with  by  numerical  methods.  As 
a rule  of  thumb,  you  can  expect  that  the  same  is  true  of  most  nonlinear  differential 
equations.  And  there  are  many  nonlinear  differential  equations  which  can  be 
solved  only  by  numerical  methods.  There  are  also  some  lucky  exceptions,  like  the 
nonlinear  differential  equation  for  the  skydiver,  Eq.  (6-13).  It  has  an  analytical  so- 
lution, quoted  in  Eq.  (5-41),  involving  a small  number  of  elementary  functions.  It 
should  also  be  said  that  there  are  many  linear  differential  equations  which  can  be 
solved  only  by  numerical  methods. 


Although  we  have  ignored  the  fact  until  now,  real  macroscopic  oscillators 
always  experience  at  least  a small  frictional  force  that  tends  to  dissipate  or, 
as  it  is  said,  damp  their  motion.  In  certain  situations  the  frictional  force  is 
large,  and  introduced  on  purpose.  Think  of  the  springing  system  of  an  au- 
tomobile. If  there  were  only  a small  frictional  damping  force,  a single 
bump  in  the  road  would  set  the  automobile  into  a vertical  oscillation  that 
would  persist  long  enough  to  make  its  occupants  quite  uncomfortable.  This 
is  prevented  by  the  shock  absorbers,  which  are  designed  to  produce  a fric- 
tional force  that  damps  out  the  oscillation  as  rapidly  as  possible.  In  this  sec- 
tion we  will  study  the  motion  of  a damped  oscillator  by  using  Newton’s  sec- 
ond law  to  set  up  the  differential  equation  governing  its  motion  and  then 
solving  the  equation.  Both  numerical  and  analytical  methods  will  be  em- 
ployed. 

Consider  an  oscillator  in  which  the  force  producing  the  oscillation 
obeys  Hooke’s  law  and  in  which  the  oscillating  body  experiences  fluid  fric- 
tion. Since  oscillating  bodies  very  often  are  not  large  enough,  or  fail  to 
move  rapidly  enough,  to  produce  much  turbulence,  we  will  assume  that  the 
damping  force  of  fluid  friction  has  a magnitude  proportional  to  the  hrst 
power  of  the  speed  of  the  body,  as  in  Stokes'  law,  Eq.  (4-26).  The  case  in 
which  the  magnitude  of  the  damping  force  is  proportional  to  the  hrst 


236  Oscillatory  Motion 


power  of  the  speed  is  of  special  interest  for  another  reason.  The  differen- 
tial equation  governing  the  motion  of  the  body  has  exactly  the  same  mathe- 
matical form  as  an  important  one  that  will  arise  when  we  study  alternating 
current  electric  circuits  in  Chap.  26.  So  everything  that  we  learn  here  about 
the  properties  of  its  solutions  will  carry  over  directly  to  the  study  of  electric 
circuits. 


To  obtain  the  differential  equation  governing  the  motion  of  the 
damped  oscillator,  we  write  Newton’s  second  law  of  motion 

d2x  = F 
dt 2 m 

and  then  evaluate  the  net  force  F acting  on  the  body.  Here  the  net  force  is 
the  algebraic  sum  of  the  Hooke’s-law  force,  — he,  and  the  damping  force. 
The  magnitude  of  the  damping  force  is  proportional  to  the  magnitude  of 
the  velocity  dx/dt  and  is  always  directed  opposite  to  the  direction  of  the 
velocity.  Thus  the  damping  force  can  be  written  as  — r dx/dt.  The  minus 
sign  gives  the  damping  force  the  proper  direction.  The  constant  r governs 
the  magnitude  of  the  damping  force.  The  value  of  r depends  on  the  size 
and  shape  of  the  oscillating  body  and  on  the  coefficient  of  viscosity  of  the 
substance  through  which  the  body  moves.  Its  numerical  value  is  the 
strength  of  the  damping  force  for  unit  speed.  Thus  the  net  force  acting  on 
the  body  is 


F — -he  - r~ 
at 

and  Newton’s  second  law  yields 

d2x  —kx  — r dx/ dt 


dt2 

We  define  the  constants 


m 


a-±  and  ft  = ^ 
m m 

This  allows  us  to  express  Eq.  (6-29)  as 

d2x  dx 

if  = “ 13  it 


(6-29) 


(6-30) 


(6-31! 


This  is  the  damped  oscillator  equation.  We  will  write  it  in  the  form 


~ = Q where  Q = -ax  - /3  ^ 


dt2 


(6-32) 


Now  that  the  differential  equation  governing  the  motion  has  been 
written  in  the  form  of  Eq.  (6-14),  the  numerical  method  for  solving  it  ac- 
cording to  Eqs.  (6-15)  can  be  applied  immediately.  Doing  this  involves 
nothing  more  than  adding  to  the  body-and-spring  program  several  steps 
which  will  generate  the  value  of  Q specified  in  Eq.  (6-32)  and  then  running 
the  calculator  or  computer  with  this  new  program.  The  damped  oscillator 
program  is  listed  in  the  Numerical  Calculation  Supplement.  It  is  used  in 
Examples  6-7  and  6-8. 


6-6  The  Damped  Oscillator  237 


Fig.  6-18  A lightly  damped  oscillation.  The  parameters  specifying  the  mechanical  properties 
of  the  oscillator  have  the  values  a = 1 N/(m-kg)  = 1 s~2  and  (5  = 0.5  N/(m-s_1-kg)  = 0.5  s-1. 


EXAMPLE  6-7  i - 

Run  the  damped  oscillator  program  with  the  following  set  of  initial  conditions  and 
parameters: 

x0  = 1.5  (in  m);  ( dx/dt)0  = 0;  t0  = 0;  At  = 0.2  (in  s);  a = 1 [in  N/(nvkg)]; 
/ 3 = 0.5  [in  N/(nvs_1-kg)] 

■ The  oscillator  has  the  same  stiffness-to-mass  ratio  a and  the  same  initial  condi- 
tions x0  and  (dx/dt)0  as  the  undamped  oscillator  treated  in  Example  6-2  and  plotted 
in  Fig.  6-6.  The  motion  of  the  damped  oscillator  is  plotted  in  Fig.  6-18.  You  can  see 
that  the  motion  is  oscillatory,  but  with  an  amplitude  which  decreases,  that  is  to  say, 
“dies  down,”  with  each  succeeding  cycle.  If  you  compare  Figs.  6-18  and  6-6  care- 
fully, you  will  see  that  the  period  of  the  damped  oscillator  is  slightly  longer  than  the 
period  of  the  undamped  oscillator  with  the  same  value  of  a.  The  damping  inhibits 
the  motion  of  the  oscillator  and  makes  it  oscillate  a bit  less  rapidly. 


EXAMPLE  6-8  1 1 iiiiiihim 

Run  the  damped  oscillator  program  with  the  following  set  of  initial  conditions  and 
parameters: 


238 


Oscillatory  Motion 


Fig.  6-19  A heavily  damped  oscillation.  Here  a = 1 N/(m-kg),  as  in  the  lightly  damped  oscil- 
lation of  Fig.  6-18.  But  the  damping  parameter  has  the  larger  value  /3  = 2.5  N/(m-s-1-kg)  = 
2.5  s"1. 


x0  = 1.5  (in  m);  ( dx/dl)0  = 0;  t0  = 0;  At  = 0.2  (in  s);  a = 1 [in  N/(m-kg)]; 
/3  = 2.5  [in  N/(m-s_1-kg)] 

■ In  this  example  the  damping-constant-to-mass  ratio  /3  is  5 times  larger  than  it 
was  in  Example  6-7.  Nothing  else  is  changed.  The  motion  of  the  oscillator  is  plotted 
in  Fig.  6-19.  The  motion  is  not  oscillatory,  and  so  a period  cannot  be  defined.  Be- 
cause of  the  large  value  of  (3  there  is  so  much  damping  that  the  body  never  passes 
through  the  coordinate  origin.  Indeed,  it  appears  as  if  its  position  x will  reach  zero 
only  asymptotically. 


The  oscillatory  motion  in  Fig.  6-18  is  said  to  be  lightly  damped,  or  un- 
derdamped, because  the  oscillation  persists  through  several  cycles  despite 
the  damping.  The  motion  shown  in  Fig.  6-19  is  said  to  be  heavily  damped, 
or  overdamped,  since  the  damping  has  completely  suppressed  the  oscilla- 
tion. Does  the  behavior  of  a lightly  or  heavily  damped  oscillator,  predicted 
from  obtaining  numerical  solutions  to  the  differential  equation,  agree  with 
the  behavior  your  physical  intuition  would  lead  you  to  expect? 

Now  we  will  try  to  obtain  analytical  solutions  to  Eq.  (6-31),  the  dif- 
ferential equation  for  a damped  oscillator.  As  usual  in  solving  a differential 

6-6  The  Damped  Oscillator  239 


equation,  the  first  step  is  to  guess  a mathematical  form  for  the  analytical  so- 
lutions. Since  we  have  already  obtained  numerical  solutions,  we  can  use 
them  to  supplement  our  physical  intuition  and  thereby  enhance  the  chance 
that  our  guess  will  prove  to  be  valid.  One  thing  that  Figs.  6-18  and  6-19 
suggest  immediately  is  that  we  treat  light  damping  and  heavy  damping  sep- 
arately, as  far  as  analytical  solutions  are  concerned.  Quite  different  mathe- 
matical expressions  for  the  assumed  form  of  the  analytical  solutions  are 
suggested  by  the  numerical  solutions  plotted  in  the  figures  for  those  two 
cases.  We  will  focus  our  attention  on  getting  solutions  for  light  damping  and 
then  indicate  briefly  what  is  done  to  treat  heavy  damping. 

Figure  6-18  also  suggests  that  when  the  value  of  /3  is  small  enough  so 
that  the  damping  is  light,  the  solutions  to  the  differential  equation  can  be 
written  as  the  product  of  two  functions.  The  first  is  an  oscillatory  function 
of  time  which  describes  the  oscillation.  The  second  is  a function  which  de- 
creases smoothly  with  increasing  values  of  time  and  thus  describes  the  dim- 
inution in  amplitude  resulting  from  damping.  A reasonable  guess  at  a form 
which  has  these  properties  is 

x = Ae cos(a>t  + S)  (6-33) 

The  factor  e is  a decreasing  exponential  function  of  the  quantity  fit.  The 
symbol  fi  (the  Greek  letter  mu)  is  a positive  constant  called  the  damping 
coefficient,  and  t is  the  variable  time.  If  you  are  not  familiar  with  exponen- 
tial functions  or  do  not  know  how  to  evaluate  their  derivatives,  you  should 
read  the  following  material  set  in  small  type. 


The  exponential  function  e”  of  the  variable  u is  a certain  number  e with  the 
exponent  u.  For  integral  values  of  u,  the  function  is  just  the  uth  power  of  e.  But  e“ 
is  defined  for  all  values  of  u,  not  just  integral  values,  by  means  of  an  equation  in- 
volving an  infinite  series.  Specifically,  the  definition  is 


u2  u3  u4 

e"  = 1 + u + 


+ 


u,J 


4X3X2  5 x 4 X 3 x 2 


+ 


(6-34) 


The  motivation  behind  this  definition  can  be  seen  by  evaluating  the  derivative  of  eu 
with  respect  to  u.  Taking  the  derivative  with  respect  to  u of  the  term  on  the  left 
side  of  Eq.  (6-34),  and  of  every  term  on  the  right  side,  produces 


d(e“) 

du 


= 0 + 1 + 


2u 

2 


3u2  4u3  + 5u4 

3X2  4X3X2  5 X 4 x 3 x 2 


Cancellation  gives 

d(e“)  u2  u3  u4 

— — 1 + u H H -I + • • • 

du  2 3X2  4X3X2 

Comparison  of  the  right  side  of  this  equation  with  the  right  side  of  Eq.  (6-34) 
shows  that 


d(e“) 

—j — = e 
du 


(6-35) 


Thus  the  function  e“  defined  by  Eq.  (6-34)  has  the  property  that  it  equals  its  own 
first  derivative  with  respect  to  u.  No  other  function  of  u has  this  very  useful  prop- 
erty. 

The  value  of  the  number  e is  obtained  from  the  definition  of  Eq.  (6-34)  by  set- 
ting u = 1.  This  yields 


111  1 

6-1  + 11 — H 4 H + ■ ■ ■ 

2 3X2  4X3X2  5 X 4 X 3 X 2 


240  Oscillatory  Motion 


The  series  converges  quite  rapidly.  With  a calculator  it  will  be  easy  for  you  to  eval- 
uate and  sum  the  first  few  of  its  terms.  Doing  this  so  as  to  obtain  results  to 
five- decimal-place  accuracy,  you  will  find  that 

e = 2.71828  (6-36) 

Thus  e"  has  the  value  2.71828  for  u = 1.  The  value  of  eu  for  any  other  value  of  u is 
obtained  by  summing  the  series  in  Eq.  (6-34)  for  that  value  of  u.  Results  for  a lim- 
ited range  of  positive  and  negative  values  of  u are  plotted  in  Fig.  6-20a  and  b. 
Accurate  tables  of  values  of  e"  are  available  for  much  wider  ranges  of  u.  Using  al- 
most any  calculator  intended  for  scientific  work,  you  can,  in  effect,  sum  the  series 
that  evaluates  the  exponential  function  of  any  number  entered  in  the  calculator  by 
pressing  only  one  or  two  keys. 

With  values  obtained  from  Fig.  6-20  or  a calculator,  you  can  verify  that 


U 


Fig.  6-20  (a)  The  function  eu  plotted  for 

a limited  range  of  positive  values  of  u.  ( b ) 
The  same  for  negative  values  of  u. 


(a) 


6-6  The  Damped  Oscillator  241 


e"'e"2  = e“1+"2  and  that  e~“  = l/e“.  Thus  the  definition  of  e“  given  by  Eq.  (6-34] 
conforms  to  the  familiar  properties  of  exponents. 

For  the  exponential  function  occurring  in  Eq.  (6-33),  the  exponent  u is  nega- 
tive, and  its  magnitude  is  written  as  the  product  of  a positive  constant  p and  the 
variable  t,  which  is  always  positive  since  its  initial  value  is  t0  = 0.  That  is  u = 
-pi.  To  verify  that  the  form  given  by  Eq.  (6-33)  is  a solution  to  the  differential 
equation  for  a lightly  damped  oscillator,  it  will  be  necessary  to  differentiate  the 
exponential  function  with  respect  to  t alone.  This  is  done  by  using  the  chain  rule, 
Eq.  (6-19).  If  we  set/(u)  = e“,  the  rule  says  that 

d(e“)  _ d(e“)  du 
dt  du  dt 

According  to  Eq.  (6-35),  the  first  term  on  the  right  side  has  the  value 


For  u = —pt,  with  p a constant,  the  second  term  gives 

du 

dF= 


Therefore 


d(e  w) 
dt 


= -pe~^ 


Now  we  return  to  considering  Eq.  (6-33). 


(6-37) 


In  Eq.  (6-33),  the  value  of  the  exponential  factor  e decreases 
smoothly  from  1 with  increasing  values  of  t,  the  decrease  being  more  or  less 
rapid  depending  on  the  size  of  the  damping-to-mass  ratio  p.  The  behavior 
of  this  factor  is  indicated  in  the  top  part  of  Fig.  6-2 1 for  a typical  small  value 
p.  The  middle  part  of  that  figure  illustrates  the  factor  cos(cot  + 8)  for  typi- 


Fig.  6-21  Illustrating  an  argu- 
ment justifying  a form  assumed 
for  the  analytical  solution  to  the 
differential  equation  for  a lightly 
damped  oscillator. 


242 


Oscillatory  Motion 


cal  values  of  the  constants  co  and  8,  and  the  lower  part  shows  the  product 
Ar_#i'  cos(a >t  + 8)  for  some  value  of  the  amplitude  A.  Comparison  of  the 
lower  part  of  the  figure  with  Fig.  6-18  gives  justification  to  the  guess  that 
Eq.  (6-33), 

x = Ac_Mf  cos  (cat  + 8) 

might  represent  a family  of  analytical  solutions  to  Eq.  (6-31), 

d2x  dx 

n?~ 

This  is  the  differential  equation  which  produced  the  numerical  solution 
shown  in  Fig.  6-18.  To  find  out  whether  Eq.  (6-33)  really  does  describe  the 
solutions,  we  will  use  it  to  evaluate  dx/dt  and  d2x/dt2  from  the  form  it  spe- 
cifies for  x.  Then  we  will  substitute  these  derivatives  and  x into  Eq.  (6-31) 
and  see  whether  they  satisfy  the  differential  equation.  You  will  see  that 
while  the  process  is  somewhat  messy,  it  is  quite  straightforward.  It  is  worth- 
while to  carry  out  the  process  not  only  to  verify  the  correctness  of  the  solu- 
tions, but  also  because  it  leads  to  values  of  the  constants  co  (which  gives  the 
angular  frequency  of  the  oscillator)  and  fx  (which  gives  its  rate  of  damping) 
in  terms  of  the  mechanical  constants  a and  /3.  Compare  this  with  the  proce- 
dure used  to  solve  the  harmonic  oscillator  equation  analytically,  which  led 
to  the  relation  co  = \/a. 

First  we  must  evaluate 

~T.  = ~T.  [Ae~^  cos (cot  + 5)] 
dt  dt 

Since  A is  a constant,  this  is 

cos(a^  + 

According  to  Eq.  (2-15),  the  derivative  of  the  product  of  two  functions 
equals  the  first  function  times  the  derivative  of  the  second  function  plus  the 
second  function  times  the  derivative  of  the  first  function.  Using  this  prop- 
erty and  employing  Eq.  (6-20)  to  differentiate  the  cosine  and  Eq.  (6-37)  to 
differentiate  the  exponential,  we  obtain  directly 

dx 

— = — Ao»r_Ai'  sin(cu/  + 8)  — A/uc-^  cos  (cot  + 8)  (6-38a) 

Next  we  use  this  result  to  evaluate 

d2x  d (dx\  d r . . „ „ , 

—y  = — ( — I = t[— Aojc  sin(o»t  + 8)  — A/xe  cos(o»t  + 8)] 

dt  dt  \dt } dt 

The  computation  is  very  similar  to  the  one  which  evaluated  dx/dt  from  x. 
But  now  there  are  two  terms  to  differentiate,  so  there  are  four  terms  in  the 
result.  The  result  that  we  obtain  is 

j2 

Vy  = — Aw2r'1(  cos(a»t  + 8)  + Afxcoe^1  sin(cat  + 8) 
dt 

+ A fx sin(arf  + 8)  + Ap,2c_M'  cos(a>t  + 8) 

Since  the  second  and  third  terms  on  the  right  side  are  identical,  this  simpli- 
fies to 


6-6  The  Damped  Oscillator  243 


d2x 

If 


= A(yf  — oo2)e  cos(o >t  + 8)  + 2Aytxa>c  ^ sin(co<  + 8) 


(6-388) 


Now  we  test  the  validity  of  the  assumed  form  for  x by  substituting  it, 
and  the  expressions  for  dx/dt  and  dfx/df  obtained  from  it,  into  the  dif- 
ferential equation  that  x is  supposed  to  satisfy,  Eq.  (6-31).  This  produces 
the  equation 

A(/x2  — ar)c-M(  cos(c ot  + 8)  + 2AyLxojc_Mr  sin(co/  + 8)  = 

— aAe~^  cos  (cat  + 8)  + fiAa>e~M  sin(wt  + 8)  + f3A/jie~IJ't  cos  (tot  + 8) 

While  this  equation  is  lengthy,  each  term  is  either  e cos(otf  + 8)  or 
e~ M sin(<zrf  + 8)  multiplied  by  a combination  of  constants.  Gathering  the 
coefficients  of  the  cosine  terms,  and  then  of  the  sine  terms,  and  trans- 
posing, we  have 

A(/x2  — co2  + a - (3fi)e~fJ't  cos  (ait  + 8) 

+ A( 2/jlm  — (3 cj)e~fJ't  sin(cot  + 8)  = 0 (6-39) 

If  the  form  for  x given  by  Eq.  (6-33)  is,  in  fact,  a solution  to  the  dif- 
ferential equation  given  by  Eq.  (6-31),  then  Eq.  (6-39)  must  be  satisfied. 
That  is,  the  sum  of  the  hrst  and  second  terms  of  Eq.  (6-39)  must  be  zero  at 
all  times.  But  the  hrst  term  in  Eq.  (6-39)  has  a dependence  on  t which  is  dif- 
ferent from  that  of  the  second  term.  Therefore  the  only  way  that  the  sum  of 
the  two  terms  can  equal  zero  for  all  values  of  t is  for  each  of  the  two  terms 
individually  to  be  equal  to  zero.  Thus  for  Eq.  (6-39)  to  be  mathematically 
consistent,  two  equations  must  be  satisfied.  The  hrst  is 

A(/jf  - w2  + a - (3ii)e-^  cos (m(  + 8)  = 0 (6-40a) 

and  the  second  is 

A(2p.o>  — (3M)e~'xt  sin(atf  + 8)  = 0 (6-408) 

We  can  achieve  satisfaction  of  Eqs.  (6-40a)  and  (6-408)  by  properly  ad- 
justing the  values  of  the  constants  A,  /x,  m,  and  8.  One  way  is  to  set  A = 0. 
But  doing  so  makes  Eq.  (6-33)  yield  the  solution  x = 0 of  the  differential 
equation.  That  solution  is  correct  but  uninteresting,  because  it  describes 
the  body  remaining  fixed  at  its  stable  equilibrium  position. 

The  other  way  of  satisfying  Eqs.  (6-40a)  and  (6-408)  is  to  adjust  the  val- 
ues of  /x  and  m so  that  two  relations  hold.  These  are 

yu2  — or  + a — y3/x  = 0 

and 


2/xco  — (3m  = 0 


The  second  of  these  relations  immediately  gives  the  value  of  /x,  which  de- 
termines the  rate  of  damping.  It  is 


(6-41) 


Substituting  this  value  of  p into  the  hrst  relation,  we  have 


This  can  now  be  solved  for  at2  to  give 


244  Oscillatory  Motion 


CO 


2 


= a — 


£ 

4 


and  we  find  die  angular  frequency  of  the  lightly  damped  oscillator  to  be 


co  = 


(6-42) 


Substitution  into  the  differential  equation,  Eq.  (6-31),  of  the  assumed 
form  of  its  solutions,  Eq.  (6-33),  has  shown  that  this  form  is  correct,  pro- 
viding that  Eqs.  (6-41)  and  (6-42)  are  obeyed.  In  other  words,  if  the  con- 
stants ix  and  co  which  specify  the  motion  of  the  damped  oscillator  are  re- 
lated by  Eqs.  (6-41)  and  (6-42)  to  the  constants  a and  (3  that  specify  its 
mechanical  properties,  then  the  form  we  have  assumed  for  the  solutions  is 
correct.  Using  these  values  of  /x  and  co  in  Eq.  (6-33),  we  therefore  have  the 
solutions  describing  the  motion.  They  are: 


For  lightly  damped  motion 

x = Ac_/3(/2  cos a ~ t + s')  for  < a (6-43) 

As  in  the  case  of  the  solutions  to  the  undamped  harmonic  oscillator,  the 
constants  A and  8 remain  to  be  adjusted  so  that  Eq.  (6-43)  can  be  fitted  to 
any  pair  of  initial  conditions,  x0  and  (dx/dt) 0.  This  adjustment  of  the  ampli- 
tude and  the  phase  constant  produces  a solution  which  describes  the  mo- 
tion particular  to  the  initial  conditions. 


When  the  damping-to-mass  ratio  (3  is  zero,  Eq.  (6-43)  reduces,  as  it 
should,  to  Eq.  (6-24)  describing  the  solutions  for  the  undamped  harmonic 
oscillator.  As  (3  increases  from  zero,  the  coefficient  of  t in  the  decreasing 
exponential  becomes  larger.  This  damps  the  oscillation.  Also  the  coeffi- 
cient of  t in  the  sinusoid  becomes  smaller.  This  reduces  the  oscillation  fre- 
quency and  thereby  increases  the  oscillation  period.  Because  of  the  form  of 
the  dependence  of  the  frequency  on  (3,  there  is  very  little  change  from  the 
frequency  of  an  undamped  oscillator  until  (3 2 becomes  large  enough  com- 
pared to  a to  approach  the  condition 


For  instance,  with  the  values  used  in  the  numerical  solution  of  Example 
6-7,  a = 1.00  N/(m-kg)  and  (3  = 0.50  N/(nvs_1-kg),  the  angular  frequency 
of  the  damped  oscillator  is  co  = Va  — (32/ 4 = 0.97  s-1.  This  may  be  com- 
pared to  the  value  co  = Va  = 1.00  s_1  that  would  be  obtained  for  the  same 
a if  there  were  no  damping,  so  that  (3  = 0.  As  mentioned  earlier,  the  physi- 
cal reason  why  increased  damping  leads  to  a decreased  frequency  is  that 
damping  inhibits  the  motion  of  the  oscillator,  thereby  making  it  oscillate 
less  rapidly. 


When  f3  is  large  enough  in  comparison  to  a that  the  condition  of  Eq. 
(6-44)  is  satisfied,  the  angular  frequency  co  calculated  from  Eq.  (6-42)  is 
zero.  When  it  is  even  larger,  the  calculated  value  of  co  is  the  square  root  of  a 
negative  number;  that  is,  co  is  an  “imaginary”  number.  It  is  apparent  that 


6-6  The  Damped  Oscillator  245 


there  is  something  special  about  the  condition  specified  by  Eq.  (6-44).  In 
fact,  this  is  the  condition  for  a critically  damped  “oscillation.”  At  critical 
damping  the  motion  is  not  oscillatory  at  all.  Nor  is  it  simply  a decreasing 
exponential.  In  other  words,  the  pure  decreasing  exponential  that  Eq. 
(6-43)  becomes  when  /32/ 4 = a does  not  correctly  describe  solutions  to  the 
differential  equation  for  any  possible  pair  of  initial  conditions  x0  and 
(dx/dt)0.  You  can  see  this  physically  by  visualizing  what  the  motion  is  like 
when  a critically  damped  oscillator  is  released  with  x„  ^ 0 and  (dx/dt) 0 = 0. 
Its  motion  at  the  very  first  looks  like  the  beginning  of  a cosine  curve,  as  in 
the  beginning  part  of  Figs.  6-18  and  6-19.  This  is  true,  independent  of  the 
degree  of  damping,  since  while  the  velocity  is  very  low  (as  it  will  be  at  the 
beginning),  the  damping  is  not  effective  in  any  case.  This  is  not  the  behav- 
ior exhibited  by  a decreasing  exponential,  such  as  the  one  plotted  in  the  top 
part  of  Fig.  6-21. 

A separate  investigation  must  be  made  of  the  analytical  solutions  to  the 
differential  equation  for  critically  damped  motion,  and  it  must  be  done 
again  for  heavily  damped  motion,  since  these  solutions  are  of  quite  dif- 
ferent forms  from  the  oscillatory  form  initially  assumed  in  Eq.  (6-33).  This 
can  be  done,  and  the  following  solutions  are  obtained: 

For  critically  damped  motion 

x = (A  + Bt)e~ptl2  for  ~ = a (6-45) 

For  heavily  damped  motion 

x = (Aevli'2l4~a  1 + f)e^etl2  for  > a (6-46) 

In  both  expressions,  A and  B are  constants  whose  values  can  be  adjusted  to 
describe  any  particular  motion. 

In  this  section  you  have  studied  both  numerical  and  analytical  solutions  to  a 
differential  equation  typical  of  those  that  arise  in  physics.  If  you  consider  the  pro- 
cedures used  and  results  obtained  from  the  two  methods,  you  will  get  a good  idea 
of  their  comparative  advantages  and  disadvantages,  when  applied  to  an  equation 
that  can  be  solved  by  either  method.  As  you  proceed  through  this  book,  you  will 
find  both  methods  employed  from  time  to  time,  although  not  for  the  same  equa- 
tion. We  will  solve  a differential  equation  analytically,  if  it  has  an  analytical  solu- 
tion, and  if  it  is  not  too  difficult  to  obtain  the  analytical  solution.  In  practice,  this 
means  that  we  will  solve  most  differential  equations  numerically. 

EXAMPLE  6-9 

You  are  an  engineer  working  on  the  design  of  an  automobile  springing  system. 
With  springs  but  no  shock  absorbers  installed  on  the  preliminary  model,  you  give 
the  automobile  body  an  initial  downward  push  to  set  it  into  vertical  oscillation.  You 
observe  that  the  oscillation  persists  for  a large  number  of  cycles  and  has  a period  of 
T = 0.72  s.  Next  you  install  trial  shock  absorbers,  give  the  body  another  push,  and 
find  that  discernible  oscillation  still  persists  for  several  cycles — enough  to  allow  you 
to  determine  that  the  period  is  now  T = 0.81  s.  Thus  the  trial  shocks  are  not  ade- 
quate to  produce  the  desirable  condition  of  critical  damping,  at  which  the  automo- 
bile recovers  from  the  effect  of  a bump  in  the  road  as  rapidly  as  possible  without  os- 
cillating. You  cannot  afford  to  find  the  proper  shocks  by  continuing  to  experiment 
by  trial  and  error.  How  can  you  use  your  data  to  determine  specifications  for  shocks 
which  will  render  the  springing  system  critically  damped? 

■ Shock  absorbers  utilize  fluid  friction  arising  from  a fluid  passing  through  a 
number  of  narrow  orifices  connecting  two  chambers,  as  shown  schematically  in  Fig. 


246 


Oscillatory  Motion 


Chassis 


Wheel  assembly 


Fig.  6-22  A schematic  drawing  of  a 
spring  and  shock  absorber  system. 


6-22.  The  differential  equation  treated  in  this  section  applies  quite  accurately  to  an 
automobile  springing  system,  since  the  damping  is  proportional  to  the  first  power 
of  the  speed  of  vertical  motion.  Thus  you  may  use  the  relation  co  = 27 t/T  to  write 
Eq.  (6-42)  as 


(X) 


2 77 


T 


(6-47) 


The  almost  undamped  oscillatory  motion  for  the  case  without  shocks  implied  that 
to  a good  approximation  (3  = 0 for  that  case.  Thus  the  value  of  a characteristic  of 
the  springing  system  and  the  mass  of  the  automobile  can  be  found  from  the  value 
T = 0.72  s observed  for  the  oscillation  without  shocks.  Setting  (3  = 0 in  Eq.  (6-47), 
solving  for  a in  terms  of  T,  and  then  squaring,  you  find 


a = 


2 


Inserting  the  value  of  T gives  you 


a 


27 r 

0.72  s 


76  s 2 


The  effectiveness  of  the  trial  shocks  is  characterized  by  the  value  of  the  quan- 
tity (3.  It  can  be  obtained  by  solving  Eq.  (6-47)  for  (3: 


a 


A2 

4 


2 


(3  = 2 


2 


Setting  a = 76  s 2 and  T = 0.81  s,  you  find 

* - 2 V76  - (nib)2  - 8-° s" 

This  is  the  value  of  (3  for  the  trial  shocks. 

For  critical  damping,  Eq.  (6-42)  or  (6-44)  shows  you  need 

f32  _ 

T~  a 

or  /3crit  = 2 = 2 V76  s~2  = 17.4  s”1 

Thus  to  obtain  critical  damping,  you  must  specify  that  the  car  be  equipped  with 
shocks  which  produce  a damping  force  stronger  by  a factor  of 

ficrlt  _ 17-4  s'1 

(3  8.0  s'1 

than  is  produced  by  the  trial  shocks.  This  can  be  achieved  by  reducing  the  number 
of  orifices  by  that  factor. 


EXERCISES 

Group  A 

61.  An  inertial  method  for  determining  mass.  A body  of 
mass  1.00  kg  is  suspended  from  a spring.  When  the  body 
is  pulled  down  from  its  equilibrium  position  and  then  re- 


leased, the  resulting  oscillations  have  a period  of  2.00  s. 
When  the  1.00-kg  body  is  replaced  by  a body  of  unknown 
mass,  the  oscillations  have  a period  of  1.00  s.  Assuming 
that  the  mass  of  the  spring  is  negligible,  determine  the 
mass  of  the  unknown  body. 


Exercises  247 


6-2.  Body  on  a vertical  spring.  A vertically  suspended 
spring  (of  negligible  mass  and  force  constant  k ) is 
stretched  by  an  amount  / when  a body  of  mass  m is  hung 
on  it.  The  body  is  pulled  by  hand  an  additional  distance  y 
(positive  direction  downward)  and  then  released.  Show 
that  the  motion  of  the  body  is  governed  by  the  equation 
a = — ky/m , so  that  the  body  executes  harmonic  motion 
about  its  equilibrium  position. 

6-3.  Bird  on  a spring.  A bird  cage  hung  on  a spring  ex- 
tends the  spring  by  10.0  cm.  What  is  the  frequency  of  os- 
cillation of  the  cage  about  its  equilibrium  position? 

6-4.  Spring  versus  pendulum.  A body  hung  from  a 
spring  stretches  the  spring  a distance  l. 

a.  What  is  the  period  of  the  oscillations  that  this 
system  can  exhibit? 

b.  Show  that  the  period  found  in  part  a is  the  same  as 
the  period  of  small  oscillations  of  a pendulum  of  length  l. 

6-5.  How  much  stretching1?  A body  hanging  from  a 
spring  is  set  into  motion,  and  the  period  of  oscillation  is 
found  to  be  0.50  s.  After  the  body  has  come  to  rest,  it  is  re- 
moved. When  the  spring  comes  to  rest,  how  much  shorter 
will  it  be? 

6-6.  Springs  on  both  sides.  Two  anchored  springs,  each 
of  force  constant  k,  are  attached  to  opposite  sides  of  a block 
of  mass  m.  The  springs  are  initially  unstretched.  The  block 
is  displaced  to  the  right  and  then  released.  There  is 
negligible  friction  between  the  block  and  tbe  supporting 
surface. 

a.  What  is  the  period  of  oscillation  of  the  block? 

b.  What  would  the  period  be  if  each  spring  were 
stretched  by  an  amount  s when  the  block  was  in  equilib- 
rium? 

6-7.  Harmonic  motion,  I.  An  object  is  executing  har- 
monic motion  with  a frequency  of  5.00  Hz.  At  t = 0 s its 
displacement  is  x(0)  = 10.0  cm  and  its  velocity  is  u(0)  = 
— 314  cm/s. 

a.  Use  the  given  information  to  obtain  an  analytical 
expression  for  the  object’s  displacement  x(t),  velocity 
v(t),  acceleration  a(t). 

b.  If  you  have  not  already  done  so,  express  the  dis- 
placement in  the  form  x(t)  = A cos  (cot  + 8).  That  is,  de- 
termine the  values  of  A and  S which  are  appropriate  for 
the  given  information. 

c.  Find  the  maximum  value  of  the  object’s  displace- 
ment x(t),  velocity  v(t),  acceleration  a(t). 

6-8.  Harmonic  motion,  II.  A body  executes  harmonic 
motion  with  an  amplitude  of  2.00  cm  and  a frequency 
of  3.00  Hz.  At  t = 0 s,  the  displacement  is  x(0)  = 0 cm 
and  the  velocity  v(0)  is  positive. 

a.  Obtain  an  analytical  expression  for  the  displace- 
ment x(t),  the  velocity  v(t),  the  acceleration  a(t). 

b.  Evaluate  the  expressions  found  in  part  a for  t = 
5.00  x IQ"2  s. 


6-9.  A harmonic  shadow.  A circus  bear  rides  a unicycle  at 
constant  speed  y around  a circle  of  radius  r (see  Fig.  6E-9). 
A distant  ground-level  spotlight  casts  a shadow  of  the  bear 
onto  a vertical  wall  which  is  perpendicular  to  the  spotlight 
beam.  Show  that  the  bear’s  shadow  executes  harmonic 
motion  with  angular  velocity  oj  given  by  oj  = v/r. 


Group  B 

6-10.  A search  for  equilibrium.  The  pulleys  shown  in 
Fig.  6E-10  are  frictionless,  the  string  is  massless  and  inex- 
tensible,  and  M = 10 m. 

a.  Do  you  expect  that  the  body  of  mass  m has  an  equi- 
librium position  in  which  the  string  from  A to  B is  not 
pulled  into  a straight  line?  Explain  your  answer. 

b.  If  you  believe  that  there  is  a position  of  equilib- 
rium, is  the  equilibrium  stable?  Explain  your  answer. 


Fig.  6E-10 

6-11.  Object  on  a board.  A massive  object  is  placed  on 
the  middle  of  a thin  board  supported  at  both  ends,  caus- 
ing the  board  to  sag  by  a distance  d at  its  midpoint.  Assume 
that  the  board  has  negligible  mass  and  that  it  exerts  an 
upward  force  proportional  to  its  sag.  Find  the  period  of 
oscillation  of  the  object. 

6-12.  Cut  in  two.  An  object  suspended  from  a spring 
exhibits  oscillations  of  period  T.  Now  the  spring  is  cut  in 
half,  and  the  two  halves  are  used  to  support  the  same  ob- 
ject, as  shown  in  Fig.  6E-12.  Show  that  the  new  period  of 
oscillation  is  T/2. 


248  Oscillatory  Motion 


Fig.  6E-12 


6-13.  A compound  spring.  Springs  A and  B have  spring 
constants  of  2000  N/m  and  1000  N/m,  respectively. 
Spring  A is  hung  from  a rigid  horizontal  beam  and  its 
other  end  is  attached  to  an  end  of  spring  B.  The  pair  of 
springs  is  then  used  to  suspend  a body  of  mass  50  kg  from 
the  lower  end  of  spring  B.  What  is  the  period  of  harmonic 
oscillation  of  the  system? 

6-14.  Swinging  around  an  obstacle.  A pendulum  has  a 
period  T for  small  oscillations.  An  obstacle  is  placed 
directly  beneath  the  pivot,  so  that  only  the  lowest  one- 
quarter  of  the  string  can  follow  the  pendulum  bob  when  it 
swings  to  the  left  of  its  resting  position.  The  pendulum  is 
released  from  rest  at  a certain  point.  How  long  will  it  take 
to  return  to  that  point?  In  answering  this  question,  you 
may  assume  that  the  angle  between  the  moving  string  and 
the  vertical  is  a small  angle  throughout  the  motion. 

6-15.  Population  growth.  When  nutrients  are  not  a lim- 
iting factor,  the  number  of  bacteria  produced  in  each  gen- 
eration of  growth  is  proportional  to  the  number  present 
in  that  generation. 

a.  Show  that  the  number  N present  at  any  time  t sat- 
isfies the  differential  equation  dN/dt  = RN , where  R is  a 
positive  constant. 

b.  Find  an  analytical  solution  to  this  differential 
equation. 

6-16.  Radioactive  decay.  In  a sample  of  radioactive  nu- 
clei, the  total  number  decaying  per  second  is  proportional 
to  the  number  present. 

a.  Show  that  the  number  N of  undecayed  nuclei  re- 
maining at  any  time  t satisfies  the  differential  equation 
dN/dt  = —RN,  where  R is  a positive  constant. 

b.  Find  an  analytical  solution  to  the  differential  equa- 
tion. 

6-17.  Suspended.  As  shown  in  Fig.  6E-I0,  a body  of 
mass  m is  attached  to  the  center  of  a string  which  is  kept 
taut  by  the  weight  of  a body  of  mass  M » m.  There  is  a 
length  l of  string  between  the  fixed  end  A and  the 
pulley  B\  the  mass  of  the  string  itself  is  negligible.  The  body 
of  mass  m is  pushed  downward  a distance  d <5C  l and 
then  released.  Show  that  it  executes  a harmonic  up- 
and-down  motion  whose  period  is  ir\/ ml/ Mg. 

6-18.  Body  on  a string  on  a spring.  A body  of  mass  m is 
attached  by  a string  to  a suspended  spring  of  spring  con- 
stant k.  Both  the  string  and  the  spring  have  negligible 
mass,  and  the  string  is  inextensible  (it  has  a fixed  length). 


The  body  is  pulled  down  a distance  A and  then  released. 

a.  Assuming  that  the  string  remains  taut  throughout 
the  motion,  find  the  maximum  (downward)  acceleration 
of  the  oscillating  body. 

b.  The  string  will  remain  taut  only  as  long  as  it  re- 
mains under  tension.  Determine  the  largest  amplitude 
A max  for  which  the  string  will  remain  taut  throughout  the 
motion. 

c.  Evaluate  Amax  for  m = 0.10  kg  and  k = 10  N/m. 

6-19.  Hold  on!  A massive  block  resting  on  a table  top 
is  attached  to  an  anchored  horizontal  spring.  There  is 
negligible  friction  between  the  massive  block  and  the  table- 
top.  The  oscillation  frequency  is  v.  A much  less  massive 
block  is  placed  on  top  of  the  large  block.  The  coefficient  of 
static  friction  between  the  two  blocks  is  p,s. 

a.  What  is  the  largest  amplitude  of  oscillation  of  the 
large  block  which  permits  the  small  block  to  ride  without 
slipping? 

b.  Evaluate  your  result  for  v = 3.0  Hz  and  = 0.60. 

6-20.  The  effects  of  damping.  The  parameter  a for  a 
damped  oscillator  has  the  value  1.00  s-2.  At  t = 0 s,  the  os- 
cillator is  set  into  motion  with  x0  = 0 m and  v0  = 5.00  m/s. 
Use  the  appropriate  equation  to  write  an  analytical  ex- 
pression for  x(t)  in  each  of  the  cases  listed. 

a.  No  damping,  (3  = 0s_1 

b.  Light  damping,  f3  = 0.10  s_l,  1.00  s-1 

c.  Critical  damping,  f3  = 2.00  s-1 

d.  Heavy  damping,  (3  = 3.00  s-1,  10.0  s-1 

e.  Use  the  expressions  obtained  in  parts  a to  d to  find 
the  maximum  value  of  x(t)  attained  in  each  case.  Express 
each  result  as  a fraction  of  the  undamped  amplitude 
Vo/V^x  = 5.00  m. 


Group  C 

6-21.  Two  bodies,  joined  by  a spring.  Body  1 and  body  2 
of  masses  m x and  m2  respectively  are  resting  on  a fric- 
tionless horizontal  surface.  They  are  joined  by  a spring  of 
spring  constant  k.  Initially  the  spring  is  relaxed.  The  two 
bodies  are  pushed  closer  together,  compressing  the  spring 
by  an  amount  A,  and  then  they  are  simultaneously  re- 
leased from  rest. 

a.  Determine  the  subsequent  motion  of  the  system, 
including  the  period  of  the  oscillations,  the  relative  velo- 
cities of  bodies  1 and  2,  and  the  amplitudes  of  motion  of 
bodies  1 and  2. 

b.  Show  that  your  results  take  the  correct  form  for 
mx  » m2  and  for  mx  « m2. 

c.  Evaluate  your  results  for  mx  = m2  = m.  Why  is  the 
period  shorter  than  2n\/m/k? 

6-22.  Piggyback.  A block  of  mass  Mx  resting  on  a fric- 
tionless horizontal  surface  is  connected  to  a spring  of 
spring  constant  k that  is  anchored  in  a nearby  wall.  A block 
of  mass  M2  = cxM1  is  placed  on  top  of  the  first  block.  The 
coefficient  of  static  friction  between  the  two  bodies  is  /xs. 


Exercises  249 


a.  Assuming  that  the  two  bodies  move  as  a unit,  find 
the  period  of  oscillation  of  the  system. 

b.  What  is  the  maximum  oscillation  amplitude  Amax 
that  permits  the  two  bodies  to  move  as  a unit? 

c.  Evaluate  your  results  of  parts  a and  b for  k = 
6.0  N/m,  M1  = 1.0  kg,  a = 0.50,  and  p.s  = 0.40. 

d.  Describe  qualitatively  what  happens  if  the  two 
bodies  are  released  from  rest  at  a coordinate  value  some- 
what greater  than  Amax. 

6-23.  Time  flies.  A recording  pendulum  clock  is  in- 
stalled in  a rocket,  which  is  then  fired  vertically  with  a con- 
stant acceleration  a = 3.0g  for  15  s,  when  burnout 
occurs.  The  rocket  continues  upward  with  diminishing 
speed  until  its  velocity  is  zero.  It  then  reverses  its  path, 
falling  freely  to  the  ground.  It  is  possible  to  neglect  the 
variation  of  the  earth's  gravity  with  altitude,  and  air  re- 
sistance is  negligible  throughout  the  flight.  Although  the 
clock  is  smashed  upon  impact,  the  record  is  recovered. 

a.  What  is  the  actual  duration  of  the  flight? 

b.  What  duration  is  indicated  by  the  record? 

6-24.  A springy  spherical  pendulum.  As  shown  in  Fig. 
6E-24,  a body  of  mass  m is  suspended  from  a frictionless 
pivot  by  a massless  spring  of  relaxed  length  l0  and  spring 
constant  k.  Determine  the  possible  horizontal  circular  mo- 
i ions  of  the  body. 


m Fig.  6E-24 

6-25.  Oscillations  of  a water  column.  A U-tube  of  uni- 
form cross  section  A contains  a length  / of  water.  Initially 
the  water  is  in  equilibrium,  as  shown  in  Fig.  6E-25a.  Then 
the  tube  is  tilted  to  the  left  until  the  water  attains  a new 
equilibrium.  Finally  it  is  turned  upright  very  quickly.  As  a 
result,  the  water  level  on  the  right  is  a distance  y below  its 
equilibrium  level,  as  shown  in  Fig.  6E-25 b.  At  that  instant 
the  fluid  is  motionless,  but  evidently  it  will  not  remain  so. 

a.  What  is  the  difference  D in  water  level  between  the 
left  and  right  sides  of  the  tube,  for  the  situation  of  Fig. 
6E-256? 


b.  1 he  water  has  uniform  density  p.  What  is  the  total 
mass  Mw  in  the  tube? 

c.  It  is  shown  in  Chap.  16  that  a restoring  force  F = 
- pgAD  acts  on  the  water  column.  Use  this  fact  and  your 
result  for  part  a to  write  F in  terms  of  p,  g,  A,  and  y. 

d.  It  we  assume  that  when  the  level  changes,  all  the 
water  moves  with  speed  \dy/dt\,  then  Newton’s  law  takes 
the  form  Mw  d2y/dt2  = F.  Lise  this  to  determine  the  period 
of  oscillation  of  the  water  column. 

6-26.  Buoy,  oh  buoy!  A rectangular  block  of  wood 
floating  in  a large  pool  of  water.  It  is  shown  in  Chap.  16 
that  the  water  exerts  an  upward  force  on  the  bottom  face 
of  the  block  whose  strength  is  Adpg , where  A is  the  area  of 
that  face,  d is  its  depth  beneath  the  surface  of  the  water,  p 
is  the  density  of  water,  and  g is  gravitational  acceleration. 
(This  is  Archimedes’  principle:  The  loss  of  weight  of  a body 
immersed  partly  or  completely  in  a liquid  equals  the 
weight  of  the  liquid  displaced  by  the  body.) 

a.  The  mass  of  the  block  is  m.  Find  the  value  of  d for 
which  the  block  is  in  equilibrium. 

b.  Find  an  expression  that  describes  the  net  force 
acting  on  the  block  for  values  of  d that  differ  from  the 
equilibrium  value.  Else  it  to  show  that  the  equilibrium  is 
stable. 

c.  Show  that  if  the  block  is  depressed  below  its  equi- 
librium depth  (but  not  beneath  the  surface  of  the  water) 
and  then  released,  it  will  execute  harmonic  oscillations. 

d.  Determine  the  frequency  of  the  oscillations 

6-27.  Critical  Damping,  /.  Prove  by  substitution  that 
the  analytical  form  given  in  Eq.  (6-45)  is  a solution  to 
the  damped  oscillator  equation,  Eq.  (6-31),  for  the  case 
of  critically  damped  motion. 

6-28.  Heavy  Damping.  Prove  by  substitution  that  the 
analytical  form  given  in  Eq.  (6-46)  is  a solution  to  the 
damped  oscillator  equation,  Eq.  (6-31),  for  the  case  of 
heavily  damped  motion. 

Numerical 

6-29.  Body-and-spring  program,  I. 

a.  Use  the  analytical  result  in  Eq.  (6-17),  with  the  ini- 
tial conditions  and  parameters  of  Example  6-1,  and  evalu- 
ate to  five  decimal  places  the  position  x of  the  body  at  the 
end  of  the  spring  when  t = 4 s. 

b.  Run  the  body-and-spring  program  as  in  Example 
6-1  and  obtain  x to  five  decimal  places. 

c.  Compare  the  results  of  parts  a and  b to  determine 
the  error  in  the  numerical  method. 

6-30.  Body-and-spring  program,  IF 

a.  Follow  the  procedure  of  Exercise  6-29,  using  a re- 
duced time  increment  At  = 0.1  s in  the  body-and-spring 
program.  Determine  the  error  in  the  numerical  method 
for  this  value  of  At. 

b.  Compare  with  the  error  obtained  in  Exercise  6-29. 
What  relation  do  you  find  between  the  error  in  the  nu- 
merical method  for  treating  the  body  and  spring  and  the 
size  of  At?  In  Exercise  5-43  the  error  in  the  numerical 


250  Oscillatory  Motion 


method  for  treating  the  skydiver  was  found  to  be  inversely 
proportional  to  At.  If  this  is  not  the  case  here,  can  you  ex- 
plain why? 

6-31.  Body-and-spring  program,  III.  Run  the  body- 
and-spring  program  with  x0  = 0.5  m,  v0  = 0.5  m/s,  t0  = 
0 s,  A t = 0.2  s,  and  a = 1 N/(m-kg).  Discuss  your  results. 

6-32.  Body-and-spring  program,  IV.  Run  the  body- 
and-spring  program  with  the  same  initial  conditions  as  in 
Example  6-1,  but  with  a = 0.125  N/(nvkg),  and  use  the 
results  to  add  another  point  to  Figs.  6-7  and  6-8.  Should 
you  use  the  same  value  of  At  as  in  Example  6-1? 

6-33.  Body-and-spring  program,  V.  Run  the  body-and- 
spring  program  with  the  same  initial  conditions  as  in  Ex- 
ample 6-1,  but  with  a = 8 N/(nvkg),  and  use  the  results  to 
add  another  point  to  Figs.  6-7  and  6-8.  Should  you  use  the 
same  value  of  At  as  in  Example  6-1? 

6-34.  Pendulum  program,  I.  Run  the  pendulum  pro- 
gram with  the  same  initial  conditions  and  parameters  as  in 
Example  6-4,  except  that  the  initial  angle  is  </>0  = 2 rad. 
Use  the  results  to  add  another  point  to  Fig.  6-14. 

6-35.  Pendulum  program,  II.  Run  the  pendulum  pro- 
gram with  the  same  initial  conditions  and  parameters  as  in 
Example  6-4,  except  that  the  initial  angle  is  </>0  = 3 rad. 
Use  the  results  to  add  another  point  to  Fig.  6-14. 

6-36.  Pendulum  versus  harmonic  oscillator. 

a.  Run  the  pendulum  program  with  the  same  initial 
conditions  and  parameters  as  in  Example  6-4,  except  that 
the  initial  angle  is  </>0  = 3 rad.  Plot  p for  the  first 
quarter-cycle,  and  determine  the  period  of  the  pendulum. 

b.  Use  Eq.  (6-25a)  to  find  the  value  of  the  parameter 
a for  a harmonic  oscillator  that  will  make  it  have  the  same 
period  as  the  pendulum  motion  found  in  part  a.  Plot  x for 
the  hrst  quarter-cycle  of  a harmonic  oscillator  with  that 
value  of  a and  with  x0  = 3 m and  (dx/dt) 0 = 0.  (The 
quickest  way  to  do  this  is  to  run  the  body-and-spring  pro- 
gram with  the  appropriate  initial  conditions  and  parame- 
ters.) 

c.  Compare  the  sinusoidal  function  describing  the 
harmonic  oscillator  motion  with  the  oscillatory,  but  non- 
sinusoidal,  function  describing  the  pendulum  motion.  Ex- 
plain their  similarities  and  their  differences. 

6-37.  Pendulum  program,  III.  A pendulum  is  released 
from  rest  with  the  cord  horizontal.  The  length  of  the  cord 
is  0.5  m.  Run  the  pendulum  program  to  determine  the 
speed  of  the  pendulum  bob  at  an  instant  when  the  cord  is 
vertical. 

6-38.  Reaching  the  top.  As  shown  in  Fig.  6E-38,  a pen- 
dulum is  hanging  at  one  end  of  a rod  of  length  2 m. 
The  other  end  of  the  rod  is  mounted  on  a frictionless  axle. 
The  mass  of  the  rod  is  negligible. 

a.  The  bob  is  struck  sharply,  giving  it  an  initial  speed 
v.  Use  the  pendulum  program  to  determine,  by  trial  and 
error,  the  value  of  v for  which  the  bob  will  almost  reach 


the  point  X (directly  above  the  axle)  before  it  reverses  its 
motion. 

b.  What  is  the  motion  of  the  bob  for  an  initial  speed 
greater  than  that  found  in  part  a? 


X Fig.  6E-38 


6-39.  Critical  damping,  II.  Run  the  damped  oscillator 
program  with  a = 1 N/(m-kg)  and  various  values  of  (3. 
Using  x0  = 1 m and  (dx/dt) 0 = 0,  find  the  largest  (3  for 
which  x(t)  makes  just  one,  barely  perceptible,  negative 
swing.  Plot  x versus  t for  this  so-called  critically  damped 
case,  and  compare  it  with  the  heavily  damped  case  plotted 
in  Fig.  6-19. 

6-40.  Near-critical  damping.  Use  a value  of  (3  that  is  10 
percent  smaller  than  the  value  found  in  Exercise  6-39, 
and  measure  the  period  of  the  damped  oscillations.  How 
does  it  compare  with  the  undamped  oscillator  having  the 
same  value  of  /3? 

6-41.  Anharmonic  oscillations,  I.  Modify  the  body-and- 
spring  program  to  obtain  a numerical  solution  to  the  equa- 
tion of  motion  of  an  anharmonic  oscillator  in  which  the 
force  acting  on  the  body  is  — kx  — px3.  This  describes  a 
situation  where  the  spring  to  which  the  body  is  connected 
becomes  more  stiff  (if  p > 0)  or  less  stiff  (if  p < 0)  with  in- 
creasing extension  or  compression.  Run  an  example. 
There  is  no  analytical  solution  to  the  equation. 

6-42.  Anharmonic  oscillations,  II. 

a.  Modify  the  body-and-spring  program  to  solve 
numerically  the  equation  of  motion  of  a body  acted  on  by 
the  anharmonic  force  —kx+  px2,  where  p > 0.  This  force 
law  describes  a spring  that  is  stiffer  when  compressed  (x  < 
0)  than  when  extended  (x  > 0)  by  the  same  amount. 
There  is  no  analytical  solution  to  this  equation  of  motion, 
which  is  used  to  model  the  oscillations  in  the  center-to- 
center  separation  of  the  atoms  of  a diatomic  molecule.  In- 
sert into  the  program  steps  which  evaluate  the  time 
average  of  x over  an  entire  cycle  of  the  motion. 

b.  Run  the  program  for  several  different  maximum 
values  of  x.  Show  that  the  average  value  of  x increases  with 
increasing  amplitude  of  the  motion.  This  feature  plus  the 
fact  that  the  amplitude  of  oscillation  of  a molecule  in- 
creases with  temperature  makes  the  model  useful  in 
describing  the  thermal  expansion  of  solids  made  up  of 
such  molecules. 


Exercises  251 


7 

Energy  Relations 


7=1  A PREVIEW  OF  Two  of  the  basic  sets  of  tools  of  mechanics  are  now  at  our  disposal.  The 
ENERGY  RELATIONS  first  set  consists  of  Newton’s  laws  of  motion.  We  can  use  these  laws  to  write 

an  equation  for  the  acceleration  of  any  body  acted  on  by  any  system  of 
forces.  The  second  set  of  tools  consists  of  the  techniques  for  finding  the  so- 
lutions to  that  equation.  The  solutions  determine  how  the  body  moves 
when  it  is  started  in  a particular  way.  We  can  almost  always  obtain  these  so- 
lutions by  using  numerical  methods,  and  in  many  cases  we  can  apply  ana- 
lytical methods.  Thus  we  can  now,  in  principle,  study  the  motion  of  almost 
any  mechanical  system  by  direct  application  of  Newton’s  laws. 

In  practice,  however,  it  is  often  much  easier  to  analyze  mechanical 
behavior  by  applying  other  relations.  A very  important  set  of  such  relations 
involves  a quantity  called  energy.  As  you  will  see,  using  energy  relations  is 
really  a matter  of  using  Newton’s  laws  indirectly.  The  energy  relations  are 
not  independent  of  Newton’s  laws.  Rather  they  reexpress  these  basic  laws 
in  a way  which  makes  it  possible  to  answer  certain  questions  about  mechan- 
ical systems  very  easily. 

Consider  a pendulum.  As  straightforward  as  it  might  seem  to  be  from  a physi- 
cal point  of  view,  the  motion  of  a pendulum  can  be  quite  complicated  mathemati- 
cally. If  you  apply  Newton’s  laws  directly  to  obtain  the  equation  determining  the 
motion,  you  find  a very  difficult  equation — a nonlinear  differential  equation  which 
can  be  solved  to  obtain  the  position  of  the  pendulum  as  a function  of  time  best 
by  numerical  methods  on  a programmable  pocket  calculator  or  a computer.  When 
it  is  solved,  you  can  predict  all  aspects  of  the  behavior  of  the  pendulum.  But  that 
takes  some  doing. 

There  are  certain  aspects  of  the  behavior  of  the  pendulum  which  can  be 
treated  in  a simple  way  by  applying  energy  relations.  Let’s  say  you  give  the  pen- 
dulum bob  some  large  initial  displacement  from  its  stable  equilibrium  position 


252 


Fig.  7-1  Strobe  photo  of  a pendulum. 


and  then  release  it  when  it  is  at  rest.  How  fast  will  the  bob  be  moving  when  it  goes 
through  the  bottom  of  its  swing?  The  situation  is  shown  in  the  strobe  photo  of 
Fig.  7-1. 

One  way  you  can  answer  the  question  is  to  obtain  the  appropriate  numerical 
solution  to  the  pendulum  differential  equation.  But  there  is  a much  easier  way. 
You  will  soon  see  that  energy  relations  can  be  used  to  predict  quite  directly  how 
fast  the  pendulum  bob  moves  through  the  bottom  of  its  swing  if  you  know  its  ini- 
tial displacement  and  speed.  On  the  other  hand,  these  relations  have  limitations. 
You  cannot  use  them,  for  instance,  to  answer  the  question:  How  much  time  does  it 
take  for  the  bob  to  go  from  its  initial  position  to  the  bottom  of  its  swing?  That  can 
be  answered  only  by  solving  the  differential  equation. 

A question  which  can  be  answered  even  more  easily  by  applying  the  energy 
relations  is  the  following:  Given  the  same  initial  conditions  as  above,  what  is  the 
displacement  of  the  pendulum  bob  when  it  is  instantaneously  at  rest  at  the  top  of 
its  swing  on  the  side  opposite  the  initial  side?  By  using  energy  relations  you 
can  say  that  its  final  height  will  be  the  same  as  its  initial  height,  and  hence  its  final 
displacement  will  be  of  the  same  magnitude  as  its  initial  displacement,  although 
of  opposite  sign.  Of  course,  you  can  also  make  that  prediction,  without  knowing 
anything  about  energy,  by  invoking  the  symmetry  of  the  system. 

But  consider  an  interrupted  pendulum,  as  shown  in  Fig.  7-2.  When  the  bob  is 
at  the  bottom  of  its  swing,  the  cord  supporting  it  is  intercepted  by  a horizontal  rod. 
The  bob  continues  its  swing,  with  the  effective  length  of  the  cord  reduced.  The 
question  remains  the  same:  Given  the  initial  displacement  of  the  bob,  what  is  its 
extreme  displacement  on  the  other  side?  Symmetry  will  not  help  you  here.  How- 
ever, the  energy  relations  remain  useful.  You  will  see  that  here  again  they  require 


7-1  A Preview  of  Energy  Relations  253 


Fig.  7-2  Strobe  photo  of  an  interrupted 
pendulum. 


the  final  displacement  to  be  related  to  the  initial  displacement  in  such  a way  that 
the  final  height  of  the  bob  equals  its  initial  height. 

You  could  also  answer  the  question  by  joining  numerical  solutions  to  the  dif- 
ferential equations  for  a long  pendulum  and  a short  pendulum.  But  the  complica- 
tions of  such  a direct  application  of  Newton’s  laws  would  be  great,  and  are  unnec- 
essary to  answer  the  question  posed. 


Fig.  7-3  A box  being  pushed  across  the 
floor.  The  force  applied  to  produce  the 
motion  is  labeled  F,  but  none  of  the 
other  forces  acting  on  the  box  are 
shown.  The  displacement  of  the  box  re- 
sulting from  the  application  of  the  force 
is  labeled  x. 


Now  that  we  have  given  a hint  of  the  utility  of  the  energy  relations,  we 
will  begin  to  develop  them.  In  this  section  we  take  a special  case  in  which  an 
object  moves  while  a force  acts  on  it  that  is  constant  in  both  direction  and 
magnitude.  Taking  the  force  to  be  constant  makes  the  physics  and  mathe- 
matics as  simple  as  possible.  This  will  allow  us  to  go  through  the  develop- 
ments of  energy  concepts  and  their  relations  quickly.  The  primary  purpose 
of  this  section  is  to  acquaint  you  with  the  energy  relations  and  to  give  you 
an  idea  of  how  they  fit  into  an  overall  pattern.  This  will  help  you  follow  the 
detailed  treatments  of  general  cases  presented  in  subsequent  sections. 

I he  concept  of  work  is  basic  to  the  energy  relations.  When  you  apply  a 
constant,  horizontally  directed  force  to  push  a heavy  box  across  the  floor, 
the  box  moves  in  the  same  direction  as  the  direction  of  the  force.  See  Fig. 
7-3.  The  force  you  apply  is  said  to  do  an  amount  of  work  on  the  box  equal 
to  the  product  of  the  applied  force  and  the  displacement  of  the  object  to 
which  the  force  is  applied.  In  this  one-dimensional  situation  we  can  use 
signed  scalars  to  specify  directed  quantities,  like  force  and  displacement. 
Let  the  positive  x axis  extend  along  the  direction  of  motion  of  the  box,  with 
its  origin  at  the  initial  location  of  the  box.  Then  the  displacement  of  the  box 
when  it  has  arrived  at  some  subsequent  location  is  given  by  its  coordinate*, 
which  is  positive.  I he  force  applied  to  the  box  is  specified  by  the  value  of  F. 


254  Energy  Relations 


This  quantity  is  also  positive  because  the  force  is  applied  in  the  positive  x 
direction.  If  we  use  these  symbols,  and  the  symbol  W for  work,  the  work 
that  the  force  does  on  the  box  is  defined  to  be 

W = Fx  for  constant  force  acting  along  straight  path  (7-1) 

Say  the  force  you  apply  is  F = 100  N and  the  displacement  of  the  box  is 
x = 10  m.  Then  the  work  that  this  force  does  on  the  box  is  W = Fx  = 100 
N x 10  m = 1000  N-m.  The  work  unit,  a newton-meter,  is  used  so  fre- 
quently that  it  is  given  its  own  name.  It  is  called  a joule  (abbreviated  J),  in 
honor  of  the  English  physicist  James  Prescott  Joule  (1818-1889).  So 

1J  = lN-m  (7-2) 

and  the  work  done  in  moving  the  box  is  written  W = 1000  J. 

Most  people  will  agree  that  this  is  a reasonable  definition  of  work  because  it 
conforms  to  the  everyday  use  of  the  word.  If  you  continue  pushing  on  the  box,  you 
will  soon  begin  to  feel  the  physiological  sensation  of  having  performed  work.  Fur- 
thermore, your  degree  of  fatigue  will  be  proportional,  roughly  speaking,  to  the 
force  applied  and  also  to  the  distance  moved. 

But  consider  a counterexample.  If  you  push  vigorously  on  a rigid  wall,  and 
keep  pushing,  you  will  also  soon  feel  fatigue.  To  use  common  language,  you  will 
have  been  working.  But  according  to  the  physical  definition  you  will  have  done 
no  work.  Although  a force  is  applied,  no  displacement  occurs.  So  the  x in  the 
equation  W = Fx  has  the  value  zero,  and  therefore  W is  zero.  This  example  em- 
phasizes that  in  physics  “work”  means  the  quantity  W defined  (in  the  circum- 
stances considered  here)  by  the  equation  W = Fx.  It  should  not  be  confused  with 
the  common  use  of  the  word. 


The  work  W = Fx  done  by  the  force  F you  apply  to  the  box  to  give  it  a 
displacement  x across  the  floor  is  positive  because  both  F and  x have  the 
same  sign.  But  there  are  many  situations  in  which  a force  is  applied  to  an 
object  in  the  direction  opposite  to  the  direction  of  the  object’s  displace- 
ment. An  example  is  found  in  the  force  of  kinetic  contact  friction  Ck  which 
the  floor  applies  to  the  box.  This  force  acts  in  the  direction  opposite  to  the 
direction  in  which  the  box  is  displaced.  Thus  Ck  has  a negative  value  be- 
cause the  displacement  x has  a positive  value.  As  a consequence,  the  work 
W = Ckx  done  by  the  frictional  force  will  be  negative.  For  a specific  ex- 
ample, say  that  you  push  the  box  across  the  floor  at  a low  and  constant 
speed.  Then  the  force  you  apply  to  the  box  is  of  the  same  strength  as  the 
frictional  force  that  the  floor  applies  to  it,  except  at  the  very  beginning  of 
the  displacement  where  you  apply  a bit  of  excess  force  to  start  the  box 
moving.  This  must  be  so  since  the  box  does  not  accelerate,  except  at  the 
beginning.  The  work  done  by  the  frictional  force  thus  has  the  value  W — 
Ckx  = - 100  N x 10  m = - 1000  J. 

When  the  box  is  moving  across  the  floor  with  you  pushing  on  it  with  a 
force  in  the  direction  of  its  displacement,  and  the  floor  applying  a frictional 
force  to  it  of  the  same  magnitude  in  the  direction  opposite  to  its  displace- 
ment, no  net  work  is  being  done  on  the  box.  The  negative  work  done  on 
the  box  by  the  frictional  force  just  cancels  the  positive  work  done  by  the 
force  you  apply.  Furthermore,  since  there  is  no  net  force  acting  on  the  box 
in  these  circumstances,  the  box  has  no  acceleration  and  so  continues  to 
move  slowly  across  the  floor. 

On  the  other  hand,  if  you  push  on  the  box  with  a force  of  magnitude 

7-1  A Preview  of  Energy  Relations  255 


greater  than  that  of  the  frictional  force,  then  the  force  you  apply  will  do 
more  positive  work  than  the  frictional  force  does  negative  work.  In  such  a 
case,  net  positive  work  is  done  on  the  box  because  there  is  a net  force  acting 
on  it  in  the  direction  of  its  displacement.  In  fact,  you  can  show  easily  that 
the  net  work  done  is  just  the  constant  net  force  in  the  direction  of  the  dis- 
placement multiplied  by  the  displacement.  Furthermore,  with  a net  force 
applied  to  the  box  Newton’s  second  law  says  it  has  an  acceleration.  The 
acceleration  causes  the  speed  of  the  box  to  increase  throughout  the  dis- 
placement, and  at  the  end  of  the  displacement  the  box  will  be  moving  rap- 
idly. Thus  the  effect  of  the  positive  work  done  on  the  box  is  to  increase  its 
speed. 

We  will  find  the  relation  between  the  work  done  on  an  object  and  the 
change  in  its  speed  by  considering  not  a box  on  a floor,  but  a puck  on  the 
top  of  an  air  table.  The  advantage  of  the  air  table  is  the  absence  of  friction. 
When  you  apply  a horizontally  directed  force  to  an  air  table  puck,  this  force 
is  the  net  force  acting  on  the  puck  in  that  direction.  As  is  illustrated  in  Fig. 
7-4,  we  again  take  the  positive  % axis  along  the  direction  of  the  force  you 
apply,  and  of  the  displacement  it  produces.  So  the  acceleration  of  the  puck 
is  the  positive  quantity  a,  because  the  acceleration  is  in  the  direction  of  the 
applied  force.  We  also  take  the  origin  of  the  axis  to  be  at  the  position  of  the 
puck  at  the  instant  t = 0 when  you  begin  to  apply  the  force  to  the  puck,  and 
we  assume  that  before  this  instant  the  puck  was  at  rest.  Then  its  position 
coordinate  and  velocity  have  the  values  x = 0 and  v = 0 at  t = 0.  When  you 
have  pushed  the  puck  through  a displacement  x — that  is,  when  its  coordi- 
nate has  the  positive  value  x — the  positive  acceleration  a has  resulted  in  an 
increase  in  its  velocity  to  the  positive  value  v. 

The  work  you  do  in  this  process  can  be  evaluated  by  using  the  relations 
that  describe  the  motion  resulting  from  a constant  net  force.  According  to 
Newton’s  second  law,  the  acceleration  a resulting  from  this  force  is  con- 
stant. So  we  can  employ  the  kinematical  equations  for  motion  with  constant 
acceleration  a to  find  the  puck’s  velocity  v when  the  displacement  of  the 
puck  from  its  initial  position  is  x.  Setting  xt  = 0 and  vt  = 0 in  Eq.  (2-32),  we 
obtain 


The  work  you  clo  on  the  puck  by  applying  the  constant  net  force  F to 
give  it  the  displacement  x in  the  direction  of  the  force  is,  by  definition, 

W = Fx 


Fig.  7-4  A puck  being  pushed  across  the  top  of  an  air  table  by  a force 
labeled  F.  This  is  the  only  force  acting  in  a horizontal  direction,  so  the 
puck  accelerates  in  the  direction  of  the  force.  The  quantity  x gives  the 
position  of  the  puck  relative  to  its  initial  position,  x = 0.  In  other  words, 
x is  the  displacement  of  the  puck  from  the  position  it  had  before  the  force 
was  applied.  Forces  acting  in  vertical  directions  are  not  shown. 


Energy  Relations 


Employing  Eq.  (7-3),  we  can  write  this  as 


Now  we  make  explicit  use  of  Newton’s  second  law  to  express  a in  terms 
of  the  net  force  F applied  to  the  puck  and  the  mass  m of  the  puck.  That  is, 
we  write 


F = ma 


Thus  the  work  done  on  the  puck  can  be  expressed  as 


W = 


mav 

~2a~ 


or 


W = 


mv 


(7-4) 


We  have  expressed  the  work  done  by  the  net  force  acting  on  the  ini- 
tially stationary  puck  during  its  displacement  in  terms  of  the  mass  of  the 
puck  and  its  speed  at  the  end  of  the  displacement.  The  word  “speed”  is 
used,  instead  of  “velocity,”  because  the  value  of  v2  is  independent  of  the 
sign  of  v,  and  so  the  value  of  W depends  on  only  the  magnitude  of  v.  The 
work  done  depends  on  the  magnitude  of  the  velocity,  but  not  on  its  direc- 
tion. 


When  the  puck  is  speeding  across  the  top  of  the  air  table  after  you 
have  done  work  on  it,  the  puck  has  an  attribute  that  it  did  not  have  when  it 
was  at  rest.  The  new  attribute  is  its  ability  to  do  work  on  some  other  object, 
if  it  interacts  with  that  object.  For  instance,  if  the  puck  hits  the  head  of  a 
nail  sticking  out  of  a block  of  wood  fastened  to  the  edge  of  the  air  table, 
the  puck  can  drive  the  nail  into  the  block.  By  assuming,  for  simplicity,  that 
the  force  exerted  by  the  puck  on  the  nail  is  constant  while  the  latter  is  being 
driven  into  the  block,  and  that  a drop  of  instant-setting  glue  on  the  nail 
head  prevents  the  puck  from  rebounding,  you  should  be  able  to  calculate 
how  much  work  the  puck  does  on  the  nail.  You  use  Newton’s  third  law  to 
equate  the  negative  of  the  force  exerted  by  the  puck  on  the  nail  to  the  force 
exerted  by  the  nail  on  the  puck.  Then  you  make  a calculation  involving  the 
mass  and  acceleration  of  the  puck  which  is  just  like  the  one  leading  to  Eq. 
(7-4),  except  for  the  sign  and  magnitude  of  the  acceleration.  The  result  is 
identical  to  that  quoted  in  Eq.  (7-4)  because  the  acceleration  cancels  in  the 
equation  immediately  preceding  it.  Thus  you  will  find  that  the  work  done 
by  the  puck  as  it  comes  to  rest  is  just  equal  to  the  work  clone  on  the  puck  in 
bringing  it  up  to  speed. 

Whenever  something  has  the  ability  to  do  work,  we  say  that  it  has  en- 
ergy. There  is  more  than  one  reason  why  something  can  have  this  ability. 
For  the  puck  we  are  considering,  the  reason  is  that  it  is  moving.  The  energy 
that  a body  has  by  virtue  of  its  motion  is  called  its  kinetic  energy.  The  word 
"kinetic”  comes  from  the  Greek  kinetikos,  which  means  “moving.”  We  will 
use  the  symbol  K for  kinetic  energy.  Its  value  equals  the  work  done  by  the 
net  force  acting  on  the  body  to  set  it  into  motion.  That  is,  K equals  the 
quantity  W evaluated  in  Eq.  (7-4).  So  the  expression 


7-1  A Preview  of  Energy  Relations  257 


gives  the  value  of  the  kinetic  energy  K of  a body  of  mass  m that  is  moving  with 
velocity  v.  Note  again  that  the  sign  of  v is  of  no  consequence  in  determining 
the  value  of  K.  The  kinetic  energy  of  a body  depends  on  its  speed,  but  not 
on  the  direction  of  its  motion. 


Fig.  7-5.  A puck  being  raised  very 
slowly  from  the  floor  by  the  application 
of  an  upward  force  F of  the  same  mag- 
nitude as  the  downward  gravitational 
force  mg  exerted  on  the  puck  by  the 
earth.  The  quantity  x gives  the  position 
of  the  puck  relative  to  its  initial  position 
on  the  floor,  where  x = 0,  and  also  its 
displacement  from  that  position. 


Now  we  consider  a second  experiment  in  which  you  apply  a constant 
force  to  a puck  and  it  is  displaced  in  the  direction  of  that  force.  But  the 
work  that  the  force  does  on  the  puck  will  give  the  puck  a different  kind  of 
energy  because  the  circumstances  are  different  from  what  they  were  in  the 
hrst  puck  experiment. 

What  you  do  is  to  raise  the  puck  vertically  from  the  floor  to  a certain 
height  and  then  hold  it  there.  At  the  beginning  of  the  displacement  you 
momentarily  apply  an  upward  force  to  the  puck  which  is  slightly  stronger 
than  the  downward  force  that  the  earth  applies  to  it  through  gravity.  This 
starts  the  puck  moving  upward  very  slowly.  Then  you  keep  the  strength  of 
the  force  you  apply  just  equal  to  that  of  the  gravitational  force,  so  that  the 
puck  maintains  its  very  low-speed,  upward  motion.  At  the  end  of  the  dis- 
placement you  stop  the  puck  by  momentarily  making  the  force  you  apply 
to  it  slightly  weaker  than  the  gravitational  force.  The  experiment  is  de- 
picted in  Fig.  7-5. 

As  is  indicated  in  the  figure,  we  define  an  x axis  whose  positive  direc- 
tion is  upward  and  whose  origin  is  at  the  level  of  the  floor.  Then  the  dis- 
placement of  the  puck  is  just  the  positive  quantity  x,  the  value  of  its  coordi- 
nate at  the  end  of  the  displacement.  The  magnitude  of  x is  the  height  of  the 
puck  above  floor  level  at  the  end.  Except  at  the  very  beginning  and  end  of 
the  displacement,  the  upward  force  that  you  apply  to  the  puck  in  order  to 
cancel  the  downward  force  of  gravity  has  the  positive  value  F = mg.  Here  m 
is  the  mass  of  the  puck,  and  g is  the  magnitude  of  the  gravitational  accelera- 
tion. So  the  positive  work  you  do  on  the  puck  in  elevating  it  is  given  by  the 
expression 

W = mgx  (7-6) 


The  elevated  puck  has  a property  that  it  did  not  have  when  resting  on 
the  floor.  The  new  property  is  the  ability  of  the  puck  to  do  work  on  some 
other  object  in  the  process  of  returning  to  the  floor.  In  other  words,  the 
puck  has  energy.  Because  of  its  energy,  the  puck  can  now  do  as  much  work 
on  you  as  you  did  on  it  in  elevating  it.  Reverse  the  experiment  by  very 
slowly  letting  the  puck  push  your  hand  down  to  the  floor.  The  downward 
force  that  the  puck  applies  to  your  hand  has  the  value  —mg,  and  the  down- 
ward displacement  of  your  hand  has  the  value  —x.  So  the  positive  work 
done  by  the  puck  on  you  is  W = (—mg)(—x)  = mgx.  The  work  the  puck 
does  on  you  when  it  descends  just  equals  the  work  you  do  on  it  when  it  as- 
cends. Thus  when  you  lift  the  puck  against  the  gravitational  attraction  of 
the  earth,  you  give  it  energy,  and  it  then  has  the  ability  to  do  as  much  work 
as  you  did  on  it  to  lift  it.  The  work  can  be  done  on  you,  as  in  the  experiment 
just  described,  or  on  some  other  object. 

The  energy  that  a body  has  by  virtue  of  its  position  is  called  its  poten- 
tial energy.  It  is  said  that  the  puck  has  gravitational  potential  energy  when 
it  is  at  a position  above  the  floor,  instead  of  at  its  reference  position  on  the 
floor.  The  word  “potential”  is  used  because  the  puck  is  potentially  able  to 
do  work  in  returning  from  its  position  above  the  floor  to  its  reference  posi- 
tion on  the  floor. 


258  Energy  Relations 


A more  complete  statement  is  that  the  system  of  the  puck  plus  the  earth 
has  gravitational  potential  energy  when  the  puck  is  at  a position  higher 
than  its  reference  position.  The  potential  energy  is  really  stored  in  the  puck- 
plus-earth  system,  not  in  the  puck  alone.  The  reason  is  that  the  force  which 
the  stationary  (or  very  slowly  moving)  puck  can  apply  to  some  other  object 
is  a result  of  the  gravitational  force  which  the  earth  exerts  on  the  puck.  So 
the  work  that  can  be  done  on  an  object  by  the  force  which  the  puck  can 
exert  on  it  is  a consequence  of  two  things:  (1)  The  puck  is  a member  of  a 
system  whose  other  member,  the  earth,  exerts  a force  on  it;  hence  the  puck 
can,  in  turn,  exert  a force  on  an  object  external  to  the  system.  (2)  The  posi- 
tion of  the  puck  is  such  that  the  force  which  it  exerts  on  the  external  object 
does  positive  work  as  the  puck  returns  to  its  reference  position. 

The  symbol  that  we  will  use  for  potential  energy  is  U.  Thus  we  say  that 
when  the  puck  is  at  vertical  position  x above  the  floor,  the  puck-plus-earth 
system  has  gravitational  potential  energy  U,  with  reference  to  the  situation 
when  x — 0.  The  value  of  U just  equals  the  work  W done  in  elevating  the 
puck  from  position  x = 0 to  position  x.  Using  Eq.  (7-6)  to  evaluate  W,  we 
have 

U = mgx  (7-7) 

In  this  expression  the  floor  plays  the  role  of  a reference  position.  That  is, 
the  vertical  position  x is  measured  relative  to  the  floor-level  position  x = 0, 
and  the  potential  energy  U is  measured  relative  to  the  value  [7  = 0 that  it 
has  when  x = 0. 

No  matter  how  the  puck  got  to  the  position  x — whether  you  or  someone  else 
put  it  there,  whether  you  lifted  it  directly  or  overshot  and  then  lowered  it  back  to 
that  position — the  system  has  the  same  amount  of  gravitational  potential  energy. 
This  must  be  so  because  in  all  cases  the  gravitational  force  that  the  system  pro- 
duces will  do  the  same  amount  of  work  if  the  puck  returns  from  position  x to  the 
reference  position x = 0.  It  can  be  so  because  the  work  done  by  the  force  that  is  ap- 
plied to  the  system  to  displace  the  puck  from  position  x = 0 to  position  x is  inde- 
pendent of  the  path  the  puck  follows  between  these  two  positions.  Show  that  this 
is  true  for  a case  in  which  you  raise  the  puck  vertically  upward  to  a position x"  that 
is  higher  than  positionx,  and  then  lower  it  vertically  downward  to  positionx.  You 
should  have  no  difficulty  if  you  take  into  account  the  fact  that  you  do  negative 
work  on  the  puck  in  the  displacement  fromx"  to  x. 

Before  continuing  this  preview  of  energy  relations,  we  will  summarize 
the  most  important  features  of  kinetic  and  potential  energy  saying  that 
kinetic  energy  is  energy  of  motion  and  potential  energy  is  energy  of  position. 

Now  we  consider  a final  experiment  wdrich  will  bring  out  a very  impor- 
tant relation  between  potential  energy  and  kinetic  energy.  Begin  with  the 
puck  resting  on  the  floor.  You  do  work  on  the  puck-plus-earth  system  by 
very  slowly  raising  the  puck  to  position  x above  the  floor,  with  positive  x 
measured  upward.  The  positive  work  you  do  is  mgx,  and  this  equals  the 
gravitational  potential  energy  you  have  given  the  system.  Then  you  release 
the  puck.  The  system  immediately  becomes  an  isolated  one.  This  isolated 
system  has  an  initial  positive  gravitational  potential  energy  U = mgx,  with 
reference  to  the  floor,  and  an  initial  kinetic  energy  K = 0.  As  the  puck  falls, 
the  system  loses  potential  energy  since  the  height  of  the  puck  decreases. 
But  it  gains  kinetic  energy  since  the  speed  of  the  puck  increases. 

Let’s  evaluate  U and  K at  an  instant  when  the  puck  falls  past  a position 


7-1  A Preview  of  Energy  Relations  259 


x'  that  is  lower  than  x but  higher  than  the  floor.  At  x'  the  potential  energy 
of  the  system  relative  to  the  floor  has  the  smaller,  but  still  positive,  value 

U = mgx'  (7-8) 

The  reason  is  that  you  would  have  given  the  system  that  much  energy  if 
you  had  slowly  raised  the  puck  directly  from  the  floor  to  x'.  In  falling  from 
x to  x' , the  puck  has  acquired  a negative  velocity  v' . According  to  Eq.  (7-5), 
it  has  a positive  kinetic  energy 


In  order  to  compare  these  values  of  U and  K,  we  must  express  them  in 
comparable  terms. 

To  do  this,  we  relate  the  displacement  of  the  puck  traveling  at  constant 
acceleration  to  the  square  of  the  velocity  it  acquires  during  the  displace- 
ment. This  can  be  done,  if  the  effect  of  air  resistance  is  neglected,  by  again 
making  use  of  Eq.  (2-32).  If  we  set  the  initial  position  coordinate  equal  to  x, 
the  final  position  coordinate  equal  to  x',  the  initial  velocity  equal  to  zero,  the 
final  velocity  equal  to  v',  and  the  acceleration  equal  to  —g,  Eq.  (2-32)  gives 
us  the  relation 


We  transpose,  to  obtain 


Solving  for  v'2  produces 


x'  = X + 


V 


'2 


-2  g 


2 g 


v'2  = 2 g(x  — x') 


We  can  therefore  express  the  value  of  K,  in  these  particular  circumstances, 
as 


mv'2  _ m\2g(x  — x')] 
K = = 2~ 


or 


K = mg(x  — x') 


(7-9) 


Figure  7-6  is  a plot  of  the  kinetic  and  potential  energies  of  the  system, 
given  by  Eqs.  (7-8)  and  (7-9),  as  a function  of  the  instantaneous  position  x'. 
The  fall  begins  with  x'  = x and  ends  with  x'  — 0 — just  before  the  puck 
strikes  the  floor.  At  first  (when  x'  = x)  the  kinetic  energy  K is  zero  because 
the  puck  is  just  beginning  to  move,  and  the  potential  energy  U has  the  ini- 
tial value  mgx.  When  the  puck  has  fallen  to  some  intermediate  height  x',  the 
kinetic  energy  has  increased,  since  the  puck  has  gained  speed.  But  this  in- 
crease has  been  at  the  expense  of  a loss  of  potential  energy,  since  the  puck 
has  lost  height.  Just  before  the  puck  hits  the  floor  (when  x'  — 0),  the  poten- 
tial energy  is  essentially  zero  and  the  kinetic  energy  has  increased  to  a value 
essentially  equal  to  the  initial  value  of  the  potential  energy.  In  fact,  Fig.  7-6 
makes  it  clear  that  the  decrease  in  potential  energy  is  exactly  compensated 
for  by  an  increase  in  kinetic  energy  at  all  points  on  the  fall. 

We  are  therefore  led  to  introduce  the  total  mechanical  energy  E.  This 


260  Energy  Relations 


(K,  U,E ) 


mgx 


K+U=E 


Fig.  7-6  The  kinetic  energy  K , potential  energy  U,  and  total  mechanical 
energy  £ of  a puck  falling  from  rest  at  height  x to  the  floor  at  height  0. 
These  quantities  are  plotted  versus  its  height  x'  at  intermediate  points. 
During  the  fall  the  value  of  x'  goes  to  the  left  from  x'  = x to  x'  = 0. 


0 


x 


Vertical  position 
(*') 


quantity  is  defined  as  the  sum  of  the  kinetic  and  potential  energies.  That  is, 


(7- 10a) 


E = K + U 


As  the  puck  falls,  the  energy  of  the  system  gradually  changes  from  one 
form,  potential,  to  another,  kinetic.  But  their  sum,  the  total  mechanical  en- 
ergy, remains  constant.  In  particular,  we  have  from  Eqs.  (7-8)  and  (7-9) 


E = K + U = mg(x  — x')  + mgx'  = mgx 


The  quantity  mgx  is  a constant  because  x is  the  fixed  value  specifying  the  ini- 
tial height  of  the  puck.  This  simple  experiment  provides  an  example  of  the 
law  of  conservation  of  total  mechanical  energy  of  an  isolated  system  in 
which  there  is  no  friction.  The  law  can  be  expressed  symbolically  by  the 
equation 


(7-106) 


It  is  one  of  the  most  important  conservation  laws  of  physics. 

Let  us  recapitulate.  First  the  puck  was  stationary  on  the  floor.  Then 
you  did  positive  work  on  the  puck-plus-earth  system  by  applying  an 
upward-directed  external  force  to  the  puck  in  such  a way  that  it  was  very 
slowly  displaced  upward.  In  doing  this,  you  gave  the  system  potential  en- 
ergy only.  Throughout  almost  all  its  upward  displacement,  the  external 
force  you  applied  to  the  puck  just  canceled  the  internal  force  applied  to  it 
by  the  earth  through  gravity.  So  the  puck  experienced  no  net  force,  did  not 
accelerate,  and  did  not  gain  kinetic  energy  from  the  work  you  did  on  it. 
After  the  puck  was  released  at  its  elevated  position,  the  system  was  isolated. 
Then  the  only  force  acting  on  the  puck  was  the  internal  force  exerted  on  it 
by  the  earth.  The  puck  accelerated  downward  under  the  influence  of  this 
net  force.  As  it  fell  with  increasing  speed,  the  puck  gained  kinetic  energy. 
Therefore  the  system  gained  kinetic  energy.  But  at  the  same  time  the 
system  lost  potential  energy  because  the  elevation  of  the  puck  was  de- 
creasing. The  gain  in  kinetic  energy  was  exactly  compensated  for  by  the 
loss  of  potential  energy,  so  that  the  total  mechanical  energy  of  the 
friction-free  system  remained  constant  as  long  as  the  system  remained  iso- 
lated. When  the  puck  hit  the  floor,  the  system  was  no  longer  isolated. 

As  another  example,  suppose  you  throw  a ball  vertically  upward  from 
floor  level.  Immediately  after  the  ball  leaves  your  hand,  the  ball  plus  the 


7-1  A Preview  of  Energy  Relations  261 


earth  forms  an  isolated  system  with  negligible  friction  that  has  kinetic  en- 
ergy, but  no  gravitational  potential  energy  with  reference  to  the  situation 
with  the  ball  on  the  floor.  At  the  top  of  its  path  there  is  potential  energy 
with  reference  to  the  floor,  but  no  kinetic  energy.  Throughout  the  upward 
part  of  its  flight  the  total  mechanical  energy  remains  constant;  in  other 
words,  it  is  conserved.  The  total  mechanical  energy  of  the  system  continues 
to  remain  constant  for  the  downward  part  of  the  flight  of  the  ball.  Just  be- 
fore the  ball  hits  the  floor,  its  final  kinetic  energy  equals  its  initial  kinetic  en- 
ergy. (Why?)  Thus  the  final  velocity  of  the  ball  is  equal  in  magnitude  to  its 
initial  velocity.  The  sign  reversal  of  the  velocity  has  no  effect  on  the  kinetic 
energy,  which  depends  on  the  square  of  the  velocity. 


Consider  the  ball-plus-earth  system  after  the  ball  hits  the  floor,  assum- 
ing it  does  not  rebound.  The  ball  is  at  rest  at  floor  level.  Therefore  the 
system  has  neither  kinetic  energy  nor  potential  energy.  The  constant  total 
mechanical  energy  which  the  system  had  during  the  flight  of  the  ball  has 
vanished.  What  has  happened  to  the  mechanical  energy? 

This  is  what  happens.  When  the  ball  hits  the  floor,  the  molecules  in  the 
surface  of  the  ball  collide  with  the  molecules  in  the  surface  of  the  floor,  set- 
ting the  surface  molecules  in  both  bodies  into  vibrational  motion.  Their  vi- 
bration sets  adjacent  molecules  into  vibration,  and  so  on.  So  the  vibrational 
motion  propagates  into  the  two  bodies.  The  molecules  are  not  vibrating  in 
unison,  however.  There  is  still  energy  present,  even  though  the  ball  as  a 
whole  is  at  rest  on  the  floor.  But  it  is  an  energy  of  random  vibrational  mo- 
tion. 

Just  before  the  ball  hits  the  floor  the  system  has  mechanical  energy, 
which  is  all  in  the  form  of  kinetic  energy.  Kinetic  energy  is  an  organized  en- 
ergy of  motion.  It  is  organized  in  the  sense  that  all  the  molecules  of  the  ball 
are  moving  in  unison — in  the  same  direction  at  the  same  speed.  After  the 
ball  hits  the  floor  there  is  still  motion  of  molecules,  and  there  is  energy  asso- 
ciated with  this  motion.  But  it  is  the  disorganized  energy  of  randomly  vi- 
brating molecules.  This  energy  is  called  thermal  energy.  So  when  the  ball 
hits  the  floor,  the  mechanical  energy  of  the  system  is  transformed  into 
thermal  energy. 

The  same  sort  of  process  happens  in  a more  gradual  fashion  whenever 
any  form  of friction  is  present  in  a system.  Contact  friction  transfers  part  of 
the  mechanical  energy  of  a system  of  two  objects  sliding  over  each  other 
into  thermal  energy  associated  with  random  motion  of  the  molecules  near 
the  surfaces  in  contact.  In  the  case  of  fluid  friction,  some  of  the  mechanical 
energy  is  lost  to  heating  the  moving  body  and  the  material  through  which  it 
moves. 

Thus  the  law  of  mechanical  energy  conservation  does  not  hold  when 
frictional  effects  are  significant.  But  even  when  they  are,  experiment  shows 
that  the  sum  of  the  mechanical  and  thermal  energies  of  an  isolated  system  is 
constant.  The  lost  mechanical  energy  appears  as  thermal  energy.  It  is  also 
possible  for  some  (but  not  all)  of  the  thermal  energy  of  a system  to  be  con- 
verted into  mechanical  energy;  devices  which  do  this  are  called  heat 
engines.  These  matters  are  discussed  at  length  in  Chaps.  17  through  19. 


We  close  this  preview  of  the  energy  relations  by  using  them  in  Ex- 
ample 7- 1 . 


262  Energy  Relations 


EXAMPLE  7-1 


a.  An  elevator  car  is  moving  upward  at  a constant  speed  of  2.00  m/s.  When  it  is 
at  an  elevation  10.00  m above  ground,  the  lifting  cable  breaks.  Calculate  the  max- 
imum elevation  attained  by  the  car,  ignoring  friction. 

■ After  the  cable  is  broken,  the  car-plus-earth  system  is  isolated.  Therefore, 
since  the  system  is  assumed  to  be  frictionless,  you  can  apply  the  law  of  conservation 
of  mechanical  energy,  Ef  = In  particular,  you  can  equate  the  initial  total  me- 
chanical energy  Et  = Kt  + Uj  of  the  system  at  the  instant  after  the  cable  breaks  to 
the  final  total  mechanical  energy  Es  = Ks  + U f at  the  instant  when  the  car  is  sta- 
tionary at  its  maximum  elevation.  Use  an  x axis  whose  origin  is  at  ground  level  and 
whose  positive  direction  is  vertically  upward,  and  measure  the  gravitational  poten- 
tial energy  of  the  system  with  reference  to  the  car  at  ground  level.  Then  the  initial 
kinetic  and  potential  energies  of  the  system  are  Kt  = mv f/2  and  {/,-  = mgxh  where  vt 
and  Xj  are  the  initial  velocity  and  position  of  the  car  and  m is  its  mass.  The  final 
kinetic  and  potential  energies  are  Kf  = 0 and  Uf  = mgxf.  Equating  the  sum  of  the 
final  energies  to  the  sum  of  the  initial  energies,  you  have 

Ef  = Ej 
or 

Kf+Uf=  Kj  + U, 


or 


mv? 


0 + mgxf  = — + mgXj 


Solving  for  x{  gives  you 


Xf  = Xj  + — 
2 g 


= 10.00  m + 


= 10.20  m 


(2.00  m/s)2 

2 x 9.80  m/s2 


b.  The  safety  brakes  automatically  engage  when  the  downward  speed  reaches 
4.00  m/s.  Determine  the  elevation  at  which  this  happens. 

■ Now  you  take  the  initial  conditions  to  be  those  when  the  car  is  instantaneously 
at  rest  at  its  maximum  elevation,  and  the  final  conditions  to  be  those  at  the  instant 
before  the  safety  brakes  engage.  Energy  conservation  gives 

Kf  + Uf=  Kj  + Uj 


or 


mvj 

— 1-  mgXf  = 0 + mgXj 


or 


Xf  = Xj  — 


v2f 


= 10.20  m 
= 9.39  m 


(4.00  m/s)2 
2 x 9.80  m/s2 


This  is  0.81  m below  the  highest  point  reached,  or  0.61  m below  the  point  where  the 
cable  broke.  ■ 

c.  The  brakes  apply  a constant  upward  frictional  force  to  the  car,  with  a magni- 
tude I times  the  weight  of  the  car.  Determine  the  elevation  of  the  car  when  it  stops. 


7-1  A Preview  of  Energy  Relations  263 


7-2  WORK  DONE  BY  A 
VARIABLE  FORCE 


■ Here  the  mechanical  energy  is  not  constant  since  energy  is  lost  to  friction  in 
the  brakes.  To  say  it  another  way,  the  car-plus-earth  system  is  no  longer  isolated, 
since  the  brakes  do  work  by  applying  an  external  force  to  the  car.  If  you  take  the  ini- 
tial conditions  to  be  those  when  the  brakes  engage,  the  work  done  by  the  brakes  on 
the  car  is  the  product  on  the  force  they  apply  and  the  displacement  of  the  car  as  it 
comes  to  a stop  under  the  influence  of  this  force.  The  upward  force  has  a positive 
value  f mg,  whereas  the  downward  displacement  (xf  - xt)  has  a negative  value.  So  the 
work  done  by  the  brakes  on  the  system  is  the  negative  quantity  img(xf  — xt).  It  is 
negative  because  the  force  acts  in  the  direction  opposite  to  the  displacement,  and 
thus  mechanical  energy  is  removed  from  the  system.  You  can  equate  the  total  initial 
mechanical  energy  of  the  system,  plus  the  negative  work  IT  done  on  it  by  the  brakes, 
to  its  total  final  mechanical  energy.  This  gives 

Kf  + Uf=  Kt  + Ui+W 


or 


or 


So 


mvj  3m  g 

0 + mgxf  = — - + rngxi  + (xf  - xt) 


mgXf  mgxi  mv* 

2 ~~  2 + 2 

v? 

Xf  = X; 

g 

= 9.39  m 
= 7.76  m 


(4.00  m/s)2 
9.80  m/s2 


The  braking  distance  is  1.63  m.  Can  you  explain  why  this  is  twice  the  distance  the 
elevator  fell  from  its  maximum  height  before  the  brakes  engaged? 


Now  we  will  begin  generalizing  the  energy  relations  by  extending  the  defi- 
nition of  work  to  a one-dimensional  case  in  which  the  force  doing  work  is 
not  of  constant  magnitude.  The  force  and  the  displacement  of  the  body  to 
which  it  is  applied  both  lie  along  thex  axis,  as  before.  But  here  the  strength 
of  the  force  depends  on  the  position  of  the  body,  and  so  it  must  be  written 
as  F(x).  Nevertheless,  an  equation  very  much  like  Eq.  (7-1)  can  still  be  used 
to  evaluate  approximately  the  amount  of  work  done  by  the  force  in  a small 
displacement  of  the  body.  If  the  body  moves  from  x,-  to  x*  + Ax,  then  the 
small  amount  of  work  AW  done  by  the  force  applied  to  it  will  be 

AW  = F(Xi  + Ax/2)  Ax 

The  idea  is  that  in  the  small  displacement  from  x*  to  x;  + Ax  none  of  the 
values  actually  assumed  by  the  force  F(x)  are  very  different  from  the  partic- 
ular value  F(xi  + Ax/2)  of  the  force  at  the  midpoint  of  the  displacement. 
Hence  we  can  well  approximate  F(x)  over  the  displacement  by  using  the 
constant  value  F(Xj  + Ax/2).  Doing  so,  we  can  then  apply  the  basic  defini- 
tion of  Eq.  (7-1) — work  equals  (constant)  force  times  displacement — to  ob- 
tain a good  approximation  to  the  work  actually  done  in  the  displacement. 
Figure  7-7  indicates  that  the  same  procedure  can  be  used  to  estimate  the 
work  done  by  the  force  when  the  body  moves  through  the  next  displace- 


264  Energy  Relations 


F(xf — Ax/2) 


F(Xi+  5Ax/2) 
F(xi+ 3 Ax/2) 
F(xi  + Ax/2 ) 


F(x) 


Fig.  7-7  Illustration  of  a procedure 
that  can  be  used  to  evaluate  the  work 
done  by  a variable  force  F(x ) acting  on 
a body  that  is  displaced  in  the  direction 
of  the  force  from  x;  to  xf.  The  total  area 
enclosed  by  the  rectangles  gives  an 
approximate  value  of  the  work.  The 
smaller  the  increment  Ax,  and  therefore 
the  larger  the  number  of  rectangles,  the 
more  accurate  the  approximation.  In 
the  limit  as  Ax  goes  to  zero,  the  area 
enclosed  by  the  rectangles  approaches 
the  area  enclosed  by  the  curve  F(x),  the 
x axis,  and  the  vertical  lines  at  x,  and  xf. 
This  “area  under  the  curve”  is  called  the 
definite  integral  of  F(x)  from  x,-  to  xf.  Its 
value  is  precisely  equal  to  the  work 
done. 


— x 


ment.  This  one  is  from  x;  + Ax  to  x*  + 2Ax.  Its  midpoint  is  at  x,-  + 3Ax/2. 
The  force  at  the  midpoint  is  T(x,-  + 3Ax/2).  And  the  work  done  by  the  force 
in  the  displacement  is 

AW  = F(Xi  + 3 Ax/2)  Ax- 

Continuing  in  this  manner,  we  can  obtain  an  approximation  to  the 
total  work  W done  when  the  body  moves  from  x,-  to  X/by  summing  the  con- 
tributions AW  obtained  for  each  range  of  width  Ax.  Thus 

W — F(Xi  + Ax/2)  Ax  + F(Xj  + 3Ax/2)  Ax  + F(x;  + 5Ax/2)  Ax 

+ • • • + F(xf  ~ Ax/2)  Ax  (7-14) 

This  can  be  written  more  concisely  as 

jy-A.r/2 

W - V F(xj)  Ax  (7-15) 

Xj+Ax/2 

The  large  Greek  letter  X (sigma)  means  the  operation  of  taking  a sum  of 
terms  of  the  quantity  to  its  right,  F(x j)  Ax.  The  quantity  Xj  represents  succes- 
sively the  points  at  which  F is  evaluated,  with  xj  starting  at  xt  + Ax/2  and 
increasing  each  time  by  Ax  until  the  value  xf  — Ax/2  is  reached.  Thus  Eq. 
(7-15)  has  exactly  the  same  meaning  as  Eq.  (7-14). 

The  smaller  you  take  the  width  of  the  displacements  Ax,  the  smaller 
the  difference  between  the  value  of  F(x)  at  the  midpoint  of  each  displace- 


7-2  Work  Done  by  a Variable  Force  265 


7-3  INTEGRATION 


ment  and  the  values  that  F{x)  actually  has  within  the  displacement.  Thus 
the  smaller  Ax  (and  consequently  the  more  terms  in  the  sum),  the  more  ac- 
curately the  right  side  of  Eq.  (7-15)  will  approximate  the  total  work  actually 
done  by  the  force  F{x)  when  the  body  on  which  it  acts  moves  from  x,-  to  xf.  In 
the  limit  when  Ax  becomes  infinitesimally  small,  the  right  side  becomes  pre- 
cisely equal  to  the  total  work  done.  Therefore 

JC  — A X 12 

VE  = limit  ' ' v F(Xj)  Ax  (7-16) 

A-r^°  X,+Axl2 

In  the  notation  of  integral  calculus,  this  limit  of  a sum  is  written  as 

xf-Axl2  r xf 

limit  V F{xj)  Ax  = F(x)  dx  (7-17) 

Aj^°  Xi+Axl2  Jx, 

The  distorted  letter  S is  called  an  integral  sign,  and  the  right  side  of  Eq. 
(7-17)  is  called  the  definite  integral  over  x of  the  function  F(x)  from  the 
lower  limit  x;  to  the  upper  limit  xf.  Thus  the  work  W done  by  the  force 
acting  on  the  body  over  the  entire  displacement  from  x,  to  xf  is,  by  defini- 
tion, the  definite  integral 

fxf 

W=  j F(x)  dx  for  force  acting  along  straight  path  (7-18) 


This  section  develops  the  integral  calculus  that  we  will  need  in  continuing 
our  investigation  of  work  and  energy  and  elsewhere  in  the  book.  If  you  are 
already  familiar  with  integral  calculus,  you  may  wish  to  go  directly  to 
Sec.  7-4. 

First  we  will  obtain  a geometrical  interpretation  of  a definite  integral. 
By  definition,  a definite  integral  has  the  value  specified  in  Eq.  (7-17), 

J"  y A«r/2 

F(x)  dx  = limit  V F(xj)  Ax 

x>  ax—*o  Xi+Ax/2 

The  value  of  the  definite  integral  is  the  limiting  value  of  a sum  of  terms. 
Now  the  value  of  each  of  these  terms  equals  the  area  under  the  corre- 
sponding rectangle  in  Fig.  7-7,  since  the  height  of  each  is  the  midpoint 
value  of  /fix)  and  the  width  is  Ax.  Each  of  these  rectangles  is  a good  approx- 
imation to  the  area  under  the  part  of  the  F(x)  curve  passing  through  the 
rectangle.  This  is  because  the  overestimate  that  the  first  half  of  the  rec- 
tangle makes  to  the  actual  area  under  the  F(x)  curve  is  largely  compensated 
for  by  the  underestimate  that  the  second  half  of  the  rectangle  makes.  In  the 
limit  as  the  width  Ax  of  each  rectangle  approaches  zero,  the  area  under  the 
entire  set  of  rectangles  approaches  the  actual  area  under  the  F(x)  curve 
between  x;  and  xf.  Therefore  we  conclude  that  the  value  of  the  definite  integral 
of  F(x)  from  x,  to  xf  equals  the  area  under  the  curve  described  by  F(x)  between  these 
limits.  This  suggests  that  an  approximate  evaluation  of  a particular  definite 
integral  can  be  obtained  by  carefully  plotting  F(x)  between  x,  and  xf  on 
graph  paper  with  closely  spaced  grid  lines,  and  then  measuring  the  area 
under  the  curve  by  counting  the  number  of  grid  squares  in  this  area. 

A much  more  accurate  and  convenient  way  to  evaluate  a particular 
definite  integral  directly  from  the  definition  is  to  employ  a programmable 
pocket  calculator,  or  a small  computer.  The  Numerical  Calculation  Supple- 
ment contains  a numerical  integration  program  which  makes  such  a device 


266  Energy  Relations 


calculate  the  sum  for  successively  smaller  values  of  Ax.  By  inspecting  the  se- 
quence of  sums  it  is  possible  to  determine  when  a limiting  value  has  been 
reached,  to  within  the  required  accuracy.  This  limiting  value  is  the  value  of 
the  definite  integral,  to  that  accuracy.  Example  7-2  uses  the  numerical  inte- 
gration program. 


EXAMPLE  7-2  - -t iti — 

Use  the  numerical  integration  program  to  evaluate  the  definite  integral 

(x/ 

F{x)  dx  where  F(x)  = x2,  with  X;  = 1 and  xs  = 2 

J Xj 

That  is,  evaluate  the  definite  integral 

J x2  dx 

Obtain  results  for  each  value  of  Ax  to  three-decimal-place  accuracy. 

■ To  do  this,  you  calculate  a sequence  of  values  of  the  summation  in  Eq.  (7-17), 
with  each  value  in  the  sequence  obtained  by  using  a successively  smaller  value  of  Ax. 
Then  you  determine  by  inspection  the  limiting  value  to  the  sequence.  The  program 
starts  by  taking  Ax  = xf  — X; , so  that  there  is  only  one  term  in  the  summation.  Then  it 
takes  Ax  = (xf  — X;)/ 2,  and  evaluates  the  corresponding  two-term  summation.  Con- 
tinuing to  reduce  Ax  successively  by  halving  it,  it  generates  a sequence  of  increasingly 
accurate  approximations  to  the  integral,  which  converges  to  the  actual  value  of  the 
integral. 

The  sequence  you  obtain  on  running  the  program  is 

2-AJ/2 

2 X2J  Ax  = 2.250,  2.313,  2.328,  2.332,  2.333,  2.333 

l+Ax/2 


These  values  evidently  converge  to  the  limit  2.333.  Thus  you  can  conclude  that  to 
three  decimal  places  the  value  of  the  definite  integral  is 


There  is  an  analytical  method  for  evaluating  integrals,  which  has  the 
usual  advantages  over  numerical  methods.  It  is  not  always  applicable — the 
integrals  of  some  not  very  complicated,  but  very  important,  functions  can 
be  evaluated  only  numerically.  But  nearly  all  the  integrals  that  are  of  inter- 
est in  elementary  physics  can  be  found  by  the  analytical  method.  The 
method  amounts  to  applying  the  fundamental  theorem  of  calculus: 


(7-19) 


Td  T I 1 I j 1 EU  I ~l — f - 1 — I- 

’ +•  + +•  > > > ^ 


Fig.  7-8  A uniformly  increasing  set  of 
values  of  a quantity  u,  ranging  from  u,- 
to  uf,  displayed  by  plotting  them  along 
a line. 


A proof  of  this  theorem  is  given  in  the  small-print  material  that  follows. 

Consider  a quantity  u and  a uniformly  increasing  set  of  its  values  in  the  in- 
terval ranging  from  u,  to  u^.  These  values  are  displayed  as  ticks  plotted  along  a line 
in  Fig.  7-8,  but  the  line  itself  has  no  particular  significance.  The  set  of  values  can 
be  written 


Ui;  Ui+1,  Ui+2,  Uj+3,  . . . , U/-3,  U/-2,  Uf-!,  uf 


7-3  Integration  267 


The  meaning  of  the  subscript  notation  is  made  apparent  in  the  figure.  The  dif- 
ferences between  adjacent  values  of  the  set  form  another  set 


EXAMPLE  7- 


U;+1  - U h U;+2  - Ui+1,  Ui+3  - Ui+2,  ....  Uf-2  ~ U/_3,  U/_!  ~ U/_2,  Uf  ~ Uy-j 

But  each  element  of  this  difference  set  has  the  same  numerical  value,  Au , because 
of  the  uniform  separation  of  the  values  in  the  first  set.  Take  the  sum  of  all  the  ele- 
ments in  the  difference  set: 


m/.-Am/2 

X Au  = (u«+i  - u,)  + (u  i+2  - U;+i)  + (ui+3  - U;+2)  + • • • + (u/— 2 - Uy_3) 
u . + A U 12 

+ (Uy-y  - U/_2)  + (uf  - Uy_y) 

The  limits  of  the  summation  are  indicated  by  the  values  of  u at  the  midpoints  of 
the  first  and  last  terms.  Note  that  each  intermediate  value  of  u (such  as  ui+1  or  uy_d 
occurs  twice  in  the  sum,  once  with  a positive  sign  and  once  with  a negative  sign. 
Thus  the  intermediate  values  cancel  out  of  the  sum.  Because  of  these  cancellations 
the  right  side  of  the  equality  simplifies  drastically,  and  we  have 

U — Am/2 

^ Au  = uf  - u; 

u +AM/2 


Now  take  the  limit  as  Au  — » 0 on  both  sides.  This  produces 

u -Am/2 

limit  V Au  = Uy  — u; 

M/+AM/2 

By  the  definition  of  Eq.  (7-17),  the  left  side  of  this  equality  is  the  definite  integral  of 
the  rudimentary  function  F(u)  = 1 from  u,  to  uf.  Therefore  we  have 


1 du  = Uf  — u. 


or 


du  = Uf  - Ui 


This  is  the  fundamental  theorem  of  calculus,  Eq.  (7-19). 


As  is  sometimes  the  case  with  theorems,  the  formal  notation  used  to 
prove  the  fundamental  theorem  of  calculus  may  make  the  result  seem 
more  complicated  than  it  really  is.  To  appreciate  its  simplicity,  say  in  words 
what  Eq.  (7-19)  means,  while  looking  at  Fig.  7-8:  "The  (limit  of  the)  sum  of 
the  changes  in  u from  one  value  to  the  next  equals  the  total  change  in  u 
over  the  interval  from  iq  to  Uf.”  Example  7-3  shows  how  to  use  the  theorem 
to  evaluate  definite  integrals. 


Use  the  fundamental  theorem  of  calculus  to  evaluate  the  definite  integral 

[*r 

F(x)  dx  where  F(x)  = x2,  with  xt  = 1 and  xf  = 2 

J Xi 

That  is,  evaluate  the  definite  integral 

r 2 

x2  dx 

■ To  apply  the  theorem,  you  must  find  an  expression  for  u which  has  the  prop- 
erty that 


268  Energy  Relations 


du  = x2  dx 


By  trial  and  error,  based  on  your  knowledge  of  differentiation,  you  will  eventually 
find 


u 


X3 

7 


This  is  correct  since 


du  d(x3/3)  1 d(x3)  3 
dx  dx  3 dx  3 

Multiplying  through  by  dx  verifies  that 

du  = x2  dx 


Using  this  relation,  you  can  write 


x2  dx 


du 


Here  ut  and  uf  are  the  limiting  values  of  u that  correspond  to  the  limiting  values  xf 
and  xf  of  x.  Since  u = x3/3,  xt  = 1,  and  xf  = 2,  they  are 

x?  1 x3f  8 

_ and  u,  = j = - 


The  fundamental  theorem  immediately  gives 


du  = uf  — Ui 


8 1 _ 7 

3 “ 3 “ 3 


So  you  have 


x2  dx  = — 
i 3 


in  agreement  with  the  results  obtained  numerically  in  Example  7-2. 


Example  7-3  shows  that  integration  by  use  of  the  fundamental 
theorem  is  a matter  of  finding  a function  which,  when  differentiated,  yields 
the  function  to  be  integrated.  Thus  the  process  of  integration  is  the  inverse  of  the 
process  of  differentiation. 


A generalization  of  the  calculation  carried  out  in  Example  7-3  will  lead 
to  the  evaluation  of  an  integral  that  we  will  find  very  useful.  The  general- 
ization is 


xn  dx  = 


' Xi 


1 

t/Vy  a j 

n + 1 n + 1 


for  n f — 1 


Using  an  abbreviated  notation,  we  can  write  this  as 


xn  dx  = 


n + 1 


for  n f — 1 


(7-20) 


In  this  notation  it  is  understood  that  the  integral  on  the  left  side  of  the 
equation  is  taken  between  the  lower  limit  xt  and  the  upper  limit  xf,  and  that 
the  quantity  on  the  right  side  is  evaluated  at  the  upper  limit  and  then  from 
this  is  subtracted  the  result  obtained  by  evaluating  it  at  the  lower  limit.  In 
the  same  notation,  some  other  important  integrals  are: 


7-3  Integration  269 


7-4  WORK  AND 
KINETIC  ENERGY 


f dx 

— = In  |x|  for  x / 0 

J x 

(7-21) 

j ex  dx  = 

(7-22) 

J sin  x dx  = -cos  * 

(7-23) 

J cos  x dx  = sin  x 

(7-24) 

And  for  any  functions  of  x,  f(x)  and  g(x), 

f dg(x)  f df(x ) 

J /W  dx  dx+  j six>  dx  dx  = 

f (x)g(x) 

(7-25) 

These  integrals  were  evaluated  by  applying  the  fundamental  theorem. 
Direct  application  of  the  definition  of  an  integral  proves  the  following  basic 
properties  of  integrals: 

J cf(x)  dx  = c J f(x)  dx  for  constant  c 

(7-26) 

J [/(*)  + gt*)]  dx  = J f(x)  dx  + J 

g(x)  dx 

(7-27) 

Caution:  Note  that  f(x)g(x)  dx  ^ \ f(x)  dx  g(x)  dx. 


We  now  continue  to  generalize  the  concepts  introduced  in  Sec.  7-1.  Assum- 
ing that  a certain  variable  force  is  the  only  force  acting  on  a body,  we  will  es- 
tablish the  relation  between  the  work  done  by  that  force  and  the  change  in 
kinetic  energy  of  the  body.  We  will  do  this  for  one-dimensional  motion, 
using  signed  scalars,  and  then  extend  the  argument  to  two  or  three  dimen- 
sions. 

Consider  a body  moving  on  the  x axis  as  a result  of  the  net  force  F{x) 
acting  on  it,  the  force  being  directed  along  the  x axis.  According  to  the  defi- 
nition of  Eq.  (7-18),  when  the  body  moves  from  x,  to  xf,  the  work  done  on 
the  body  by  the  force  is 

W = \ F(x)  dx 
J JCi 


The  work-kinetic  energy  relation  that  we  will  obtain  is  founded  on  Newton's 
second  law  because  the  first  step  in  obtaining  the  relation  is  to  apply  the  law. 

d2x 

m = m-^ 


in  order  to  evaluate  F(x)  in  terms  of  the  mass  m of  the  body  and  its  accelera- 
tion d2x/dt 2.  Using  it  in  the  integral,  we  have 


W = 


d2x 

m -77  dx 
dt 2 


(7-28) 


Next,  the  quantity  which  must  be  integrated  is  manipulated  into  a form 


270  Energy  Relations 


that  will  allow  the  integral  to  be  evaluated  immediately  from  the  funda- 
mental theorem  of  calculus.  This  is  done  as  follows: 


d 2- 


m —rr  dx 


d (dx\ 


m ■ 


dx  = m dx 
dt 


dt2  dt  \dt J 

As  usual,  v represents  the  velocity  of  the  body.  Now  dx  = v dt,  so  we  have 

d^x  dv 

m -77  dx  = m — v dt  = mv  dv 
dt2  dt 

But  v dv  = d(v2)/ 2,  as  can  be  seen  by  noting  that  d(v2)/dt  = 2v  dv/dt  and 
then  multiplying  through  by  dt/ 2.  Thus 


d2x 


m 


(7-29) 


mia*dx  = d<yV^ 

Using  Eq.  (7-29)  in  Eq.  (7-28),  we  obtain 

w = / 1 1 i(v‘] 

Here  vf  is  the  value  of  the  square  of  the  velocity  of  the  body  when  it  is  at  the 
initial  position  xh  and  v2  is  the  corresponding  value  at  the  final  position  xf. 
Since  m/2  is  a constant,  the  integral  becomes 

m P'2  m 2 s 
W = - d(v2) 

2 J Vi 2 

In  this  form,  the  fundamental  theorem  of  Eq.  (7-19)  can  be  applied  to  eval- 
uate the  integral.  We  set  u = v2  so  that  du  = d(v2),  ut  = v2,  and  uf  = v2.  We 
then  obtain  the  result 


m 


mv2 


min 


w = j(v}-v  f)  = -g 

The  right  side  of  this  equation  is  the  difference  between  the  final  and  initial 
values  of  a quantity  we  recognize  from  Sec.  7-1.  It  is  the  kinetic  energy 


K = 


mv 


(7-30) 


Expressed  in  these  terms,  our  results  can  be  written 

W = Kf-  Kt 


(7-3 1 ; 


We  have  shown  that  for  one-dimensional  motion  the  work  done  on  a 
body  by  a net  force  of  varying  strength  acting  on  it  equals  the  change  in  its 
kinetic  energy.  If  the  force  acts  in  the  same  direction  as  the  displacement, 
then  positive  work  is  done  and  the  kinetic  energy  increases  in  proportion 
to  the  increase  in  v2,  the  square  of  the  velocity  of  the  body.  (Remember  that 
here  we  represent  velocity  by  a signed  scalar.  So  v2  is  just  the  square  of  the 
scalar  v.  Soon,  however,  we  will  treat  velocity  as  a vector.  Then  we  will  have 
to  define  what  we  mean  by  the  square  of  a velocity  vector.)  Since  v2  is  always 
positive,  its  value  is  always  the  same  as  the  square  of  the  speed  of  the  body. 
Thus  the  kinetic  energy  of  a body  increases  in  proportion  to  the  increase  in 
the  square  of  its  speed  when  the  net  force  acting  on  it  does  positive  work. 
The  situation  is  just  the  same  as  that  discussed  in  Sec.  7-1,  where  the 


7-4  Work  and  Kinetic  Energy  271 


Fig.  7-9  The  magnitude  of  a vector  v 
will  be  changed  by  the  addition  of  a 
vector  dv  only  to  the  extent  that  d\  has 
a component  along  the  direction  of  v. 
On  the  top,  a perpendicular  velocity 
increment  dv  is  added  to  the  velocity  v. 
If  dv  is  infinitesimal,  the  sum  v + dv  has 
the  same  magnitude  as  v.  So  adding  dv 
to  v does  not  change  the  magnitude  of 
the  velocity  when  dv  is  perpendicular  to 
v.  In  the  center  dv  is  parallel  to  v.  Here 
adding  dv  to  v increases  the  magnitude 
of  the  velocitv  by  an  amount  ec|ual  to  dv, 
the  magnitude  of  dv.  On  the  bottom  dv 
is  at  an  angle  0 to  v.  In  this  case  adding 
dv  to  v increases  the  magnitude  of  the 
velocity  by  an  amount  equal  to  dv  cos  0, 
the  component  of  dv  along  the  direction 
of  v. 


strength  of  the  net  force  was  constant.  When  the  variable  net  force  acts  in 
the  direction  opposite  to  that  of  the  displacement,  it  does  negative  work 
and  also  reduces  the  speed  of  the  body  — just  as  a constant  net  force  does. 
The  reduction  in  kinetic  energy  is  in  proportion  to  the  decrease  in  the 
square  of  the  speed. 

We  would  like  to  generalize  Eq.  (7-31)  to  two  or  three  dimensions.  A 
way  to  do  this  can  be  found  by  noting  that  the  work  done  by  the  force 
acting  on  the  body  is  related  to  the  change  in  its  speed,  that  is,  to  the  change 
in  the  magnitude  of  its  velocity.  In  Sec.  3-3  we  saw  that  a velocity  vector  v 
changes  its  magnitude  during  the  infinitesimal  time  interval  dt  only  when 
the  infinitesimal  change  in  velocity  d\  has  a component  in  the  direction  of 
v.  The  point  is  seen  again  in  Fig.  7-9,  which  shows  that  the  change  in  the 
magnitude  of  v just  equals  the  parallel  component  of  dx.  Now,  dx  is  parallel 
to  the  force  F acting  on  the  body,  since  F = m dx/dt.  So  it  is  only  Fn,  the 
component  of  F parallel  to  the  velocity  v of  the  body,  which  is  effective  in 
changing  the  speed  of  the  body.  But  v is  always  in  the  direction  of  the  infin- 
itesimal displacement  ds  of  the  body  during  dt,  since  v = ds/dt.  Therefore 
Fj|  is  also  the  component  of  F parallel  to  ds.  And  therefore  the  change  in  the 
speed  of  a body  arises  from  the  component  F{1  of  the  force  acting  on  it 
which  is  parallel  to  its  displacement — and  from  that  component  only.  Since 
the  change  in  speed  is  related  to  the  change  in  kinetic  energy,  we  are  led  to 
conclude  that  Fn  is  what  does  work  on  the  body  in  the  case  of  two-  or-three- 
dimensional  motion. 

In  one  dimension,  the  work  IT  done  by  any  force  F acting  along  the  % 
axis  is 


when  the  body  it  acts  on  moves  in  a succession  of  displacements  dx  from  a 
point  on  that  axis  with  coordinate  x,  to  another  point  on  the  axis  with  coor- 
dinate Xf.  The  preceding  discussion  leads  us  to  generalize  this  definition  to 
several  dimensions,  as  follows.  Suppose  that  a body  acted  on  by  a force  F 
moves  in  a succession  of  displacements  ds,  from  a point  on  the  path  of  the 
body  with  coordinate  st  to  a point  elsewhere  on  the  path  with  coordinate  sf. 
We  take  the  work  IT  done  by  the  force  to  be,  by  definition, 

W = I " F„ds 

J Si 

The  symbol  F^  represents  the  component  of  the  force  F that  is  parallel  to 
each  small  displacement  ds  along  the  path  of  the  body.  The  symbol  ds  rep- 
resents the  magnitude  of  the  displacement.  And  the  coordinate  5 is  mea- 
sured along  the  path.  This  definition  is  illustrated  in  Fig.  7-10.  If  9 is  the  angle 
between  the  vectors  F and  ds,  the  same  figure  shows  that 

F,|  = F cos  9 (7-32) 

Thus  the  definition  of  the  work  IT  done  by  the  force  in  the  complete  dis- 
placement from  St  to  Sf  can  be  written  as 

W = I ' F cos  9 ds  (7-33 a) 


272  Energy  Relations 


A definition  equivalent  to  Eq.  (7-33 a)  is  given  by  the  equation 

dW  = F cos  9 ds  (7-33 b) 


Fig.  7-10  The  general  definition  of 
work.  A body  follows  a path  front  an 
initial  location  specified  by  the  coordi- 
nate St  to  a final  location  specified  by  the 
coordinate  sf,  while  a force  F is  applied 
to  it.  The  coordinate  s is  measured  along 
the  path  from  some  origin  O.  In  an 
infinitesimal  part  of  the  path  ds.  for 
which  F and  ds  are  at  the  angle  6,  the 
infinitesimal  amount  of  work  dW  done 
by  the  force  on  the  body  has  the  value 
dW  = F||  ds.  Here  Fu  = F cos  0 is  the 
component  of  F parallel  to  ds.  and  ds  is 
the  magnitude  of  ds.  The  total  work  W 
done  by  the  force  on  the  body  is  the  sum 
of  these  infinitesimal  contributions. 
That  is,  the  total  work  is  the  integral 

W = I *'  F cos  e ds. 

J Si 


This  says  that  in  an  infinitesimal  displacement  of  magnitude  ds,  the  infini- 
tesimal element  of  work  done  dW  is  obtained  by  taking  the  product  of  ds 
and  F cos  6.  The  quantity  F cos  9 is  used,  instead  of  just  F,  since  only  the 
parallel  component  of  the  force  vector  is  wanted.  To  calculate  the  total 
work  W done  when  the  body  moves  along  its  path  between  the  points  st  and 
sf,  the  right  side  of  Eq.  (7-33 b)  is  integrated  between  these  limits,  and  the 
left  side  is  integrated  between  the  corresponding  limits  W-,  and  W f.  Doing 
so,  we  have 

j dW  = j F cos  0 ds 

According  to  the  fundamental  theorem  of  calculus,  the  integral  on  the  left 
side  of  this  equality  has  the  value  W f — Wj.  We  write  the  value  as  W,  the 
total  work  done.  Hence 

f iV' 

dW  = Wf-  Wj  = W 
J Wi 

We  therefore  have 

W — j F cos  9 ds 
in  agreement  with  Eq.  (7-33 a). 

The  real  justification  of  Eq.  (7-33 a)  is  that  it  leads  to  the  same  relation 
between  work  and  change  in  kinetic  energy  as  is  obtained  in  one  dimen- 
sion. We  will  show  this  shortly.  But  first  we  must  make  a small  digression 
for  the  purpose  of  introducing  a vector  notation  which  is  very  convenient 
to  use  in  manipulating  the  expression  F cos  6 ds. 

The  scalar  product  of  two  vector  quantities  A and  B is  a scalar  quantity 
equal  to  A (cos  0)  B,  which  we  will  write  as  A cos  9 B,  where  A and  B are  their 
magnitudes  and  9 is  the  angle  between  their  directions.  The  angle  9 is 
always  counted  as  positive,  and  is  always  the  smaller  of  the  two  angles 
formed  by  the  directions  of  A and  B.  Thus  9 always  lies  in  the  range 
0 =£  0 =£  tt.  Since  A and  B are  positive,  the  sign  of  A cos  9 B is  determined  by 
the  sign  of  cos  9.  It  is  positive  in  the  range  0 =s  9 < tt/2  and  negative  in  the 
range  tt/2  < 9 =£  tt. 

The  scalar  product  is  written  symbolically  as  A • B,  with  the  bold  dot 
being  used  to  indicate  the  specific  mathematical  operation  involved  in  the 
scalar  product.  In  spoken  language  the  scalar  product  is  expressed  as  “A 
dot  B and  consequently  the  scalar  product  is  often  called  the  dot  product. 
Thus,  by  definition 


A • B = A cos  9 B (7-34) 

It  should  be  emphasized  that  the  complete  symbol  A • B represents  a 
scalar,  even  though  it  contains  the  two  vector  symbols  A and  B.  This  is  so 
because  its  third  part  is  the  symbol  • which  stands  for  the  operation  of  using 
these  two  vectors  to  evaluate  the  scalar  on  the  right  side  of  Eq.  (7-34).  Fig- 
ure 7-1 1 illustrates  A • B for  a variety  of  cases.  Note  that  A • B = AB  if  A is 
parallel  to  B,  that  A • B = — AB  if  A is  antiparallel  (in  other  words,  is  oppo- 
sitely directed)  to  B,  and  that  A • B = 0 if  A is  perpendicular  to  B.  The  def- 
inition makes  ii  clear  that  A • B = B • A.  That  is,  the  scalar  product  is  com- 
mutative. 


7-4  Work  and  Kinetic  Energy  273 


•& 


Fig.  7-11  (a)  The  dot  product  A • B of  two  vectors  A and  B.  whose  direc- 

tions are  separated  by  the  angle  9,  can  be  obtained  by  multiplying  the  mag- 
nitude of  A and  the  component  of  B along  the  direction  of  A.  In  other 
words,  the  value  is  A(cos  9)B.  ( b ) Alternatively,  A • B can  be  obtained  by 
multiplying  the  component  of  A along  the  direction  of  B and  the  magni- 
tude of  B.  In  other  words,  the  value  is  (A  cos  6)B.  ( c ) If  A is  parallel  to  B, 
then  6 = 0 and  cos  6 = 1,  so  A • B = A cos  6 B = AB.  (d)  If  A is  anti- 
parallel to  B,  then  9 = 180°  and  cos  6 = — 1,  so  A • B = A cos  6 B = — AB. 
( e ) If  A is  perpendicular  to  B,  then  6 = 90°  and  cos  9 = 0,  so  A • B = A cos 
9 B = 0. 


( b ) 


The  scalar,  or  dot,  product  has  an  obvious  application  in  the  definition 
of  work,  as  well  as  in  many  other  places  in  physics  where  there  is  a need  to 
generate  the  component  of  one  vector  along  the  direction  of  another.  It 
also  has  useful  applications  in  geometry  and  trigonometry,  as  Example  7-4 
shows. 


EXAMPLE  7-4 

The  lengths  of  sides  A and  B of  a triangle  and  the  angle  </>  between  these  sides  are 
known.  Find  the  length  C of  the  third  side  in  terms  of  A,  B,  and  <f>. 

■ The  triangle  is  drawn  in  Fig.  7- 12a.  You  must  find  a rule  for  this  arbitrary  tri- 
angle analogous  to  the  pythagorean  theorem  for  right  triangles.  For  this  purpose,  it 
makes  sense  to  generate  an  expression  for  the  square  of  the  length  of  side  C.  This 
can  be  done  by  considering  Fig.  7-126,  which  converts  the  triangle  into  a diagram 
for  the  vector  addition 


C = A + B 

You  can  generate  an  expression  for  C2  by  writing  this  equation  again 

C = A + B 

and  then  setting  the  dot  product  of  the  left  sides  of  these  two  equations  equal  to  the 
dot  product  of  their  right  sides.  You  obtain 

C • C = (A  + B)  • (A  + B) 


Fig.  7-12  A triangle,  and  associated  vector 
addition  diagram,  used  to  derive  the  law  of 
cosines  from  the  properties  of  dot  products. 


(a) 


(b) 


274  Energy  Relations 


Since  C • C = C2,  you  have  just  what  you  want  on  the  left  side  of  the  equality.  Ex- 
panding the  expression  on  the  right  side  you  obtain 

c2  = a-  a + b-  b + ab  + b-  a 

or,  since  A • B = B • A, 

C2  = A2  + B2  + 2A  • B 
Evaluating  the  dot  product,  you  get 

C2  = A2  + B2  + 2 A cos  0 B 

= A2  + B2  + 2 AB  cos  (tt  - <f>) 

= A2  + B2  - 2 AB  cos  t/> 

So 

C = VaFTW^2ABcos^> 

You  may  remember  that  this  result  is  called  the  law  of  cosines,  and  that  its  proof  by 
the  standard  techniques  of  trigonometry  is  considerably  more  involved  than  its  proof 
by  using  the  dot  product. 


The  most  general  definition  of  work  is  Eq.  (7-33 a).  Written  in  terms  of 
a scalar  product,  the  definition  is 


W 


1 • ds 


(7-35) 


The  work  done  on  a body  by  a force  acting  on  it  is  the  integral  from  its  initial  to  final 
position  of  the  dot  product  of  the  force  and  the  infinitesimal  displacements  of  the  body 
along  its  path. 

Now  we  will  use  this  definition  to  show  that  in  two  or  three  dimensions 
the  work  done  by  the  net  force  acting  on  a body  equals  the  change  in 
kinetic  energy  of  the  body,  just  as  Eq.  (7-31)  says  is  the  case  for  one  dimen- 
sion. The  procedure  is  analogous  to  that  used  to  obtain  Eq.  (7-31).  In  fol- 
lowing it  through  you  will  see  how  to  handle  the  differential  calculus  of 
scalar  products. 

Again  we  start  with  Newton’s  second  law,  thereby  making  it  the  basis 
for  the  relation  we  will  obtain.  The  second  law 


F = m 


d2  s 
dt2 


is  used  to  evaluate  F in  the  work  integral.  This  gives 

fS/  d2s 

W — m -7-5-  • ds 

JSi  dt2 

in  analogy  to  Eq.  (7-28).  Next  we  manipulate  the  quantity  being  integrated 
in  very  much  the  same  way  as  in  the  calculation  leading  to  Eq.  (7-29): 


d2  s 

m —tt  ’ d s 
c it 


d /ds 


m 


dt  \dt 


ds 


Now  ds  = v dt,  so  we  have 

d2  s 

mlt2’ 


, d\ 
ds  = m — • v dt 
dt 


dv  , 

m — • ds 
dt 


m d\ 


7-4  Work  and  Kinetic  Energy  275 


This  can  be  put  into  a more  useful  form  by  using  the  rule  for  differen- 
tiating the  product  of  two  functions  to  write 

d(x  ' x)  dx  dx 

= v • 4-  • v 

dt  dt  dt 

But  the  dot  product  is  commutative,  so 

d(x  • x)  dx  dx  dx 

; = — • V 4-  • V = 2 — • V 

dt  dt  dt  dt 


Multiplying  through  by  dt/ 2 yields 

d(x  • v) 


dx  • v 


I hen  multiplying  through  by  m,  and  interchanging  the  sides,  we  obtain 


m 


m dx  • v = — d(x  • v) 

Substituting  this  into  the  expression  for  m d2s/dt2  • ds  gives 


d2  s m m 

m ~df  ‘ ds  = ~2  d(x  ‘ v)  = — d(v2) 


From  here  on  the  calculation  is  identical  to  that  leading  to  Eq.  (7-30).  We 
write  the  work  integral  as 

w-J*  T^-fJC i<v2> 

and  then  use  the  fundamental  theorem  to  evaluate  the  integral.  We  obtain 


W = 


mVf  mv  f 


(7-36) 


Just  as  it  did  in  one  dimension,  this  result  prompts  us  to  define  the 
kinetic  energy  K of  a body  of  mass  m moving  with  speed  v to  be 


K = 


mv 2 

~T 


{7 -57  a) 


The  kinetic  energy  of  a body  is  one-half  its  mass  times  the  square  of  its  speed.  The 
value  of  K can  be  expressed  in  terms  of  the  velocity  v of  a body  by  using  the 
relation  v • v = v2  to  write 


mx  • v 


(7-37 b) 


Introducing  the  definition  of  kinetic  energy  into  Eq.  (7-36),  we  obtain 
an  important  relation  between  the  work  W done  by  the  net  force  acting  on 
a body  and  the  initial  and  final  values  of  the  kinetic  energy  of  the  body: 

W = Kf-  Kj  (7-38 a) 

If  we  write  the  change  in  kinetic  energy  as  A K = Kf  — Ki , the  relation  be- 
comes 

W = A.K  (7-38  b) 


276  Energy  Relations 


That  is,  the  change  in  the  kinetic  energy  of  a body  equals  the  work  done  by  the  net 
force  acting  on  it  during  its  motion.  This  is  the  work-kinetic  energy  relation. 
It  applies  in  one,  two,  or  three  dimensions.  But  since  the  relation  was  ob- 
tained by  using  Newton’s  second  law,  it  is  valid  only  in  an  inertial  frame  of 
reference. 

Example  7-5  applies  the  work-kinetic  energy  relation  in  a two- 
dimensional  situation. 


EXAMPLE  7-5 

A projectile  of  mass  1.25  kg  is  fired  horizontally  with  an  initial  speed  of  30.0  m/s  at 
an  initial  elevation  above  the  ground  of  10.0  m.  Moving  with  negligible  air  resis- 
tance under  the  influence  of  gravity,  it  follows  the  parabolic  trajectory  shown  in  Fig. 
7-13  until  it  strikes  the  ground. 

a.  Calculate  the  work  done  on  the  projectile  by  the  gravitational  force  while  it 
is  in  flight. 

b.  Then  use  the  work-kinetic  energy  relation  to  determine  the  speed  of  the 
projectile  just  before  it  strikes  the  ground. 

■ a.  The  gravitational  force  acting  on  the  projectile  is  F = m g,  where  g is  the 
gravitational  acceleration.  The  work  done  by  this  force  on  the  projectile  while  it  is 
in  flight  is,  according  to  Eq.  (7-35), 

W = I F • ds 

■J  Si 

At  any  point  in  the  trajectory 

F • ds  = mg  • ds  = mg  cos  6 ds 

Here  8 is  the  angle,  illustrated  in  Fig.  7-13,  between  the  downward  gravitational 
force  mg  and  the  infinitesimal  displacement  ds  along  the  trajectory  at  that  point. 
Choosing  x and  y axes,  with  positive  directions  to  the  right  and  upward  as  in  the  fig- 
ure, allows  you  to  relate  ds  to  the  infinitesimal  change  dy  in  the  projectile’s  y coordi- 
nate. The  figure  shows  that  the  relation  is 

dy  = — cos  6 ds 

Using  this  relation  in  the  preceding  equation,  you  have 

F • ds  = —mg  dy 

Thus  you  can  write  the  work  integral  as 

pV 

W = I F’ds=  — mg  dy 

J s J y, 


y 


Fig.  7-13  A projectile  moving  without  air  resistance  along 
a parabolic  trajectory. 


7-4  Work  and  Kinetic  Energy  277 


where  yt  and  3^  are  the  initial  and  hnal  values  of  the  coordinate  y that  correspond  to 
the  initial  and  Hnal  values  of  the  coordinate  s measured  along  the  trajectory.  Since 
mg  is  a constant,  you  have 

rm 

W = —mg  dy 
J 2/i 

The  fundamental  theorem  of  calculus  then  immediately  gives  you  the  result 

W = -mg(yf  - yt) 

Setting  3>;  = 0,  yf  = — 10.0  m,  m = 1.25  kg,  and  g = 9.80  m/s2,  you  obtain  the 
numerical  value 


W = —1.25  kg  x 9.80  m/s2  x (-10.0  m) 
= 123  kg-nT/s2 

or 


W = 123  ) 

This  is  the  work  done  by  the  gravitational  force  acting  on  the  projectile  while  it  is  in 
flight.  How  does  it  compare  with  the  work  done  if  the  projectile  is  simply  dropped 
from  the  same  height? 

b.  Since  the  only  force  acting  on  the  projectile  while  it  is  in  flight  is  the  gravita- 
tional force,  this  is  the  net  force  acting  on  it.  Therefore  you  can  use  the  work- 
kinetic  energy  relation  of  Ecp  (7-38 a)  to  write 

Kf  - Ki  = W 

According  to  Eq.  (7-36),  the  initial  and  final  values  of  the  projectile’s  kinetic  energy 
are 

mvf  mvf 

Kt  = — and  Kf  = — 

where  m is  its  mass  and  vt  and  vfa.re  the  initial  and  final  values  of  its  speed.  So  you 
have 

mv}  mv2i 
-1  - — = W 
2 2 

Using  the  expression  for  W obtained  in  part  a,  with  yt  = 0,  gives  you 

mvj  mvf 
~2 5“  = 


or 


Vf  = vf  - 2 gyf 

Taking  square  roots  of  both  sides  of  this  equality,  you  obtain  the  result 

vf  = Vw2  - 2 gyf 

Positive  roots  are  used  since  a speed  is  necessarily  positive. 

Setting  vt  = 30.0  m/s,  g = 9.80  m/s2,  and  yf  = — 10.0  m,  you  find  that  the  final 
speed  of  the  projectile  is 

vf=  V(30.0  m/s)2  - 2 x 9.80  m/s2  x (-10.0  m) 
or 

Vf  = 33. 1 m/s 

You  can  verify  that  this  value  is  correct  by  obtaining  it  from  the  equations 
developed  in  Sec.  3-1. 


278  Energy  Relations 


7-5  CONSERVATIVE 
FORCES 


We  saw  in  Sec.  7-1  that  energy  relations  are  particularly  useful  in  analyzing 
systems  whose  total  mechanical  energy  remains  constant  or,  in  other  words, 
whose  total  mechanical  energy  is  conserved.  When  a system  is  observed 
from  an  inertial  reference  frame,  its  total  mechanical  energy  will  be  con- 
served if  two  conditions  are  satisfied: 

1.  The  system  must  be  isolated.  For  present  purposes,  this  means  that 
either  no  external  forces  act  on  its  constituent  bodies  or  every  external  force 
that  acts  on  these  bodies  does  zero  work  on  them  in  the  course  of  any  pos- 
sible motions  of  the  bodies. 

2.  Each  force  internal  to  the  system — that  is,  a force  exerted  on  one 
body  of  the  system  by  another  body  of  the  system  — has  the  following  prop- 
erty: The  force  does  zero  total  work  when  the  bodies  contained  in  the 
system  move  from  any  given  arrangement  to  any  other  arrangement  and 
then  back  to  the  original  arrangement.  If  this  is  the  case,  the  bodies  com- 
prising the  system  complete  their  “round  trips”  with  zero  total  work  having 
been  done  on  them  by  the  forces  that  act  within  the  system. 

It  is  easy  to  understand  why  condition  1 must  be  satisfied  by  the  forces 
acting  on  a system  from  the  outside  if  there  is  to  be  conservation  of  a 
system’s  total  mechanical  energy.  If  the  condition  is  not  satisfied,  then  work 
will  be  done  on  the  system  by  the  external  forces,  and  so  the  total  mechan- 
ical energy  of  the  system  cannot  remain  constant.  But  it  will  take  some  ef- 
fort to  understand  why  condition  2 must  be  satisfied  by  the  forces  acting 
from  within  the  system  if  its  total  mechanical  energy  is  to  be  conserved.  In 
this  section  we  will  study  the  properties  of  forces  which  meet  condition  2,  as 
well  as  the  properties  of  forces  which  do  not.  Forces  which  do  meet  condi- 
tion 2 are  called  conservative  forces  because  of  the  role  they  play  in  the 
conservation  of  total  mechanical  energy. 

A simple  example  of  a system  which  conserves  its  mechanical  energy 
is  depicted  in  Fig.  7-14.  A body  of  mass  m is  connected  to  one  end  of  a 
spring  of  negligible  mass.  The  other  end  is  fixed.  The  body  is  supported 
against  gravity  by  the  frictionless  surface  of  an  air  track.  The  forces  acting 
on  the  body  are  a downward  gravitational  force  of  magnitude  mg  produced 
by  the  attraction  of  the  earth,  an  upward  supporting  force  of  the  same 
magnitude  exerted  by  the  air  track,  and  a Hooke’s-law  force  exerted  by  the 
spring.  Choosing  an  x axis  along  the  length  of  the  air  track,  we  write  the 
spring  force  as  F(x)  = —kx.  The  coordinate  x measures  the  extension  (x  > 
0)  or  compression  (x  < 0)  of  the  spring.  If  the  body  is  pulled  from  its  stable 
equilibrium  position  (x  = 0)  and  then  released,  it  will  oscillate  along  the  x 
axis  about  the  equilibrium  position  in  harmonic  motion. 

We  can  take  the  body-plus-spring  system  as  an  isolated  system  con- 
taining a single  moving  body  acted  on  by  the  single  internal  force  produced 
by  the  spring.  In  other  words,  we  consider  the  essentially  massless  spring 


Fig.  7-14  A block  supported  by  an 
air  track  and  connected  to  one  end 
of  a spring.  The  other  end  of  the 
spring  is  attached  to  a fixed  pin. 


7-5  Conservative  Forces  279 


not  as  a body  in  its  own  right,  but  as  the  source  of  the  internal  force  acting 
on  the  body  connected  to  its  movable  end.  What  about  the  forces  exerted 
on  the  moving  body  by  the  earth  and  the  air  track?  The  downward  force 
exerted  on  the  body  by  the  earth  and  the  upward  force  exerted  on  it  by  the 
air  track  have  the  function  of  constraining  the  body  to  move  in  the  horizon- 
tal plane.  They  are  external  to  the  body-plus-spring  system.  But  since  each 
is  always  acting  in  a direction  perpendicular  to  the  displacement  of  the 
body,  neither  does  work  on  the  body.  Such  forces  are  called  workless  con- 
straints. Another  workless  constraint  is  the  force  exerted  on  the  fixed  end 
of  the  spring  by  the  pin  in  the  air  track.  No  work  is  done  by  this  force  since 
its  point  of  application  is  not  displaced.  So  the  body-plus-spring  system  is 
isolated,  in  the  sense  of  condition  1,  because  the  only  external  forces  acting 
on  it  are  workless  constraints.  Another  significant  advantage  of  using  en- 
ergy relations  to  analyze  mechanical  systems,  instead  of  using  Newton’s 
laws  of  motion  directly,  is  that  workless  constraint  forces  cayi  he  ignored. 

The  internal  force  acting  on  the  body  in  the  system,  mentioned  in  con- 
dition 2,  is  the  force  exerted  on  it  by  the  spring.  This  force  does  work  on 
the  body  because  it  acts  along  the  same  line  as  the  body’s  displacement. 
However,  the  total  work  it  does  is  zero  when  the  body  moves  from  some  lo- 
cation to  some  other  location  and  then  back  to  the  original  location.  We  will 
show  this  by  following  the  body  through  such  a journey. 

Consider  the  oscillating  body  at  an  instant  when  it  is  moving  to  the 
right  and  passes  the  point  x = 0 so  that  the  spring  is  relaxed.  The  round 
trip  will  consist  of  its  journey  from  that  point  to  the  point  where  the  spring 
has  its  maximum  extension,  and  then  back  to  the  point  x = 0.  In  the  first 
half  of  the  trip,  the  spring  force  does  a certain  amount  of  negative  work  on 
the  body.  The  work  is  negative  since  the  force  exerted  by  the  spring  acts  to 
the  left  and  the  displacement  of  the  body  it  acts  on  is  to  the  right.  In  the  sec- 
ond half  of  the  trip,  the  spring  force  does  positive  work  on  the  body  be- 
cause the  force  acts  to  the  left  and  the  displacement  is  also  to  the  left. 

Focus  attention  on  the  infinitesimal  segment  of  the  path  traversed  by 
the  body  which  is  located  at  x and  whose  length  is  \dx\.  The  body  passes 
through  this  segment  twice,  once  in  each  half  of  the  round  trip.  Since  the 
body  travels  in  the  positive  direction  during  the  first  half  of  the  trip,  its  dis- 
placement dx first  half  as  it  passes  through  the  segment  the  first  time  has  a pos- 
itive value.  And  since  the  body  travels  in  the  negative  direction  during  the 
second  half  of  the  trip,  its  displacement  dxseconA  half  as  it  passes  through  the 
segment  the  second  time  has  a negative  value.  That  is, 


In  each  passage  of  the  body  through  the  path  segment,  the  spring  exerts 
the  same  force  F(x)  on  the  body,  since  x has  the  same  value  both  times.  The 
contributions  to  the  work  integral  from  the  two  passages  are 


dW, 


first  half 


F(x)  dx , 


first  half 


and 


dW. 


second  half 


F(x)  dx. 


second  half  -F(x)  dXfjrst  dW | 


first  half 


Thus  the  two  contributions  cancel: 


dWfjrst  haif  T dW, 


second  half 


0 


280  Energy  Relations 


s lb,  s2b 


(a) 


Sib,  S2b 


( b ) 

Fig.  7-15  (a)  A body  traveling  from  one 
point  to  another  along  path  1 and  then 
traveling  back  to  the  starting  point 
along  path  2.  The  coordinate  measured 
along  path  1 of  the  starting  point  is  sla. 
and  that  of  the  intermediate  point  is  slb. 
The  coordinate  measured  along  path  2 
of  the  intermediate  point  is  $2i><  and  that 
of  the  starting  point  is  s2„ . (b)  For  a con- 
servative force,  the  work  it  does  on  a 
body  when  the  body  moves  from  a point 
specified  by  coordinate  5a  to  a point 
specified  by  coordinate  s b does  not 
depend  on  the  path  followed  by  the 
body  in  moving  between  the  points. 
Thus  the  work  done  is  the  same 
whether  the  body  follows  path  1 or 
path  2. 


This  argument  does  not  depend  on  the  particular  value  of  x which  specifies 
the  location  of  the  path  segment.  Consequently,  it  holds  for  all  values  of  x 
within  the  region  traversed  by  the  body.  The  total  work  done  by  the  spring 
force  in  the  round  trip  is  therefore  zero. 

By  applying  Eq.  (7-38 b), 

W = AA 

to  the  round  trip,  we  can  conclude  immediately  that  the  kinetic  energy  K of 
the  body  when  it  returns  to  x = 0 must  equal  its  kinetic  energy  when  it  left 
that  point.  This  is  true  because  the  equation  says  that  if  the  work  done  on 
the  body  is  W — 0,  then  its  change  in  kinetic  energy  is  AA  = 0.  The  same 
conclusion  will  be  reached  for  the  next  round  trip  made  by  the  oscillating 
body,  and  so  on.  Therefore  the  body  always  has  the  same  value  of  A when  it 
passes  through  ihe  point  x = 0.  When  the  body  is  at  x = 0,  t he  spring  is  at 
its  relaxed  length.  At  that  length  the  spring  cannot  exert  a force  on  any- 
thing. So  it  has  no  energy  content  because  ii  has  no  ability  to  do  work. 
Hence,  the  total  mechanical  energy  of  the  system  equals  the  kinetic  energy 
of  the  body  at  x = 0.  We  have  thus  shown  that  the  total  mechanical  energy 
of  the  system  has  the  same  value  each  time  the  body  passes  x = 0,  because 
the  spring  force  does  zero  total  work  on  the  body  in  each  of  its  round  trips 
from  x = 0.  This  is  consistent  with  condition  2 stated  at  the  beginning  of 
this  section. 

Soon  we  will  learn  how  to  evaluate  the  energy  content  of  the  spring 
when  x f 0.  We  will  find  that  the  total  mechanical  energy  of  the  body- 
plus-spring  system  has  the  same  value  whatever  the  location  of  the  body,  so 
that  the  system  conserves  its  total  mechanical  energy.  This  will  prove  to  be 
a consequence  of  the  fact  that  the  spring  force  does  zero  total  work  in  a 
round  trip  from  any  point  x^  0 back  to  the  same  point. 

We  now  consider  conservative  forces  in  systems  where  bodies  move  in 
more  than  one  dimension.  A general  test  for  a conservative  jorce  is  that  it 
must  do  zero  total  work  on  a body  when  the  body  moves  through  any  closed  path. 
Expressed  mathematically,  a conservative  force  F is  defined  as  one  that 
satisfies  the  relation 


I F • dsx  +|  F • ds2  = 0 (7-39 a) 

The  body  on  which  the  force  acts  makes  the  round  trip  depicted  in  Fig. 
7- 15a.  The  first  integral  is  the  work  done  along  some  path  labeled  1 as  the 
body  moves  from  some  position  whose  coordinate,  measured  along  that 
path,  is  sla  to  some  other  position  whose  coordinate  is  s16.  The  second  inte- 
gral is  the  work  clone  in  the  body’s  return  along  some  other  path  2 from  the 
position  on  that  path  whose  coordinate,  measured  along  it,  is  s2b  to  the  posi- 
tion whose  coordinate  is  s2a.  Note  that  slb  and  s2b  are  corresponding  values 
of  the  coordinates  5X  and  s2',  that  is,  they  specify  the  same  position.  The 
same  is  true  for  51(J  and  s2a.  We  can  write  the  relation  defining  a conservative 
force  in  the  compact  form 

Wa  to  6 on  1 + Wb  to  a on  2 = 0 (7-396) 

by  introducing  the  notation 

fSli, 

F-  ds^WvtoOom  (7-39 c) 

J s,.. 


7-5  Conservative  Forces  281 


and  similarly  for  the  integral  along  path  2.  That  there  is  a connection 
between  conservative  forces  and  energy  conservation  is  evident  from  the 
discussion,  immediately  above,  of  the  body  and  spring,  since  the  discussion 
led  to  a special  case  of  Eq.  (7-39 b).  We  develop  this  connection  fully  in  Sec. 
7-6,  but  first  we  must  learn  more  about  conservative  forces. 

A completely  equivalent  form  of  Eq.  (7-39 b),  which  is  more  conve- 
niently used  to  test  for  a conservative  force,  can  be  obtained  by  considering 
a body  acted  on  by  a conservative  force  which  makes  two  separate  round 
trips  from  the  point  whose  label  is  a to  the  point  whose  label  is  b,  and  then 
back.  In  the  first  trip  it  goes  out  along  path  1 and  then  back  along  path  1. 
By  the  definition  of  a conservative  force,  zero  total  work  is  done  in  this 
round  trip.  So 

kk a to  6 on  l T kk&  to  a on  1 0 

or 

kkf(  to  b on  1 kkj  to  a on  1 “40) 

Thus  we  see  that  reversing  the  direction  in  which  a body  traverses  any  path  between 
any  two  points  reverses  the  sign  of  the  work  done  on  the  body  by  a conservative  force. 
In  the  second  trip  the  body  goes  out  along  path  2 and  then  back  along  path 
1 . Again,  the  conservative  force  does  zero  total  work  in  the  round  trip, 
and  we  have 

kk  a to  b on  2 T Wj  to  a on  1 b 

Applying  Eq.  (7-40)  to  the  second  term  on  the  left  side  of  this  equation 
allows  us  to  rewrite  it  as 

kk  a to  6 on  2 kEa  to  b on  1 b 

or 

kkato&onl  kEa(o6on2  (7-41) 

This  relation  must  be  satisfied  if  the  force  acting  on  the  body  is  conserva- 
tive. The  relation  tells  us  that  for  a force  to  be  conservative,  the  work  it  does  when 
the  body  it  acts  on  moves  from  any  position  to  any  other  position  must  depend  only  on 
the  two  positions.  See  Fig.  7-15/t.  In  particular,  the  work  that  a conservative  force 
does  cannot  depend  on  any  other  specific  characteristics,  such  as  the  path  followed  by 
the  body,  the  speed  of  the  body  when  making  the  trip,  or  the  time  when  it  does  so.  Ex- 
amples 7-6  through  7-8  demonstrate  how  Eq.  (7-41)  is  used  to  find  out 
whether  a force  is  conservative.  (The  results  of  the  hrst  two  examples  are 
also  used  later  to  evaluate  potential  energies.) 


EXAMPLE  7-6 

A bead  moves  along  the  path  shown  in  Fig.  7-16  between  the  position  whose  coordi- 
nate is  sa  and  the  position  whose  coordinate  is  sb.  It  is  guided  in  the  path  because  it  is 
threaded  through  a wire  bent  to  the  shape  shown,  and  it  moves  along  the  path  be- 
cause of  its  inertia.  The  bead  is  connected  to  one  end  of  a very  extensible  spring, 
whose  other  end  can  rotate  freely  around  a fixed  pin.  Assume  that  the  spring  is  so 
extensible  that  its  relaxed  length  is  negligible  compared  to  its  length  when  the  bead 
is  anywhere  along  the  part  of  the  wire  being  considered.  This  will  simplify  the 
mathematics  to  be  used,  without  affecting  the  point  of  physics  to  be  learned.  Evalu- 
ate the  work  Wniob  done  by  the  force  that  the  spring  exerts  on  the  bead.  Then  de- 
termine if  the  spring  force  is  conservative.  (Other  forces  may  be  exerted  on  the 


282  Energy  Relations 


Fig.  7-16  A bead  connected  to  one  end  of  a spring  and  guided 
by  a wire  from  point  a to  point  b.  The  other  end  of  the  spring 
is  connected  to  a fixed  pin. 


bead.  In  particular,  the  wire  exerts  a force  on  it.  But  only  the  spring  force  is  being 
investigated  here.) 

■ If  you  let  r be  a vector  extending  from  the  pin  to  the  bead,  then  its  magnitude  r is 
essentially  equal  to  the  extension  of  the  spring.  The  Hooke’s-law  force  which  the 
spring  exerts  on  the  bead  will  be  of  magnitude  kr,  where  k is  the  force  constant. 
Since  the  force  always  acts  in  the  direction  toward  the  pin,  the  expression 

F = -kr 

gives  both  its  direction  and  its  magnitude.  Thus 

r st,  r si, 

Wa  to  b = F • ds  = - k r • ds 

J Sa  J Sa 

where  ds  is  an  infinitesimal  element  of  the  path  followed  by  the  bead.  Application  of 
the  definition  of  the  dot  product  to  Fig.  7-16  shows  you  that 

r • ds  = r cos  6 ds 

But  cos  0 ds  is  equal  to  dr,  the  change  in  the  distance  from  the  pin  to  the  bead  during 
the  displacement.  Thus 

r • ds  = r dr 

and  so 

frt> 

Watob=  -k  r dr 

J t'fi 

Using  Eq.  (7-20)  with  n = 1 to  evaluate  the  integral,  you  obtain 


Therefore  you  have 

(rb  ra\ 

Watob=-k  (j-j)  (7-42) 

The  value  of  Watob  in  Eq.  (7-42)  was  obtained  without  specifying  the  shape  of 
the  path  followed  by  the  bead  in  going  from  a to  b.  The  same  amount  of  work  would 
be  done  by  the  force,  no  matter  what  path  the  wire  made  the  bead  follow,  providing 
the  bead  went  from  position  a to  position  b.  So  you  have  shown  that  the  spring  force 
is  conservative.  In  this  particular  case,  Waiob  actually  depends  not  on  the  complete 
specification  of  positions  a and  b,  but  only  on  their  distances  ra  and  rb  from  the  pin. 

The  physical  reason  for  these  results  is  as  follows.  In  any  displacement  ds,  one 
component  of  the  displacement  lies  in  a direction  parallel  to  the  spring  axis,  and  the 
other  component  lies  in  a direction  perpendicular  to  that  axis.  The  parallel  compo- 
nent stretches  the  spring  along  its  axis,  and  the  perpendicular  component  rotates  it 


7-5  Conservative  Forces  283 


pin  • 


Fig.  7-17  The  path  from  a to  b is  approximated  by  a sequence  of 
radial  line  segments  along  the  spring  axis  and  arcs  centered  on  the  pin. 
No  work  is  done  by  the  spring  force  on  the  arcs,  and  the  total  work 
£ done  by  that  force  on  the  radial  segments  depends  only  on  the  total 

extension  of  the  spring.  The  result  obtained  in  Eq.  (7-42)  can  be 
thought  of  as  the  limit  of  the  result  obtained  by  using  this  approximate 
path,  since  the  approximation  to  the  actual  path  is  improved  by 
making  the  arcs  and  radial  line  segments  shorter  and  increasing  their 
number. 

about  the  pin.  But  only  the  parallel  component  of  the  displacement  leads  to  work  being  done 
by  the  spying  force,  because  the  force  acts  along  the  spring  axis.  Since  the  work  done  in  each 
displacement  is  related  to  how  much  the  spring  is  stretched  by  the  displacement, 
you  can  see  why  the  total  work  done  depends  on  only  the  quantities  ra  and  rb,  which 
give  the  amount  of  stretch  at  the  beginning  and  end  of  the  total  displacement.  If  rb 
is  larger  than  ra,  as  in  the  figure,  the  motion  increases  the  stretch  of  the  spring  and 
the  total  work  done  is  negative.  This  is  because  the  force  exerted  by  the  spring  on 
the  bead  is  in  the  general  direction  opposing  that  motion.  Figure  7-17,  and  the 
explanation  in  its  caption,  presents  a somewhat  different  version  of  this  argument. 


EXAMPLE  7-7 

The  wire  of  Example  7-6  lies  in  a vertical  plane  near  the  surface  of  the  earth.  Thus 
the  bead  experiences  a downward  force  mg,  where  m is  its  mass  and  g is  the  gravita- 
tional acceleration.  Calculate  the  work  Wato6  done  by  the  uniform  gravitational 
force  when  the  bead  moves  from  a to  b.  Then  decide  whether  the  gravitational  force 
is  conservative. 

■ The  calculation  that  must  be  performed  is  similar  to  the  one  in  Example  7-5. 
First  you  express  the  gravitational  force  as 

F = mg 

Then  you  note  from  Fig.  7-18  that 

F • ds  = F cos  6 ds 

= F cos  (77  — (/>)  ds 


Fig.  7-18  The  evaluation  of  the  dot  product  F • ds  for  a 
particle  moving  under  the  influence  of  a gravitational  force 
F = mg. 


284  Energy  Relations 


Now  COS  (77  — <t>)  = —COS  (f ),  so 

F • ds  = — F cos  $ ds 

The  figure  also  shows  that  cos  c f>  ds  = dy,  where  y is  the  vertical  coordinate  of  the 
bead  whose  upward  direction  is  defined  to  be  positive.  Thus 

F • ds  = — F dy  = — mg  dy 

The  work  done  by  the  force  is 

rsb  rut  rub 

Watob  = F • ds  = -mgdy  = -mg  dy 
J su  J ya  J u a 

The  integral  immediately  yields 

Wato b = ~mg{yb  - ya)  = mgya  - mgyb  (7-43) 

Since  Watob  depends  only  on  the  positions  of  the  beginning  and  end  of  the 
path  — in  fact,  only  on  the  heights  of  these  positions — you  can  conclude  that  the 
uniform  gravitational  force  is  conservative. 

For  a case  such  as  is  illustrated  in  the  figure,  where  yb  represents  a higher  eleva- 
tion than  ya,  the  work  Waiob  will  be  negative  because  the  force  acts  downward  and 
the  general  motion  is  upward.  You  should  be  able  to  give  a physical  explanation 
comparable  to  those  at  the  end  of  Example  7-6  of  why  the  work  done  depends  only 
on  the  (heights  of  the)  beginning  and  end  points  of  the  motion  of  the  bead,  and 
not  on  the  particular  path  it  follows  in  going  between  those  points. 

If  the  wire  is  perfectly  smooth,  then  the  force  it  exerts  on  the  bead  always  is 
normal  to  the  surface  of  the  wire,  and  therefore  always  perpendicular  to  the  bead’s 
displacement  along  the  wire.  Hence  this  force  constraining  the  bead  to  move  along 
the  path  determined  by  the  shape  of  the  wire  can  do  no  work  at  all  on  the  bead  — it 
is  a workless  constraint  and  can  be  ignored.  So  in  such  a case  only  two  forces  are 
acting  on  the  bead  which  do  work — the  spring  force  and  the  gravitational  force. 
Both  are  conservative.  Is  the  net  force  acting  on  the  bead  conservative? 


EXAMPLE  7-8 

The  wire  in  Examples  7-6  and  7-7  is  not  perfectly  smooth.  It  exerts  a contact  fric- 
tion force  of  constant  magnitude  F on  the  bead.  Evaluate  the  work  Watob  done  by 
this  force  as  the  bead  moves  from  position  a to  position  b.  Is  the  frictional  force  con- 
servative? 

■ Since  the  contact  friction  force  will  always  act  on  the  bead  in  such  a direction  as 
to  oppose  its  motion,  when  the  bead  makes  the  displacement  ds  along  the  wire,  the 
direction  of  the  contact  friction  force  F will  be  opposite  to  the  direction  of  ds.  So  you 
can  write  the  force  in  the  form 

F = -F  ds 

Here  ds  is  a unit  vector  in  the  direction  of  ds.  (Note  that  the  "hat”  covers  the  entire 
symbol  ds  to  remind  you  that  the  vector  ds  has  a unit  magnitude,  and  not  an  infini- 
tesimal magnitude.)  Using  this  form,  you  have 

rsb  rsb  A rsb  A 

IYato6  = F • ds  = — F ds  • ds  = —F  ds  • ds 

*'  Sa  ^ Sa  ^ Sa 

since  F is  constant.  But  ds  • ds  = l(cos  0)  ds  = ds.  So 

Wat06  = ~F  I ds 

J Sa 

The  integral  can  be  evaluated  immediately,  and  it  yields  the  result 

Watob  = -F(sb  - sa)  (7-44) 


7-5  Conservative  Forces  285 


At  first  glance,  this  may  seem  much  like  the  results  obtained  in  Examples  7-6 
and  7-7.  Certainly  the  mathematical  expression  in  Eq.  (7-44)  looks  quite  similar  to 
the  one  in  Eq.  (7-43).  But  its  physical  significance  is  very  different.  Since  the  infini- 
tesimal element  of  displacement  ds  always  lies  along  the  path  from  a to  b,  the  inte- 
gral of  its  magnitude  ds  represents  the  total  length  of  the  path  taken  between  those 
positions.  That  is,  in  the  relation 


the  quantity  sb  is  the  distance  measured  along  the  path  from  some  reference  posi- 
tion on  the  path  to  position  b,  sa  is  the  distance  along  the  path  from  the  reference 
position  to  position  a,  and  sb  — sa  is  therefore  the  distance  along  the  path  from  a to 
b.  Thus  the  work  done  depends  on  the  length  of  the  path  followed  by  the  bead  on 
which  the  frictional  force  acts.  The  quantity  Watob  is  not  the  same  for  all  paths  con- 
necting a and  b,  and  so  the  frictional  force  is  not  conservative.  The  point  is  illus- 
trated in  Fig.  7-19. 

Equation  (7-44)  shows  that  the  work  done  by  the  frictional  force  is  always  nega- 
tive because  the  path  length  is  always  positive.  The  physical  reason  is  that  the  fric- 
tional force  always  acts  on  the  bead  in  the  direction  opposite  to  its  displacement, 
and  consequently  the  force  does  negative  work  in  any  motion  of  the  bead.  Thus 
when  the  bead  makes  a round  trip,  from  a to  b and  then  from  b back  to  a,  this  force 
does  negative  work  on  the  bead  throughout.  Friction  continually  removes  mechan- 
ical energy  from  the  system.  So  mechanical  energy  is  not  conserved  in  a system  in- 
volving frictional  forces,  even  if  it  is  an  isolated  system.  To  put  it  another  way,  since 
one  of  the  forces  contributing  to  the  net  force  acting  on  the  bead  is  the  noncon- 
servative frictional  force,  the  net  force  itself  is  not  conservative. 


It  can  be  said  that  the  contact  friction  force  is  not  a conservative  force 
because  it  depends  on  the  direction  of  the  velocity  v of  the  object  on  which 
it  acts.  This  is  so  since  the  direction  of  v = ds/dt  is  the  direction  of  ds.  The 
fluid  friction  force  depends  on  both  the  direction  and  the  magnitude  of  the 
velocity,  and  it  also  is  not  a conservative  force.  A force  which  does  work 
cannot  be  conservative  if  the  force  varies  with  the  velocity  of  the  object  on 
which  it  acts.  The  work  done  by  such  a force  on  the  object  does  not  depend 
only  on  the  positions  between  which  the  object  travels.  Rather  the  work  de- 
pends both  on  where  the  object  it  acts  on  goes  and  on  how  it  gets  there  (on 
its  direction  and/or  speed  when  making  the  trip),  and  so  it  is  not  conserva- 
tive. Furthermore,  any  force  that  depends  explicitly  on  time— that  is,  on 
when  the  trip  is  taken — is  not  conservative.  Can  you  see  why? 


Fig.  7-19  Three  wires  of  the  same  type  extend  between  a and  b.  If  they 
exert  contact  friction  on  the  bead  when  it  slides  along  them,  then  the  contact 
friction  always  does  negative  work  on  the  bead  as  it  moves  from  a to  b.  The 
amount  of  work  done  is  proportional  to  the  length  of  the  path.  Thus  the 
work  done  is  least  for  the  straight  path,  intermediate  for  the  curved  path,  and 
greatest  for  the  looped  path. 


286  Energy  Relations 


7-6  POTENTIAL 
ENERGY  AND  ENERGY 
CONSERVATION 


In  this  section  we  will  continue  to  generalize  the  concepts  introduced  in 
Sec.  7-1  by  developing  the  connection  between  a conservative  force  and  the 
potential  energy  associated  with  that  force.  Then  we  will  use  this  connec- 
tion to  obtain  a general  statement  of  one  of  the  most  important  laws  of 
physics,  the  law  of  conservation  of  total  mechanical  energy. 

Suppose  that  a system  contains  only  a single  body  whose  position  we 
must  consider.  Suppose  also  that  the  only  force  acting  on  the  body  is  a.  con- 
servative force  exerted  because  of  the  presence  of  something  else  in  the 
system.  This  force  is  an  internal  force — it  is  not  an  external  force  applied  to 
the  body  from  outside  the  system.  An  example  is  a system  consisting  of  the 
earth  and  a brick.  The  brick  can  move,  and  so  its  position  must  be  consid- 
ered. But  seen  from  a reference  frame  fixed  to  the  ground,  the  earth 
cannot  move.  The  conservative  force  acting  on  the  movable  body  is  the 
force  of  gravity.  It  is  a force  internal  to  the  system,  and  it  is  exerted  on  the 
brick  because  of  the  presence  in  the  system  of  the  earth.  Another  example 
is  a system  comprising  the  brick  supported  by  a horizontal  air  track  and 
connected  to  one  end  of  a spring,  whose  other  end  is  attached  to  a pin  in 
the  track.  As  seen  by  an  observer  stationed  at  the  air  track,  the  position  of 
the  brick  completely  describes  the  appearance  of  the  system  at  any  instant. 
The  conservative  internal  force  acting  on  the  brick  is  the  force  exerted  on  it 
by  the  spring.  (The  forces  exerted  by  gravity  and  the  air  track  cancel,  and 
so  they  can  be  ignored.) 

Consider  the  system  when  the  movable  body  is  at  some  position  a.  The 
conservative  internal  force  acting  on  it  will  do  work  if  the  body  moves  to 
some  other  position  o.  Thus  when  the  body  is  at  position  a,  there  is  the 
potentiality  that  work  can  be  done  by  this  internal  force,  and  so  it  is  said 
that  the  system  has  potential  energy.  Specifically,  the  potential  energy  of 
the  system  is  defined  as  the  work  which  will  be  done  on  the  body  by  the  internal 
force  acting  on  it  if  the  body  moves  from  position  a to  some  agreed-upon  position  o. 
The  symbol  U is  used  to  represent  potential  energy.  By  the  definition  just 
stated,  its  value  is 

U=Wat00  (7-45) 

The  position  o is  called  the  reference  position.  It  is  chosen  on  the  basis  of 
convenience.  In  due  course  you  will  see  many  examples  of  how  this  is  done. 
But  note  here  that  the  potential  energy  is  zero  if  the  body  happens  to  be  at  the  refer- 
ence position.  That  is,  U — 0 if  a is  o,  since  Woio  0 = 0.  Note  also  that  U is  well 
defined,  even  though  the  path  from  a to  o is  not  specified,  because  the  value 
°f  kka  t0  o is  independent  of  the  path  for  a conservative  force.  This  comment 
should  make  it  apparent  to  you  that  there  is  no  such  thing  as  a potential  energy 
for  a nonconservative  force.  It  is  impossible  to  associate  a unique  energy  U 
with  each  position  a in  a situation  where  the  work  done  if  the  body  moves 
from  a to  o does  not  have  a unique  value. 


We  now  embark  on  an  argument  which  will  lead  to  the  law  of  mechan- 
ical energy  conservation.  Consider  a system  containing  a single  movable 
body.  The  system  is  isolated  from  all  external  forces  that  can  do  work  on  it. 
(This  allows  there  to  be  external  forces  from  workless  constraints — such  as 
a force  exerted  by  a completely  frictionless  track  that  guides  the  movable 
body — since  such  forces  do  no  work  on  the  system.)  Let  the  net  internal  force 
acting  on  the  movable  body  be  conservative,  so  that  the  system  has  a poten- 


7-6  Potential  Energy  and  Energy  Conservation  287 


a 


Fig.  7-20  Paths  followed  by  a body 
which  are  used  to  establish  the  work- 
potential  energy  relation. 


rial  energy  corresponding  to  this  force.  Suppose  the  body  moves  from  posi- 
tion a,  where  the  potential  energy  has  the  value  U,  to  some  other  position 
a',  where  the  potential  energy  is  U',  as  shown  in  Fig.  7-20.  The  potential 
energy  changes  by  the  amount 

MJ=U'-U=  Wa,too  - VEotoo 

But 


Wa'too  T Wat00 

The  reason  is  that  the  work  done  by  the  conservative  net  force  is  path- 
independent.  So  it  will  be  the  same  for  either  of  the  paths  indicated  in  the 
figure  connecting  a'  with  the  reference  position  o.  Substituting  the  value  of 
Wa,  t0  o in  the  equation  for  A U,  we  find  the  change  in  the  potential  energy  of 
the  system  is 

AL  — Wai  to  a T Wat00  ~ kkatoo  ~ kka'toa 
Now  we  can  write 


W«'toa  = -VFa 


to  a’ 


because  Eq.  (7-40)  tells  us  that  reversing  the  direction  in  which  a body  tra- 
verses any  path  between  any  two  points  reverses  the  sign  of  the  work  done 
on  the  body  by  a conservative  force.  Using  this  in  the  equation  for  A U,  we 
obtain  a relation  between  the  potential  energy  change  when  the  body 
moves  from  a to  a ' and  the  work  done  on  the  body  by  the  conservative 
force  in  this  motion.  It  is 


A U = -Watoa,  (7-46) 

This  is  the  work-potential  energy  relation:  The  change  in  the  potential  energy 
of  an  isolated  system,  equals  the  negative  of  the  work  done  by  the  conservative  net  in- 
ternal force  acting  on  the  body  during  its  motion. 


According  to  Eq.  (7-386),  the  work  done  by  the  net  force  acting  on  the 
body  as  it  moves  from  a to  a'  is  also  equal  to  the  change  A K in  its  kinetic  en- 
ergy if,  as  we  assume,  the  system  is  viewed  from  an  inertial  reference 
frame.  Thus,  using  our  present  notation, 

Watoa,  = A K (7-47) 

Combining  this  with  Eq.  (7-46),  we  have 

A U = - AA 

or 

AA'  + At/  = 0 (7-48) 

Since  the  sum  of  the  changes  in  two  quantities  equals  the  change  in  their 
sum,  we  can  write 

AA  + AU  = A(A  + U) 

So  Eq.  (7-48)  can  be  written  as 

A(A  + U)  = 0 (7-49) 

This  result  suggests  the  utility  of  defining  the  sum  of  the  kinetic  and  poten- 
tial energies  of  the  system  to  be  its  total  mechanical  energy  E.  In  symbols, 

E = A + U (7-50) 


288  Energy  Relations 


Then  Eq.  (7-49)  says 


AT  = 0 


(7-51) 


The  total  mechanical  energy  of  the  isolated  system  does  not  change 
when  the  body  moves  under  the  influence  of  a conservative  net  force.  The 
reason  is  that  any  change  in  the  kinetic  energy  is  exactly  compensated  for 
by  a change  in  the  potential  energy  so  that  there  is  no  change  in  their  sum, 
the  total  mechanical  energy.  This  observation  leads  us  to  conclude  that  Eq. 
(7-51)  is  equivalent  to  the  statement 

E = K + U = constant  (7-52) 

This  is  the  very  important  law  of  conservation  of  total  mechanical  energy: 

If  a system  is  isolated  except  for  workless  constraints,  and  all  its  internal  forces  are 
conservative  so  that  the  net  internal  force  acting  on  a body  of  the  system  is  conserva- 
tive, then  its  total  mechanical  energy  remains  constant  when  it  is  viewed  from  an  in- 
ertial reference  frame.  We  developed  the  law  by  considering  a system  with 
only  one  moving  body,  but  an  analogous  development  shows  it  to  be  true 
no  matter  how  many  moving  bodies  the  system  contains. 

The  law  of  conservation  of  total  mechanical  energy  applies  to  a wide 
variety  of  systems.  In  each  of  them  this  conservation  law  plays  as  vital  a role, 
in  governing  the  behavior  of  the  system,  as  does  the  law  of  total  momentum 
conservation.  The  law  of  total  mechanical  energy  conservation  is  less  im- 
portant in  physics  than  the  total  momentum  conservation  law  only  in  that 
the  latter  applies  to  every  isolated  system  viewed  from  an  inertial  frame, 
while  the  former  does  not  apply  if  there  are  any  forces  which  are  not  con- 
servative acting  within  such  a system.  In  particular,  any  frictional  force  will 
prevent  the  law  of  total  mechanical  energy  conservation  from  applying  to  a 
system,  since  all  frictional  forces  are  nonconservative. 

The  total  mechanical  energy  of  an  isolated  system  involving  only  conserva- 
tive forces  is  not  constant  when  the  system  is  viewed  from  a reference  frame  that  is 
not  inertial.  Imagine  a block  stationary  on  the  floor  (considered  to  be  an  inertial 
frame).  Think  what  the  kinetic  and  gravitational  potential  energies  will  be  when 
these  quantities  are  evaluated  from  a frame  of  reference  which  is  accelerating  in 
some  horizontal  direction  with  respect  to  the  floor  (and  which  therefore  is  not  an 
inertial  frame).  Since  the  speed  of  the  block  changes  continually,  as  seen  from  this 
frame,  its  kinetic  energy  changes.  But  the  horizontal  motion  of  the  noninertial 
frame  does  not  affect  the  potential  energy,  and  so  the  total  mechanical  energy  does 
not  appear  to  be  constant. 

An  example  of  the  application  of  mechanical  energy  conservation  in 
analyzing  the  behavior  of  a simple  system  was  given  in  Sec.  7-1.  Another 
simple  example  is  given  immediately  below.  Chapter  8 is  devoted  to  work- 
ing out  examples  using  the  energy  relations,  particularly  the  law  of  con- 
servation of  total  mechanical  energy. 


EXAMPLE  7-9 

A pendulum  bob  of  mass  m at  the  end  of  a cord  of  length  l is  displaced  from  its 
stable  equilibrium  position  so  that  the  cord  is  at  right  angles  to  the  vertical  (see  Fig. 
7-21).  The  bob  is  then  released  from  rest.  How  fast  will  it  be  moving  when  it  goes 
through  the  bottom  of  its  swing?  This  is  the  hrst  question  posed  at  the  beginning  of 
Sec.  7-1. 


7-6  Potential  Energy  and  Energy  Conservation  289 


y 


Fig.  7-21  A pendulum  with  the  cord 
initially  horizontal  and  the  bob  released 
from  rest. 


■ You  can  answer  the  question  easily  by  applying  the  law  of  conservation  of  total 
mechanical  energy.  Take  the  frictionless,  isolated  system  to  be  the  bob  plus  the 
earth,  which  you  view  from  the  essentially  inertial  frame  of  the  ground.  The  cord 
acts  as  a workless  constraint,  since  the  force  it  exerts  on  the  bob  is  always  perpendic- 
ular to  the  bob’s  instantaneous  motion.  The  only  internal  force  to  be  considered  is 
the  gravitational  attraction  exerted  by  the  earth  on  the  bob.  To  obtain  an  expression 
for  the  potential  energy  U of  the  system,  combine  its  definition  in  Eq.  (7-45), 

U = W„t00 

with  Eq.  (7-43) 

Watob  = mgya  - mgyb 

to  obtain 


U = mgyn  - mgy0 

Then,  as  indicated  in  Fig.  7-21,  take  the  reference  height  y0  to  be  the  height  of  the 
bob  at  its  stable  equilibrium  position;  also  use  that  height  to  fix  the  origin  of  the 
upward-directed  y axis.  With  these  choices  you  have  y0  = 0,  so  that 


U = mgya 


Dropping  the  now-unneeded  subscript,  you  get 

U = mgy 


(7-53) 


This  is  the  potential  energy  of  the  system  when  the  bob  is  at  any  height  y above  the 
reference  height  y0.  The  kinetic  energy  is  that  of  the  motion  of  the  bob  (the  earth 
being  motionless  in  your  reference  frame). 

The  initial  values  of  the  potential,  kinetic,  and  total  mechanical  energies  are, 
since  y = l initially, 

U = mgl 


K = 0 


and 


E = K + U = mgl 

The  final  values  of  these  energies  are 


(7  = 0 

mv2 


E = K + U = — 

where  v is  the  speed  of  the  bob  when  y = 0.  Equating  the  initial  and  final  values  of 
the  total  mechanical  energy,  you  have 

mv 2 
mgl  = — 

Solving  for  v produces  the  equation  you  set  out  to  find: 

v = V2  gl  (7-54) 

You  should  use  similar,  but  even  simpler,  energy  considerations  to  answer  the 
questions  posed  in  Sec.  7-1  about  the  maximum  displacement  of  the  bob  on  the  op- 
posite side  of  its  swing,  and  about  the  swing  of  the  interrupted  pendulum. 


290  Energy  Relations 


7-7  EVALUATION  OF 
FORCE  FROM 
POTENTIAL  ENERGY 


If  both  the  conservative  net  force  F and  the  motion  of  the  body  it  acts  on  lie 
along  the  x axis,  the  definition  of  Eq.  (7-45)  for  the  potential  energy  U at 
position  xa  reduces  to  the  one-dimensional  form 

U = WXa  toXo  = I F dx  (7-55) 

J Xa 


where  x0  is  the  reference  position.  This  relation  can  be  used  to  evaluate  the 
potential  energy  from  the  force.  But  sometimes  it  is  necessary  to  go  the 
other  way,  that  is,  to  evaluate  the  force  from  the  potential  energy.  This  can 
be  done  by  using  the  relation 


F = 


dU 

dx 


(7-56) 


To  show  that  Eq.  (7-56)  is  consistent  with  the  definition  of  Eq.  (7-55), 
we  substitute  it  into  the  definition  and  then  show  the  validity  of  the  result. 
We  have 


= Ua  - U0 


*0  dU 
x„  dx 


dx  = 


dU  — — (U0  - Ua) 


Here  Ua  is  the  value  of  the  potential  energy  at  position  xa,  and  U0  is  its  value 
at  the  reference  position  x0.  As  explained  immediately  below  Eq.  (7-45),  the 
potential  energy  is  zero  at  the  reference  position.  So  U0  = 0,  and  we  have 

U=  Ua 

This  is  certainly  valid  since  the  symbol  on  the  left  is  just  an  abbreviation  for 
the  symbol  on  the  right.  Therefore  Eq.  (7-56)  is  consistent  with  Eq.  (7-55). 
Example  7-10  makes  use  of  Eq.  (7-56). 


EXAMPLE  7-10 


The  results  of  Example  7-6,  and  particularly  Eq.  (7-42),  show  that  the  potential  en- 
ergy associated  with  the  force  produced  by  the  spring  in  Fig.  7-16  can  be  expressed 
as 


U = 


(7-57) 


Xa  1 

*o=0 

Fig.  7-22  A spring  with  one  end  fixed 
and  the  other  end  at  xa,  a coordinate 
measured  from  the  position  of  the  free 
end  when  the  spring  has  its  relaxed 
length. 


For  this  expression  the  fixed  pin  at  one  end  of  the  spring  is  chosen  to  be  the  refer- 
ence position  of  the  body  at  the  other  end,  and  ra  is  the  length  of  the  spring  when 
extended.  In  obtaining  Eq.  (7-42)  it  was  assumed  that  the  length  of  the  spring  is 
negligible  when  relaxed,  so  that  r„  is  also  equal  to  the  extension  of  the  spring.  Fig- 
ure 7-22  shows  a spring  with  the  same  force  constant  k whose  relaxed  length  is  not 
negligible.  One  end  is  fixed,  and  the  other  is  attached  to  a body  which  can  move 
along  the  x axis.  In  light  of  Eq.  (7-57),  it  is  reasonable  to  guess  that  the  potential  en- 
ergy in  this  case  can  be  expressed  as 


U = 


or,  without  the  subscript, 


foe2 


(7-58) 


7-7  Evaluation  of  Force  from  Potential  Energy  291 


Here  x is  the  location  of  the  free  end  of  the  spring,  measured  from  an  origin  located 
at  the  free  end  of  the  spring  when  it  is  relaxed.  Thus  x is  also  the  extension  of  the 
spring  if  x is  positive  or  its  compression  if  x is  negative.  Show  that  the  force  corre- 
sponding to  this  potential  agrees  with  Hooke’s  law,  and  thereby  verify  Eq.  (7-58). 
Also  find  the  reference  position  chosen  for  the  potential  energy  in  Eq.  (7-58). 

■ You  employ  Eq.  (7-56) 


and  obtain 


Since  this  expression  for  F is  precisely  Hooke’s  law,  the  correctness  of  the  expres- 
sion for  U in  Eq.  (7-58)  is  proved. 

The  reference  position  used  in  specifying  the  values  of  a potential  energy  is  that 
position  where  the  potential  energy  is  defined  to  be  zero.  Since  Eq.  (7-58)  yields 
U = 0 when  x = 0,  it  is  apparent  that  the  reference  position  chosen  in  the  definition 
of  this  potential  energy  is  the  origin  of  the  x axis.  This  is  the  location  of  the  free  end 
of  the  spring  when  it  is  at  its  relaxed  length. 


An  important  point  can  be  made  in  connection  with  Example  7-10.  If 
the  reference  position  used  to  define  the  potential  energy  associated  with 
the  spring  force  is  changed  (without  changing  the  way  the  coordinate  x is 
defined),  the  effect  will  be  to  add  a constant  to  the  right  side  of  Eq.  (7-58). 
This  constant  is,  physically,  the  work  that  the  spring  force  does  on  a body 
connected  to  the  end  of  the  spring  if  the  body  moves  to  the  new  reference 
position  from  the  original  reference  position.  But  adding  a constant  to  the 
potential  energy  U will  have  no  effect  at  all  on  the  force,  F = —dU/dx,  cal- 
culated from  U,  because  the  derivative  of  a constant  is  zero.  The  same  situ- 
ation occurs  for  any  type  of  force  and  its  associated  potential  energy.  The 
actual  value  of  the  potential  energy  of  a system  is  arbitrary,  in  the  sense  that 
a change  in  the  choice  of  the  reference  position  has  the  effect  of  adding  a 
constant  to  the  potential  energy.  But  this  makes  no  change  in  the  force  as- 
sociated with  the  potential  energy.  And  since  motion  is  produced  by  force, 
it  makes  no  change  in  the  motion  of  the  body  on  which  the  force  acts. 

The  choice  actually  made  for  a reference  position  is  dictated  by  conve- 
nience. Usually,  it  is  chosen  so  as  to  make  the  form  of  the  expression  for  U 
as  simple  as  possible.  For  instance,  in  Example  7-10  the  reference  position 
was  chosen  in  such  a way  that  Eq.  (7-58)  had  the  form  U = kx2/2.  Any  other 
choice  would  lead  to  an  expression  of  the  form  U = kx2/2  + C.  For  the 
spring  force  the  simplest  expression  results  from  choosing  U = 0 at  the  ori- 
gin x = 0.  But  in  other  cases  a different  choice  is  required  to  produce  the 
simplest  expression. 

In  more  complicated  situations  the  potential  energy  may  depend  on 
all  three  space  coordinates  x,  y,  x,  so  that  it  is  written  as  U(x,  y,  z).  The  com- 
ponents Fx,  Fy,  Fz  of  the  force  F along  the  corresponding  axes  can  be  evalu- 
ated from  the  relations 


292 


Energy  Relations 


(7-59a) 


Fx 


dU{x,  y,  z) 
dx 


Fy 

Fz 


dU{x , y,  z) 
dy 

dU(x,  y,  z) 
dz 


(7-5%) 

(7-5%) 


The  quantities  in  these  relations  are  partial  derivatives.  A partial  derivative 
of  a function  of  several  independent  variables  is  a derivative  evaluated  by 
allowing  one  of  the  variables  to  vary  while  treating  the  others  as  if  they 
were  constants.  For  instance,  the  meaning  of  the  quantity  on  the  right  side  of 
the  hrst  relation  is  simply 


dU(x,  y,  z) 
dx 


dl/(x,  y,  z) 


dx 


(7-60) 


evaluated  by  treating  y and  z as  constants 


The  validity  of  Eqs.  (7-59)  is  established  in  much  the  same  way  as  we  veri- 
fied Eq.  (7-56).  Example  7-11  uses  them. 


EXAMPLE  7-11 


Fig.  7-23  Springs  of  force  constant  A, 
and  k2  connected  to  a body  located 
somewhere  in  the  xy  plane.  The  other 
end  of  the  first  spring  is  at  x = 0,  y = 0, 
and  the  other  end  of  the  second  spring 
is  at  x = c,  y = 0. 


Figure  7-23  shows  a two-dimensional  system  consisting  of  a spring  extending  from 
the  point  (0,  0)  to  a body  at  the  point  (x,  y)  and  a second  spring  extending  from  the 
point  (c,  0)  to  the  same  body.  The  force  constant  of  the  first  is  k{,  the  force  constant 
of  the  second  is  k^.  As  in  Example  7-6,  both  springs  are  assumed  to  be  so  exten- 
sible that  their  lengths  when  relaxed  are  negligible.  Thus  vectors  rj  and  r2  specify  the 
magnitudes  and  directions  of  the  extensions  of  the  two  springs.  (This  idealization 
simplifies  the  mathematics  without  affecting  the  significant  physical  concepts.)  Deter- 
mine the  x and  y components  of  the  total  force  exerted  on  the  body. 

■ Using  Eq.  (7-57),  you  can  write  the  potential  energy  arising  from  the  first 
spring  as 


U i 


kjr\ 

2 


kl 


(x2  + y2) 


and  the  potential  energy  arising  from  the  second  spring  as 

k2r I k2 

u2  = ~y  = i [(*  - <)2  + /] 

The  total  potential  energy  of  the  system  is 

U = Ux  + U2  = y (x2  + y2)  + yf  [(x  - c)2  + y2] 

Knowing  U,  you  can  immediately  obtain  the  components  Fx  and  Fy  of  the  total 
force  acting  on  the  body  by  using  Eqs.  (7-59a)  and  (7-596)  for  U = U(x,  y).  That  is, 


Fx  = 


dU  = 
dx 

kx2x  k22(x  — c) 


M , , , „ d 


= - (ki  + k2)x  + k2C 


and 


Fy  = 


dU  kx  d „ 2 

17  = 


kx2y  k22y 


= - (ki  + k2)y 


7-7  Evaluation  of  Force  from  Potential  Energy  293 


It  is  easy  to  evaluate  the  magnitude  and  direction  of  the  total  force  from  its 
components  Fx  and  Fu.  But  it  is  likely  that  the  reason  you  want  to  know  the  total 
force  is  that  you  want  to  set  up  the  equations  governing  the  motion  of  the  body.  If 
so,  then  it  is  the  components  that  you  need,  since  the  equations  are 

d2x 

and 


F 


y 


d2y 


m 


dt2 


You  can  also  determine  the  components  of  the  total  force  by  adding  the  forces 
produced  by  the  two  springs  and  then  taking  components,  or  more  easily  by  taking 
the  components  of  the  two  forces  and  then  adding  them.  Do  this,  and  show  that  the 
results  obtained  are  the  same  as  those  obtained  here. 


EXERCISES 

Group  A 

7-1.  Kinetic  energy  and  stopping  distance. 

a.  Find  the  kinetic  energy  of  each  of  the  following  ob- 
jects: 

(1)  A pitched  baseball:  mass  = 0.15  kg,  speed  = 40 

m/s 

(2)  Rifle  bullet:  0.002  kg,  500  m/s 

(3)  Jogger:  70  kg,  3.0  m/s 

(4)  Automobile  on  the  highway:  2000  kg,  25  m/s 

(5)  Medium-sized  cargo  ship  approaching  dock:  3 x 
107  kg,  1.0  m/s 

b.  Suppose  each  of  the  objects  listed  in  part  a were 
acted  on  by  a steady  retarding  force  equal  in  magnitude  to 
the  weight  of  a typical  adult  human,  700  N.  What  distance 
would  be  required  to  bring  each  to  rest? 

7-2.  Sprinting. 

a.  Estimate  your  top  sprinting  speed. 

b.  Starting  from  rest,  what  is  the  minimum  amount 
of  work  you  must  do  to  reach  your  top  speed? 

7-3.  G unsmoke.  A gun  of  mass  M fires  a bullet  of  mass 
m.  Therefore  MV  = mv,  where  V and  v are  the  speeds  of 
the  gun  and  bullet,  respectively,  immediately  after  firing. 
Calculate  the  ratio  of  the  kinetic  energy  of  the  gun  to  the 
kinetic  energy  of  the  bullet  at  that  moment. 

7-4.  Energy  and  the  pendulum,  I.  As  shown  in  Fig.  7E-4, 
a pendulum  bob  is  hanging  at  one  end  of  a rod  of  length 

* 

\ 

\ 

\ 

\ 

\ 

\ 

T - ® I 

T 


v Fig.  7E-4 


2 m.  The  other  end  of  the  rod  is  mounted  on  a friction- 
less axle.  The  mass  of  the  rod  is  negligible. 

a.  The  bob  is  struck  sharply,  which  gives  it  an  initial 
speed  v.  Use  energy  relations  to  find  the  value  of  v for 
which  the  bob  will  almost  reach  the  point  X (directly  above 
the  axle)  before  it  reverses  its  motion. 

b.  Compare  your  result  with  the  result  obtained  by 
direct  application  of  Newton’s  laws  in  Exercise  6-38. 

7.5.  What’s  the  angle?  A 1.00-m  long  pendulum  is  tied 
to  the  top  of  a cupboard  (Fig.  7E-5).  The  bob  is  raised  so 
that  the  string  makes  an  angle  of  30°  with  the  vertical.  The 
bob  is  released.  If  the  side  of  the  cupboard  is  0.50  m long, 
what  angle  will  the  string  make  with  the  vertical  when  the 
bob  is  at  its  highest  point  under  the  cupboard?  Assume 
all  frictional  effects  to  be  negligible. 


7-6.  Pushing  a block.  Starting  from  rest,  a 10-kg  block 
is  pushed  along  a horizontal  surface  for  a distance  of  5.0 
m.  The  horizontal  force  used  to  push  the  block  is  30  N, 
and  there  is  a resisting  frictional  force  of  25  N. 

a.  What  is  the  total  work  done  by  the  applied  force? 

b.  How  much  work  is  done  in  overcoming  friction? 

c.  What  is  the  speed  of  the  block  when  it  has  covered 
5.0  m? 


294  Energy  Relations 


7-7.  Speed  check.  Verify  the  final  projectile  speed  ob- 
tained in  Example  7-5  by  using  the  equations  developed  in 
Sec.  3- 1 . 

7-8.  The  effect  of  work  on  speed.  The  net  force  acting  on 
an  automobile  of  mass  2000  kg  does  100,000  J of  work. 
Find  the  final  speed  of  the  automobile,  if  its  initial  speed  is 

a.  Zero 

b.  10.0  m/s  (about  22  mi/h) 

c.  25.0  m/s  (about  56  mi/h) 

d.  40.0  m/s  (about  89  mi/h) 

7-9.  Workforce.  Under  the  action  of  a net  force  whose 
direction  is  along  the  direction  of  motion,  a particle  of 
mass  rn  increases  its  speed  from  Vf  to  vf  in  covering  a 
straight-line  path  of  length  5. 

a.  How  much  work  is  done  by  the  net  force? 

b.  If  the  net  force  is  constant,  what  is  its  magnitude? 

c.  Evaluate  your  results  for  parts  a and  b for  m = 10 
kg,  Vi  = 2.0  m/s,  vf  = 7.0  m/s,  and  s = 20  m. 

7-10.  Working  out. 

a.  How  much  work  is  done  against  gravity  by  a 
weightlifter  who  lifts  100  kg  a vertical  distance  of  2.0  m? 

b.  Compare  the  result  of  part  a with  the  work  done 
against  gravity  by  a 70-kg  person  climbing  four  flights  of 
stairs  (total  vertical  distance  12  m). 

7-11.  Filled  to  the  brim.  A cylindrical  water  tank  10  m 
high  has  a capacity  ol  1000  m3.  It  is  to  be  filled  with  water 
from  a lake  whose  surface  is  at  the  same  elevation  as  the 
bottom  of  the  tank.  (In  working  this  exercise,  ignore  fric- 
tional losses  in  the  pipes  and  the  kinetic  energy  of  the 
water  as  it  leaves  the  hose.) 


Fig.  7E-11 

a.  Suppose  that  the  tank  is  filled  by  raising  a hose  to 
the  top  and  pumping  in  water,  as  shown  in  Fig.  7E-1  1. 
How  much  work  is  done  in  filling  the  tank  in  this  manner? 

b.  Suppose  that  the  tank  is  filled  by  connecting  the 
hose  to  an  inlet  at  the  bottom  of  the  tank.  How  much  work 
is  done  in  filling  the  tank  in  this  manner? 

c.  What  happens  to  the  extra  energy  in  part  al 

7-12.  Chair  lift.  The  sailor  in  the  bosun’s  chair  shown 
in  Fig.  7E-12  has  a mass  m.  He  plans  to  raise  himself  a 
distance  h to  the  upper  pulley  by  pulling  on  the  rope  on 
the  right. 

a.  By  how  much  will  his  gravitational  potential  en- 
ergy increase? 

b.  What  force  must  he  exert  to  lilt  himself? 


c.  What  length  of  rope  must  he  pull  to  get  to  the 
upper  position? 

d.  How  much  work  will  he  do?  Neglect  friction. 

7-13.  Hauling  a sled.  As  shown  in  Fig.  7E-13,  a force 
of  100  N at  30°  to  the  horizontal  is  required  to  draw  a sled 
at  uniform  speed  along  a horizontal  sidewalk. 

a.  How  much  work  is  done  by  the  applied  force  in 
pulling  the  sled  a distance  of  10  m? 

b.  What  is  the  magnitude  of  the  frictional  force  ex- 
erted on  the  sled  by  the  sidewalk? 

c.  How  much  work  does  the  frictional  force  do  when 
the  sled  is  pulled  a distance  of  10  m? 

d.  What  is  the  net  work  done  on  the  sled? 


7-14.  Heavy  work.  You  are  pushing  a heavy  box  across 
the  floor  by  applying  a horizontally  directed  force  of 
magnitude  150  N.  The  force  of  kinetic  contact  friction 
applied  by  the  floor  to  the  box  has  a magnitude  of  140  N. 
The  box  moves  8.00  m. 

a.  How  much  work  is  done  on  the  box  by  the  force 
you  apply?  By  the  force  the  floor  applies?  By  the  net  force 
acting  on  the  box? 

b.  Describe  qualitatively  the  motion  of  the  box. 

7-15.  Down  and  around.  A small  sphere  of  mass  1.00 

kg  is  attached  to  one  end  of  a 1 .00-m  rod  of  negligible  mass. 
The  other  end  is  mounted  on  an  axle  having  negligible 
friction.  Initially,  the  sphere  is  directly  above  the  axle, 
as  shown  in  Fig.  7E-15. 

a.  When  the  sphere  falls,  how  fast  will  it  be  moving  as 
it  passes  its  lowest  point? 


Exercises  295 


b.  What  will  be  the  tension  in  the  rod  at  that  instant? 


7-21.  Across  the  table,  I.  In  the  system  shown  in  Fig. 
7E-21,  the  pulley  and  string  have  negligible  mass,  and  the 
pulley  and  tabletop  are  frictionless.  The  system  is  released 
from  rest. 

a.  Find  the  speed  of  bodies  A and  B when  B has  de- 
scended a distance  D. 

b.  Evaluate  your  result  for  mA  = 20  kg,  mB  = 30  kg, 
and  D = 5.0  m. 


Fig.  7E-21 


7-16.  A captive  pendulum.  Figure  7E-I6  shows  a pen- 
dulum consisting  of  a bob  of  mass  rn  attached  to  a cord  of 
negligible  mass  having  length  I.  The  pendulum  bob  is  held 
by  a horizontal  string  at  A,  so  that  the  pendulum  cord  is 
inclined  at  an  angle  = 30°  with  the  vertical. 

a.  What  is  the  tension  in  the  pendulum  cord? 

b.  What  is  the  tension  in  the  horizontal  string? 

c.  The  horizontal  string  is  cut,  releasing  the  pen- 
dulum. What  is  the  speed  of  the  bob  as  it  passes  through 
the  lowest  point,  B ? 

d.  What  is  the  tension  in  the  pendulum  cord  at  B7 

e.  What  is  the  tension  in  the  cord  when  the  bob 
reaches  the  highest  point,  C? 

7-17.  Net  force  exerted  by  two  springs.  Verify  the  expres- 
sions obtained  in  Example  7-1  1 for  the  components  of  the 
net  force  exerted  by  the  two  springs.  Do  this  by  finding  the 
components  of  each  spring  force  and  then  adding  them. 

Group  B 

7-18.  Driving  a nail.  Refer  to  the  discussion  in  the  text 
following  Eq.  (7-4).  A procedure  is  suggested  in  the  sec- 
ond paragraph  after  Eq.  (7-4)  for  evaluating  the  work 
done  by  a puck  when  it  drives  a nail  into  a block  of  wood 
fastened  to  the  edge  of  an  air  table.  Follow  the  procedure 
and  obtain  an  expression  for  the  work  done,  in  terms  of 
the  mass  of  the  puck  and  its  initial  speed. 

7-19.  Lifting  and  lowering.  By  following  the  procedure 
suggested  in  the  small-print  section  after  Eq.  (7-7),  calcu- 
late the  net  work  done  by  the  force  you  apply  in  slowly 
raising  a puck  of  mass  m vertically  upward  from  the  floor 
at  x = 0 to  a position  x"  and  then  lowering  it  vertically 
downward  to  a position  x lower  than  x"  but  higher  than 
x = 0. 

7-20.  Energy  and  the  pendulum,  II.  A pendulum  is  re- 
leased from  rest  with  the  cord  horizontal.  The  length  of 
the  cord  is  0.50  m.  Use  energy  relations  to  determine  the 
speed  of  the  pendulum  bob  at  an  instant  when  the  cord  is 
vertical.  Compare  your  results  with  the  results  obtained  by 
direct  application  of  Newton’s  laws  in  Exercise  6-37. 


7-22.  An  unbalanced  rod,  A light  stick  of  length  / is 
pivoted  at  its  center  (Fig.  7E-22).  Bodies  of  mass  2 m 
and  m are  attached  at  its  ends.  The  stick  is  held  horizon- 
tally and  released.  What  will  be  the  speed  of  either  end 
when  the  stick  is  vertical? 


m in 


7-23.  Kinetic  energy  in  athletics.  Outstanding  perform- 
ances for  a number  of  athletic  events  are  listed  below. 
Neglecting  air  resistance  and  assuming  that  each  projectile 
is  launched  at  the  optimum  45°  elevation  angle,  calculate 
the  initial  kinetic  energy  for  each  case. 

a.  Shot  put:  mass  = 7.26  kg,  distance  thrown  = 22.0  m 

b.  Discus  throw:  2.00  kg,  70.9  m 

c.  Hammer  throw:  7.26  kg,  79.3  m 

d.  Javelin  throw:  0.800  kg,  94.6  m 

e.  Long  jump:  60.0  kg,  8.90  m 

f.  Baseball  throw:  0.145  kg,  130  m 

7-24.  To  the  top.  A child  at  play  wishes  to  launch  a 
2.0-kg  block  up  an  inclined  plane  with  sufficient  speed  to 
reach  the  top  of  the  incline.  The  plane  is  3.0  m long  and  is 
inclined  at  20°.  The  coefficient  of  kinetic  friction  between 
the  block  and  the  plane  is  0.40.  What  minimum  initial 
kinetic  energy  must  the  child  supply  to  the  block? 

7-25.  Rocket  mass  and  velocity.  The  motion  of  a rocket 
moving  in  free  space  is  governed  by  Eq.  (5-3 la), 
m(dV/dt ) = vfdm/dt),  where  m is  the  mass  of  the  rocket,  V 
is  its  velocity,  and  Vg  is  the  constant  velocity  of  the  gas 
ejected  from  the  rocket  engine,  as  seen  from  the  rocket,  so 
that  Vg  < 0. 

a.  The  infinitesimal  change  dV  in  the  velocity  of  the 
rocket  can  be  written  dV  = v'g  dm/m.  Integrate  both  sides 
of  this  equation  between  initial  and  final  values  of  V and  m 
to  find  a relation  between  the  change  in  the  rocket’s  veloc- 
ity and  the  change  in  its  mass  necessary  to  produce  it. 


296  Energy  Relations 


b.  A rocket  starts  from  rest  with  mass  . What  is  the 
mass  of  the  rocket  when  it  reaches  a speed  |V|  = 1.5\vg\? 

7-26.  Stopping  distance. 

a.  The  speed  of  a car  is  v,  and  the  coefficient  of  static 
friction  between  the  tires  and  the  road  is  yu.s.  Use  energy 
considerations  to  derive  an  expression  for  the  shortest  dis- 
tance in  which  the  car  can  come  to  a full  stop.  Neglect  the 
reaction  time  of  the  driver  and  the  effect  of  the  idling 
engine. 

b.  Using  the  expression  derived  in  part  a,  calculate 
the  stopping  distance  for  a car  traveling  at  30  m/s  (about 
70  mi/h)  with  fxs  = 0.5. 


7-27.  Calculating  the  work  done,  1.  One  of  the  forces 
acting  on  a certain  particle  depends  on  the  particle’s  posi- 
tion in  the  xy  plane.  This  force  Fx , expressed  in  newtons,  is 
given  by  the  expression  Fj  = (2x2x  + 3y2y)(l  N/m2),  where 
x and  y are  expressed  in  meters.  Calculate  the  work 

r D 


Fx  • ds  done  by  this  force  when  the  particle  moves 
J A 

from  point  A to  point  D in  Fig.  7E-27. 

a.  along  the  straight  line  AD 

b.  along  the  path  ABD,  which  consists  of  two  straight 


lines 


c.  along  the  path  ACD,  which  consists  of  the  straight 
line  AC  followed  by  the  circular  quadrant  CD 

d.  Is  Fx  a conservative  force? 


7-29.  Up  the  slope.  As  shown  in  Fig.  7E-29,  a body  of 
mass  1 .00  kg  is  pulled  slowly  up  a 30°  slope  1 .00  m long  by  a 
force  directed  parallel  to  the  plane.  The  coefficient  of 
kinetic  friction  between  the  body  and  the  plane  is  0.30. 

a.  How  much  work  is  done  to  increase  the  gravita- 
tional  potential  energy? 

b.  How  much  work  is  done  against  friction? 

c.  If  the  body  is  released  and  slides  down  the  incline, 
what  is  its  kinetic  energy  at  the  bottom? 


7-30.  Table  pounding.  A body  of  mass  100  g is  at- 
tached to  a hanging  spring  whose  force  constant  is  10 
N/m.  The  body  is  lifted  until  the  spring  is  in  its  un- 
stretched state.  The  body  is  then  released.  Using  the  law 
of  conservation  of  total  mechanical  energy,  calculate  the 
speed  of  the  body  when  it  strikes  a table  15  cm  below 
the  release  point. 


y (in  m) 


00  £>(3,3) 


Fig.  7E-27 


7-31.  Down  the  track.  In  the  track  shown  in  Fig.  7E-31, 
section  AB  is  a quadrant  of  a circle  of  1.0-m  radius.  A block 
is  released  at  A and  slides  without  friction  until  it  reaches 
point  B. 

a.  How  fast  is  it  moving  at  B,  the  bottom  of  the  quad- 
rant? 

b.  The  horizontal  part  is  not  smooth.  If  the  block 
comes  to  rest  3.0  m from  B,  what  is  the  coefficient  of  kinetic 
friction? 


1 .0  m 


B Fig.  7E-31 


7-28.  Down  the  incline.  A crate  of  mass  50  kg  slides 
down  a 30°  incline.  The  crate's  acceleration  is  2.0  m/s2, 
and  the  incline  is  10  m long. 

a.  What  is  the  kinetic  energy  of  the  crate  as  it  reaches 
the  bottom  of  the  incline? 

b.  How  much  work  is  spent  in  overcoming  friction? 

c.  What  is  the  magnitude  of  the  frictional  force  that 
acts  on  the  crate  as  it  slides  down  the  incline? 

d.  What  is  the  coefficient  of  kinetic  friction  between 
the  crate  and  the  incline? 

e.  At  the  base  of  the  incline  there  is  a horizontal  sur- 
face with  the  same  coefficient  of  kinetic  friction.  How  far 
will  the  crate  slide  before  coming  to  rest? 


7-32.  Spring  power.  A spring  with  negligible  mass  and 
a force  constant  of  600  N /m  is  kept  straight  by  confining  it 
within  a smooth-walled  guiding  tube.  The  tube  is  an- 
chored in  a horizontal  position  on  a tabletop.  The  spring 
is  compressed  by  10.0  cm  and  held  there  by  a latch  pin  in- 
serted through  the  wall  of  the  tube.  A 200-g  ball  of  the 
same  diameter  as  the  spring  is  placed  in  contact  with  the 
spring,  as  shown  in  Fig.  7E-32.  Then  the  latch  pin  is  re- 
moved releasing  the  spring. 


Exercises  297 


a.  What  speed  does  the  ball  acquire? 

b.  If  the  same  procedure  is  followed  with  the  tube 
pointing  vertically  upward,  what  will  be  the  speed  of  the 
ball  as  it  leaves  contact  with  the  spring? 

7-33.  Roller  coaster.  Figure  7E-33  shows  the  plan  for 
a proposed  roller  coaster  track.  Each  car  will  start  from 
rest  at  point  A and  will  roll  with  negligible  friction.  For 
safety,  it  is  important  that  there  be  at  least  some  small  pos- 
itive normal  force  (that  is,  a push)  exerted  by  the  track  on 
the  car  at  all  points.  (Why?)  What  is  the  minimum  safe 
value  for  the  radius  of  curvature  at  point  B ? 


A 


Fig.  7E-33 


a.  What  is  the  minimum  speed  required  at  the  top  of 
the  loop? 

b.  Assuming  energy  losses  to  be  negligible,  what  is 
the  minimum  height  h above  the  top  of  the  loop  from 
which  the  car  must  start? 

c.  Experience  with  a particular  toy  suggests  that  the 
actual  minimum  height  that  allows  the  car  to  loop  the  loop 
is  1 .3  times  the  value  found  in  part  b.  Compare  the  actual 
kinetic  energy  of  the  car  as  it  passes  the  top  of  the  loop 
with  the  kinetic  energy  the  car  would  have  when  released 
from  the  same  point  if  there  were  no  frictional  losses. 

7-36.  Across  the  table,  II.  In  the  system  shown  in  Fig. 
7E-36,  friction  is  negligible  and  the  string  and  pulleys  are 
of  negligible  mass.  The  system  is  released  from  rest. 


Fig.  7E-36 


7-34.  Bob  in  a vertical  circle.  A bob  of  mass  m is  revolving 
in  a vertical  circle  at  the  end  of  a freely  pivoted  rod  of 
length  R and  negligible  mass.  The  bob’s  speed  at  the  lowest 
point  of  the  circle  is  v0,  but  the  speed  varies  with  position 
as  a result  of  the  pull  of  gravity. 

a.  What  is  the  tension  in  the  rod  as  the  bob  passes 
through  the  lowest  point? 

b.  How  fast  is  the  bob  moving  as  it  passes  through  the 
highest  point  of  the  circle? 

c.  What  is  the  tension  in  the  rod  when  the  bob  is  at 
the  highest  point? 

d.  What  is  the  difference  between  the  tensions  found 
in  parts  a and  c?  Express  the  difference  as  a multiple  of 
the  bob’s  weight. 

e.  Interpret  your  result  for  part  c for  the  case  of  v% 
less  than  bgR. 

f.  What  minimum  value  is  implied  for  v0  by  the  fact 
that  the  bob  is  traversing  complete  circles? 

7-35.  Looping  the  loop.  In  the  toy  illustrated  in  Fig. 
7E-35  a small  car  loops  the  loop. 


a.  Find  a relationship  between  the  vertical  drop  of 
body  B and  the  horizontal  displacement  of  body  A. 

b.  Find  a relationship  between  the  speeds  vA  and  vB 
of  bodies  A and  B. 

c.  After  body  B has  descended  a distance  D,  what  is 
the  speed  vA  of  body  A? 

d.  Evaluate  the  result  of  part  c for  mA  = mB  and  D = 

2.0  m. 

7-37.  Block,  spring,  and  kinetic  friction.  As  shown  in 
Fig.  7E-37,  a block  of  mass  m is  resting  on  a horizontal  sur- 
face. The  coefficients  of  static  and  kinetic  friction  between 
the  block  and  the  surface  are  fxs  and  , respectively.  The 
block  is  attached  to  a spring  of  negligible  mass  having 
spring  constant  k.  Initially  the  block  is  at  rest,  and  the 
spring  is  relaxed.  Then  the  block  is  struck  sharply,  so  that 
it  begins  moving  to  the  right  with  speed  v0. 


m 

Fig.  7E-37 

a.  How  far  does  the  spring  extend  before  the  right- 
ward  motion  is  arrested? 

b.  Find  a criterion  that  can  be  used  to  determine 
whether  the  block  begins  to  move  back  to  the  left  or  sim- 


298  Energy  Relations 


ply  remains  at  the  point  of  maximum  extension  found  in 
part  a. 

c.  Evaluate  the  expressions  you  obtained  in  parts  a 
and  b for  the  case  m = 10  kg,  k = 100  N/m,  /jls  = 0.30, 
fxk  = 0.15,  and  v0  = 1.0  m/s. 

7-38.  Launch  speed  versus  elevation  angle.  A spring  gun 
uses  a spring  of  negligible  mass  having  spring  constant  k 
to  launch  a ball  of  mass  m.  The  spring  is  initially  com- 
pressed through  a distance  5.  Use  energy  considerations  to 
show  that  the  launch  speed  v depends  on  the  launch  angle 
6.  Specifically,  show  that  v2  = kr/m  — 2gs  sin  6. 


y (in  m) 


Group  C 

7-39.  A loss  of  support.  A body  of  mass  m is  attached  to 
the  hook  of  a stationary  spring  scale,  as  shown  in  Fig. 
7E-39.  The  body  is  supported  so  that  the  reading  of  the 
scale  is  zero.  The  support  is  then  removed.  What  is  the 
maximum  momentary  reading  of  the  spring  balance?  (As- 
sume the  damping  is  very  slight.) 


from  point  O to  point  C in  Fig.  7E-41  along 

a.  The  path  OAC,  which  consists  of  two  straight  lines 

b.  The  path  OBC,  which  consists  of  two  straight  lines 

c.  The  straight  line  OC 

d.  Is  F2  a conservative  force?  Explain  your  answer. 
Compare  it  with  the  results  obtained  in  Exercise  7-27. 


m 


MM/4,  (HbluW* 
1 ,m 


Fig.  7E-39 


7-42.  Oscillations  of  a leaky  cart.  A small  lab  cart  is  able 
to  roll  without  friction  on  a linear  horizontal  track.  The 
cart  is  connected  to  an  anchored  spring  of  negligible 
mass  having  spring  constant  k,  as  shown  in  Fig.  7E-42. 
The  cart  is  designed  like  a railroad  hopper  car.  It  is  ini- 
tially full  of  sand  and  has  total  mass  M;.  The  spring  is  ex- 
tended by  an  amount  A,-  and  is  then  released  at  t = 0.  At 
the  same  time,  the  outlet  on  the  bottom  of  the  cart  is 
opened,  so  that  sand  leaks  out  of  the  cart  at  a constant  rate 
'dM/dt  whose  value  is  negative ^The  leak  is  very  slow  and 
the  cart  is  heavy,  so  that  the  fractional  decrease  of  the  cart 
mass  in  a time  27 rxjM/k  is  very  small.  Under  these  condi- 
tions, the  motion  of  the  cart  is  approximately  harmonic, 
but  with  gradually  changing  frequency  and  amplitude. 


7-40.  A sliding  launch.  Starting  from  rest  at  the  top,  a 
body  slides  down  a frictionless  hemispherical  dome,  as 
shown  in  Fig.  7E-40.  Show  that  the  body  leaves  the  dome 
surface  when  6 = cos-1  f. 


Fig.  7E-42 


Fig.  7E-40 

7-41.  Calculating  the  work  done,  II.  One  of  the  forces 
acting  on  a certain  particle  depends  on  the  particle’s  posi- 
tion in  the  xy  plane.  This  force  F2,  expressed  in  newtons,  is 
given  by  the  expression  F2  = (xyx  + xyy)(l  N/m2),  where 
x and  y are  expressed  in  meters.  Calculate  the  work 

F2  • ds  done  by  this  force  when  the  particle  moves 

J o 


a.  Find  the  angular  frequency  of  oscillation  w(t)  at 
time  t,  when  M(t)  = M;  + t dM/dt.  z.  tjdM/df), 

b.  Find  the  kinetic  energy  carried  out  of  the  system 
by  the  sand  leaking  out  in  any  one  cycle  of  oscillation. 
Express  your  result  in  terms  of  dM/dt,  a>{t),  and  the 
time-dependent  amplitude  A{t). 

c.  Use  energy  considerations  to  determine  how  the 
amplitude  of  the  oscillation  varies  with  time.  That  is,  find 
an  expression  for  A(t). 

d.  The  leak  stops  when  the  cart  is  empty.  Its  final 
(unloaded)  mass  is  Mf.  With  what  frequency,  amplitude, 
and  total  energy  does  the  empty  cart  oscillate? 

e.  Compare  inital  and  final  values  of  «,  A,  and  total 
energy  E for  the  case  Mf  = 0.10M*. 


Exercises  299 


7-43.  Body,  track,  spring,  and  pivots.  One  end  of  a spring 
is  attached  to  a pivot  0 at  the  end  of  a fixed  vertical  support, 
as  shown  in  Fig.  7E-43.  The  spring  has  spring  constant  k 
and  relaxed  length  l0 . A body  is  attached  by  a pivot  to  the 
free  end  of  the  spring.  The  body  is  constrained  by  a yoke 
to  a horizontal,  frictionless  circular  track  of  radius  r.  The 
track  is  fixed,  with  its  center  a distance  h from  0.  In  com- 
pleting parts  a through  d,  take  the  relaxed  configuration 
l = /0  to  define  the  reference  position  for  potential  energy. 


ynniiiTO 


(a)  Perspective  view 


Fig.  7E-43 


a.  What  is  the  potential  energy  of  the  system  when 
the  body  is  at  point  A ? Express  your  result  in  terms  of  k,  h, 
r,  and  l0. 

b.  What  is  the  potential  energy  when  the  body  is  at 
point  B ? 

c.  How  much  work  is  done  by  the  spring  if  the  body 
moves  from  point  A to  point  B ? 

d.  What  is  the  potential  energy  of  the  system  when 
the  body  is  located  at  the  point  P? 

e.  Suppose  the  body  is  started  from  point  A in  the 

counterclockwise  sense  with  initial  speed  vA . Describe  the 
possible  subsequent  motion  on  the  following  cases:  (1) 
h + r < /0;  (2)  \h  — r\  < l0  < h + r;  (3)  | h — r\  > (4)  h — 0. 

f.  Which  of  your  answers  in  parts  a through  e would 
have  to  be  changed  if  a different  reference  position  were 
chosen  for  potential  energy?  Explain  your  answer. 


7-44.  Two  bodies,  two  tracks,  and  a spring.  Two  bodies 
of  equal  mass  m are  constrained  by  yokes  to  move  on  iden- 


tical horizontal  frictionless  circular  tracks  of  radius  r.  As 
shown  in  Fig.  7E-44,  the  centers  of  the  tracks  are  sepa- 
rated by  a distance  h.  The  bodies  are  linked  by  pivots  to  a 
spring  of  spring  constant  k and  relaxed  length  l0.  The  two 
bodies  are  initially  at  maximum  separation  as  shown;  each 
body  is  given  a small  initial  speed.  The  body  on  the  left  is 
started  in  the  clockwise  sense,  and  that  on  the  right  in  the 
counterclockwise  sense. 

a.  Under  what  condition  will  the  bodies  reach  the 
points  C and  C'P 

b.  Assuming  the  condition  in  part  a is  satisfied,  find 
the  magnitude  and  direction  of  the  force  exerted  by  each 
track  on  its  body  as  the  bodies  pass  through  C and  C' . 

c.  For  given  values  of  r and  /0,  find  the  value  of  h for 
which  the  force  found  in  part  b is  zero. 

7-45.  Frictionless  but  restrained.  As  shown  in  Fig.  7E-45, 
a smooth  rod  is  mounted  horizontally  just  above  a table 
top.  A 10-kg  collar,  which  is  able  to  slide  on  the  rod  with 
negligible  friction,  is  fastened  to  a spring  whose  other  end 
is  attached  to  a pivot  at  O.  The  spring  has  negligible  mass, 
a relaxed  length  of  10  cm,  and  a spring  constant  of  500 
N/m.  The  collar  is  released  from  rest  at  point  5. 


1 0 cm  1 5 cm 


a.  What  is  the  speed  of  the  collar  as  it  passes  A the 
closest  point  to  O? 

b.  What  is  the  speed  of  the  collar  as  it  passes  point  B ? 


Numerical 

7-46.  Integration:  A comparison  of  methods.  For  each  of 
the  integrals  below,  use  the  numerical  method  of  Example 
7-2  to  obtain  a result  to  two-place  accuracy.  Then  evaluate 
the  integral  using  the  appropriate  equation(s)  from  Eqs. 
(7-20)  to  (7-27).  Compare  the  two  results  for  each  integral. 


x2  dx 


b.  | y?  dx 


c. 


sin  0 d6 


d. 


e. 


f. 


x(l  — x)  dx 


dx 


7-47.  An  important  integral. 

a.  Use  the  numerical  method  of  Example  7-2  to 
evaluate  to  three  significant  figures  the  integral 


dx 


300  Energy  Relations 


for  xs  = 0.5,  1,  1.5,  and  2.  This  integral  is  called  the  gaus- 
sian  integral  or  normal  probability  integral,  and  it  plays 
a very  important  role  in  the  error  analysis  of  experimental 
data.  It  can  be  evaluated  only  by  numerical  methods;  there 
is  no  analytical  expression  for  the  value  of  the  integral. 

b.  Compare  the  values  you  obtained  in  part  a with 
values  that  can  be  found  in  almost  any  table  of  mathemati- 
cal data. 

7-48.  integrating  over  a spectrum.  The  surface  of  the 
sun  emits  radiation  over  a wide  range  of  wavelengths.  The 
power  radiated  by  1 m2  of  the  surface  is  different  at  dif- 
ferent wavelengths  k.  It  is  specified  by  the  emitted  power 
per  unit  wavelength,  R( A),  given  by  the  so-called  Planck 
function 

3.74  x l(T16 

_ ^5^2.52xl0-''/\  _ ])  ^ /'" 

where  the  wavelength  A is  expressed  in  meters.  The 
wavelength  k = 3.50  x 10-7  m represents  approximately 
the  extreme  blue  end  of  the  visible  spectrum,  anti  the 
wavelength  k = 7.00  x 10-7  m represents  approximately 
the  extreme  red  end.  The  integral 

r7.00xio-7 

/ = R(k)  dk 

J 3.50X10-7 

gives  the  power  per  unit  area  radiated  by  the  surface  of 
the  sun  in  the  visible  range  of  wavelengths.  Use  the  nu- 
merical method  of  Example  7-2  to  evaluate  the  integral  to 
an  accuracy  of  three  significant  figures.  The  integral  can 
be  evaluated  only  by  numerical  methods.  (Note:  You  may 
find  it  convenient  to  factor  out  a numerical  constant  be- 
fore integrating.) 


7-49.  Trapezoidal  integration  procedure.  There  is  an- 
other numerical  integration  procedure  that  converges 
rapidly  enough  to  be  useful  and  is  simple  enough  to  run 
on  any  calculating  device  capable  of  running  the  program 
used  in  Example  7-2.  It  amounts  to  calculating  the  total 
area  under  the  set  of  trapezoids  in  Fig.  7E-49.  The  sides  of 
the  trapezoids  are  perpendicular  to  the  x axis  and  inter- 
sect it  at  Xi,  Xj  + Ax,  Xi  + 2Ax,  . . . , xf  — Ax,  xf.  The  tops 
of  the  trapezoids  are  straight  lines  joining  the  intersec- 
tions of  the  sides  and  the  curve  F(x). 


Fig.  7E-49 


a.  Show  that  the  total  area  will  be  iF(xt)  Ax  + 
F(x i + Ax)  Ax  + F(xi  + 2Ax)  Ax  + • • • + F(xf  — Ax)  Ax  + 
iF(Xf)  Ax. 

b.  Write  a program  to  carry  out  this  trapezoidal  in- 
tegration procedure.  Use  it  to  evaluate  J x2  dx  to  three- 

decimal-place  accuracy,  and  compare  the  result  with  that 
obtained  in  Example  7-2. 


Exercises  301 


Applications  of 

Energy  Relations 


8-1  POWER  The  principal  purpose  of  this  chapter  is  to  present  a number  of  examples 
demonstrating  the  application  of  the  theory  of  energy  to  interesting  and 
important  physical  systems.  A secondary  purpose  is  to  introduce  several 
points  of  the  theory  that  were  not  raised  in  Chap.  7 because  they  were  not 
essential  to  the  development  of  the  main  theme.  One  of  these  is  the  concept 
of  power. 

In  many  circumstances  it  is  important  to  distinguish  between  work 
done  rapidly  and  work  done  slowly.  For  an  automobile  engine  to  bring  a 
car  from  rest  up  to  a certain  speed  in  a short  time,  the  engine  must  be  able 
in  that  time  to  do  the  work  required  to  give  the  automobile  the  corre- 
sponding amount  of  kinetic  energy.  Such  an  engine  is  one  of  high  power. 
To  be  more  specific,  power  measures  the  rate  of  doing  work.  If  work  is  being 
done  at  a constant  rate,  the  power  P is,  by  definition, 

W 

P = — for  constant  P (8- la) 


where  W is  the  total  amount  of  work  done  in  the  total  time  t.  If  work  is 
being  done  at  a varying  rate,  the  power  P expended  at  any  instant  is  de- 
fined as 


P = 


dW 

dt 


(8-16) 


The  unit  of  power  in  the  SI  system  is  the  watt  (W),  named  after  James  Watt 
(1736-1819),  the  inventor  of  the  most  important  form  of  the  steam  engine. 
If  work  is  being  done  at  the  rate  of  1 joule  per  second  (J/s),  the  power  is 
1 W: 


302 


1 W = 1 J/s 


(8-2) 


A commonly  used,  non-SI  unit  of  power  is  the  horsepower  (hp).  The  con- 
version factor  is 


1 hp  = 746  W 


(8-3) 


The  unit  was  introduced  by  Watt  so  that  he  could  specify  the  power  of  his 
engines  in  terms  that  would  be  understood  by  his  contemporaries.  It  was 
supposed  to  represent  the  rate  at  which  a horse  can  do  work  (for  prolonged 
periods),  but  in  fact  it  overestimates  the  ability  of  an  average  horse  by  about 
50  percent.  A human  can  do  work  at  an  appreciable  fraction  of  this  rate 
(but  not  for  prolonged  periods),  as  you  will  see  in  Example  8-1. 


EXAMPLE  8-1 


A bicyclist  is  pedaling  up  a hill  at  a speed  of  3.0  m/s.  The  slope  of  the  road  is  4.0°, 
the  mass  of  the  bicycle  is  15  kg,  and  the  mass  of  the  bicyclist  is  65  kg.  Estimate  the 
power  he  is  expending  over  and  above  the  rather  low  power  he  would  use  to  main- 
tain this  speed  against  friction  if  the  road  were  level. 

■ Work  can  result  in  an  increase  in  potential  energy,  in  kinetic  energy,  or  in 
thermal  energy  as  a result  of  frictional  losses.  The  various  frictional  losses  (in  the 
bearings,  drive  chain,  flexing  of  the  tire  against  the  roacf  and  air  resistance)  are  the 
same  as  those  encountered  in  riding  at  the  same  speed  on  a level  road.  Since  the  bi- 
cycle’s speed  is  constant  during  the  trip  up  the  hill,  there  is  no  increase  in  kinetic  en- 
ergy. Thus  all  the  additional  work  done  by  the  bicyclist  goes  to  increase  the  poten- 
tial energy  of  the  system  comprising  the  bicycle  plus  the  bicyclist  plus  the  earth.  The 
additional  work  per  unit  time  dW/dt  equals  the  increase  in  potential  energy  per  unit 
time  dU/dt.  If  you  measure  the  elevation  of  the  bicyclist  by  the  vertical  coordinate  y, 
as  in  Fig.  8- 1 , U can  be  written 


U = mgy 


Here  m is  the  combined  mass  of  the  bicycle  and  bicyclist,  and  the  reference  position 
in  the  definition  of  the  potential  energy  U is  whatever  elevation  is  used  to  define 
y = 0.  So  you  have 


where  vy  is  the  vertical  component  of  the  velocity  of  the  bicyclist.  Thus  the  addi- 
tional power  required  to  climb  the  hill  is 


dW 

p = —rr  = mgVy 


Fig.  8-1  A bicyclist  pedaling  up 


a hill. 


vy  = v sin  8 


8-1  Power  303 


The  numerical  values  specified  lead  to  the  result 


304 


P = (15  kg  + 65  kg)  X 9.8  m/s2  x 3.0  m/s  x sin  4.0° 

= 1.7  x 102  W = 0.22  hp 

Although  the  power  is  constant  in  this  example,  so  that  it  could  be  computed 
from  Eq.  (8-la),  it  is  really  more  convenient  to  use  Eq.  (8-1  b)  in  the  computation. 
Then  you  do  not  have  to  bother  specifying  the  conditions  at  which  U,  W,  and  t are 
zero.  If  the  power  varies,  you  must  use  Eq.  (8-1  b). 


EXAMPLE  8-2 

In  Example  4-9  you  saw  how  to  evaluate  the  force  that  must  be  applied  to  pull  a 
very  long  conveyor  belt  at  a constant  speed  of  2.00  m/s  while  crushed  ore  from  a 
mine  drops  on  the  belt,  thereby  adding  mass  to  the  belt  at  a rate  dm/dt  = 300  kg/s. 
The  situation  is  depicted  again  in  Fig.  8-2.  Evaluate  the  power  P expended  in 
moving  the  conveyor  belt. 

* Define  an  x axis  whose  positive  direction  is  in  the  direction  of  motion  of  the 
belt,  and  use  signed  scalars.  You  can  then  write  the  work  dW  done  by  the  force  F ap- 
plied to  move  the  belt  during  a displacement  dx  of  the  belt  as 

dW  = Fdx 


The  power  P is 


P 


dW  dx 

dt  dt 


or,  in  terms  of  the  velocity  v of  the  belt, 

P = Fv  (8-4) 

That  is,  the  power  expended  equals  the  force  applied  times  the  velocity  of  the  object 
to  which  it  is  applied. 

According  to  Eq.  (4-15),  the  force  applied  is 


So  you  have 


P = 


dm 


dt 


dm 

~dt 


Since  v and  dm/dt  are  constants,  F and  P are  also  constants. 
The  numerical  value  of  P is 


P = (2.00  m/s)2  x 300  kg/s  = 1.20  x 103  W = 1.20  kW 


Since  v is  a constant,  the  expression  for  the  power  P can  be  written  as 

d(mv2) 

P = — 

dt 


8-2  MACHINES 


EXAMPLE  8-3 


(Evaluate  the  derivative  for  constant  v,  and  you  will  see  immediately  that  it  gives 
P = v2  dm/dt,  as  before.)  Introducing  a factor  of  \ in  the  quantity  being  differen- 
tiated, and  a compensating  factor  of  2 in  front  of  the  derivative,  you  can  also  write 
the  new  expression  for  P as 

p = 2 d(mv2/ 2) 
dt 

But  mv2/ 2 is  just  the  kinetic  energy  K of  the  moving  belt  and  the  moving  ore  lying 
on  it.  So 


where  dK/dt  is  the  rate  at  which  this  energy  increases  as  the  amount  of  ore  on  the 
belt  increases.  It  may  surprise  you  that  P is  twice  as  large  as  dK/dt,  because  this 
means  that  only  half  the  expended  power  goes  into  increasing  the  kinetic  energy  of 
the  system  comprising  the  belt  plus  the  ore  lying  on  it.  Where  does  the  other 
half  go? 

Inspection  of  Fig.  8-2  will  verify  that  the  “missing”  power  certainly  does  not  go 
into  increasing  the  potential  energy  of  the  system.  In  fact,  the  other  half  of  the  ex- 
pended power  is  lost  to  frictional  effects.  These  occur  in  the  inelastic  collisions  of 
the  rocks  striking  the  moving  belt.  When  the  rocks  first  strike  the  belt,  they  skid  and 
bump  until  they  have  come  up  to  speed.  Are  you  surprised  to  find  that  this  power 
loss  is  calculable? 


In  this  section  we  will  apply  the  energy  relations  in  a set  of  examples  which 
analyze  a number  of  simple  mechanical  devices  or,  as  it  is  said,  machines. 
The  most  elementary  of  these  analyses  will  serve  mainly  to  exemplify  the 
energy  approach.  In  others,  the  use  of  energy  relations  will  make  the  analy- 
sis easier  than  it  would  be  if  Newton’s  second  law  were  applied  directly. 
Perhaps  more  important  is  that  these  relations  will  provide  a quite  dif- 
ferent point  of  view  which  can  add  substantially  to  your  depth  of  under- 
standing of  mechanical  systems. 


Use  the  mechanical-energy  conservation  law  to  find  the  force  required  to  raise  the 
weight  with  the  lever  shown  in  Fig.  8-3. 

■ You  begin  by  imagining  that  the  free  end  of  the  lever  is  slowly  raised  through  a 
small  angle  A </>,  which  you  define  to  be  positive.  As  a result,  the  weight  is  slowly 
given  a vertical  displacement  that  you  can  express  by  the  signed  scalar  Ayq  = r1  A<£. 
The  quantity  rt  is  the  distance  along  the  lever  from  the  fixed  axis,  called  the  ful- 
crum (from  the  Latin  word  meaning  a support),  to  the  point  from  which  the  weight 
is  suspended. 


advantage  assumes  that  the  angle  through  which  the 
lever  is  rotated  about  its  fulcrum  is  small  compared  to 
1 rad.  But  for  the  sake  of  clarity,  the  figure  exaggerates 
the  angle  somewhat. 

l r mg 


8-2  Machines  305 


In  performing  this  task,  you  must  give  the  end  of  the  lever  a vertical  displace- 
ment Ay2  = r2  A </>,  where  r2  is  the  distance  from  the  fulcrum  to  the  point  of  applica- 
tion of  the  vertical  force  F you  exert  on  the  lever. 

Since  everything  is  moving  slowly,  no  appreciable  kinetic  energy  is  involved.  If 
you  assume  friction  in  the  fulcrum  is  negligible,  it  follows  that  the  work  you  do  by 
applying  an  external  force  to  the  weight  goes  into  increasing  the  potential  energy  of 
the  weight-plus-earth  system.  The  work  done  is 

W = F Ay2  = Fr2  A</> 

Write  the  gravitational  potential  energy  of  the  system  in  terms  of  the  vertical  coor- 
dinate y1  of  the  weight,  measured  from  the  reference  position,  as 

U = mgyl 

Then  the  increase  in  potential  energy  is 

AU  = mg  A)'!  = mgrx  A 4> 


Equating  W to  A U,  you  have 

Fr2  Ac/)  = mgjx  A </> 


Thus  you  find 


F = Wffli 
r2 


The  quantity  mg/F  is  the  ratio  of  the  weight  lifted  to  the  force  applied.  It  is 
called  the  mechanical  advantage  of  the  machine.  For  the  lever,  the  mechanical  ad- 
vantage is 

o 


mg  r2 


(8-5) 


Since  r2  is  greater  than  ru  the  weight  mg  lifted  by  the  lever  is  greater  than  the  force  F 
applied  to  the  lever.  To  put  it  the  other  way,  a lever  allows  you  to  lift  a given  weight 
by  applying  a smaller  force  than  you  would  apply  if  lifting  it  unassisted.  Of  course, 
you  do  not  get  something  for  nothing.  The  price  paid  for  reducing  the  necessary 
force  is  an  increase  in  the  distance  that  the  far  end  of  the  lever  must  be  moved  by 
this  force. 


EXAMPLE  8-4 

Use  energy  conservation  to  find  the  mechanical  advantage  of  the  block  and  tackle, 
shown  in  Fig.  8-4a  and  b. 

■ Suppose  that  you  do  an  amount  of  work  W on  the  weight-plus-earth  system  by 
pulling  on  the  free  end  of  the  rope  so  that  the  weight  is  raised,  through  a vertical 
displacement  Ay,  slowly  enough  that  its  kinetic  energy  can  be  neglected.  If  friction 
in  the  pulley  wheels  is  negligible,  W will  be  equal  to  the  increase  A U = mg  Ay  in  the 
system’s  potential  energy.  Thus 

W = mg  Ay 

In  raising  the  weight,  each  of  the  n segments  of  the  rope  which  support  it  must 
be  shortened  by  an  amount  equal  in  magnitude  to  Ay.  The  total  length  of  rope  you 
must  pull  through  the  pulleys  at  the  free  end  is  therefore  n Ay.  This  is  the  magni- 
tude of  the  displacement  of  the  end  of  the  rope  to  which  you  apply  a force.  Since 
the  force,  of  magnitude  F , acts  in  the  direction  of  this  displacement,  the  work  it  does 
is 

W = Fn  Ay 

Equating  the  two  expressions  for  W yields 

mg  Ay  = Fn  Ay 


306  Applications  of  Energy  Relations 


n—  3 


n — 2 


(b) 

Fig.  8-4  (a)  A block  and  tackle  with  two  pulleys.  ( b ) A schematic  drawing  of  a block  and  tackle. 

Since  the  number  of  pulleys  is  arbitrary,  only  the  first  few  and  the  last  few  are  shown.  (In  prac- 
tice, friction  in  the  pulley  bearings  limits  their  total  number  to  a maximum  of  about  10.) 


(a) 


or 


mg 

F 


n 


(8-6) 


Thus  the  mechanical  advantage  of  the  block  and  tackle,  neglecting  friction, 
equals  the  number  n of  rope  segments  supporting  the  weight.  In  a real  block  and 
tackle  the  friction  in  the  pulley  bearings  can  be  significant,  and  this  reduces  the  me- 
chanical advantage  to  an  appreciably  lower  value. 


EXAMPLE  8-5 

A system  often  used  in  place  of  the  block  and  tackle  is  the  differential  pulley.  It  is 
shown  in  Fig.  8-5.  The  two  pulleys  of  radii  rx  and  r2  are  joined  and  rotate  together. 
Their  grooves  have  teeth  which  engage  the  links  of  a continuous  loop  of  chain  so 
that  it  passes  over  them  without  slipping.  As  you  pull  on  the  chain  at  point  P,  the 
segment  labeled  2 is  taken  up  more  than  the  segment  labeled  1 is  let  out  because  r2  is 
larger  than  r1.  Thus  the  weight  supported  by  the  chain  is  raised.  The  convenience 
of  this  machine  lies  in  the  fact  that  there  are  only  two  pulleys  and  one  axle,  plus  the 
auxiliary  pulley  beneath.  Yet  by  properly  choosing  the  radii  rx  and  r2,  any  desired 
mechanical  advantage  can  be  obtained.  Find  the  strength  F of  the  force  you  must 
apply  to  raise  slowly  an  object  of  mass  m = 1000  kg  if  the  radii  have  the  values  rx  = 
9.50  cm  and  r2  = 10.00  cm.  Neglect  friction. 

■ Consider  what  happens  when  you  slowly  pull  the  chain  at  the  point  P and  the 
differential  pulley  goes  around  once.  The  segment  of  chain  labeled  2 in  the  figure  is 
taken  up  by  an  amount  27rr2.  However,  the  segment  labeled  1 is  let  out  by  an 
amount  2-77^.  So  the  loop  supporting  the  object  is  shortened  by  the  amount 
27r(r2  — rx).  The  object  is  raised  by  half  this  amount.  (Why?)  Thus  its  upward  dis- 
placement is 

Ay  = 7r(r2  - rj 


8-2  Machines  307 


The  increase  in  potential  energy  of  the  weight-plus-earth  system  is 

A U = mg  Ay  = mgn(r2  — rj 

This  energy  is  supplied  by  the  work  you  do  on  the  system  in  pulling  at  point  P with 
a force  of  magnitude  F.  The  force  is  applied  through  a displacement  of  magnitude 
2 nr 2 because  the  chain  comes  off  the  large  pulley.  Since  the  force  acts  in  the  direc- 
tion of  this  displacement,  the  work  done  is 

W = F 2 nr 2 

Setting  A U = W,  you  get 

mgn(r2  — n)  = F 2nr2 

or  WM.  = 2r2 

F r2  — rt 

So  you  have  r^~  = (8-7) 

F 1 - rjr2 


mg 


Fig.  8-5  A differential  pulley.  Its 
operation  is  explained  in  Example  8-5. 


This  is  the  mechanical  advairtage  of  a differential  pulley.  If  r1  is  nearly  equal  to  r2, 
the  denominator  of  the  fraction  on  the  right  side  of  Eq.  (8-7)  will  be  very  small  and 
the  mechanical  advantage  will  be  very  large. 

To  find  the  force  required  to  lift  the  object,  you  write  Eq.  (8-7)  as 


F 


mg 

~2~ 


and  then  substitute  the  values  given  to  obtain 


1000  kg  x 9.80  m/s2 
2 


0.0950  m\ 
0.1000  nj 


= 245  N 


It  takes  a force  of  only  245  N to  lift  an  object  whose  weight  is  9800  N because  this 
differential  pulley  has  a mechanical  advantage  of  40. 


In  Examples  8-3  through  8-5  we  have  used  energy  relations  to  solve 
what  are  essentially  static  problems.  That  is,  we  were  not  concerned  with 
acceleration  because  the  systems  we  studied  are  generally  used  to  move  ob- 
jects slowly  and  steadily,  with  an  applied  force  just  large  enough  to  do  the 
necessary  useful  work.  Consequently,  the  kinetic  energy  was  negligible  in 
every  case.  In  Example  8-6  we  will  again  consider  Atwood’s  machine,  a 
system  in  which  acceleration  is  important  and  the  kinetic  energy  is  signifi- 
cant. 


EXAMPLE  8-6 

In  the  Atwood  machine  shown  in  Fig.  8-6,  the  bodies  at  the  ends  of  the  cord  of  neg- 
ligible mass  have  masses  = 2.10  kg  and  m2  = 2.00  kg,  and  the  mass  of  the 
friction-free  pulley  wheel  is  assumed  to  be  negligible.  The  bodies  are  released  with 
zero  velocity  at  the  same  height.  At  a certain  later  instant,  when  they  are  separated 
by  a vertical  distance  2|Ay[  = 1.5  m,  their  accelerations  have  given  them  certain 
velocities.  Find  these  velocities. 

■ You  take  as  an  isolated,  friction-free  system  the  two  bodies  and  the  earth.  The 
cord  and  pulley  act  as  workless  constraints  that  always  make  one  body  move  up  at 
the  same  speed  as  the  other  body  moves  down.  Since  the  bodies  accelerate,  they 


308  Applications  of  Energy  Relations 


gain  speed;  therefore  the  system  gains  kinetic  energy.  This  gain  is  at  the  expense  of 
the  system’s  gravitational  potential  energy.  Choose  the  initial  height  of  the  bodies  as 
the  origin  of  an  upward-directed  y axis,  and  also  as  the  reference  level  for  gravita- 
tional potential  energy.  Then  the  potential  energy  of  the  system  at  the  initial  instant 
has  the  convenient  value  U - 0.  Since  both  bodies  start  from  rest,  the  system  has 
kinetic  energy  A = 0 initially.  Thus  the  total  mechanical  energy  E of  the  system  has 
the  initial  value  E = A + U = 0.  It  will  maintain  this  value. 

At  the  instant  depicted  in  the  figure,  the  change  in  the  potential  energy  of  the 
system  from  its  initial  value  of  zero  is 

A U = m2g  t±y2  + mxg  t\yx  = m2g|Ay|  - mxg\b.y\ 


or 


Fig.  8-6  Atwood’s  machine. 


A U = ~(mx  — m2 ) g|Ay| 


Note  that  the  displacement  of  the  body  with  the  smaller  mass  m2  has  caused  the 
system  to  gain  potential  energy,  but  the  displacement  of  the  body  with  the  larger 
mass  vi j has  caused  the  system  to  lose  a larger  amount  of  potential  energy.  So  there 
is  a net  loss  of  potential  energy.  This  loss  of  potential  energy  must  show  up  as  a gain 
in  kinetic  energy,  in  order  that  the  total  mechanical  energy  remain  constant.  That 
is,  you  must  have 

AA  + AU  = 0 


where  A K is  the  change  in  the  potential  energy  of  the  system  from  its  initial  value  of 
zero. 

Now  the  change  in  the  kinetic  energy  of  the  initially  motionless  system  is 

A „ rnjvl  , m2v\ 

— + — 


where  vx  and  v2  are  the  velocities  of  the  two  bodies  at  the  instant  being  considered. 
Writing  the  common  value  of  their  speeds  as  |v|,  you  have 


m^v\2  m2 
AA  = — + — ^-L 


Therefore  the  mechanical-energy  conservation  relation  demands  that 


2 


m2  |v|2 
2 


- (fflj  - m2)  g|Ay|  = 0 


This  simplifies  to 


or 


Thus  you  obtain 


(mi  + m2 ) 
2 


- (m1  - m2)  g\ky\  = 0 


\v\ 


2 


m1  — m2 

17ll  + 772  2 


V 


m\  ~ m2 
m1  + m2 


1/2 


(8-8) 


1 be  body  with  the  larger  mass  mx  is  moving  downward  and  so  has  velocity  vx  = 
- |v|.  The  body  of  smaller  mass  m2  is  moving  upward  with  the  velocity  v2  = |v|.  The 
speed  of  the  downward-moving  body  is  the  speed  it  would  acquire  in  free  fall 
through  the  same  distance,  (2g|A3)|)1/2,  multiplied  by  the  quantity  [(mx  - m2)/ 
(mx  + m2)]112,  which  is  always  less  than  1. 


8-2  Machines  309 


8-3  IMPULSE  AND 
COLLISIONS 


Inserting  the  numerical  values  given,  you  find 


2 x 9.8  m/s2  x 0.75  m x 
0.60  m/s 


/2.10  kg  - 2.00  kg\ 
V2.10  kg  + 2.00  kg/ 


1/2 


The  velocities  of  the  bodies  are  therefore  vx  = -0.60  m/s  and  v2  = 0.60  m/s.  They 
are  the  same  as  the  velocities  evaluated  by  direct  application  of  Newton’s  second  law 
in  Example  5-3. 

Suppose  you  had  chosen  some  other  reference  height  for  the  zero  of  potential 
energy.  Would  there  be  any  difference  in  the  result?  As  a test,  take  U = 0 at  the 
final  position  of  the  body  of  mass  m2,  and  rework  the  calculation.  You  will  find  that 
this  choice  of  reference  height,  or  any  other  choice,  will  lead  to  the  same  result.  It  is 
only  the  change  A U in  potential  energy  which  counts. 

Note  how  the  approach  to  the  Atwood  machine  through  the  energy  relations 
completely  avoids  a detailed  analysis  of  the  internal  and  external  forces  acting  on  the 
system.  You  need  to  know  only  the  initial  and  final  conditions  in  order  to  find  the 
final  velocities  of  the  bodies. 


In  Chap.  4 we  studied  the  behavior  of  bodies  experiencing  collisions  by  ana- 
lyzing the  changes  in  their  momenta.  In  this  section  we  will  learn  more 
about  the  behavior  of  colliding  bodies  by  analyzing  also  the  changes  in  the 
kinetic  energies  of  the  bodies.  But  first  we  will  introduce  a quantity  called 
impulse.  It  is  related  to  change  in  momentum,  and  the  impulse-momentum 
relation  will  give  us  some  additional  insight  into  the  momentum  changes 
taking  place  in  a collision. 

The  motivation  and  procedure  leading  to  the  impulse-momentum  re- 
lation have  a strong  analogy  to  those  leading  to  the  work-kinetic  energy 
relation  in  Chap.  7.  Let  us  review'  very  briefly  how  and  w'hy  the  work-kinetic 
energy  relation  is  obtained.  This  relation  is  derived  by  calculating  a work 
integral,  that  is,  the  integral  of  the  component  of  the  net  force  acting  on  a 
body,  along  the  directions  of  its  infinitesimal  position  changes,  multiplied 
by  these  position  changes.  Newton’s  second  law  is  used  in  the  calculation  to 
express  the  net  force  in  terms  of  the  mass  of  the  body  and  the  derivative  of 
its  velocity.  Since  integration  is  the  inverse  of  differentiation,  the  calcula- 
tion produces  results  which  involve  the  velocity  itself  rather  than  the  deriv- 
ative of  the  velocity.  This  is  w'hy  the  w'ork-kinetic  energy  relation  involves 
only  the  velocity,  although  New'ton’s  second  law  involves  its  derivative,  the 
acceleration.  Application  of  the  relation  often  leads  much  more  rapidly  to  a 
description  of  the  behavior  of  the  body  than  does  application  of  Newton’s 
second  law.  The  reason  is  that  in  working  directly  with  the  velocity,  you  are 
dealing  with  a quantity  that  is  closer  to  the  goal — finding  the  position  of 
the  body  as  a function  of  time — than  is  the  acceleration.  Another  advan- 
tage of  the  work-kinetic  energy  relation  over  New'ton’s  second  law  is  that 
the  former  involves  scalars  such  as  W,  while  the  latter  involves  vectors  such 
as  F,  and  scalars  are  easier  to  handle  than  vectors. 

Since  consideration  of  the  integral  of  the  net  force  over  the  change  in 
position  proves  to  be  so  fruitful,  it  is  reasonable  to  ask:  Will  an  integration 
of  the  net  force  over  the  change  in  time  also  yield  a useful  quantity?  Let’s 
try  it  and  see.  Consider  the  integral 


310  Applications  of  Energy  Relations 


Fit) 


Fig.  8-7  Qualitative  representation  of 
the  time  dependence  of  the  strength 
of  the  force  acting  on  a magnetic  puck 
when  it  collides  head-on  (without  actu- 
ally touching)  with  another  magnetic 
puck. 


Here  the  integral,  whose  value  is  represented  by  the  symbol  I,  is  a sum  of 
terms  each  of  which  is  the  product  of  a vector,  the  net  force  F acting  on  a 
body,  and  a scalar  dt,  the  infinitesimal  increment  of  time.  Thus  I is  a vector. 
We  use  Newton’s  second  law  in  its  most  basic  form,  F = dp/dt,  to  evaluate  F 
in  terms  of  the  time  derivative  of  the  momentum  p of  the  body.  T his  gives 
us 


The  fundamental  theorem  of  calculus  immediately  yields 

1 = Pf  ~ P;  = Ap 


(8-10) 


The  quantity  I is  called  the  impulse,  and  Ecj.  (8-10)  is  called  the 
impulse-momentum  relation.  It  says  that  the  time  integral  of  the  net  force 
acting  on  a body,  called  the  impulse  I,  equals  the  change  in  its  momentum  Ap.  Since 
Newton’s  second  law  was  employed  in  deriving  the  impulse-momentum  re- 
lation, it  applies  only  when  the  body  is  viewed  from  an  inertial  frame  of  ref- 
erence. 

Is  this  relation  useful?  Not  very.  For  the  special  case  in  which  F = 0,  we 
have  1 = 0 and  Ap  = 0,  or  p / = p, . But  this  is  just  (he  familiar  law  of  mo- 
mentum conservation  for  the  uninteresting  situation  in  which  a system  con- 
tains only  a single  body.  For  the  general  case  where  F f 0,  we  have  I f 0 so 
Ap  f 0,  and  we  would  like  to  know  just  what  the  value  of  Ap  is.  But  we  can 
obtain  the  value  of  I only  if  we  can  evaluate  the  integral  in  Eq.  (8-9).  To  do 
this,  we  must  know  how  F depends  on  t.  That  is,  we  must  be  able  to  express 
the  force  as  F(t).  However,  we  usually  do  not  know  F(t).  (That  is,  we  do  not 
know  it  until  the  behavior  of  the  body  has  been  completely  determined  so 
that  we  know  the  position  of  the  body  at  all  times.  But  when  this  has  been 
clone,  there  is  no  motivation  for  additional  calculations.)  The  point  is  that 
the  forces  in  nature  usually  do  not  depend  explicitly  on  time.  Instead,  they 
depend  explicitly  on  position.  If  one  of  these  typical  forces  is  acting  on  the 
body  of  interest,  we  can  immediately  evaluate  its  integral  over  change  in 
position — the  work.  But  we  cannot  immediately  evaluate  its  integral  over 
change  in  time — the  impulse. 

Although  it  has  limited  practical  application,  impulse  is  useful  concep- 
tually. In  particular,  it  helps  us  think  about  what  happens  during  collisions. 
A collision  is  an  event  in  which  two  objects  approach  each  other,  interact 
strongly  during  the  short  time  that  they  are  in  proximity,  and  then  (gener- 
ally) move  apart.  A number  of  examples  of  collisions  between  pucks  on  an 
air  table  were  studied  in  Chap.  4. 

Figure  8-7  illustrates  qualitatively  the  time  dependence  of  the  repul- 
sive force  acting  on  one  of  two  colliding  magnetic  pucks,  assuming  for  sim- 
plicity a “head-on”  collision  so  that  the  force  can  be  represented  by  a signed 
scalar.  The  direction  of  the  force  is  defined  to  be  positive.  The  force  builds 
up  rapidly  in  strength  as  the  other  puck  approaches,  reaches  a maximum 
when  that  puck  is  at  the  point  of  closest  approach,  and  then  drops  rapidly 
as  it  recedes.  The  effect  of  the  force  is  to  change  the  momentum  of  the 
puck  on  which  it  acts.  According  to  the  impulse-momentum  relation,  the 
net  momentum  change  is  just  equal  to  the  time  integral  of  the  force,  which 
is  measured  by  the  area  under  the  F(t)  curve.  Of  course,  the  total  mo- 
mentum of  the  system  of  two  pucks  is  unchanged  by  the  collision.  The 


F(t) 


Fig.  8-7  Qualitative  representation  of 
the  time  dependence  of  the  strength 
of  the  force  acting  on  a magnetic  puck 
when  it  collides  head-on  (without  actu- 
ally touching)  with  another  magnetic 
puck. 


8-3  Impulse  and  Collisions  311 


F(t) 


I , 

Fig.  8-8  Tlic  solid  curve  is  a qualitative 
representation  of  the  time  dependence 
of  the  strength  of  the  force  acting  on 
a plastic  puck  when  it  collides  head-on 
with  another  plastic  puck.  Compared  to 
the  magnetic  force  illustrated  in  Fig.  8-7, 
the  time  during  which  the  contact  force 
acts  on  the  plastic  puck  is  very  short, 
and  its  maximum  strength  is  very 
large.  The  gray  curve  represents  a 
constant,  or  slowly  varying,  force  of 
moderate  strength  which  could  also 
be  acting  on  the  puck.  In  studying  the 
collision,  the  impulse  produced  by  this 
force  can  be  neglected.  The  impulse 
produced  during  the  collision  is  the 
area  under  the  gray  curve  between 
the  two  marks  on  the  time  axis.  This 
is  very  small  compared  to  the  area  under 
the  solid  curve  between  the  two  marks, 
which  is  the  impulse  produced  by  the 
contact  force. 


other  puck  always  experiences  a force  of  the  same  magnitude  but  opposite 
direction.  So  it  receives  an  equal  but  opposite  impulse  and  experiences  an 
equal  but  opposite  net  momentum  change.  As  in  all  collisions,  the  impulses 
acting  on  the  two  colliding  magnetic  pucks  cause  momentum  to  be  trans- 
ferred from  one  to  the  other,  without  changing  the  total  momentum  of  the 
system. 

Figure  8-8  indicates  the  much  more  abrupt  F(t)  curve  for  a "head-on” 
collision  between  two  plastic  pucks  on  an  air  table.  The  collision  is  abrupt 
because  the  pucks  interact  only  when  they  are  actually  touching.  When  this 
happens,  very  strong  repulsive  contact  forces  develop  for  a very  short  time. 
But  if  the  numerical  value  of  the  impulse  happens  to  be  the  same  as  in  the 
magnetic  puck  collision — that  is,  if  the  areas  under  the  F{t)  curves  in  the 
two  figures  happen  to  be  equal  — the  momentum  transfer  will  be  the  same 
in  the  two  collisions.  What  counts  in  determining  the  momentum  transfer 
in  a collision  is  the  impulse,  not  the  detailed  behavior  of  F{t). 

Another  thing  that  is  unimportant  in  a collision  is  the  action  of  any 
force  of  moderate  strength.  As  is  illustrated  in  Fig.  8-8,  if  the  collision  force 
is  strong  and  the  collision  time  is  short,  any  much  weaker  force  can  be  ig- 
nored because  its  contribution  to  the  impulse  will  be  negligible.  For  in- 
stance, the  momentum  transferred  between  two  plastic  pucks  that  collide 
while  flying  through  the  air,  instead  of  while  moving  across  the  surface  of 
an  air  table,  is  virtually  unaffected  by  the  uncompensated  gravitational 
forces  acting  on  them  during  the  collision  because  these  forces  are  very 
much  weaker  than  the  contact  forces. 

Examples  8-7  through  8-10  analyze  some  of  the  air  table  puck  collisions 
that  were  studied  in  Chap.  4.  Both  momentum  and  energy  are  taken  into 
account  in  the  analyses.  As  a consequence,  the  examples  provide  a more 
thorough  understanding  of  the  collisions  than  was  obtained  in  Chap.  4, 
where  only  momentum  was  considered. 


EXAMPLE  8-7 

The  strobe  photo  in  Fig.  4-14,  reproduced  here  as  Fig.  8-9,  shows  a collision 
between  two  equal-mass  magnetic  pucks  on  an  air  table.  Puck  2 was  initially  sta- 
tionary near  the  center  of  the  air  table,  and  puck  1 was  initially  moving  toward  the 
center  from  the  upper  right.  The  strobe  photo  shows  that  the  final  trajectories  of 
the  pucks  make  a 90°  angle.  Prove  that  such  behavior  is  consistent  with  the  assump- 
tion that  mechanical  energy  is  conserved  in  the  collision.  Do  this  by  assuming  that 
the  mechanical-energy  conservation  law  holds  and  then  predicting  the  value  of  the 
angle  between  the  final  trajectories  of  the  pucks. 

■ The  magnetic  pucks  do  not  exert  forces  on  each  other  of  appreciable  strength 
until  they  are  cpiite  close.  You  can  see  this  from  the  photo.  It  shows  that  puck  1 
maintains  an  essentially  constant  velocity  during  its  approach  to  puck  2.  (The  word 
“approach”  is  used  to  mean  that  part  of  the  motion  of  puck  1 toward  puck  2 which 
ends  when  the  separation  between  the  pucks  is  less  than  something  like  one  puck 
diameter.)  Thus  there  can  be  no  appreciable  force  exerted  between  pucks  during 
the  approach.  As  a consequence,  the  kinetic  energy  of  pock  1 is  essentially  all  the 
energy  stored  in  the  system  when  puck  1 is  at  any  position  in  the  approach.  Addi- 
tional energy  could  result  only  f rom  work  done  by  a force  acting  on  puck  1 while  it 
moved  up  to  that  position.  But  there  is  no  such  force.  In  other  words,  you  can  say 
that  the  system  has  essentially  no  potential  energy  relative  to  the  potential  energy 
reference  value  zero  for  puck  I at  a reference  position  chosen  to  be  very  far  from 
puck  2,  when  puck  1 is  at  any  position  in  the  approach.  (When  puck  1 is  very  close 


312  Applications  of  Energy  Relations 


Fig.  8-9  Strobe  photo  of  a collision  between  two 
identical  magnetic  pucks  on  an  air  table.  Puck  1 comes 
from  the  right,  and  puck  2 is  initially  at  rest  near  the 
center  of  the  table. 


to  puck  2,  the  system  does  have  potential  energy.  However,  you  will  see  that  there  is 
no  need  to  consider  it.)  A similar  argument  allows  you  to  say  that  the  system  has  es- 
sentially no  potential  energy  during  the  part  of  the  motion  when  the  two  pucks  are 
receding  from  each  other  and  their  separation  exceeds  something  like  one  puck 
diameter. 

Taking  advantage  of  the  conclusions  just  drawn,  you  need  consider  the  energy 
content  of  the  system  only  in  the  initial  time  interval  when  the  velocity  vector  v:i  in 
the  photo  was  measured  and  in  the  final  time  interval  when  the  velocity  vectors  vir 
and  v2/  were  measured.  When  the  pucks  have  these  initial  or  final  velocities,  the 
system  has  initial  or  final  kinetic  energies  K,  or  K{.  But  the  corresponding  values  of 
both  the  initial  and  the  final  potential  energies  of  the  system  are  zero.  Therefore, 
what  you  do  in  following  the  instruction  to  assume  that  mechanical  energy  is  con- 
served in  the  collision  is  to  equate  the  initial  and  final  values  of  the  kinetic  energy: 

Kt  = Kf 

To  obtain  information  about  the  angle  between  the  final  trajectories  of  the  pucks, 
you  need  to  deal  with  directions.  For  this  reason,  you  write  the  expressions  for  the 
kinetic  energies  in  their  vector  forms.  Using  Eq.  (7-37 b)  to  express  Kt  and  Kf  in 
terms  of  vu,  Vjy,  v2/,  and  the  mass  m of  either  identical  puck,  you  have 

m m m , 

J vu  • vi;  = — v„*  vu  + — • \2f  (8- 11a) 

The  left  side  of  this  equality  is  the  initial  total  kinetic  energy  of  the  system,  since 
puck  2 is  initiallv  stationary.  It  is  also  the  initial  total  mechanical  energy,  since  ini- 
tially there  is  zero  potential  energy  in  the  system.  The  right  side  is  the  final  total 
kinetic  energy  of  the  system,  and  also  its  final  total  mechanical  energy.  For  the  par- 
ticular case  at  hand  where  the  masses  are  equal,  the  factor  m/2  can  be  divided  out  of 
each  term  to  yield  the  simplified  form 

Vji  • v1£  = vu-  vu  + Vjj {'  vjy  (8-116) 

You  have  assumed  that  the  total  mechanical  energy  of  the  system  is  unchanged 
in  the  collision.  But  it  is  not  necessary  to  assume  that  the  total  momentum  of  the 
system  is  unchanged.  You  know  that  to  be  true  because  the  system  is  an  isolated  one 
viewed  from  an  (approximately)  inertial  frame.  Thus  you  can  also  tvrite 

m\u  = mv  if  + m\2f  (8-12a) 


8-3  Impulse  and  Collisions  313 


m 


Vli 


mW 


Before 


Fig.  8-10  An  analysis  of  the  collision  of  Fig.  8-9  between 
identical  magnetic  pucks.  Before  the  collision  puck  1 
has  velocity  vu  and  approaches  the  stationary  puck  2. 
After  the  collision,  puck  1 moves  away  from  the  collision 
point  with  velocity  \lf  and  puck  2 moves  away  from  that 
point  with  velocity  v2/.  The  angle  between  the  two  final 
velocity  vectors  vx/  and  v2/  is  6. 


Here  the  left  side  is  the  initial  total  momentum  of  the  system,  and  the  right  side  is 
the  final  total  momentum.  This  equation  is  quite  independent  of  Eq.  (8-116),  be- 
cause it  has  nothing  to  do  with  energy  considerations.  It  can  be  simplified  immedi- 
ately to  the  form 


Vli  = Vi/  + v2/ 


(8-126) 


You  now  have  two  simultaneous  equations,  Eqs.  (8-116)  and  (8-126),  involving 
the  three  quantities  vi;,  vt/,  and  v2/.  You  want  to  prove  that  the  angle  between  the 
two  final  trajectories  is  90°.  In  other  words,  you  want  to  prove  that  vx/and  v2/form  a 
90°  angle.  To  obtain  a relation  between  these  two  quantities,  you  eliminate  vi;  from 
the  two  equations.  To  do  this,  you  take  the  dot  product  of  the  left  side  of  Eq.  (8-126) 
with  itself  to  obtain  vi;  • vi;.  You  maintain  the  equality  by  also  taking  the  dot  prod- 
uct of  the  right  side  with  itself  , producing 


Vli  • Vli  = (vlf  + v2/)  • (\lf  + v2/) 


or 

Vli  • Vli  = Vi/  • Vj/  + v 2f  • vu  + 2 Vy*  vu  (8- Id) 

This  is  the  vector  equivalent  of  squaring  both  sides  of  an  equation.  Subtracting 
Eq.  (8-1 16)  from  Eq.  (8-13),  you  obtain 

0 = 2vx/  • v2/  = 2uj / cos  9 v2f 

Here  9 is  the  angle  between  the  final  trajectories  of  the  two  pucks,  as  shown  in  Fig. 
8-10.  In  general,  neither  tt^nor  v2f  is  zero,  and  thus  it  must  be  true  that 

cos  9 — 0 (8-14) 

So  you  have  predicted  that  if  the  collision  is  energy-conserving,  the  angle  between 
the  final  puck  trajectories  must  be  9 = 90°.  Your  prediction  is  borne  out  by  Fig.  8-9. 
An  interesting  special  case,  familiar  to  billiard  players,  is  a head-on  collision  where 
vlf  = 0.  How  does  this  case  fit  into  the  analysis  you  have  just  gone  through? 


EXAMPLE  8-8 


The  collision  between  identical  magnetic  pucks  of  Fig.  8-9  was  called  an  elastic  col- 
lision in  Chap.  4.  There  an  elastic  collision  was  defined  as  one  in  which  the  pucks 
move  apart  after  the  collision  with  a relative  speed  equal  to  the  relative  speed  with 


314  Applications  of  Energy  Relations 


which  they  came  together  before  the  collision.  That  is,  in  an  elastic  collision  the  ve- 
locities satisfy  the  condition  of  Eq.  (4-9 a),  which  is 

|vi/  - v2/|  = |vy  - v2ij  (8-15) 

Show  that  this  condition  is  consistent  with  the  assumption  that  mechanical  energy  is 
conserved  in  the  collision. 

■ You  want  to  evaluate  the  final  relative  speed  |vy  — v2/|.  You  can  do  this  by  con- 
sidering the  final  relative  velocity,  — \2f,  and  then  generating  the  square  of  the 
speed  from  the  velocity  by  taking  the  dot  product  of  the  velocity  into  itself.  You  ob- 
tain, on  expanding, 

(Vi/  - Vy)  * (Vy  - Vy)  = Vy*  Vy  + V2f‘  Vy  ~ 2Vy  • \2f 

By  using  Eq.  (8-14),  you  simplify  this  to 

iy If  ~ V2/)  * (V if  - Vy)  = Vy  • Vy  + Vy  * Vy 

And  by  using  Eq.  (8-1 16),  you  simplify  it  further  to 

(vy  - Vi if)  ■ (Vy  - V2/)  = Vy  • Vy 

This  result  can  be  written  in  the  form 

|Vl/  - Vyj2  = |Vy|2 

The  terms  on  both  sides  of  the  equality  are  scalars.  Taking  their  square  roots,  you 
have 

|v y - Vy | = | Vy | 

For  the  case  at  hand,  v2i  = 0.  So  the  equation  you  have  just  obtained  by  using  the 
conservation  of  mechanical  energy  is  consistent  with  Eq.  (8-15),  the  kinematical  con- 
dition defining  elastic  collisions. 


The  conclusion  reached  in  Example  8-8  concerning  an  elastic  colli- 
sion is  valid  in  general.  Even  when  both  colliding  bodies  are  moving  before 
the  collision,  and  even  when  their  masses  are  unequal,  the  kinematical  defi- 
nition for  elastic  collisions  (the  bodies  move  apart  with  a final  relative  speed 
equal  to  the  initial  relative  speed  at  which  they  came  together)  is  equivalent 
to  the  statement  that  total  mechanical  energy  is  conserved  in  elastic  colli- 
sions (the  bodies  move  apart  with  a final  total  kinetic  energy  equal  to  the 
initial  total  kinetic  energy  they  have  as  they  come  together).  Briefly  put,  an 
elastic  collision  is  one  in  which  mechanical  energy  is  conserved. 

An  inelastic  collision  is  one  in  which  there  is  a loss  of  mechanical  energy.  There 
is  no  potential  energy  in  the  system  of  colliding  bodies  in  either  the  initial 
or  the  final  state  where  they  are  well  separated  and  so  do  not  exert  forces 
on  each  other,  as  was  explained  in  Example  8-7.  Thus  the  loss  of  mechan- 
ical energy  in  an  inelastic  collision  must  be  a result  of  a loss  of  kinetic  en- 
ergy. The  speeds  of  both  bodies  are  generally  lower  after  an  inelastic  colli- 
sion because  their  kinetic  enexgies  are  generally  smaller.  Thus  the  colliding 
bodies  rebound  from  an  inelastic  collision  with  a relative  speed  that  is  lower 
than  the  relative  speed  with  which  they  approach  before  the  collision,  in 
agreement  with  the  kinematical  definition  of  Eq.  (4-9 b).  An  example  is 
found  in  Fig.  4-12,  reproduced  here  as  Fig.  8-11.  The  strobe  photo  shows  a 
collision  between  an  incident  plastic  puck  and  an  initially  stationary  plastic 
puck  of  the  same  mass.  When  the  pucks  come  into  contact,  they  are  de- 
formed by  the  very  strong  contact  force.  Most  of,  but  not  all,  the  work  done 
in  producing  the  deformation  is  recovered  when  the  pucks  move  apart. 
Some  of  the  associated  mechanical  energy  remains  in  the  pucks  in  the  form 


8-3  Impulse  and  Collisions  315 


Fig.  8-11  Strobe  photo  of  a collision  between  two 
identical  plastic  pucks.  Puck  1 is  incident  from  the 
upper  right  on  puck  2,  which  is  initially  at  rest  near 
the  center  of  the  table. 


of  thermal  energy  of  vibration  of  their  constituent  molecules  (and  a bit  is 
lost  to  the  acoustical  energy  in  the  “click”  that  can  be  heard  when  they  col- 
lide). But  it  would  be  very  difficult  to  determine  how  much  mechanical  en- 
ergy is  lost  by  making  thermal  (and  acoustical)  measurements.  Example  8-9 
shows  that  there  is  a much  easier  way. 


EXAMPLE  8-9  — ■ 

Develop  an  equation  which  can  be  used  as  a basis  for  a convenient  mechanical  mea- 
surement of  the  amount  of  mechanical  energy  lost  in  an  inelastic  collision.  Do  this 
for  a case  in  which  one  of  the  colliding  bodies  is  initially  stationary  but  the  masses  of 
the  two  bodies  are  unequal. 

■ For  an  inelastic  collision,  an  energy  conservation  equation  like  Eq.  (8-1  la)  is 
invalid  since  the  left  side  is  actually  larger  than  the  right  side.  The  difference 
between  their  values  is  A K,  the  change  in  kinetic  energy  between  the  initial  and  final 
states.  It  equals  A E,  the  change  in  the  total  mechanical  energy  of  the  system 
between  the  initial  and  final  states,  because  there  is  no  potential  energy  in  the 
system  in  either  state.  Using  labels  defined  in  Fig.  8-12  to  specify  the  masses  of  the 
colliding  bodies,  you  can  express  these  energy  changes  as  follows: 

frn,  m<>  \ m,  „ , „ 

AT  = AK  = (- — vlf-  vlf  + y v2/  • v2/J  - — vlf  • v14  (8-16) 

In  an  inelastic  collision  the  energy  changes  are  negative  since  kinetic  and  total  me- 
chanical energies  are  lost.  If  you  know  both  masses,  a measurement  of  all  three  ve- 
locities in  Eq.  (8-16)  would  allow  a measurement  of  the  energy  loss. 


Vli 

• * • 

m i mj 


Fig.  8-12  Analysis  of  a collision.  Before  the  collision,  the  body  of 
mass  mx  has  velocity  vi;  and  approaches  the  stationary  body  of  mass  m2. 
After  the  collision,  the  body  of  mass  mx  moves  from  the  collision  point 
with  velocity  vlf,  and  the  body  of  mass  m2  moves  away  from  that  point 
with  velocity  v2/.  The  angle  between  vi;  and  vx/,  the  initial  and  final 
velocity  vectors  of  the  body  of  mass  mx , is  4>. 


Before 


After 


316  Applications  of  Energy  Relations 


However,  by  combining  Eq.  (8-16)  with  the  momentum  conservation  equation, 
the  need  for  measuring  one  of  these  velocities  can  be  eliminated.  This  is  a very  great 
convenience  because  it  considerably  simplifies  the  experimental  technique  that 
must  be  used  to  measure  the  energy  loss.  The  momentum  conservation  equation  is 

m iVjj  = m{vlf  + m2\2f  (8-17) 

It  applies  whether  the  collision  is  elastic  or  inelastic. 

What  you  do  to  use  Eq.  (8-17)  is  to  solve  it  for  v2/,  obtaining 

mi  , , 

V2/  = (Vju  - vx/) 

m2 

Then  you  evaluate 


rri!  mx 

v2/  * v2/  = — (vu  - \lf) (vlf 

m2  m2 


Vi /) 


Expanding,  you  obtain 


v2/*  v2/ 


Vii  + v„*  \u  - 2vlf  • vlf) 


Now  you  substitute  this  expression  for  v2/*  v2/into  Eq.  (8-16)  and  find 


AK  = r-fv 


, , 

if'  VU+  T—  (v» 

2 m 2 


V;  + VX/  * \lf  ~ 2vlf  • Vx/)  - — Vlf 


All  the  dot  products  can  be  evaluated  immediately,  except  for  the  one  involving  vi; 
and  v if.  But  it  can  be  expressed  in  terms  of  the  magnitudes  of  these  velocities  and 
the  angle  c/>  between  them.  This  is  also  the  angle  between  the  initial  and  final  trajec- 
tories of  the  incident  body  (see  Fig.  8-12).  Expressing  the  value  of  v1(-  • vx/in  terms 
of  Vu,  Vi/,  and  cos  <j>,  and  placing  the  cos  <f)  factor  at  the  end  of  the  expression  for 
convenience,  you  have 


m i 


iA'  = f * + 2^  "" 


m\ 


, ,2 

vu 


i , m , 

- VuVtf  COS  - — V 


2 m2  m2 

Gathering  the  terms  in  v\t  and  v\f,  you  get  the  result 

, mvv If  l mb  miu\{  l m,\ 

&K  = 1 + — - 1 - — - — vuvif  cos  ^ 

2 \ m2J  2 \ m2J  ?n2 


(8- 18a) 


An  equivalent  form,  expressed  in  terms  of  the  initial  and  final  kinetic  energies  Ku 
and  Kif,  is 


AK  = Klf(l  +^\  -Ku(  1 - 2—  Vk^K^  cos  </>  (8-186) 

\ m2J  \ m2J  m2 

Thus  a measurement  of  the  initial  and  final  kinetic  energies  Ku  and  Klf  of  the 
so-called  scattered  body,  and  of  its  scattering  angle  4>,  gives  you  the  kinetic  energy 
change  A K of  the  system.  From  this  you  immediately  find  the  change  in  total  me- 
chanical energy  AE,  since  A£  = A K. 


Either  form  of  Eqs.  (8-18)  provides  a very  convenient  way  of  analyzing 
inelastic  collisions.  These  equations  are  also  very  useful  for  elastic  colli- 
sions. To  apply  them  to  such  a collision,  you  just  set  A K = 0.  This  is  done 
for  a specific  case  in  Example  8-10,  and  the  predictions  obtained  are  tested 
against  experiment. 


8-3  Impulse  and  Collisions  317 


EXAMPLE  8-10 


isnm 


A body  of  mass  1.00  kg,  moving  at  a speed  of  1.00  m/s,  collides  elastically  with  a sta- 
tionary body  of  mass  2.00  kg.  The  angle  between  the  final  trajectory  of  the  1.00-kg 
body  and  its  initial  trajectory  is  $ = 65°.  Calculate  its  final  speed. 

■ Since  the  collision  is  elastic,  you  set  A K = 0 in  Eq.  (8- 18a).  Dividing  through  by 
mj 2 and  reordering  terms,  you  have 


Vlf 


1 + 


»h 

m2 


- 2 


VliVlf  cos  <£ 

m2 


v2u 


= 0 


Using  wj/w«2  = 1-00  kg/2.00  kg  = 0.50,  Vu  = 1.00  m/s,  and  cos  4>  = cos  65°  = 0.42, 
you  obtain 

1.50  v\f  — (0.42  m/s)vlf  — 0.50  m2/s2  = 0 


This  is  a quadratic  equation  in  the  unknown  vlf.  The  standard  expression  for  the 
solutions  to  such  an  equation  tells  you  that 

_ 0.42  m/s  ± V(0.42  m/s)2  + 4 x 1.50  x 0,50  m2/s2 

Vlf  ~ 2 x 1.50 

0.42  m/s  ± 1.78  m/s 
2 x 1.50 

Since  vlf  is  a speed,  it  cannot  be  negative.  So  the  physically  significant  solution  is 
found  by  choosing  the  positive  sign  that  precedes  the  square  root.  With  this  choice 
of  sign,  you  obtain  the  result 

vlf  = 0.74  m/s 

Figure  8-13  is  a reproduction  of  the  strobe  photo  of  Fig.  4-16.  It  shows  an 
elastic  collision  between  a single  (that  is,  mx  = 1)  magnetic  puck  coming  from  the 
launcher  and  a double  (that  is,  m2  = 2)  magnetic  puck  which  is  initially  stationary 
at  the  center  of  the  air  table.  The  collision  scatters  the  incident  puck  through  an  angle 
cf),  and  it  moves  off  to  the  upper  left  of  the  photo.  Measurement  with  a protractor 
will  show  you  that  — 65°.  You  thus  have  all  the  information  you  need  to  make  a 
comparison  of  the  puck  collision  with  the  collision  calculated  in  this  example.  Mea- 


Fig.  8-13  Strobe  photo  of  a collision  between  a single 
and  a double  magnetic  puck.  The  single  puck,  labeled 
1,  comes  from  the  upper  right;  the  double  puck, 
labeled  2,  is  initially  at  rest  near  the  center  of  the 
air  table.  If  the  mass  of  the  single  puck  is  taken  as  the 
unit  of  mass,  that  is,  m1  = \,  then  the  mass  of  the 
double  puck  is  m2  = 2. 


318  Applications  of  Energy  Relations 


sure  the  center-to-center  separations  of  adjacent  images  of  the  incident  puck,  after 
the  collision  and  before  the  collision,  to  obtain  the  ratio  of  its  final  and  initial  speeds. 
Then  compare  this  ratio  obtained  from  measurement  with  the  calculated  ratio 
Vif/vu  = (0.74  m/s)/(1.00  m/s)  = 0.74.  You  will  find  the  comparison  to  be  quite 
satisfactory. 


It  should  be  emphasized  that  Eq.  (8-  18a)  or  (8-186),  having  been  ob- 
tained by  using  only  conservation  laws,  can  relate  only  possible  hnal  condi- 
tions of  a system  of  colliding  bodies  to  possible  initial  conditions  of  the 
system.  Since  the  conservation  laws  do  not  contain  complete  information 
about  the  forces  acting  between  the  bodies,  the  equations  obtained  from 
them  cannot  answer  all  the  questions  that  can  be  asked  about  a particular 
collision.  For  instance,  the  question  posed  in  Example  8-10  was:  Given  that 
the  incident  body  is  scattered  through  an  angle  of  65°,  what  will  its  hnal 
speed  be?  This  question  can  be  answered  by  using  Eqs.  (8- 18a)  and  (8-186). 
Another  interesting  question  is:  In  what  circumstances  will  the  incident 
body  be  scattered  through  an  angle  of  65°?  The  equations  will  not  provide 
an  answer  to  this  question.  It  can  be  answered  only  by  a calculation  which 
fully  takes  into  account  the  properties  of  the  forces  acting  between  the  col- 
liding bodies.  An  example  of  such  a calculation  is  considered  in  Chap.  20, 
when  we  study  the  scattering  of  an  alpha  particle  by  an  atomic  nucleus. 

An  equation  completely  equivalent  to  Eq.  (8-186)  is  very  frequently 
used  in  atomic  and  nuclear  physics  to  analyze  measurements  of  the  scat- 
tering of  microscopic  particles.  And  a slightly  modified  equation  is  used  in 
the  analysis  of  reactions  between  such  particles.  We  will  obtain  the  modi- 
fied equation  in  Chap.  15,  which  treats  relativistic  mechanics.  The  mea- 
surement of  scattering  and  reactions  is  one  of  the  most  widely  used  experi- 
mental techniques  of  contemporary  physics.  It  is  used  to  study  the  forces 
which  microscopic  particles  exert  on  each  other  when  they  interact. 

Example  8-1  1 deals  with  a completely  inelastic  collision. 


EXAMPLE  8-11 


Fig.  8-14  A ballistic  pendulum. 


Figure  8-14  depicts  the  ballistic  pendulum  technique  that  can  be  used  to  measure 
the  speed  of  a bullet.  The  bullet,  of  mass  m,  is  fired  into  a wood  block  of  mass  M 
which  is  sufficiently  thick  to  stop  the  bullet.  The  wood  block  is  suspended  by  a wire 
of  length  l and  mass  negligible  compared  to  m and  M.  After  the  bullet  enters  the 
block,  the  pendulum  swings  to  a maximum  angle  </>,  which  is  measured.  Derive  an 
expression  giving  the  speed  v of  the  bullet  in  terms  of  the  measured  value  of  4>. 

■ You  must  consider,  in  sequence,  two  processes.  In  the  first,  the  bullet  collides 
with  the  wood  block,  penetrating  it  until  the  bullet  comes  to  rest  with  respect  to  the 
block.  The  total  momentum  of  the  bullet-plus-block  system  is  conserved  in  the  colli- 
sion. The  reason  is  that  no  forces  act  on  this  system  during  the  collision  which  have 
components  along  the  direction  of  motion  of  the  bullet.  (And  even  if  there  were 
such  a force,  its  effect  on  the  momentum  of  the  system  could  be  ignored,  to  a good 
approximation,  since  it  could  produce  no  significant  impulse  during  the  very  short 
duration  of  the  collision.)  Writing  the  velocity  of  the  bullet  immediately  before  col- 
liding with  the  block  as  v,  you  get  by  momentum  conservation  the  equation 

m.\  = ( m + M)\  (8-19) 

Here  m\  is  the  initial  total  momentum  of  the  system,  since  the  block  initially  was  sta- 
tionary. And  (m  + M)V  is  the  final  total  momentum  of  the  system  with  the  block 
and  embedded  bullet  moving  at  velocity  V immediately  after  the  collision. 


8-3  Impulse  and  Collisions  319 


8-4  HARMONIC 
OSCILLATIONS 


EXAMPLE  8-12 


In  the  second  process,  the  pendulum  swings  to  </>,  where  it  is  instantaneously  at 
rest.  In  this  process  the  mechanical  energy  of  the  bullet-plus-block-plus-earth 
system  is  conserved.  The  reason  is  that  the  force  exerted  on  this  system  by  the  wire 
connected  to  the  block  does  no  work  on  the  system;  the  wire  is  a workless  constraint. 
Take  the  potential  energy  to  be  zero  when  the  bullet  and  block  are  at  the  height  y = 
0 at  the  start  of  the  swing,  by  defining  that  to  be  the  reference  height.  Then  equate 
the  initial  kinetic  energy  of  the  system 


K = 


( to  + MIT2 
~2 


(8-20) 


to  its  final  potential  energy 

U = (to  + M)gy  = (to  + M)g(l  — l cos  <£) 

Doing  so,  you  obtain 


m + M 
2 


V 2 = (to  + M)g(l 


l COS  (j)) 


or 

V2  = 2gl(l  - cos  4>) 

Using  Eq.  (8-19)  to  evaluate  V2  in  terms  of  v 2,  you  then  have 

( — 7T7)  v2  = 2gl(\  - cos  0) 

\m  + Ml 

So 


v= V2g/(1  - cos  <f>)  (8-21) 

m 

The  result  obtained  in  Eq.  (8-21)  allows  v to  be  determined  from  the  known  values 
of  to,  M,  g,  /,  and  (f). 

The  collision  between  the  bullet  and  the  block  is  completely  inelastic,  in  that  the 
colliding  objects  remain  together  after  the  collision.  Thus  you  would  expect  me- 
chanical energy  to  be  lost  in  the  collision.  But  it  is  apparent  that  the  mechanical  en- 
ergy is  not  completely  lost  in  the  completely  inelastic  collision  — the  block  and  bullet 
are  moving  immediately  afterward.  A portion  of  the  mechanical  energy  must  re- 
main because  momentum  must  not  be  lost  in  the  collision.  Use  Eq.  (8-19)  to  express 
the  K of  Eq.  (8-20)  in  terms  of  the  initial  kinetic  energy  of  the  bullet,  and  thereby 
determine  just  how  much  mechanical  energy  has  disappeared  in  the  collision. 
Where  did  it  go? 


In  Examples  7-9  and  8-11  we  have  applied  energy  relations  to  predict  cer- 
tain aspects  of  the  motion  of  a pendulum.  In  this  section  we  will  apply  energy 
relations  to  a harmonic  oscillator,  such  as  the  body  at  the  end  of  a spring 
illustrated  in  Fig.  8-15.  The  purpose  is  not  to  predict  the  motion — we 
already  know  about  the  motion  of  a harmonic  oscillator  from  the  detailed 
study  we  made  in  Chap.  6 with  the  aid  of  Newton’s  second  law.  Rather,  the 
purpose  is  to  gain  the  additional  insight  that  the  energy  relations  have  to 
offer  about  properties  of  this  important  system. 


Using  the  analytical  solutions  of  the  harmonic  oscillator  differential  equation  ob- 
tained in  Sec.  6-4,  evaluate  the  kinetic,  potential,  and  total  mechanical  energies  of 
the  oscillator. 


320  Applications  of  Energy  Relations 


k 


x > 0 


Fig.  8-15  A harmonic  oscillator. 


According  to  Eq.  (6-17),  the  solutions  are 

x = A cos  (cot  + 8) 


(8-22) 


Here  x is  the  displacement  of  the  oscillating  body  at  time  t,  measured  from  its  equi- 
librium position  at  x = 0.  See  Fig.  8-15.  The  angular  frequency  co  is  determined  by 
the  mechanical  properties  of  the  oscillator  through  the  relation 


(8-23) 


In  this  expression  k specifies  the  stiffness  of  the  spring  connected  to  the  body,  and  rn 
specifies  the  mass  of  the  body.  But  the  amplitude  constant  A and  the  phase  constant 
Scan  have  whatever  values  are  required  to  describe  a particular  oscillation.  In  other 
words,  the  values  of  A and  8 are  determined  by  the  initial  conditions. 

To  evaluate  the  kinetic  energy  A',  you  need  to  compute  the  velocity  dx/dt  of  the 
body  since 

m ( dx\2 
~~2\dt) 

Differentiating  Eq.  (8-22),  or  copying  Eq.  (6-2  1a),  you  immediately  obtain 


dx 

— = — Aco  sm (cot  + 8) 
dt 

Substituting  this  into  the  expression  for  A,  you  have 

mA2oo 2 . 

A = — — — stn2(oi/  + 8) 

According  to  Eq.  (7-58),  the  potential  energy  stored  in  the  spring  is 

kx2 

U = T 


(8-24) 


(8-25) 


(8-26) 


Using  Eq.  (8-22),  you  get 


kA2 

U = — — cos  2(oot  + 8) 


To  facilitate  comparison  with  A,  you  can  use  Eq.  (8-23)  to  write  k = moo2.  Then  you 
have 


U = 


mA2w2 


cos  2(cot  + 8) 


(8-27) 


The  total  mechanical  energy  E of  the  oscillator  is 

E = K + U 

Evaluating  A and  U from  Eqs.  (8-25)  and  (8-27),  you  have 
E = — — — [sin2(w<  + 8)  + cos2(a >t  + 8)] 


(8-2  8a) 


The  trigonometric  identity  sin2  0 + cos2  0 = 1 holds  for  any  angle  0 and  for  the 
angle  cot  + 8 in  particular.  Thus  the  identity  allows  you  to  simplify  Eq.  (8-28a)  to 


E = 


mA2oo2 

2 


(8-288) 


Although  the  kinetic  and  potential  energies  of  the  harmonic  oscillator  vary  in  time, 
these  results  show  that  its  total  mechanical  energy  has  no  time  dependence. 

Thus  you  can  conclude  that  built  into  the  solutions  of  the  harmonic  oscillator 
differential  equation  is  the  information  that  the  total  energy  of  the  oscillator  is  a 
constant.  The  solutions  “know”  that  the  oscillator  conserves  mechanical  energy, 
although  they  came  directly  from  Newton’s  laws  of  motion,  and  were  found  before 


8-4  Harmonic  Oscillations  321 


Energy 


h — r — H „ 

*%  a / 

/ ' / \ 

i \ 

/ \ / \ , 

\ / \ 

i * 

/ t K 

' / \ 

< i 

i / \ 

J »/ 

y l 

| 

i i 

ij  1 

It  w 

A 

r» 

A fi 

h i 

h i 

i\  n 

/ ® 1 

\ / 

\ / » u 

\ / w 

t / 

\J  ' 1 

....  v.. 

V 

N/  W , 

ta  tfj  tc  t(j  te 


Fig.  8-16  T lie  kinetic  energy  K,  poten- 
tial energy  U,  and  total  mechanical 
energy  £ of  a harmonic  oscillator  plotted 
as  a function  of  the  time  t. 


we  developed  the  energy  relations  and  proved  that  the  force  produced  by  a spring 
is  conservative!  Equation  (8-286)  provides  a convincing  demonstration  of  the  con- 
sistency of  all  the  theory  which  has  led  to  it. 


The  way  the  mechanical  energy  of  the  oscillating  system  is  trans- 
formed back  and  forth  between  its  kinetic  manifestation  and  its  potential 
manifestation  is  made  very  evident  by  Eq.  (8-28c).  Figure  8-16  plots  the 
total,  kinetic,  and  potential  energies  of  a harmonic  oscillator,  E , K,  and  U, 
from  the  term  on  the  left  side  of  the  equation  and  the  two  terms  on  the 
right  side.  At  the  time  labeled  ta  in  the  figure,  U is  zero  and  K is  a maxi- 
mum. At  this  instant  the  spring  has  its  relaxed  length,  and  the  body  at  its 
end  is  moving  with  maximum  speed  through  its  position  of  stable  equilib- 
rium, say  in  the  direction  which  will  subsequently  result  in  compression  of 
the  spring.  As  time  passes,  the  spring  is  compressed  and  the  body  slows 
down.  The  increase  in  U is  exactly  as  great  as  the  decrease  in  K,  so  that  at  tb 
both  of  these  energies  have  half  their  maximum  value.  At  tc  the  spring  has 
maximum  compression  and  the  body  is  instantaneously  at  rest,  so  that  U is 
a maximum  and  K is  zero.  The  time  interval  from  ta  to  tc  is  one-quarter  of 
the  oscillation  period  T = 2tt/co.  At  ta  the  body  is  again  moving  through  the 
equilibrium  position,  but  now  in  the  direction  which  will  result  in  extension 
of  the  spring.  Thus  one-half  of  a period  has  elapsed.  The  time  te  corre- 
sponds to  the  end  of  a full  period  of  the  oscillation. 

The  total  mechanical  energy  E of  the  harmonic  oscillator  is  conserved 
because  of  the  way  any  change  in  its  kinetic  energy  K is  always  exactly  com- 
pensated for  by  an  opposite  change  in  its  potential  energy  U. 

The  figure  also  makes  clear  that  ( K },  the  average  of  the  kinetic  energy 
K over  any  full  oscillation  period,  is  equal  to  (U),  the  average  of  the  poten- 
tial energy  U over  the  same  period.  That  is 


(K)  = ( U ) 


since  both  K and  U vary  in  the  same  way  between  a minimum  value  of  0 
and  a maximum  value  of  E.  An  easy  way  to  determine  the  actual  value  of 
(K)  or  (U)  is  to  note  that  the  definition 


K + U = E 


tells  us  that 


(K  + U)  = (E) 


This,  in  turn,  tells  us  that 


(K)  + (U)  = (E) 


(8-29) 


since  the  average  value  of  the  sum  of  two  quantities  equals  the  sum  of  their 
average  values.  Equating  (U)  to  (K),  we  have 


h — r — H „ 

i \ f\ 

i \ 

\ / \ 

i * 

/ \ / Vk 

' / \ 

< i 

i / \ 

* »/ 

y \ 

| 

i i 

ij  1 

It  w 

A 

r» 

A n 

h i 

h i 

i\  n 

/ 4 1 

\ / 

\ h u 

\ / w 

t / 

\J  ' i 

....  v.. 

V 

N/  W , 

ta  tfj  tc  trf  te 


Fig.  8-16  T lie  kinetic  energy  K,  poten- 
tial energy  U,  and  total  mechanical 
energy  £ of  a harmonic  oscillator  plotted 
as  a function  of  the  time  t. 


2 (K)  = (E) 


(8-30) 


We  can  write  this  as 


322  Applications  of  Energy  Relations 


since  (£)  = E because  the  total  energy  £ is  constant.  Evaluating  £ from  Eq. 
(8-286),  we  obtain 

<*>  = «/>  = (8-31) 

(A  more  formal  proof  of  this  result  can  be  obtained  by  direct  calculation  of 
(K)  and  (U).  The  calculation  requires  evaluating  the  integral  over  one 
period  of  the  square  of  a sinusoid.) 

A number  of  different  energies  can  be  associated  with  a harmonic  os- 
cillator: K,  U,  E,  (K),  and  (U).  But  Eqs.  (8-25)  through  (8-31)  show  that  all 
these  energies  have  a common  feature:  the  energy  is  proportional  to  both  the 
square  of  the  amplitude  of  the  harmonic  oscillator  and  the  square  of  its  frequency. 


Fig.  8-17  The  kinetic  energy  K,  poten- 
tial energy  U,  and  total  mechanical 
energy  £ of  a harmonic  oscillator  plotted 
as  a function  of  the  displacement  x of  the 
oscillating  body  from  its  equilibrium 
position. 


A different,  and  even  more  useful,  representation  of  the  relation 
among  K,  U,  and  £ for  a harmonic  oscillator  is  shown  in  Fig.  8-17,  which 
was  obtained  by  plotting  U and  £ versus  x,  the  position  of  the  oscillating 
body.  You  can  see  from  inspection  of  Eq.  (8-26)  why  the  curve  of  U versus  x 
is  a vertically  oriented  parabola,  with  “vertex”  at  the  origin  and  “width”  that 
is  determined  by  the  force  constant  k of  the  spring.  The  curve  of  £ versus  x 
is  a horizontal  line,  since  £ has  no  x dependence.  The  possible  values  of  x lie 
in  the  range  — A *£  x A,  where  A is  the  amplitude  of  the  oscillator.  Also 
note  that  the  limits  of  this  range  are  the  values  of  x where  U = E.  The  oscil- 
lating body  cannot  be  found  outside  these  limits  because  it  is  necessary  to 
have  U £ in  order  to  have  K = E — U ^ 0.  (Since  K = mv2/ 2,  it  is  not 
possible  to  have  K < 0.)  When  the  oscillating  body  is  anywhere  within  the 
allowed  range  ofx,  the  value  of  U is  measured  by  the  vertical  distance  from 
the  point  on  the  x axis  characterizing  the  body’s  location  to  the  U curve,  and 
the  value  of  K is  measured  by  the  remainder  of  the  vertical  distance  to  the  £ 
line. 

If  you  begin  watching  the  oscillating  body  when  it  is  moving  in  the  pos- 
itive direction  through  x = 0,  you  will  see  it  continue  to  move  to  x = A, 
slowing  down  all  the  while  because  its  kinetic  energy  is  decreasing.  At  the 
turning  point  x = A,  it  reverses  its  direction  of  motion  to  turn  around  and 
start  back  toward  x = 0.  It  picks  up  speed  as  its  kinetic  energy  increases, 
until  it  passes  x = 0.  At  that  point  it  has  completed  one-half  of  a cycle  of  os- 
cillation. The  second  half-cycle  is  just  the  reverse  of  the  hrst  half,  taking  it 
to  the  other  turning  point  at  x = —A  and  then  back  to  x = 0. 

Giving  the  oscillating  body  an  appropriately  timed  blow  will  increase 
its  total  energy.  If  this  happens,  the£  line  in  Fig.  8-17  will  move  up  and  the 
turning  points,  where  U — £,  will  move  farther  from  x = 0.  This  means  the 
amplitude  A of  the  oscillation  will  increase.  Because  of  the  parabolic  shape 
of  the  curve  described  by  U,  the  relation  between  £ and  A is  £ « A2,  in 
agreement  with  Eq.  (8-286). 

Figure  8-17  can  also  be  used  to  give  you  information  about  the  force  £ 
acting  on  the  body  when  it  is  at  location  x.  To  obtain  this  information,  use 
Eq.  (7-56),  the  relation  between  force  and  potential  energy: 


£ = 


dip 

dx 


(8-32) 


Since  dU /dx  is  the  slope  of  the  U curve,  this  tells  you  that  £ is  the  negative  of 
the  slope.  Thus  £ = 0 at  x = 0 because  the  U curve  has  zero  slope  there.  As 
the  body  moves  toward  x = A,  the  slope  of  the  U curve  becomes  ever  more 
positive.  The  body  feels  an  ever-increasing  negative  force,  that  is,  a force 


8-4  Harmonic  Oscillations  323 


Energy 


acting  back  toward  x — 0.  The  signs  are  reversed  when  the  body  moves 
from  x = 0 toward  x — — A . 

Another  example  of  a plot  of  the  x dependence  of  the  potential  and  total  en- 
ergies of  a system  is  shown  in  Fig.  8-18.  The  system  consists  of  two  atoms  that  can 
bind  together  and  form  a molecule,  say,  sodium  and  chlorine.  The  potential  en- 
ergy U is  plotted  versus  the  separation  x between  the  centers  of  the  two  atoms.  The 
minimum  in  U occurs  at  the  equilibrium  separation  xe  of  the  two  atoms  in  the 
molecule.  At  that  separation,  the  force  F acting  on  either  atom  is  F = —dU/dx  = 
0.  If  the  separation  is  smaller  than  the  equilibrium  separation,  a strong  force  F = 
— dU/dx  > 0 develops.  The  force  acts  in  the  direction  tending  to  increase  the  sep- 
aration; in  other  words,  the  force  is  repulsive.  Ifx  is  larger  thanxe,  the  atoms  feel 
an  attractive  force F = -dU/dx  < 0.  For  values  ofx  close  toxe,  the  magnitude  of 
the  force  tending  to  restore  the  separation  to  its  equilibrium  value  is  a function  of 
x which  is  symmetrical  aboutxe.  In  fact,  nearxPthe  potential  energy  curve  approx- 
imates a parabola — as  any  smooth  curve  with  a minimum  atxe  must.  The  corre- 
sponding force  obeys  Hooke’s  law  forx  near  xe..  The  potential  energy  of  a crystal, 
such  as  in  a metal,  also  is  a parabolic  function  of  the  center-to-center  separation  of 
the  atoms  comprising  the  crystal,  providing  the  separation  is  near  its  equilibrium 
value.  This  microscopic  property  is  what  causes  metals , and  many  other  mate- 
rials, to  obey  Hooke’s  law  on  the  macroscopic  level.  Its  macroscopic  aspect  is 
studied  in  Chap.  16. 

But  for  a sufficiently  large  increase  inx,  a point  is  reached  where  the  attrac- 
tive force  begins  to  increase  less  rapidly  than  Hooke’s  law  would  predict.  This  is 
the  microscopic  equivalent  of  the  point  at  which  a crystalline  material  ceases  to 
obey  Hooke’s  law.  At  an  even  larger  value  of  x,  the  attractive  force  between  the 
atoms  begins  to  become  weaker  with  increasing  x . At  this  point  the  attractive  force 
has  its  ultimate  strength.  It  is  the  microscopic  equivalent  of  the  yield  strength  of  a 
material.  Ifx  exceeds  the  dissociation  separation  xd,  the  force  drops  to  zero.  This 
happens  because  the  two  atoms  have  become  so  widely  separated  that  they  no 
longer  interact.  A plot  of  F versus  x is  shown  in  Fig.  8-19  for  the  U versus  x plot 
in  Fig.  8-18.  Inspecting  it  will  help  clarify  the  points  made  in  this  paragraph. 

If  the  value  of  the  total  energy  E is  less  than  the  dissociation  energy  Ed,  as  in 
the  case  Ej  illustrated  in  Fig.  8-18,  then  the  two  atoms  are  bound  in  the  molecule. 
That  is,  their  separation  distance x will  oscillate  within  the  range Xi  =£  x =£  x[.  At 
a somewhat  higher  value  E2,  the  allowed  range  increases  tox2  =£  x x'2.  Note  that 
because  U is  not  symmetrical  about  xe,  the  outer  limit  of  the  range  has  moved  out 
more  than  the  inner  limit  has  moved  in.  Thus  the  oscillations  inx  are  not  symmet- 
rical about  xe.  Averaging  x over  a cycle  of  oscillation,  the  molecule  has  expanded 
as  its  total  energy  has  become  higher.  This  is  the  essential  mechanism  operating  in 
the  thermal  expansion  of  most  materials,  since  the  total  energy  of  a molecule  tends 
to  increase  as  the  temperature  of  its  surroundings  increases. 


Fig.  8-18  This  curve  shows  qualitatively  the  dependence 
of  the  potential  energy  U of  a diatomic  molecule  on  the  separa- 
tion x between  the  centers  of  its  two  atoms.  The  horizontal  lines 
represent  three  possible  values  of  the  constant  total  energy 
E of  the  molecule.  The  equilibrium  separation  of  the  atoms 
is  xe,  and  the  dissociation  separation  is  xd.  The  dissociation 
energy  is  Ed . 


324 


Applications  of  Energy  Relations 


obeyed 


Fig.  8-19  Qualitative  representation  of  the  force  F acting 
on  an  atom  of  a diatomic  molecule  along  the  line  between 
the  centers  of  its  atoms,  as  a function  of  the  center-to-center 
separation  x.  This  curve  is  drawn  from  Fig.  8-18  by  using 
the  relation  F — — dU/dx . 


For  a total  mechanical  energy  higher  than  the  dissociation  energy,  such  as  E3, 
the  separation  x can  be  any  value  in  an  infinite  range  beginning  at  x3,  and  the 
two  atoms  are  unbound. 

How  do  the  atoms  become  bound?  If  they  approach  each  other  from  a large 
initial  separation,  with  total  energy  E3,  they  move  together  with  constant  relative 
speed  until  x < xd.  Their  relative  speed  then  increases  until  x < xe  and  subse- 
quently decreases  until  the  atoms  are  instantaneously  motionless  with  respect  to 
each  other  atx  = x3.  They  then  retrace  their  relative  motion  in  the  opposite  direc- 
tion. But  suppose  that  while  x is  nearxe,  the  system  gets  rid  of  enough  energy  that 
the  total  energy  drops  to  a value  less  than  Ed.  Then  the  two  atoms  will  form  a 
bound  molecule  because  they  will  no  longer  have  enough  energy  to  separate.  This 
can  happen  if  the  system  emits  energy  in  the  form  of  electromagnetic  radiation. 
Such  radiation  emission  characterizes  many  inelastic  collisions  on  the  atomic 
level. 

8-5  LIGHTLY  DAMPED  In  this  section  we  investigate  the  decrease  in  the  mechanical  energy  of  a 
OSCILLATIONS  lightb  damped  oscillator,  whose  motion  was  treated  in  Sec.  6-5.  One  way  we 

could  approach  this  task  is  to  follow  the  method  of  Example  8-12.  Begin- 
ning with  the  analytical  solutions  to  the  differential  equation  for  a lightly 
damped  oscillator  given  in  Eq.  (6-33), 

x = cos(od  + 6) 

we  could  differentiate  to  find  dx/dt  and  then  find  K = m{dx/dt)2/ 2.  Adding 
the  result  to  U = kx2/2 , we  would  obtain  is  = K + U.  But  these  expressions 
would  have  rather  complicated  forms.  It  is  easier  and  more  instructive  to 
work  directly  with  the  differential  equation. 

According  to  Eq.  (6-29),  the  differential  equation  for  a damped  oscil- 
lator is 


d2x 

m -rr  = — he  — 
dt2 


(8-33) 


where  —r(dx/dt)  is  the  frictional  damping  force.  From  the  point  of  view  of 
energy,  the  main  difference  between  the  damped  oscillator  and  the  un- 
damped oscillator  is  that  in  the  former  the  total  mechanical  energy  £ of  the 
system  diminishes  as  a result  of  the  frictional  term  —r(dx/dt).  To  stress 
this  point,  we  will  recast  the  differential  equation  in  the  form 


8-5  Lightly  Damped  Oscillations  325 


— (total  mechanical  energy)  = frictional  power  drain 

First,  we  rewrite  Eq.  (8-33)  with  the  frictional  term  isolated  on  the  right 
side: 


d2x  . dx 

m —rw  + kx  = —r~r 

dt 2 dt 


(8-34) 


Each  term  in  this  equation  has  the  dimensions  of  a force.  Thus  if  the  term 
on  the  right  side  were  multiplied  by  dx/dt , the  product  would  then  have  the 
desired  dimensions  of  power  because  Eq.  (8-4)  shows  us  that 

(force)(velocity)  = power 


This  suggests  that  we  multiply  Eq.  (8-34)  through  by  dx/dt  to  obtain 


m 


dx  d2x 
dt  dt 2 


dx 


+ kx—  ~ —r  — 


dt 


dx\2 


dt 


(8-35) 


I hen  we  note  that  the  left  side  of  this  equation  can  be  rewritten  so  that  it 
becomes 


d [m  ( dx\ 2 k Q 

Jt  [j  \di)  + 2 X~ 


(8-36) 


This  can  be  verified  immediately  by  working  out  the  derivative  of  the  two 
terms  within  the  brackets. 

Now,  the  hrst  of  the  terms  in  the  brackets  is  the  kinetic  energy  K of  the 
oscillating  body,  and  Eq.  (8-26)  says  that  the  second  term  is  the  potential 
energy  U associated  with  the  spring  force.  So  we  have 


4 [K  + U]  = -r  (4) 


dt 


dt) 


or 


dE  ( dx\ 2 

~dt~~T  \dt ) 


(8-37) 


Here  E is  the  total  mechanical  energy  in  the  body-and-spring  system,  and 
dE/dt  is  its  rate  of  change.  Thus  the  differential  equation  has  been  recast 
into  the  desired  form,  and  we  can  identify  its  right  side  —r(dx/dt)2  as  the 
frictional  power  drain.  For  an  undamped  oscillator  r = 0,  and  Eq.  (8-37) 
shows  immediately  that  in  such  a system  dE/dt  = 0 and  so  E = constant. 

In  a lightly  damped  oscillator  the  damping  term  removes  mechanical 
energy  from  the  system  at  a rate  which  varies  throughout  each  oscillation 
cycle  because  (dx/dt)2  varies.  Consequently,  it  is  useful  to  calculate  the 
average  of  this  rate  over  one  full  cycle.  To  do  this,  we  hrst  solve  the  equa- 
tion K = m(dx/dt)2/2  for  (dx/dt)2,  obtaining 

/dx\2  = 2 K 
\dt)  m 


We  substitute  this  expression  into  the  right  side  of  Eq.  (8-37),  which  yields 

dE  2 r „ 

dt  m 


326  Applications  of  Energy  Relations 


Mechanical  energy 


Taking  averages  over  one  cycle  and  remembering  that  — 2 r/m  is  a constant, 
we  have 


Let  us  assume  that  the  oscillator  is  quite  lightly  damped.  Then  its  am- 
plitude will  decrease  only  slightly  from  one  cycle  to  the  next,  and  its  motion 
over  any  one  cycle  will  not  be  significantly  different  from  the  motion  of  an 
undamped  oscillator  that  happens  to  be  oscillating  with  the  same  ampli- 
tude. Thus  when  the  damping  is  small,  it  is  a good  approximation  to  use 
Eq.  (8-30): 


2 (K)  = (E) 


with  (E)  representing  the  slowly  decreasing  average  total  mechanical  en- 
ergy of  the  damped  oscillator.  Then  we  have 


(8-38) 


For  the  purpose  of  determining  how  the  mechanical  energy  decreases  over 
many  cycles,  we  can  replace  the  average  value  of  its  derivative  by  the  deriv- 
ative of  its  average  value.  That  is,  we  can  go  from  Eq.  (8-38)  to 


d(E ) 


r 


(E) 


(8-39) 


dt 


rn 


In  actuality,  the  rate  of  decrease  of  £ fluctuates  through  each  cycle  of  oscil- 
lation, since  energy  is  lost  most  rapidly  during  the  parts  of  the  cycle  when 
the  motion  is  most  rapid.  This  detailed  behavior  is  indicated  schematically 
by  the  solid  curve  in  Fig.  8-20.  I he  shaded  curve  represents  the  averaged 
behavior  of  E.  The  justification  of  replacing  the  average  of  the  derivative 
with  the  derivative  of  the  average  is  that  the  slope  of  the  solid  curve, 
averaged  over  a particular  cycle,  is  accurately  represented  by  the  slope  of 
the  shaded  curve  at  that  cycle.  In  using  Eq.  (8-39)  we  are  trying  to  describe 
only  the  overall  time  dependence  of  the  mechanical  energy  of  the  oscil- 
lator, not  the  fine  structure  of  this  time  dependence. 

Equation  (8-39)  says  that  the  average  mechanical  energy  of  a lightly 
illator  is  a function  of  time  having  the  property  that  its  first 
proportional  to  its  value.  What  function  has  this  property?  Ac- 
,q.  (6-37),  an  exponential  function  does.  In  fact,  a form  for  (E) 
es  Eq.  (8-39)  is 


Fig.  8-20  The  time  dependence  of  the  total  mechanical  energy  of  a 
lightly  damped  oscillator. 


(E)  = <£)0e-™ 


(8-40) 


o 


Time 


8-5  Lightly  Damped  Oscillations  327 


The  quantity  (E)0  is  a constant  whose  value  we  can  adjust  to  fit  any  particu- 
lar case  by  setting  it  equal  to  the  value  of  (E)  at  t = 0.  To  verify  that  Eq. 
(8-40)  satisfies  Eq.  (8-39),  we  differentiate  Eq.  (8-40)  by  applying  Eq.  (6-37), 
the  rule  for  differentiating  an  exponential.  We  have 

^ - = -jt  e~(rlm)t]  = (E) o j~  (E)0  e~(rlm)t 


Substituting  this  and  Eq.  (8-40)  itself  into  Eq.  (8-39),  we  obtain 

(E) o e-(rlm)t  = (E) o e~irlm)t 

m m 

Since  this  equation  is  certainly  valid,  and  since  it  was  obtained  from  Eq. 
(8-40),  we  have  proved  that  Eq.  (8-40)  is  valid. 

Our  conclusion  is  that  ( E ),  the  average  mechanical  energy  stored  in  a 
lightly  damped  oscillator,  decreases  exponentially  from  one  cycle  to  the 
next  as  frictional  effects  convert  the  mechanical  energy  into  thermal  en- 
ergy. The  rapidity  of  the  decrease  is  governed  by  the  value  of  r/m , the  coef- 
ficient oft  in  the  negative  exponent.  The  shaded  curve  showing  the  behav- 
ior of  (E)  in  Fig.  8-20  was  obtained  by  plotting  Eq.  (8-40). 


In  a damped  oscillator  (and  in  any  equivalent  system  such  as  an  oscil- 
lating electrical  system),  the  ratio  of  the  energy  stored  to  the  energy  loss  in 
one  cycle  of  oscillation,  multiplied  by  277,  is  called  the  quality  factor,  or  Q 
factor.  That  is,  we  define 


<2  = 


energy  stored 

energy  loss  per  cycle 


(8-41a) 


The  energy  stored  in  the  oscillator  is  given  by  the  quantity  (E),  and  the  en- 
ergy loss  per  cycle  is  the  magnitude  of  its  average  change  per  unit  time 
\{dE/dt)\  multiplied  by  the  time  per  cycle,  the  period  T.  Thus  we  can  write 


^ 27T  \(dE/dt)\T 


(8-41  b) 


The  Q factor  is  a figure  of  merit  that  indicates  how  thorough  a job  has  been 
done  in  eliminating  frictional  sources  of  energy  loss  from  an  oscillating  me- 
chanical system  (or  resistive  losses  in  an  electrical  system).  A high  Q factor 
means  a low  loss. 

To  evaluate  Q for  a lightly  damped  oscillator,  we  use  Eq.  (8-38)  for 
(dE/dT).  This  yields 


Q = 


2 7 7 (E) 
(r/m)  (E)T 


or 


Q = 


2tt  m 
rT 


Expressed  in  terms  of  the  angular  frequency  co  = 2tt/T,  it  is 


(8-42a) 


(8-426) 


Since  r specifies  the  strength  of  the  frictional  force  acting  on  the  oscillating 
body,  it  is  not  surprising  to  find  that  Q is  proportional  to  1/r.  Can  you  ex- 
plain on  physical  grounds  why  Q is  proportional  to  m and  also  to  c or 


328  Applications  of  Energy  Relations 


It  is  difficult  to  make  tfie  Q factor  in  practical  mechanical  systems  much 
larger  than  102.  (In  high-quality  electrical  systems  Q factors  in  excess  of  10fi 
can  be  attained,  while  in  superconducting  circuits  Q —>  <».)  We  will  find  in 
Chap.  26  that  the  Q factor  has  a very  important  bearing  on  the  sharpness  of 
the  response  of  oscillatory  systems  (such  as  those  in  radio  receivers)  when 
they  are  driven  by  an  applied  oscillation. 

Example  8-13  evaluates  the  Q factor  of  a lightly  damped  oscillator. 


EXAMPLE  8-13  — 

Again  you  are  the  engineer  working  on  the  design  of  the  automobile  springing 
system  considered  in  Example  6-9.  With  springs  but  no  shock  absorbers  installed  on 
the  preliminary  model,  you  give  the  automobile  body  an  initial  downward  push  to 
set  it  into  vertical  oscillation.  Your  measurements  show  that  from  each  cycle  to  the 
next  the  amplitude  of  the  oscillation  decreases  by  16  percent.  What  is  the  Q factor  of 
the  system? 

■ You  can  find  die  Q factor  from  the  results  of  your  measurements  by  using  Eqs. 
(6-30),  (6-33),  and  (6-38)  to  write  an  expression  for  the  position  x of  a lightly 
damped  oscillator  that  involves  the  quantities  m and  r.  It  is 

x = Ae~ut2m)t  cos(o >t  + 8)  (8-43) 

Let  fi  be  the  time  at  which  the  body  reaches  the  highest  point  xx  in  its  first  oscillation 
cycle.  For  this  value  of  t,  you  know  that  cosfajfi  + 8)  = 1.  So 

xx  = Ae~(rl2m)l' 

Similarly,  if  t2  is  the  time  for  the  highest  point  x2  in  the  second  cycle,  you  have 
cos(ait2  + 8)  = 1,  and 

x2  = Ae~(rl2m)k 


Divide  the  second  equation  by  tfie  first: 


x2 


g-(r/2m)fa 


Xj 


„-(.rl2m)ti 


Your  measurements  show  that 


Xj,  = (1  - 0. 16)xx  = 0.84  xx 


or 


x? 

— = 0.84 

Xj 

Also,  f2  — = T , the  period  of  the  oscillation.  So  you  have 

0.84  = e~irl2m)T 

Now  you  solve  for  m/r  in  terms  of  T.  First  you  transpose  and  take  reciprocals, 
to  obtain 

(rl2m)T  — 1 = 1 IQ 

0.84 

Taking  logarithms  to  the  base  e,  you  then  have 

In  eo-i2m)T  = hl  119 

By  the  definition  of  a logarithm,  the  left  side  of  this  equation  gives  you  simply 
(r/2m)T  = rT /2m.  So  you  have 


8-5  Lightly  Damped  Oscillations  329 


Evaluating  the  logarithm,  you  find 


or 


£ = 0 174 


2m 

— = 5.74 
rl 


To  two  significant  figures,  you  obtain  the  result 


m 

- = 2.9 T 
r 


Would  you  obtain  the  same  result  if  you  used  the  amplitudes  for  the  fifth  and  sixth 
cycles  to  evaluate  m/r,  instead  of  the  amplitudes  for  the  first  and  second  cycles? 
Why?  What  is  the  advantage  of  using  the  first  and  second  cycles  for  measurement? 
Also,  can  you  explain  on  physical  grounds  why  the  coefficient  of  t in  the  negative 
exponent  of  Ecp  (8-43)  is  r/2m,  whereas  it  is  r/m  in  Eq.  (8-40)? 

From  Eq.  (8-28),  you  know  that 


Easing  the  result  you  obtained  for  m/r,  you  have 

Q = ^ X 2.9 T 
^ T 

or  <2  = 18 

The  system  does  not  have  a high  Q factor.  But  it  is  not  intended  to  be  a 
“high -(2"  system.  Indeed,  you  must  make  the  Q factor  close  to  2tt  in  order  to  make 
the  automobile  have  a comfortable  ride.  As  Q approaches  27r,  the  energy  loss  in  one 
cycle  of  oscillation  becomes  the  same  as  the  energy  content  of  the  oscillation,  and 
critical  damping  is  approached.  How  do  you  reduce  the  Q factor  of  the  springing 
system? 


EXERCISES 

Group  A 

8-1.  Upstairs.  You  increase  your  elevation  by  15  m in 
50  s by  running  up  the  stairs  of  a multistory  building. 
Assuming  that  all  the  work  you  do  goes  into  increasing  the 
gravitational  potential  energy  of  the  system  you-plus- 
earth  and  that  your  mass  is  70  kg,  calculate  your  power 
output  in  watts  and  in  horsepower. 

8-2.  Niagara  Falls.  Approximately  180,000  metric 
tons  of  water  per  minute  plunges  over  Niagara  Falls, 
which  is  48  m high.  Find  the  rate  of  release  of  gravita- 
tional potential  energy.  Express  your  result  in  kilowatts. 


8-3.  Rain  power.  A heavy  rainstorm  can  drop  10  mm 
of  water  per  hour  on  a locality,  and  good-sized  raindrops 
have  a terminal  speed  of  8 m/s. 

a.  If  the  rain  cloud  is  at  an  average  altitude  of  600  m, 
find  the  rate  at  which  the  rainstorm  releases  gravitational 
potential  energy  per  unit  of  land  area.  Express  your  result 
in  watts  per  square  meter. 

b.  What  fraction  of  the  power  per  unit  area  found  in 
part  a is  in  the  form  of  the  kinetic  energy  of  the  impacting 
raindrops?  What  has  happened  to  the  rest  of  the  initial 
gravitational  potential  energy? 


330  Applications  of  Energy  Relations 


8-4.  Rain  on  the  roof.  A gently  sloped  roof  of  area 
100  m2  is  exposed  to  the  rainstorm  described  in  Exercise 
8-3. 

a.  What  is  the  time-averaged  force  the  roof  must 
exert  to  stop  the  impacting  drops? 

b.  What  force  per  unit  area  is  this? 

c.  Express  the  result  of  part  b as  a fraction  of  the 
standard  atmospheric  pressure,  1.013  X 105  N/m2. 

8-5.  Measuring  an  engine’s  brake  horsepower.  1 he  device 
shown  in  Fig.  8E-5  can  be  used  to  measure  the  brake 
horsepower  of  an  engine.  The  engine  is  used  to  drive  the 
drum  at  a steady  rate  against  the  frictional  resistance  pro- 


Fig.  8E-5 


vided  by  a heat-resistant  band  that  is  held  against  the 
drum  by  the  two  spring  balances. 

a.  If  the  drum  is  turning  at  a steady  rate,  how  much 
work  is  done  against  friction  each  time  the  drum  rotates? 
Express  your  result  in  terms  of  the  drum  radius  r and  the 
tensions  Tj  and  T2. 

b.  If  the  drum  is  turning  steadily  at  n rotations  per 
second,  find  the  power  P being  delivered  by  the  engine. 

c.  With  a drum  of  radius  15  cm,  an  engine  is  tested  at 
a speed  of  3000  rotations  per  minute  (rot/min).  With  the 
engine  at  full  throttle,  tensions  T i = 1100  N and  T2  = 200 
N are  required  to  keep  the  drum  from  turning  faster. 
What  maximum  power  can  the  engine  provide  at  3000 
rot/min?  Express  your  result  in  horsepower.  This  is  the 
engine's  brake  horsepower  at  3000  rot/min. 

8-6.  A long,  hard  climb.  According  to  the  seventeenth 
edition  of  the  Guinness  Book  of  World  Records,  a New  York 
City  resident  once  climbed  the  1575  steps  of  the  Empire 
State  Building  in  12  min  32  s.  Assume  that  the  climber’s 
mass  was  70  kg.  If  the  stairstep  height  is  24  cm,  at  what 
average  rate  did  the  climber  do  work  against  gravity? 
Express  your  result  in  watts  and  in  horsepower. 

8-7.  The  advantage  of  a lever.  Sketch  a lever  with  a ful- 
crum located  at  a point  between  the  two  ends.  The  distance 
from  the  fulcrum  to  the  end  of  the  lever  where  the  load 
being  lifted  is  connected  is  ru  and  the  distance  from  the 
fulcrum  to  the  end  where  the  lifting  force  is  applied  is  r2. 
Find  the  mechanical  advantage  of  this  lever.  Compare 
your  results  with  Eq.  (8-5). 


8-8.  The  advantage  of  a crank.  What  is  the  mechanical 
advantage  of  the  crank  mechanism  shown  in  Fig.  8E-8? 


8-9.  The  advantage  of  a gear  train.  A train  of  gears  is 
used  to  lift  a block  of  mass  M,  as  shown  in  Fig.  8E-9.  In 
each  gear,  the  radius  of  the  central  section  is  r,  and  the 
overall  radius  is  R. 

a.  Neglecting  friction,  what  is  the  mechanical  advan- 
tage of  the  gear  train? 

b.  Suppose  R = 5 r.  Evaluate  the  mechanical  advan- 
tage. What  applied  force  F would  be  required  to  lift  a 
100-kg  block? 

8-10.  A painter  pulls  a pulley.  A painter  raises  himself 
by  pulling  on  rope  A in  the  pulley  system  shown  in  Fig. 
8E-10. 


Fig.  8E-10 


Exercises  331 


a.  Neglecting  friction,  what  is  the  mechanical  advan- 
tage of  the  system? 

b.  If  the  painter  lets  someone  on  the  ground  hoist 
him  by  pulling  on  rope  A,  what  will  the  mechanical  advan- 
tage be? 


a.  Neglecting  friction,  what  is  its  mechanical  advan- 
tage if  the  radius  of  the  lower  gear  is  N times  the  radius  of 
the  upper  gear? 

b.  Evaluate  your  results  for  N = 3.0,  R = 40  cm,  and 
r = 5.0  cm. 


8-11.  More  pulleys.  Neglecting  friction,  what  is  the  me- 
chanical advantage  of  each  of  the  pulley  systems  shown  in 
Fig.  8E-1 1? 


8-12.  The  advantage  of  a wedge.  A wooden  wedge  is 
pushed  horizontally  to  raise  a heavy  object.  All  surfaces 
are  frictionless.  Apply  the  law  of  conservation  of  mechan- 
ical energy  to  determine  the  mechanical  advantage  in 
terms  of  the  wedge  angle  0. 


8-13.  Winding  a windlass. 
shown  in  Fig.  8E-13. 


A windlass  mechanism  is 

Fig.  8E-13 


8-14.  The  advantage  of  a jackscrew.  In  a jackscrew,  a 
horizontal  arm  of  length  R extends  the  screw  and  raises 
the  jack  a distance  p for  each  complete  turn  of  the  arm. 
The  quantity  p is  called  the  pitch  of  the  screw. 

a.  Neglecting  friction,  what  is  the  mechanical  advan- 
tage ol  the  jackscrew  in  terms  of  p and  R? 

b.  Evaluate  the  mechanical  advantage  for  R = 1.0  m 
and  p = 1.0  cm. 

8-15.  A safety  valve.  A safety  valve  for  a steam  boiler 
is  shown  in  Fig.  8E-15.  The  total  mass  of  the  arm  and  plug 
is  small  compared  to  M. 


Fig.  8E-15 


a.  What  force  must  the  steam  exert  on  the  plug  to 
open  the  valve? 

b.  Evaluate  your  result  for  d = 3.0  cm,  D = 30  cm, 
and  M = 5.0  kg. 

c.  The  plug  bottom  has  an  area  of  3.0  cm2  exposed  to 
the  steam.  What  is  the  required  force  per  unit  area? 

d.  The  force  per  unit  area  exerted  by  a fluid  is  called 
the  fluid  pressure.  A common  unit  for  pressure  is  the  stand- 
ard atmosphere  (atm),  whose  SI  equivalent  is  1.013  x 105 
N/m2.  Express  the  result  of  part  c in  atmospheres. 

8-16.  Head-on  collision  with  a stationary  object.  A billiard 
ball  moving  with  velocity  vi;  makes  a head-on  collision  with 
another  billiard  ball.  The  second  ball  is  stationary  before 
the  collision;  that  is,  v2,  = 0.  Both  balls  have  the  same 
mass.  Assuming  the  collision  to  be  elastic,  predict  the  final 
velocities  of  both  balls,  vv  and  v2/.  Explain  how  your  re- 
sults fit  in  with  the  analysis  of  Eq.  (8-14)  given  in  the  text. 

8-17.  Where  did  it  go?  What  fraction  of  the  initial 
kinetic  energy  was  lost  when  the  bullet  in  Example  8-11 
stopped  in  the  wood  block  of  the  ballistic  pendulum? 
Where  did  the  lost  kinetic  energy  go? 

8-18.  Splat!  Imagine  that  you  are  hit  by  a rotten  to- 
mato thrown  by  a prankster.  The  tomato  has  the  same 
density  as  water  and  is  roughly  spherical,  with  a radius  of 
3.0  cm.  It  strikes  you  at  a speed  of  10  m/s  and  flattens,  so 


332  Applications  of  Energy  Relations 


that  it  comes  to  rest  in  approximately  the  time  taken  to 
travel  its  own  diameter  at  its  initial  speed. 

a.  What  impulse  must  be  supplied  to  the  tomato? 

b.  During  what  time  interval  is  this  impulse  deliv- 
ered? 

c.  What  is  the  average  contact  force  between  you  and 
the  tomato  during  this  interval? 

8-19.  A hailstone  strikes.  A hailstone  of  radius  5.0  mm 
and  density  0.90  g/cm3  strikes  the  roof  of  a parked  car. 
The  hailstone  hits  with  a speed  of  10  m/s  and  rebounds 
to  a height  of  0.20  m. 

a.  What  fraction  of  its  initial  kinetic  energy  is  lost  in 
the  impact? 

b.  Assume  that  during  the  collision  the  hailstone  is 
decelerated  to  rest  in  the  time  it  would  take  to  travel  its 
own  diameter.  What  average  force  must  the  roof  exert? 

c.  Assuming  that  this  force  must  be  provided  by  a cir- 
cular area  equal  in  radius  to  the  hailstone,  what  is  the 
pressure?  Express  your  result  in  newtons  per  square 
meter  and  in  atmospheres.  (See  Exercise  8-15.) 

8-20.  Experimental  determination  of  a pendulum’s  (J 
factor.  Construct  a pendulum  by  suspending  any  conve- 
niently available  compact  object  of  mass  about  0.25  kg 
from  the  lower  end  of  a string  approximately  0.5  m long. 
When  the  amplitude  of  the  oscillation  is  about  0.10  m, 
make  measurements  that  will  allow  you  to  determine 
approximately  the  Q factor  of  the  pendulum,  in  the 
manner  employed  in  Example  8-13.  Compare  your  result 
with  the  Q factor  of  the  system  studied  in  the  example, 
and  explain  the  difference  in  the  values  of  the  two  Q 
factors. 

Group  B 

8-21.  Flea  power.  According  to  the  Guinness  Book  of 
World  Records,  the  common  flea  can  perform  vertical 
jumps  as  high  as  20  cm  (or  about  130  limes  its  own 
height!).  It  is  reported  that  in  the  process  the  flea  subjects 
itself  to  an  upward  acceleration  of  about  200g\ 

a.  What  launch  speed  is  required  to  reach  the  cpioted 
height  neglecting  ait  resistance?  Neglect  the  small  vertical 
displacement  during  launch. 

b.  If  the  launch  involves  a uniform  200g  acceleration, 
how  much  time  does  the  flea  use  to  launch  itself?  How  far 
does  it  move  during  launch?  Is  your  answer  consistent 
with  the  size  of  a flea,  as  given  implicitly  above? 

c.  Make  a reasoned  estimate  for  the  mass  of  a flea. 

d.  Use  the  mass  estimate  to  calculate  the  flea’s  initial 
kinetic  energy  and  the  average  power  expenditure  during 
launch. 

e.  Compare  the  power  output  per  kilogram  of  a 
jumping  flea  with  the  power  output  per  kilogram  of  the 
champion  stair  climber  of  Exercise  8-6. 

8-22.  Rolling  freight.  Each  car  of  a 50-car  freight  train 
has  a mass  of  4.0  X 104  kg.  The  coefficient  of  static  f riction 
for  iron  wheels  on  iron  rails  is  0.0040.  The  train  is  being 
pulled  up  a one  percent  grade,  so  that  the  elevation  in- 
creases by  10  m per  kilometer  of  horizontal  displacement. 


Find  the  power  required  to  pull  the  train  at  a steady  speed 
of  15  m/s.  Express  your  result  in  kilowatts. 

8-23.  Atwood’s  machine  and  energy  conservation.  Use  en- 
ergy considerations  to  analyze  the  motion  of  the  Atwood 
machine  shown  in  Fig.  8E-23.  The  string  and  pulley  have 
negligible  mass,  the  pulley  is  frictionless,  and  the  mass  m, 
is  greater  than  m2.  The  system  is  initially  held  at  rest. 

Fig.  8E-23 


a.  If  you  lake  the  tabletop  on  which  m2  rests  as  the 
reference  level,  what  is  the  initial  total  energy  of  the 
system? 

b.  The  system  is  released,  and  mx  descends  to  the 
table.  Write  an  expression  for  the  total  energy  of  the 
system  just  before  ml  strikes  the  table. 

c.  Use  the  results  of  parts  a and  b to  determine  the 
speed  of  the  bodies  just  before  mx  strikes  the  table. 

d.  When  mx  strikes  the  table,  the  string  goes  slack. 
Use  energy  considerations  to  determine  how  far  m2  rises 
after  the  string  goes  slack. 

8-24.  A self-locking  machine.  An  inclined  plane  can  be 
regarded  as  a machine  for  doing  work  against  gravity.  Be- 
cause of  the  kinetic  friction  between  the  object  being  ele- 
vated and  the  surface  of  the  incline,  the  applied  force 
must  do  an  amount  of  work  W that  exceeds  the  increase 
AU  of  gravitational  potential  energy.  Consider  the  motion 
of  a load  up  an  inclined  plane.  Prove  that  if  W > 2AU,  the 
incline  is  self-locking.  Thai  is,  if  the  applied  force  is  re- 
moved, the  object  will  not  slip  back  down  the  incline. 

8-25.  An  elastic  head-on  collision,  A body  of  mass  mx 
traveling  with  velocity  vi;  makes  a head-on  elastic  collision 
with  a stationary  body  of  mass  m2.  The  velocities  after 
collision  are  vt/  and  v2/. 

a.  Calculate  vlf  and  v2/  in  terms  of  vu . 

b.  Calculate  the  ratio  of  the  kinetic  energy  trans- 
ferred to  m2  to  the  original  kinetic  energy. 


Exercises  333 


c.  For  what  value  of  m2  is  all  the  energy  transferred  to 
the  stationary  body? 

d.  If  m-i  were  a neutron  and  m2  were  a carbon  atom 
whose  mass  is  about  12  times  the  neutron’s  mass,  what 
fraction  of  the  neutron's  energy  would  be  transferred  to 
the  carbon  atom  in  the  head-on  collision? 

8-26.  Elastic  collisions  within  an  isolated  system.  Three 
bodies  form  an  isolated  system.  Their  masses  are  m1 , 
m2  = 2 mu  and  m3  = 3 m1.  They  have  different  directions  of 
motion,  but  they  all  have  the  same  initial  speed  v0.  One  or 
more  elastic  collisions  occur  between  pairs  of  the  bodies 
which  otherwise  do  not  interact.  Use  energy  consider- 
ations to  find  the  maximum  possible  final  speed  of  each  of 
the  three  bodies.  (It  should  be  quite  evident  that  not  all 
the  bodies  could  have  their  maximum  speeds  in  the  same 
situation.) 

8-27.  Colliding  pucks. 

a.  Make  measurements  on  the  strobe  photo  of  the 
collision  between  identical  plastic  pucks  in  Fig.  8-1 1,  to  ob- 
tain the  initial  and  final  kinetic  energies  of  the  two  pucks. 
Assume  that  the  time  interval  between  strobe  light  flashes 
is  0.200  s,  that  the  distance  between  adjacent  positions  of 
the  center  of  puck  1 before  the  collision  is  0.100  m,  and 
that  each  puck  has  a mass  of  0.250  kg.  Evaluate  the  kinetic 
energy  loss  AK  from  your  measurements  of  Ku , K2i,  K y, 
and  Kv. 

b.  Apply  Eq.  (8-186)  to  calculate  AK  from  Ku,  Ku, 
and  the  value  you  measure  for  the  scattering  angle  c/>. 
Compare  your  results  with  those  you  obtained  in  part  a. 

8-28.  Coefficient  of  restitution.  For  head-on  collisions, 
the  coefficient  of  restitution  e for  a pair  of  bodies  is  de- 
fined as  the  ratio  of  the  magnitude  of  the  relative  velocity 
\f  after  the  collision  to  the  magnitude  of  v;,  that  before 
the  collision.  Thus,  e = Vf/vi.  The  usefulness  of  the  co- 
efficient e lies  in  the  fact  that,  for  given  materials  and/or 
objects,  it  is  approximately  constant  over  a reasonable 
range  of  impact  speeds.  The  value  of  the  coefficient  can 
be  determined  by  clamping  one  body,  dropping  the  other 
object  onto  it  from  a known  height  6,,  and  measuring 
the  height  of  rebound  hf. 

a.  Find  an  expression  for  e in  terms  of  A,  and  hf. 

b.  It  is  observed  that  a Ping-Pong  ball  rebounds  24 
cm  when  dropped  onto  a hardwood  table  from  a height  of 
30  cm.  What  is  the  coefficient  of  restitution? 

c.  A ball  is  dropped  onto  a fixed  horizontal  surface 
from  height  h\.  The  coefficient  of  restitution  is  e.  Find  the 
total  distance  traveled  by  the  ball  as  it  bounces  before  it 
comes  to  rest  on  the  surface.  Express  your  result  in  terms 
of  hx  and  e. 

d.  Evaluate  the  total  distance  traveled  by  the  Ping- 
Pong  ball  described  in  part  b. 

8-29.  Bouncing  down  the  stairs.  A ball  is  bouncing  down 
a flight  of  stairs.  The  coefficient  of  restitution,  as  defined 
in  Exercise  8-28,  is  e.  The  height  of  each  step  is  d , and  the 
ball  descends  one  step  at  each  bounce.  After  each  bounce 
it  rebounds  to  a height  h above  the  next  lower  step.  The 


height  h is  large  enough  compared  with  the  width  of  a step 
that  the  impacts  are  effectively  head  on.  Show  that  h = 
d/(  1 - e2). 

8-30.  On  the  rebound.  A body  of  mass  m collides  with  a 
frictionless  surface.  Its  initial  speed  is  vu  and  it  strikes  the 
surface  at  an  angle  0,.  It  bounces  from  the  surface,  but  the 
collision  is  not  elastic,  so  that  after  the  impact  the  magni- 
tude of  the  normal  component  of  the  velocity  is  only  a 
fraction  e of  the  original  value  vt  sin  0,-. 

a.  Find  the  impulse  delivered  by  the  surface  to  the 
body. 

b.  Find  the  angle  6f  at  which  the  body  leaves  the  sur- 
face. 

c.  Find  the  speed  at  which  the  body  leaves  the  sur- 
face. 

d.  Express  the  ratio  of  final  to  initial  kinetic  energy  in 
terms  of  e and  P;. 

8-31.  Tennis  power.  A powerful  tennis  player  can 
serve  the  ball  at  50  m/s  (more  than  1 10  mi/h!).  The  mass 
of  a tennis  ball  is  57  g. 

a.  What  is  the  impulse  delivered  to  a tennis  ball  as  it  is 
served? 

b.  Estimate  the  duration  of  contact  between  the 
tennis  racket  and  the  ball.  To  do  this,  estimate  the  diame- 
ter of  a tennis  ball  and  assume  that  it  is  (temporarily) 
squashed  to  half  this  value  by  the  racket. 

c.  Use  your  results  for  parts  a and  b to  obtain  esti- 
mates for  the  contact  force  between  ball  and  racket  and 
for  the  rate  of  increase  of  kinetic  energy  of  the  ball  during 
the  serve.  Express  your  answer  in  kilowatts. 

8-32.  The  right  impulse.  An  astronaut  is  doing  mainte- 
nance work  outside  a space  station.  She  is  coasting  along 
the  station  at  a speed  of  1.00  m/s.  She  wishes  to  change 
her  direction  of  motion  by  90°  and  to  increase  her  speed 
to  2.00  m/s.  Her  total  mass  is  100  kg,  including  her  space- 
suit  and  rocket  belt,  which  provides  a thrust  of  50  N. 

a.  Find  the  magnitude  and  direction  of  the  impulse 
needed  to  accomplish  the  desired  change  in  motion. 

b.  What  is  the  shortest  time  in  which  the  astronaut 
can  complete  the  change  in  motion?  How  must  the  rocket 
be  pointed? 

c.  Suppose  the  astronaut  makes  the  change  by  decel- 
erating to  rest,  turning  the  rocket  exhaust  by  90°,  and 
then  accelerating  up  to  the  desired  final  speed.  How  long 
would  this  take?  How  much  rocket  fuel  is  used,  compared 
to  the  minimum? 

Group  C 

8-33.  Energy  of  a harmonic  oscillator. 

a.  Find  an  expression  for  the  average  kinetic  energy 
(K),  over  one  period  T,  of  the  total  mechanical  energy  of  a 
harmonic  oscillator  by  evaluating  the  integral  in  the  ex- 
pression 

[ K(t)  dt 
, , ' o 


334  Applications  of  Energy  Relations 


where  K(t)  is  given  by  Ecy  (8-25).  The  trigonometric  iden- 
tity sin2  0 = (1  — cos  20)/2  will  be  useful  in  evaluating  the 
integral.  Compare  your  result  with  the  expression  for  (K) 
given  in  Eq.  (8-31). 

b.  Carry  out  a similar  evaluation  of  ( U ) by  using  the 
expression  for  U(t)  given  by  Eq.  (8-27).  The  identity 
cos2  6 = ( 1 + cos  26)/2  will  be  useful.  Compare  this 
result  with  the  expression  for  ( U ) given  in  Eq.  (8-31). 

8-34.  Energy  loss  in  a lightly  damped  oscillator.  Apply  the 
method  of  Example  8-12  to  investigate  the  decrease  in 
mechanical  energy  of  a lightly  damped  oscillator  whose 
coordinate  x is  given  by  the  expression 

x = Ae^rl2m>t  cos (ast  + 8) 

That  is,  evaluate  E = K + U = m{dx/dt)2/ 2 + kxl/2,  and 
then  compute  (E),  the  average  of  E over  one  period  of  the 
oscillation.  Compare  your  result  with  Eq.  (8-40).  What  as- 
sumptions must  you  make  in  order  to  bring  your  result 
into  complete  agreement  with  Eq.  (8-40)?  How  do  these 
assumptions  relate  to  those  used  in  deriving  Eq.  (8-40)? 

8-35.  Collisions,  collisions.  A body  of  mass  mx  and 
velocity  vi;  approaches  a stationary  body  of  mass  m2  = 
Cm i . After  the  collision,  the  two  bodies  have  a relative 
speed  |v2/  — vx/|  = e |v1(j,  where  e is  some  constant. 

a.  For  perfectly  elastic  collisions,  in  which  the  total 
kinetic  energy  K does  not  change,  what  is  the  value  of  e? 
What  values  of  e describe  inelastic  collisions,  in  which 
Kf  < Kj ? Justify  your  answers. 

b.  Show 

.-v 

vl/  ' t'2 f 

1 leu-  6 is  the  angle  between  the  final  velocities,  v^and  v2/. 

c.  Use  the  result  of  part  b to  prove  the  following 
statements: 

(1)  When  a body  collides  elastically  with  a more  mas- 
sive body  that  is  initially  at  rest,  the  final  velocities  form 
an  obtuse  angle. 

(2)  When  a body  collides  elastically  with  a less  mas- 
sive body  that  is  initially  at  rest,  the  final  velocities  form 
an  acute  angle. 

(3)  When  a body  collides  inelastically  with  an  ini- 
tially stationary  body  of  equal  or  smaller  mass,  the  final 
velocities  form  an  acute  angle. 

8-36.  Collision  with  a “massive”  body.  An  alpha  particle 
(helium  nucleus)  enters  a chamber  containing  mercury 
vapor.  The  alpha  particle  has  an  initial  kinetic  energy  of 
3.0  x 106  electron  volts  (eV).  (The  electron  volt  is  a unit 
of  energy  commonly  used  in  describing  motion  in  the 
microscopic  world:  1 eV  = 1.602  x 10-19  J.)  The  alpha 
particle  is  scattered  through  90°  by  an  elastic  collision 
with  the  nucleus  of  a mercury  atom.  The  mercury  atom 
was  essentially  at  rest  before  being  struck.  Its  mass  is  50.0 
times  that  of  the  alpha  particle. 


a.  In  what  direction  does  the  mercury  atom  recoil 
after  the  collision? 

b.  Find  the  final  kinetic  energy  of  the  mercury  atom 
(1)  as  a fraction  of  the  initial  kinetic  energy  of  the  alpha 
particle  and  (2)  in  electron  volts. 


8-37.  Collision  with  a “light”  body.  An  alpha  particle 
enters  a chamber  containing  atomic  hydrogen.  The  alpha 
particle  has  an  initial  kinetic  energy  of  3.0  x 106  eV 
(=4.8  x 1 0— 13  J).  It  collides  elastically  with  the  nucleus  of 
a hydrogen  atom  (a  proton).  The  hydrogen  atom  is  es- 
sentially at  rest  prior  to  the  collision.  The  mass  of  a pro- 
ton is  0.25  times  the  mass  of  the  alpha  particle. 

a.  What  is  the  maximum  possible  angle  through 
which  the  alpha  particle  can  be  scattered?  If  this  max- 
imum deflection  occurs,  in  what  direction  does  the  proton 
recoil? 

b.  For  maximum  deflection  of  the  alpha  particle,  find 
the  recoil  energy  of  the  proton  (1)  as  a fraction  of  the  ini- 
tial kinetic  energy  of  the  alphagyarrirle  and  t2Iyn  electron 

volts‘  C V S).  g-'  c OK 

8-38.  Hockey  puck.  A hockey  puck  strikes  the  base  of  a \ 
vertical  wall.  It  impacts  at  speed  v , and  angle  In  addi- 
tion to  the  normal  force  N(t)  that  the  wall  exerts  on  the 
puck,  there  is  a kinetic  frictional  force  q,;JV(t)  acting  par- 
allel to  the  wall  surface.  Neglect  any  effects  related  to 
the  spinning  of  the  puck,  and  assume  that  the  final  magni- 
tude of  the  normal  velocity  component  u/sin  0yis  equal  to 
y Vi  sin  0, , where  y is  a constant  whose  value  is  less  than  1. 


/ 


a. 


b. 


Show  that  if  0,  > tan  1 
Suppose  that  0,  < tan-1 


Mfc(  l + y). 

l 


Mfc(l  + y). 


then  6f  = 90°. 
. Show  that  6f 


satisfies  the  equation 

y cot  6.p=  cot  0j  - (1  + y)/xk  V 

c.  The  largest  reasonable  value  of  y is  7=1. 
Assuming  that  y = 1,  independent  of  vt  and  0,-,  find  the 
ratio  of  the  final  kinetic  energy  to  the  initial  kinetic  energy 
as  a function  of  0;  and  /x,: . 


8-39.  Pushed  apart.  A spring  of  negligible  mass  having 
spring  constant  k is  held  by  a latch  between  two  bodies,  1 
and  2,  having  masses  mx  and  m2,  as  shown  in  Fig.  8E-39a. 
The  spring  is  compressed  through  a distance  d.  Initially 
the  entire  assembly  is  moving  with  velocity  v*  = u,x,  and 
the  line  joining  body  1 to  body  2 makes  an  angle  a with  v,  . 
At  t = 0 the  latch  breaks,  allowing  the  spring  to  push 
apart  the  bodies. 

Fig.  8E-39  v2/x 


(a)  t<  0 


( b ) t>tr 


Exercises 


335 


a.  Show  that  the  spring  teaches  its  relaxed  length  at 
tr  = i(2ir/a)),  where  or  = k(ml  + m2)/m1m2. 

b.  The  spring  is  not  bonded  to  the  two  bodies,  so 
they  move  freely  after  t = ty.  Find  the  total  kinetic  energy 
Kf  of  the  bodies  for  t 3=  tr. 

c.  Find  the  magnitude  and  direction  of  the  final  rela- 
tive velocity  U/  = v#  — Vj/in  terms  of  at,  d,  and  a.  Express 
u/in  magnitude-and-direction  form. 

d.  The  final  motion  of  the  two  bodies  is  illustrated  in 
Fig.  8E-39/i.  Solve  for  the  speeds  vXf  and  v2/and  the  angles 
/3X  and  /32  in  terms  of  mx , m2,  at,  d,  a,  and  vx. 

e.  Carefully  evaluate  tr,  Kf , ur,  vlf,  v2/,  , and  /32  for 

the  following  case:  k = 480  N/m,  mx  = 3.00  kg,  m2  = 2.00 
kg,  d = 0.200  m,  a = 1 10°,  and  vt  = 2.00  m/s.  When  you 
have  obtained  the  numerical  values,  check  these  by  con- 
firming that  the  final  kinetic  energy  is  numerically  equal 
to  the  initial  total  energy. 

8-40.  Oscillating  system  with  constant  frictional  force. 
Consider  an  oscillating  system  which  is  subject  to  a fric- 
tional retarding  force  Fr  whose  magnitude  is  independent 
of  the  speed,  that  is  Fr  = — / v,  with  / a constant.  Assume 
that  the  damping  is  small,  so  that  the  motion  is  oscillatory 


with  gradually  decreasing  amplitude  A(t).  That  is  the  posi- 
tion of  the  body  is  given  by  x(t)  = A(t)  cos(oj t + 5). 

a.  Obtain  an  expression  for  dE(t)/dt  for  the  oscillator. 
In  obtaining  an  expression  for  v(t),  neglect  the  term  con- 
taining the  small  factor  dA(t)/dt. 

b.  Integrate  your  expression  over  one  cycle  to  show 
that  (dE/dt)  = —y 0(E)112  where  y0  = (2V2/77 -w1/2)/. 

c.  Show  that  the  equation  given  in  part  b is  satisfied 
by  (E)  = <£>„  (1  - y0  t/2(E)o12)2. 

d.  Using  the  definition 


Q - 


oj(E) 

\(dE/dt)\ 


show  that  the  Q value  of  this  oscillator  depends  on  the  am- 
plitude of  the  oscillation. 

e.  Does  the  Q value  of  this  oscillator  increase  or  de- 
crease as  the  oscillation  is  gradually  damped  down? 

f.  The  results  obtained  above  are  dependent  on  the 
assumption  of  light  damping.  This  assumption  is  correct 
only  if  Q >£>  1.  What  restriction  does  this  inequality  place 
on  the  amplitude  of  oscillation?  Interpret  this  restriction 
in  terms  of  a comparison  between  the  damping  force  and 
the  oscillator's  restoring  force. 


336  Applications  of  Energy  Relations 


9 

Rotational 
Motion,  I 


9-1  ROTATIONAL  As  was  discussed  briefly  in  Sec.  2-1,  (here  are  two  basically  different  ways 
KINEMATICS  FOR  an  object  can  move:  one  is  to  change  its  location,  the  other  is  to  change  its 
A FIXED  AXIS  orientation.  In  some  cases  that  we  have  considered,  both  the  location  and 

the  orientation  of  an  object  change.  A puck  tied  to  a string  and  orbiting  on 
an  air  table  is  an  example.  The  orientation  of  the  puck  changes,  as  well  as 
its  location.  This  is  because  as  it  moves  around  the  air  table,  the  string  to 
which  one  side  of  the  puck  is  attached  is  continually  changing  orientation. 
But  we  concentrated  on  the  changes  in  location  of  the  puck,  paying  no 
attention  to  its  changes  in  orientation.  Many  of  the  objects  that  we  have 
studied  move  by  changing  only  their  location.  We  now  give  the  name  trans- 
lation (from  the  Latin  word  meaning  “carried  across”)  to  motion  involving 
change  in  location  with  no  change  in  orientation.  When  an  object  experi- 
ences translational  motion,  every  point  in  the  object  has  the  same  displacement  in 
any  small  time  interval  as  every  other  point. 

A brick  flying  through  the  air  without  changing  its  orientation  pro- 
vides an  example  of  translational  motion.  See  Fig.  9-1.  Since  every  point 
in  the  brick  moves  in  the  same  way  as  every  other  point,  at  each  instant  the 
location  of  a single  point  in  the  brick  (such  as  its  center)  can  be  used  to  give 
the  location  of  the  brick.  Up  to  now  we  have  always  described  the  motion  of 
a body  in  terms  of  the  motion  of  a single  point  in  the  body.  This  means 
that  we  have  assumed  either  that  the  body  moves  in  translation,  or  that  if  it 
also  changes  its  orientation  these  changes  can  be  ignored  (as  for  the  puck 
orbiting  on  the  air  table).  Thus  we  have  been  using  translational  kinematics 
to  describe  motion  and  translational  mechanics  to  explain  it. 

In  this  chapter  and  the  next  we  will  study  the  kind  of  motion  in  which 
the  orientation  of  an  object  changes.  An  example  of  such  motion  is  de- 
picted in  Fig.  9-2.  The  child’s  top  shown  there  moves  by  changing  its  orien- 


337 


^ ^ 


; 

& 

1 / 

■'  V 

/ ■ 

Fig.  9-2  A child’s  top.  With  its  tip  rest- 
ing on  the  floor,  the  top  rotates  rapidly 
about  an  axis  along  its  line  of  symmetry. 
The  arrow  circling  the  axis  indicates 
the  direction  of  rotation  about  the  axis. 
A point  on  the  rotation  axis  at  the  tip 
remains  at  a fixed  location  in  the  refer- 
ence frame  of  the  viewer.  But  the  align- 
ment of  the  axis  in  the  reference  frame 
changes  slowly  in  such  a way  that  the 
axis  traces  out  a cone  about  the  vertical, 
as  indicated  by  the  other  arrow. 


Fig.  9-1  A brick  undergoing  translation  with  respect  to 
the  reference  frame  of  the  viewer.  It  moves  without 
changing  its  orientation.  As  a consequence,  every  point 
in  the  brick  has  the  same  displacement  in  any  time  interval 
as  every  other  point.  Its  motion  thus  can  be  described 
completely  in  terms  of  the  motion  of  a single  point  in 
the  brick,  such  as  the  point  at  its  center. 


tation,  while  the  location  of  one  point — the  tip  that  is  supported  by  the 
floor — remains  fixed.  The  object  is  said  to  be  experiencing  rotation.  In  ro- 
tational motion  every  point  in  an  object  moves  through  an  arc  of  a circle  in  any 
small  time  interval,  and  the  centers  of  all  these  circles  lie  along  a straight  line.  The 
straight  line  is  called  the  rotation  axis.  For  a case  like  that  in  Fig.  9-2,  one 
point  on  the  rotation  axis  remains  at  a fixed  location  because  the  rotation 
axis  must  pass  through  the  one  fixed  point  in  the  object.  But  the  rotation 
axis  is  free  to  change  its  alignment  in  space  because  no  other  point  in  the 
object  is  fixed.  It  is  also  possible  for  an  object  to  move  by  rotating  about  an 
axis  which  changes  its  alignment  in  space,  with  no  point  on  the  rotation  axis 
remaining  at  a fixed  location.  This  is  what  the  earth  does.  The  earth  rotates 
daily  about  its  polar  axis.  The  polar  axis  changes  alignment  in  space  cycli- 
cally, in  much  the  same  way  as  a child’s  top,  taking  25,920  yr  to  complete 
each  cycle.  No  point  on  this  rotation  axis  remains  fixed  because  the  earth 
also  moves  in  its  annual  orbit  about  the  sun. 

In  Chap.  10  we  consider  the  motion  of  the  earth  and  of  other  objects 
rotating  about  an  axis  that  passes  through  no  fixed  points.  We  will  see  that 
such  motion  can  be  treated  by  considering  it  to  be  a combination  of  rotation 
about  an  axis  going  through  one  fixed  point  and  translation  of  that  point. 
In  Sec.  9-2  we  take  up  the  simpler  case  of  rotation  of  an  object  about  an  axis 
that  actually  does  pass  through  one  fixed  point.  In  this  section  we  start  with 
the  simplest  case,  in  which  an  object  rotates  about  an  axis  going  through 
two  fixed  points. 


The  flywheel  in  Fig.  9-3  exemplifies  a body  rotating  about  an  axis  that 
is  completely  fixed  in  the  reference  frame  from  which  the  body  is  observed. 
The  axis  is  fixed  because  the  bearings  at  the  two  ends  of  the  flywheel  axle 
are  attached  to  a supporting  structure.  We  consider  the  motion  of  the  fly- 
wheel for  the  purpose  of  developing  rotational  kinematics  for  a fixed  axis. 
That  is,  we  develop  the  ideas,  language,  and  relations  needed  to  describe  the 
motion,  leaving  until  later  the  task  of  developing  the  rotational  mechanics 
needed  to  explain  it. 


338  Rotational  Motion,  I 


Fig.  9-3  A flywheel  rotating  about  an 
axis  which  has  a fixed  alignment  and  is 
in  a fixed  location,  in  the  reference  frame 
of  the  viewer.  The  axis  is  fixed  because 
the  axle  of  the  flywheel  passes  through 
rigidly  supported  bearings  at  its  two 
ends.  The  figure  shows  the  motion  of 
two  small  pieces  of  the  flywheel  while 
it  rotates  through  the  angle  dcf>  in  a time 
dt.  The  direction  of  rotation,  or,  as  it 
usually  is  said,  the  sense  of  rotation,  is 
counterclockwise  from  the  viewpoint  il- 
lustrated. According  to  the  standard 
sign  convention  for  angles,  the  angle  d<t> 
specifying  the  counterclockwise  rotation 
is  counted  as  positive.  (But  note  that  as 
seen  from  the  other  side  of  the  wheel  it 
appears  to  be  rotating  clockwise,  and 
the  angle  dtp  would  be  counted  as  nega- 
tive. Hence,  if  you  are  describing  the 
sense  of  a rotaion  to  someone  else  by 
giving  the  sign  of  the  rotation  angle, 
you  must  be  sure  that  both  of  you  agree 
on  the  side  from  which  the  rotating  body 
is  to  be  observed.) 


A flywheel  is  also  a fine  example  of  a rigid  body — a body  whose  con- 
stituent parts  do  not  change  their  locations  relative  to  one  another  while 
the  body  moves  as  a whole.  Most  of  the  objects  we  will  be  concerned  with  in 
studying  rotational  motion  are  rigid  bodies. 

When  the  flywheel  rotates  through  a small  angle  dcfr  about  its  axis,  the 
small  piece  of  the  flywheel  labeled  1.  located  a distance  r1  from  the  axis,  is 
displaced  an  amount  dsx  along  an  arc  lying  in  the  plane  of  the  wheel  and 
centered  on  its  axis.  Measuring  the  rotation  angle  in  radians  makes  it  pos- 
sible to  write 


ds  i t i d cf) 

By  continuing  to  follow  the  motion  of  piece  1,  we  could  determine  its  veloc- 
ity and  acceleration.  Then  we  could  multiply  the  acceleration  by  the  mass 
of  piece  1 to  determine  the  force  which  must  be  exerted  on  the  piece.  In 
principle,  the  same  thing  could  be  done  for  all  the  other  small  pieces  of  the 
body. 

The  difficulty  with  this  approach  is  made  clear  if  you  consider  the 
simultaneous  motion  of  piece  2,  which  is  located  at  a different  distance  r2 
from  the  axis  of  the  flywheel.  Although  the  rotation  angle  dcf)  is  the  same 
for  all  parts  of  the  wheel,  the  displacement  ds2  is  different  from  the  dis- 
placement dsv  So,  too,  are  all  the  related  kinematical  quantities,  such  as  the 
acceleration.  As  a consequence,  if  piece  1 and  piece  2 have  the  same  mass, 
they  do  not  experience  the  same  force.  This  means  that  it  would  take  an 
unreasonable  amount  of  labor  to  make  direct  application  of  the  rules  of 
translational  mechanics  to  treat  the  motion  of  the  many  small  pieces  of  the 
body,  even  in  this  simple  case  where  the  body  is  a highly  symmetrical  disk. 

But  while  the  displacement  ds  is  different  for  parts  of  the  wheel  at  dif- 
ferent distances  r from  its  axis,  the  rotation  angle  dcf),  given  by 

d<f>=—  (9-1) 

r 

is  the  same  for  every  part  of  the  wheel.  Indeed,  rotation  is  perceived  intui- 
tively in  these  very  terms.  It  is  the  fact  that  all  parts  of  a rigid  body  rotate 
simultaneously  through  the  same  angle  that  leads  us  to  say  that  the  flywheel 
is  rotating  as  a whole.  So  what  we  will  try  to  do  is  to  develop  a rotational 
kinematics — and  then  later  a rotational  mechanics — in  which  rotation 
angle,  rather  than  displacement,  is  the  basic  quantity  measured.  In  treating 
rotation  about  a fixed  axis,  we  will  represent  rotation  angles,  and  related 
quantities,  as  signed  scalars. 

As  the  flywheel  rotates,  it  takes  a certain  time  dt  for  any  point  on  the 
wheel  to  pass  through  the  angle  dcf).  We  define  the  angular  velocity  a>  of 
the  wheel  to  be 


co 


dcf) 

~dt 


(9-2) 


in  direct  analogy  to  the  definition  of  the  familiar  translational  velocity  v = 
dx/ dt. 

By  convention,  the  rotation  angle  dcf>  has  a positive  value  if  the  rotation 
described  appears  to  be  counterclockwise  from  an  agreed-upon  point  of 
view — that  is,  as  seen  from  a particular  side  of  the  wheel — and  dcf>  is  nega- 
tive if  the  rotation  appears  to  be  clockwise  from  this  point  of  view.  Since  the 


9-1  Rotational  Kinematics  for  a Fixed  Axis  339 


quantity  dt  has  a positive  value,  the  sign  of  co  is  the  same  as  the  sign  of  dcj). 
And  since  the  angle  dcj)  is  measured  in  the  dimensionless  units  radians,  the 
units  for  a>  must  be  reciprocal  seconds  (s_1).  What  amounts  to  the  same  thing, 
the  units  for  co  are  often  quoted  as  radians  per  second  (rad/s). 

If  the  wheel  is  speeding  up  or  slowing  down,  co  will  change.  We  define 
the  wheel’s  angular  acceleration  a to  be 


dco 


d2d) 

~df 


(9-3) 


The  units  for  a are  reciprocal  seconds  squared  (s-2),  or  radians  per  second 
squared  (rad/s2).  Here  again  there  is  a complete  analogy  with  the  transla- 
tional quantity  a — dv/dt  = d2x/dt2.  Just  as  is  true  of  the  relation  between 
v and  a , when  co  is  positive  and  its  magnitude  is  increasing,  then  a is  positive. 
When  co  is  positive  and  its  magnitude  is  decreasing,  then  a is  negative.  What 
is  the  sign  of  a when  co  is  negative  and  its  magnitude  is  increasing?  What 
is  it  if  co  is  negative  and  its  magnitude  is  decreasing? 


In  Example  9-1  you  will  be  led  through  a derivation  of  all  the  kinema- 
tical  equations  relating  the  angular  coordinate  cj)  of  a body  rotating  about  a 
fixed  axis,  its  angular  velocity  co,  its  angular  acceleration  a,  and  the  time  t 
for  the  cases  of  constant  co  and  of  constant  a.  The  process  will  be  much  eas- 
ier and  quicker  than  the  derivation  of  the  analogous  translational  equations 
in  Chap.  2.  First,  you  will  be  closely  guided  by  analogy.  Second,  you  will 
carry  out  the  process  by  integration,  in  the  order  a — » co  — > cf>,  rather  than 
in  (he  reverse  order  of  Chap.  2. 


EXAMPLE  9-1  run— i—wmiiin— m it  ■■■■ 

a.  Derive  the  relations  among  cj),  co,  and  t for  constant  angular  velocity. 

b.  Derive  the  relations  among  cf_>,  co,  a,  and  t for  constant  angular  acceleration. 

c.  A wheel  accelerates  from  rest  with  a = 1.53  rad/s2.  Use  the  relations  devel- 
oped in  part  b to  find  co  and  </>  after  3.00  s has  elapsed. 

■ a.  From  Eq.  (9-2)  you  have 

dcj)  — co  dt  (9-4) 


Integrating  this  expression,  you  obtain 

r*/ 


0, 


dcj)  - 


co  dt 


(9-5) 


where  the  subscripts  i and  / denote  the  initial  and  final  values  of  the  quantities.  In 
general,  co  will  be  a function  of  t which  must  be  determined  in  order  to  evaluate  the 
integral.  Here,  however,  you  have  co  constant,  so  that 


dcj)  = co 


dt 


The  fundamental  theorem  of  calculus  then  can  be  applied  to  produce  the  result 

d>f  -</>,  = <o  ( tf  - ti)  (9-6) 


If  you  set  ti  = 0 and  drop  the  subscript / since  the  relation  holds  for  any  later  time  tf, 
the  value  of  <f>yat  that  time  is  given  by 

(/>  = </>,  + cot  for  constant  co  (9-/) 

The  angular  coordinate  4>  is  positive  if  the  wheel  appears  in  the  agreed-upon  view 
to  have  rotated  counterclockwise  from  the  orientation  used  to  define  <f)  = 0.  Equa- 


340  Rotational  Motion,  I 


tion  (9-7)  is  analogous  to  the  translational  equation  x = xf  + vt,  which  is  valid  for 
constant  v. 

b.  From  the  hrst  part  of  Eq.  (9-3),  you  can  write 

dco  = a dt 


Thus  you  have 


dco  = j a dt 


Taking  a constant  and  applying  the  fundamental  theorem,  you  obtain 


a)/  — Wj  = a J dt  = a(tf  — q) 

By  setting  q = 0 and  dropping  the  unneeded  subscript/,  this  simplifies  to 

co  = Mi  + at  for  constant  a (9-8) 

Equation  (9-8)  is  analogous  to  the  equation  of  translational  mechanics:  v = vt  + at 
for  constant  a. 

Using  Eq.  (9-5)  gives  you 

I d cp  = I co  dt  — [ ( cot  + at)  dt  = (0{  I dt  + a I t dt 

J <t>.  Jo  Jo  Jo  Jo 

Applying  the  fundamental  theorem  and  Eq.  (7-21),  you  get 

atj 

<Pr  ~ 4>i  ~ <»itf  + 


This  is  written  as 


4>  = (/>;  + Wit  H — — for  constant  a 


(9-9) 


where  again  you  drop  the  subscript/.  Equation  (9-9)  is  analogous  to  the  equation  of 
translational  kinematics:  x = xt  + vtt  + at2/ 2 for  constant  a. 

The  analogy  that  you  have  found  between  Eqs.  (9-8)  and  (9-9),  on  the  one 
hand,  and  the  corresponding  translational  equations,  on  the  other  hand,  allows  you 
to  write  immediately  the  relations 


<f>  ~ <t>i  + 


or 


(Of 


2a 


for  constant  a 


(9-10) 


and 


( CO  + C0i)t 

d>  - <t>i  J ^ 


for  constant  a 


(9-11) 


The  hrst  of  these  is  the  rotational  analogue  to  Eq.  (2-32),  and  the  second  is  the  rota- 
tional analogue  to  Eq.  (2-33). 

c.  Since  the  wheel  starts  from  rest,  you  have  co;  = 0.  Equation  (9-8)  thus  gives 

you 

co  = 0 + at  = 1.53  rad/s2  x 3.00  s = 4.59  rad/s 


for  the  angular  velocity  after  3.00  s.  The  positive  value  means  the  wheel  is  rotating 
counterclockwise,  as  viewed  from  the  agreed-upon  side  of  the  wheel. 

To  hnd  the  angle  through  which  the  wheel  has  turned,  you  can  apply  Eq.  (9-9), 
setting  </>,  = 0 since  no  particular  initial  angular  orientation  is  specified.  You  have 


</> 


= 0 + 0 + 


a r 


2 


1.53  rad/s2  x (3.00  s)2 

2 


= 6.89  rad 


9-1  Rotational  Kinematics  for  a Fixed  Axis  341 


You  can  obtain  t he  same  result  by  applying  Eq.  (9-10),  using  the  final  value  of  wjust 
obtained.  What  does  the  positive  value  of  <f>  mean?  The  magnitude  of  <t>  is  greater 
than  2tt.  What  does  this  mean? 

I'he  relation  between  the  velocity  of  a point  in  the  flywheel  and  the 
angular  velocity  of  the  flywheel  can  be  established  by  writing  Eq.  (9-1)  in 
the  form 

ds  = r dd>  (9-12) 

Here  r is  the  distance  of  the  point  from  the  rotation  axis,  and  ds  is  its  infini- 
tesimal displacement  along  its  circular  path  when  the  wheel  rotates 
through  an  infinitesimal  angle  d<t>.  Dividing  through  by  the  time  dt  it  takes 
for  this  to  happen,  we  have 

ds  dcj) 

Jt  = rHi 

or 

ds 

—r  = TO) 

dt 

The  magnitude  of  the  quantity  ds j dt  gives  the  magnitude  of  the  point’s 
velocity.  The  direction  of  the  velocity  is  always  tangent  to  its  path  and  is 
counterclockwise  if  the  sign  of  ds/dt  is  positive.  Thus  we  can  specify  the 
velocity  of  the  point  by  saying  that  it  has  only  a tangential  component  and 
that  the  tangential  velocity  component  vt  has  the  value 

vt  = ro)  (9-13) 

Since  r is  intrinsically  positive,  this  means  that  vt  has  the  same  sign  as  the 
angular  velocity  o).  That  is,  vt  is  positive  if  the  wheel  is  rotating  in  the  counter- 
clockwise sense. 

If  the  value  of  vt  is  changing,  then  the  speed  of  the  point  along  its  cir- 
cular path  is  not  constant.  It  was  shown  in  Sec.  3-3  that  in  these  circum- 
stances the  point  will  have  an  acceleration  along  its  path.  The  value  of  this 
tangential  acceleration  is  given  by  the  rate  of  change  of  the  speed.  Thus  we 
can  find  an  expression  for  the  point’s  tangential  acceleration  component  at 
by  evaluating 

dvt 

at^Hi 

Using  Eq.  (9-13),  we  have,  since  r is  constant, 

d(rco)  do) 

a,  = — ; — — r — 7- 
dt  dt 

Then  writing  do)/dt  as  a,  we  obtain  the  expression 

a(  = ra  (9-14) 

Again,  the  sign  of  a,  is  the  same  as  the  sign  of  the  angular  acceleration  a. 
You  should  go  through  an  analysis  like  the  one  immediately  below  Eq.  (9-3) 
and  relate  the  sign  of  at  to  the  four  cases  enumerated  there. 

file  calculation  leading  to  Eq.  (9-14)  is  very  similar  to  the  one  leading 
to  Eq.  (6-7)  for  the  tangential  component  of  the  acceleration  of  a boh  at  the 


end  of  a pendulum.  But  the  argument  and  the  symbolism  used  in  the  ear- 
lier calculation  are  somewhat  different.  It  would  be  worthwhile  for  you  to 
review  the  development  of  Eq.  (6-7).  If  you  do  so,  you  will  be  reminded 
that  whereas  the  vt  in  Eq.  (9-13)  is  the  only  component  of  the  rotating 
point’s  velocity,  the  a,  in  Eq.  (9-14)  is  not  the  only  component  of  its  accelera- 
tion. The  point  also  has  an  acceleration  component  in  the  centripetal 
direction,  that  is,  in  the  direction  toward  the  center  of  its  circular  path.  We 
can  evaluate  the  centripetal  acceleration  component  ac  by  combining  Eq. 
(3-41),  written  as  ac  = v\ /r,  with  Eq.  (9-13),  to  obtain 

(rat)2 


or 


ac  = ra>2  (9-15) 

In  Sec.  9-2  we  rederive  Eqs.  (9-13),  (9-14),  and  (9-15)  from  more  gen- 
eral and  more  rigorous  arguments.  Example  9-2  gives  an  application  of 
these  equations. 


EXAMPLE  9-2 


The  wheel  of  Example  9-1  accelerates  from  rest  with  a constant  angular  accelera- 
tion a = 1.53  rad/s2.  The  distance  from  the  axis  of  rotation  to  a certain  point  in  the 
wheel  is  r = 0.250  m. 

a.  Find  the  velocity  components  of  the  point  3.00  s after  the  wheel  begins  to 
move. 

b.  Find  the  acceleration  components  of  the  point  at  that  time. 

■ a.  The  velocity  of  the  point  has  only  a tangential  component.  That  is,  the  veloc- 
ity vector  is  perpendicular  to  the  direction  from  the  axis  to  the  point.  According  to 
Eq.  (9-13),  the  value  of  the  tangential  velocity  component  is 


vt  = rco 


Setting  r = 0.250  m and  using  the  value  of  the  angular  velocity  calculated  in 
Example  9-1,  o>  = 4.59  rad/s,  you  have 

vt  = 0.250  m x 4.59  rad/s  = 1.15  m/s 

The  positive  sign  tells  you  that  the  point  appears  to  be  moving  counterclockwise,  as 
seen  from  the  agreed-upon  side  of  the  wheel. 

b.  The  acceleration  of  the  point  has  a tangential  component  given  by  Eq. 
(9-14), 


at  = ra 


Llsing  the  values  quoted  for  r and  a,  you  get 

at  = 0.250  m x 1.53  rad/s2  = 0.383  m/s2 

The  direction  of  the  tangential  acceleration  is  also  counterclockwise. 

From  Eq.  (9-15),  you  know  that  the  centripetal  component  of  the  point’s  accel- 
eration is 


ac  = rw2 

With  the  given  value  of  r and  the  previously  calculated  value  of  w,  you  have 

ac  = 0.250  m X (4.59  rad/s)2  = 5.27  m/s2 

The  direction  of  the  centripetal  acceleration  is  that  from  the  location  of  the  point 
directly  into  the  axis  of  rotation. 


9-1  Rotational  Kinematics  for  a Fixed  Axis  343 


Fig.  9-4  The  velocity  vt,  tangential 
acceleration  a,,  and  centripetal  accelera- 
tion ac  of  a point  on  a wheel  whose  dis- 
tance from  the  axis  is  r.  The  angular 
coordinate  of  the  point  is  </>,  and  co  is 
the  angular  velocity  of  the  wheel.  The 
wheel  started  from  rest  at  t = 0.  rotated 
with  constant  angular  acceleration  a, 
and  is  depicted  at  a later  time  t. 


The  velocity  of  the  point  and  its  tangential  and  centripetal  acceleration  are  in- 
dicated in  Fig.  9-4.  Also  indicated  are  the  angular  coordinate  of  the  point  and  the 
angular  velocity  and  angular  acceleration  of  the  wheel.  What  is  the  total  accelera- 
tion of  the  point?  What  would  it  be  if  the  angular  velocity  were  the  same  but  the 
angular  acceleration  were  — 1.53  rad/s2? 


9-2  ROTATIONAL 
KINEMATICS  FOR 
A FREE  AXIS 


/ 


Fig.  9-5  A rotating  rigid  body.  In  the 
short  time  during  which  its  orientation 
changes  from  the  one  given  by  the  gray 
lines  to  the  one  given  by  the  black  lines, 
the  body  is  rotating  about  the  instan- 
taneous axis  shown  as  a dashed  line. 
The  instantaneous  axis  is  defined  by  the 
property  that  all  moving  points  in  the 
body  travel  on  short  arcs  centered  on 
the  axis.  If  there  are  points  in  the  body 
which  do  not  move,  they  lie  on  the 
instantaneous  axis.  The  vector  d<t>  de- 
scribes the  rotation  in  a way  explained 
in  the  text. 


In  some  of  the  most  interesting  cases  of  rotational  motion,  the  axis  about 
which  a body  rotates  is  free  to  change  its  alignment  in  the  reference  frame 
used  to  observe  the  body.  An  example  is  the  child’s  top  that  was  illustrated 
in  Fig.  9-2.  The  top  rotates  rapidly  about  an  axis  that  lies  along  the  top’s 
line  of  symmetry  while  that  axis  slowly  changes  its  alignment,  tracing  out  a 
cone  whose  apex  is  where  the  pointed  end  of  the  top  rests  on  the  floor.  The 
top  performs  this  motion  in  apparent  defiance  of  the  tendency  for  gravity 
to  make  it  fall  over.  If  we  want  to  understand  the  mechanics  that  explains 
this  motion,  first  we  must  extend  our  treatment  of  rotational  kinematics  to 
situations  more  general  than  that  of  rotation  about  a fixed  axis. 

In  this  section  you  will  see  that  rotational  kinematics  for  a free  axis  re- 
quires introducing  vector  quantities  to  describe  rotations.  You  will  also  see 
that  these  vector  quantities  are  so  convenient  that  they  are  used  not  only  in 
the  kinematics  of  systems  where  the  rotation  axis  is  free  to  change  its  align- 
ment. but  also  in  the  kinematics  of  systems,  like  flywheels,  where  the  align- 
ment of  the  rotation  axis  is  fixed.  In  fact,  the  vector  description  of  rotation 
is  used  mostly  in  this  chapter  for  simple  systems  that  have  fixed  rotation 
axes.  But  in  Chap.  10  we  treat  more  complicated  systems  that  have  free  ro- 
tation axes,  like  the  spinning  top  and  the  spinning  earth.  There  the  great 
power  of  the  vector  description  of  rotation  becomes  completely  apparent. 

Figure  9-5  shows  a rotating  rigid  body  of  arbitrary  shape.  It  is  depicted 
at  some  initial  time  and  at  a slightly  later  time.  The  dashed  line  is  the  instan- 
taneous axis  of  rotation — the  axis  about  which  the  body  is  rotating  during 
this  small  time  interval.  All  points  in  the  body  move  in  short  arcs  centered 
on  that  axis,  except  for  those  points  in  the  body  (if  there  are  any)  which  lie 
on  the  axis  and  so  do  not  move.  The  rotation  axis  may  change  alignment  in 
space  in  subsequent  time  intervals.  But  at  present  we  restrict  our  attention 
to  situations  in  which  at  all  times  the  instantaneous  axis  passes  through  a 
point  O which  is  fixed  in  the  reference  frame  used  to  observe  the  motion.  In 
Sec.  10-4  we  relax  this  restriction. 

A very  useful  description  of  the  small  rotation  of  the  body  can  be  given 
in  terms  of  the  rotation  vector  d(f).  By  definition,  this  vector  is  aligned  par- 
allel to  the  instantaneous  rotation  axis,  in  the  direction  given  by  the  right- 
hand  rule  for  rotation  vectors:  Place  your  right  hand  so  that  the  fingers  curl 


344  Rotational  Motion,  I 


/ 

/ 

/ 


/ 

Fig.  9-6  T he  right-hand  rule  for  speci- 
fying die  direction  of  a rotation  vector. 
The  alignment  of  a rotation  axis  does 
not  describe  uniquely  the  rotation  of  a 
body  because  the  body  can  be  rotating 
about  an  axis  either  one  way  or  the 
other.  To  specify  the  particular  sense 
of  rotation,  the  thumb  of  the  right  hand 
is  pointed  along  the  rotation  axis  with 
the  fingers  of  the  hand  curling  in  the 
sense  of  rotation.  Then,  by  definition, 
the  thumb  points  along  the  rotation 
vector 


in  the  sense  of  rotation.  The  thumb  then  points  in  the  direction  of  the  rota- 
tion vector  d<f>. 

1 he  choice  of  a right-hancl  rule,  instead  of  a left-hand  rule,  is  a matter 
of  convention.  It  is  important  to  remember  the  rule  and  to  use  it  consist- 
ently. The  rule  is  presented  pictorially  in  Fig.  9-6.  Note  that  the  relation 
between  the  sense  of  rotation  and  the  direction  of  the  rotation  vector,  given 
by  the  right-hand  rule,  is  the  same  as  the  relation  between  the  sense  of  rota- 
tion of  a wood  screw  and  its  direction  of  motion  as  it  is  screwed  into  a block 
of  wood. 

The  magnitude  of  the  rotation  vector  is  equal  to  the  angle  df  through 
which  the  body  has  rotated  about  the  axis,  measured  in  radians.  The  vector 
describing  the  rotation  illustrated  in  Fig.  9-5  is  shown  on  the  axis  of  rota- 
tion. Rotation  vectors  are  sometimes  called  axial  vectors.  The  many  vectors 
that  we  will  encounter,  which  have  mathematical  properties  like  those  of 
rotation  vectors,  are  also  given  this  generic  name. 

It  is  only  for  a rigid  body  that  there  is  a single  rotation  vector  d<p  which 
completely  describes  a small  rotation.  If  a body  is  not  rigid,  the  rotation  in 
one  part  may  be  different  from  the  rotation  in  another  part.  We  will  con- 
sider only  rigid  bodies. 

When  a rigid  body  undergoes  a small  rotation,  each  point  in  the  body 
experiences  a small  displacement  (except  for  points  on  the  rotation  axis). 
The  direction  and  magnitude  of  the  displacement  depend  on  the  location 
of  the  point  relative  to  the  rotation  axis,  as  well  as  on  the  magnitude  of  the 
rotation  angle.  We  need  to  know  the  relation  between  these  quantities  in 
order  to  calculate  the  velocity  and  acceleration  of  each  small  piece  of  the 
rotating  body.  The  relation  can  be  obtained  by  considering  Fig.  9-7. 

In  this  figure  the  vector  dfi  characterizing  the  rotation  is  drawn  along 
the  axis  of  rotation.  The  fixed  point  0 on  that  axis  is  chosen  as  the  origin  of 
coordinates.  Thus  the  position  of  a point  P in  the  body  is  specified  by  a 
vector  r extending  from  O to  P.  You  can  see  from  the  figure  that  the  short- 
est distance  from  P to  the  axis  is  the  length  of  the  perpendicular  from  that 
point  to  the  axis.  This  is  r sin  6.  where  6 is  the  angle  between  the  vectors  r 
and  dff)-  When  the  body  rotates  about  the  axis  through  an  angle  dfi.  ex- 
pressed in  radians,  the  distance  moved  by  the  tip  of  the  line  of  length  r is  df 


/ 

/ 

/ 

/ 

/ 

/ 


Fig. 9-7  A rigid  body  undergoing  a small  rotation  about  a dashed  rotation 
axis.  The  L-shaped  symbol  at  the  intersection  of  the  line  from  a typical 
point  P in  the  body  to  the  axis  means  that  the  angle  formed  at  the 
intersection  is  a right  angle.  The  displacement  ds  of  the  point  is  in  a 
direction  normal  to  the  plane  containing  r and  dxft.  Thus  ds  is  perpen- 
dicular to  both  r and  d<t>- 


9-2  Rotational  Kinematics  for  a Free  Axis  345 


times  r sin  6.  This  distance  is  ds , the  magnitude  of  the  displacement  ds  of  the 
point  P.  That  is, 

ds  = d(f)  r sin  6 (9-16) 

The  direction  of  the  vector  ds  is  perpendicular  to  both  r and  d(j>.  This 
direction  can  be  described  trigonometrically,  but  the  process  and  the  result 
are  cumbersome.  However,  a vector  operation,  called  taking  the  vector 
product,  directly  specifies  both  the  magnitude  and  the  direction  of  ds.  The 
vector  product  will  also  be  of  great  use  in  connection  with  a number  of 
other  vectors  that  also  arise  in  the  treatment  of  rotations.  So  we  now  digress 
for  the  purpose  of  introducing  it. 

l he  vector  product  of  two  vector  quantities  A and  B is  itself  a vector 
quantity.  In  symbols,  the  vector  product  is  written  as  A x B.  The  bold  cross 
is  used  both  to  indicate  the  specific  mathematical  operation  of  taking  the 
vector  product  and  to  emphasize  that  this  operation  is  completely  different 
from  the  operation  of  taking  the  scalar,  or  dot,  product  of  two  vectors, 
A • B.  The  expression  A x B is  read  “A  cross  B.”  Consequently,  the  vector 
product  is  frequently  called  the  cross  product.  By  definition,  the  magni- 
tude of  A x B is  A (sin  6)  B.  We  write  this  as 

|A  x B|  = A sin  6 B (9-17) 

Here  A and  B are  the  magnitudes  of  A and  B,  and  6 is  the  angle  between  their 
directions.  The  angle  6 is  always  counted  as  positive  and  is  always  the  smaller 
of  the  two  angles  formed  by  the  directions  of  A and  B.  Thus  6 always  lies 
in  the  range  0 =£  6 tt.  Absolute  value  signs  are  used  in  Eq.  (9-17)  to  indi- 
cate the  magnitude  of  the  vector  A x B.  So  the  left  side  of  the  equation 
is  necessarily  a positive  quantity,  l he  right  side  is  too,  since  A and  B are  posi- 
tive and  sin  6 is  everywhere  positive  in  the  range  0 ^ 9 s£  7r. 

The  direction  of  A x B is,  by  definition,  perpendicular  to  both  A and 
B and  in  the  direction  given  by  the  right-hand  rule  for  cross  products: 
Place  your  right  hand  so  that  the  fingers  curl  in  the  sense  that  would  sweep 
A into  alignment  with  B through  the  smaller  angle  between  them.  The  thumb 
then  points  in  the  direction  of  A x B.  You  will  find  illustrations  of  the  use  of 
this  rule  to  determine  the  direction  of  A x B in  Fig.  9-8.  Can  you  state  a 
“wood  screw  rule”  for  cross  products? 

In  Fig.  9-8  you  also  find  examples  of  the  use  of  Eq.  (9-17)  to  determine 
the  magnitude  of  A x B.  The  figure  demonstrates  some  important  prop- 
erties of  the  cross  product.  First,  A x B = 0 if  A is  parallel  or  antiparallel 
(oppositely  directed)  to  B.  Second,  |A  x B|  = AB  if  A is  perpendicular 
to  B.  The  directions  of  A,  B,  and  A x B are  related  in  this  case  just  like 
those  of  the  x,  y,  and  z axes  of  a “right-handed”  coordinate  system.  Third, 
AxB  = — BxA.  This  means  that  the  cross  product  does  not  obey  the 
commutativity  rule  obeyed  by  either  the  product  of  two  scalars,  that  is, 
AB  = BA,  or  the  dot  product  of  two  vectors,  that  is,  A • B = B • A.  In- 
stead, the  cross  product  obeys  what  may  be  called  an  anticommutativity  rule. 

The  cross,  or  vector,  product  is  in  almost  constant  use  in  any  general 
treatment  of  rotations  because  it  enters  in  almost  all  the  kinematical  and 
mechanical  quantities  pertaining  to  rotational  motion.  It  is  also  very  useful 
in  other  branches  of  physics,  such  as  electromagnetism,  where  rotationlike 
quantities  are  encountered.  In  geometry  and  trigonometry  it  can  be  em- 
ployed to  generate  the  component  of  one  vector  along  a direction  perpen- 
dicular to  another,  for  the  purpose  of  determining  the  shortest  distance  to 
a line  or  a plane. 


346  Rotational  Motion,  I 


(a) 


<*) 


( c ) ( d ) (e) 

Fig.  9-8  (a)  The  cross  product  A x B of  two  vectors  A and  B is  a third  vector  oriented 

normal  to  the  plane- containing  A and  B,  in  the  direction  specified  by  the  right-hand  rule 
illustrated  in  the  figure.  In  applying  this  rule,  the  fingers  of  the  right  hand  must  be  curling  in 
the  sense  that  would  cause  the  first  vector  in  the  cross  product  to  sweep  into  alignment  with  the 
second  vector,  going  through  the  smaller  of  the  two  angles  between  them.  Here  the  curling 
fingers  show  the  sense  which  would  cause  A to  sweep  into  alignment  with  B,  going  through 
an  angle  somewhat  less  than  tt/2.  The  magnitude  of  the  cross  product,  |A  x B|.  is  A sin  0 B. 
The  angle  0 is  always  positive  and  is  the  smaller  ol  the  two  angles  between  A and  B.  The  quantity 
A sin  0 is  the  length  of  the  perpendicular  dropped  from  the  tip  of  vector  A to  the  line  along 
vector  B.  That  is,  A sin  8 is  the  “height"  of  the  shaded  parallelogram  whose  sides  are  defined 
by  A and  B.  The  length  B of  the  vector  B is  the  "base”  of  this  parallelogram.  Since  the  area 
of  the  parallelogram  is  the  product  of  its  base  and  its  height,  the  magnitude  |A  x B|  = 
A sin  8 B of  the  cross  product  is  numerically  equal  to  the  area  of  the  parallelogram. 
( b ) The  cross  product  B x A has  the  magnitude  |B  x A|  = B sin  8 A.  This  is  the  same  as 
A sin  8 B = |A  x Bj.  the  magnitude  of  A x B.  But  the  direction  of  B x A is  opposite  to  the  di- 
rection of  A x B.  The  direction  ot  B x A is  shown  by  the  thumb  of  the  right  hand  in  the  figure, 
the  fingers  of  which  curl  in  the  sense  that  would  cause  B to  sweep  into  alignment  with  A,  going 
through  the  smaller  angle.  The  relation  between  A x B and  B x A is  specified  completely  by 
the  equation  A x B = — B x A.  (c)  If  A is  parallel  to  B then  8 = 0 and  sin  0 = 0,  so 
|A  x B|  = A sin  8 B = 0.  Since  the  magnitude  of  the  vector  A x B is  zero  in  this  case,  we  can 
write  A x B = 0.  (d)  If  A is  antiparallel  to  B,  then  8 = tt  and  sin  0 = 0,  so  |A  x B|  = 
A sin  0 B = 0 and  A x B = 0.  (e)  If  A is  perpendicular  to  B then  0 = ir/2  and  sin  0 = 1,  so 
|A  x B|  = A sin  0 B = AB. 


At  first  the  cross  product  may  strike  you  as  artificial  and  unduly  complicated. 
This  feeling  can  arise  out  of  the  question:  Why  define  an  operation  on  two  vectors 
in  such  a way  that  the  product  is  a vector  perpendicular  to  both?  The  answer  is 
that  the  mathematical  operation  involved  is  intrinsically  three-dimensional — in 
fact,  the  cross  product  cannot  be  defined  in  a space  with  any  other  number  of  di- 
mensions. Now,  the  physical  world  is  also  intrinsically  three-dimensional,  and 
there  are  many  physical  phenomena  which  simply  cannot  exist  in  one  or  two  di- 
mensions. Rotation  is  the  first  of  these  phenomena  which  we  will  discuss 
thoroughly.  The  simplest  case  of  rotation — a wheel  spinning  on  a fixed  axis — can 
be  handled  in  two  dimensions  (just  barely!).  But  even  in  this  case,  it  is  necessary 
to  define  an  axis  which  lies  in  a third  dimension.  We  need  an  intrinsically 


9-2  Rotational  Kinematics  for  a Free  Axis  347 


three-dimensional  mathematical  operation  to  describe  three-dimensional  physical 
phenomena.  Believe  it  or  not.  the  cross  product  is  the  simplest  operation  with  this 
capability.  (If  you  don’t  believe  it,  try  to  invent  a simpler  one.) 


We  can  make  immediate  use  of  the  cross  product  by  applying  it  in  Eq. 
(9-16)  to  write 

ds  = d<f>  x r (9-18) 

for  the  displacement  in  Fig.  9-7.  This  equation  describes  at  once  the  magni- 
tude and  direction  of  the  vector  ds,  which  represents  the  displacement  of  a 
point  at  the  position  r when  the  body  undergoes  a rotation  d<|>.  You  can 
check  this  by  inspecting  Eq.  (9-17)  and  Fig.  9-8. 

If  we  divide  Eq.  (9-18)  by  the  infinitesimal  time  increment  dt,  during 
which  the  infinitesimal  rotation  and  displacement  occur,  we  have 

ds  dd> 

— = — — X r 
dt  dt 


Because  ds/dt  gives  the  velocity  v of  the  point,  this  can  be  written 

v - w x r (9-19) 


where  we  define 


_ ch}> 

dt 


(9-20) 


The  quantity  co  is  the  angular  velocity  of  the  rotating  body.  Since  dcf)  is  an 
angle  expressed  in  radians,  the  magnitude  a>  of  the  angular  velocity,  the 
angular  speed,  gives  the  rate  of  rotation  in  radians  per  second.  The  direc- 
tion of  specihes  the  alignment  of  the  rotation  axis  and  the  sense  of  rota- 
tion about  that  axis,  just  as  d(f>  does,  because  o»  has  the  same  direction  as  d(f>. 
Equation  (9-19)  allows  you  to  calculate  the  vector  describing  the  velocity  v 
of  the  point  P in  the  body  at  any  instant,  given  its  position  vector  r and  the 
angular  velocity  vector  c o of  the  body  at  that  instant. 


Figure  9-9  illustrates  the  application  of  the  general  relation  of  Eq. 
(9-19)  to  calculate  the  velocity  v of  a point  P on  the  rim  of  a flywheel  of 
radius  r and  angular  velocity  to,  rotating  in  the  sense  shown  in  the  figure.  It 
seems  natural  to  choose  the  origin  O on  the  axis  of  the  wheel  at  its  intersec- 
tion with  the  plane  of  the  wheel.  This  choice  makes  the  angular  velocity  (o 
perpendicular  to  the  position  vector  r that  gives  the  location  of  the  point. 
Thus,  since  the  angle  between  o>  and  r has  the  value  6 = tt/2,  we  find  for 
the  magnitude  of  v the  expression 


v = leu  X r|  = a)  sin  9 r = o>  r 


cot  S 

/«Xr 


Fig.  9-9  Use  of  the  equation  v = to  x r to  find  the  velocity  v of  a 
point  P in  a flywheel  whose  location  relative  to  the  coordinate 
origin  O on  the  body’s  rotation  axis  is  given  by  the  vector  r.  The 
angular  velocity  of  rotation  is  oj.  To  the  right  is  an  auxiliary  con- 
struction which  determines  the  direction  of  (o  x r. 


348  Rotational  Motion,  I 


/ 

/ 


uXr 


Fig.  9-10  The  equation  v = at  x r can 
be  used  to  find  the  velocity  v of  a point  P 
in  a top,  just  as  it  was  used  in  Fig,  9-9  for 
a point  in  a flywheel.  The  only  differ- 
ence is  that  in  this  figure  the  origin  O 
is  not  in  the  plane  of  rotation  of  P.  Con- 
sequently, the  angular  velocity  to  of  the 
rotating  body  is  not  perpendicular  to  the 
position  vector  r of  the  point  in  the  body. 


This  agrees  with  Eq.  (9-13),  the  particular  relation  obtained  earlier  for  such 
a case.  The  figure  also  shows  that  the  direction  of  v = at  x r correctly  de- 
scribes the  tangential  motion  of  P at  the  instant  depicted. 

Let  us  use  Eq.  (9-19)  to  determine  the  velocity  of  a point  in  a spinning 
top,  say  a point  on  its  surface  at  the  widest  part.  See  Fig.  9-10.  We  fix  the 
coordinate  origin  O at  the  motionless  tip  of  the  top.  The  location  of  the 
point  P at  the  instant  shown  is  then  given  by  the  position  vector  r from  O to 
P.  The  angular  velocity  vector  a>  is  parallel  to  the  top’s  symmetry  axis  and  is 
in  the  direction  shown  if  the  sense  of  rotation  is  as  shown.  Also,  we  choose 
to  draw  to  along  the  symmetry  axis  of  the  top.  The  instantaneous  direction 
of  the  velocity  v of  point  P is  tangential,  and  perpendicular  to  both  co  and 
r,  since  this  is  the  direction  of  to  x r.  Its  magnitude  is 

v = |o>  X r|  = oi  sin  6 r = u>d 

Here  d = r sin  6 is  the  perpendicular  distance  from  P to  the  axis,  as  shown 
in  the  figure. 

What  is  the  rationale  for  our  choice  of  the  origin  O of  the  vector  r describing 
the  position  of  a particular  point  P in  the  body?  At  any  instant  the  angular  velocity 
cj  has  a certain  direction,  and  therefore  the  axis  of  rotation  has  a certain  alignment. 
At  that  instant  any  point  on  the  rotation  axis  could  be  chosen  as  an  origin  of  coor- 
dinates. Each  different  choice  of  O would  lead  to  a different  value  of  the  vector  r 
extending  to  the  same  point  P.  But  the  velocity  v of  that  point,  found  from  Eq. 
(9-19),  is  independent  of  the  choice  of  O.  All  choices  lead  to  the  same  value  of  to  x r. 
The  reason  is  that  the  cross  product,  with  its  implicit  term  sin  0,  produces  the 
component  of  r perpendicular  to  the  direction  of  the  rotation  axis  (the  quantity  d 
in  the  equation  immediately  above),  and  this  does  not  depend  on  where  O lies  on 
that  axis.  However,  O should  almost  always  be  chosen  at  the  single  fixed  point  on 
the  axis.  This  is  the  only  choice  of  O which  will  continue  to  be  on  the  rotation  axis 
as  the  axis  changes  orientation. 


Next  we  will  develop  an  expression  for  the  acceleration  a of  a point  P 
in  a rotating  body.  Taking  the  time  derivative  of  Eq.  (9-19),  we  have 


dv 

dt 


d_ 

dt 


(to  X r) 


or 

dxo  dr 

a = — X r + to  X — 

dt  dt 

The  last  step  is  an  example  of  the  rule  for  differentiating  cross  products. 
I he  rule  combines  the  scalar  calculus  rule  for  differentiating  a product  of 
two  quantities  and  the  rule  that  the  ordering  of  terms  in  a cross  product 
must  not  be  changed,  because  the  cross  product  is  not  commutative. 

Now  dr/dt  gives  the  velocity  v of  the  point  as  well  as  ds/dt,  since  ds  = 
dr.  (See  Fig.  9-7.)  So  the  point’s  acceleration  a is 

dco 

a = —r~  X r + to  X v 

dt 

This  result  can  be  written 


where 


a = « X r + w X v 


a 


dot 

dt 


(9-21) 


(9-22) 


9-2  Rotational  Kinematics  for  a Free  Axis  349 


We  have  introduced  a to  represent  the  angular  acceleration  vector,  de- 
fined as  the  rate  of  change  of  the  body’s  angular  velocity  vector  o>.  Example 
9-3  explores  the  physical  meaning  of  each  of  the  terms  in  Eq.  (9-21). 


EXAMPLE  9-3 

a.  Apply  Eq.  (9-21)  to  calculate  the  acceleration  a of  a point  P on  the  rim  of  a 
flywheel  rotating  with  constant  angular  speed  co. 

® The  angular  velocity  co  for  a flywheel  is  in  a constant  direction  pointing 
along  the  fixed  rotation  axis  in  the  way  that  is  related  to  the  sense  of  rotation  of  the 
wheel  by  the  right-hand  rule  for  rotation  vectors.  If  its  magnitude  co  is  also  constant, 
then  the  angular  acceleration  a = dco/dt  is  zero.  Thus  the  only  term  contributing  to 
a in  Eq.  (9-2 1 ) is  the  term  co  x v.  As  you  saw  in  Fig.  9-9,  v = w X risina  tangential 
direction  and  has  magnitude  v = car.  You  can  see  from  Fig.  9-1  la  that  since  oj  is  in 
an  axial  direction,  its  cross  product  with  v must  be  in  a radial  direction;  specifically, 
Fig.  9-1  la  shows  you  that  io  x v is  in  the  inward  radial  direction  — that  is,  in  the 
centripetal  direction.  And  since  there  is  a right  angle  between  co  and  v,  the  magni- 
tude of  their  vector  product  is 

a = \co  x v|  = a>  v 

Writing  v = co  r,  and  using  the  subscript  c to  indicate  that  the  acceleration  is 
centripetal,  you  have 

ac  = co2r 

This  result  is  identical  to  Eq.  (9-15)  for  the  centripetal  acceleration  of  a point 
moving  with  constant  speed  around  a circle  of  radius  r.  It  constitutes  an  independent 
derivation  of  that  important  equation.  ■ 

b.  Now  use  Eq.  (9-21)  to  calculate  the  acceleration  of  a point  on  the  rim  when 
the  angular  speed  co  of  the  flywheel  is  not  constant,  but  is  increasing  at  the  rate  a. 

■ In  this  case  there  is  an  angular  acceleration  a = dco/dt.  Its  magnitude  is  a = 
dco/dt,  and  its  direction  is  the  same  as  the  direction  of  co.  So,  in  addition  to  the  cen- 
tripetal acceleration  found  in  part  a,  there  is  also  a contribution  to  the  acceleration 
of  the  point  on  the  rim  of  the  flywheel  due  to  the  term  a x r.  This  vector  quantity  is 
in  the  same  tangential  direction  as  the  velocity  v of  the  point,  as  you  can  see  by  con- 


Fig.  9-11  (a)  The  cross 

product  co  x v in  Example 
9-3a.  ( b ) The  cross  product 
a x r in  Example  9-3 b. 


350  Rotational  Motion,  I 


9-3  ANGULAR 
MOMENTUM 


structing  Fig.  9-116.  So  you  find  that  the  magnitude  at  of  the  tangential  accelera- 
tion is 

at  = |a  x r|  = a r 

in  agreement  with  Eq.  (9-14). 

To  evaluate  the  total  acceleration  a of  the  point  on  the  rim  of  the  angularly 
accelerating  flywheel,  you  take  the  vector  sum  of  its  tangential  and  radial  constitu- 
ents. This  gives 

a = ar  v — w2r  r (9-23) 

Here  v is  a unit  vector  tangential  to  the  path  of  the  point  and  in  its  direction  of  mo- 
tion, and  r is  a unit  vector  in  the  outward  radial  direction.  The  first  term  on  the 
right  side  of  this  equation  is  the  acceleration  resulting  from  the  changing  magnitude 
of  the  velocity  of  the  point  on  the  rim  of  the  wheel  when  the  angular  speed  of  the 
wheel  is  increasing.  The  second  term  is  the  acceleration  resulting  from  the  chang- 
ing direction  of  the  velocity  of  the  point  on  the  rim  as  the  point  rotates  around  the 
axis  of  the  wheel.  How  would  Eq.  (9-23)  be  modified  if  the  angular  speed  of  the  fly- 
wheel were  decreasing? 


Angular  momentum  is  a quantity  that  plays  a role  in  rotational  mechanics 
completely  analogous  to  the  role  played  by  momentum  in  translational  me- 
chanics. The  fundamental  reason  why  momentum  is  important  is  that 
when  a certain  set  of  conditions  is  satisfied,  the  momentum  of  a body  re- 
mains constant.  Angular  momentum  is  important  for  the  same  reason — 
when  a certain  other  set  of  conditions  is  satisfied,  a body  maintains  a con- 
stant angular  momentum.  The  concept  of  angular  momentum  will  be  in- 
troduced by  analyzing  several  strobe  photos  of  a puck  moving  on  an  air 
table.  In  the  first  of  these,  the  puck  moves  in  such  a way  that  both  its  mo- 
mentum and  its  angular  momentum  are  constant.  Then  photos  will  be  con- 
sidered which  depict  cases  where  the  momentum  of  the  puck  changes  as  it 
moves,  but  its  angular  momentum  does  not  change.  In  such  cases  it  is  more 
useful  to  consider  the  angular  momentum  of  the  moving  body  than  to  con- 
sider its  momentum,  because  the  angular  momentum  has  a simpler  be- 
havior. 

Figure  9-12  shows  a puck  moving  freely  across  the  air  table,  as  viewed 
by  a camera  fixed  to  the  earth’s  surface.  The  position  of  the  puck  at  each  in- 
stant when  the  strobe  light  flashes  is  specified  by  a position  vector  r drawn 
from  an  arbitrarily  chosen  point  O to  the  center  of  the  puck.  This  point  is 
the  origin  of  a reference  frame  which  can  be  considered  to  be  an  inertial 
frame.  As  the  puck  moves,  there  is  a continual  change  in  its  direction  from 
O because  the  direction  of  the  vector  r changes.  The  situation  bears  a simi- 
larity to  the  continual  change  in  the  direction  of  a small  piece  of  a flywheel 
from  a point  on  the  axis  of  the  wheel  as  the  small  piece  rotates  about  the 
point.  Thus  it  can  be  said  that  the  puck  rotates  about  O,  even  though  it  never 
makes  a complete  circuit.  Certainly  there  is  an  angular  velocity  (o  associated 
with  the  puck’s  motion  about  O since  the  angle  </>  from  some  fixed  line 
through  O to  the  line  through  r changes  as  time  passes.  The  right-hand 
rule  for  rotation  vectors  shows  that  o>  is  always  directed  outward  (that  is, 
normal  to  the  page  and  toward  you  when  you  view  the  page).  But  its  magni- 
tude oj  varies  as  the  puck  moves.  Inspection  of  the  photo  shows  that  the 
largest  change  A<£  in  the  angle  specifying  the  direction  of  r,  during  any  of 
the  equal  time  intervals  At  between  consecutive  strobe  light  flashes,  occurs 
when  the  puck  is  closest  to  O.  Thus  the  average  value  of  o»  for  each  time  in- 
terval, which  is  given  by  A </>/  At,  increases  as  the  puck  approaches  that  point 


9-3  Angular  Momentum  351 


Fig.  9-12  A strobe  photo  of  a puck  moving  freely 
across  the  top  of  an  air  table  from  lower  left  to 
upper  right.  The  notation  (outward),  next  to  the 
vector  at,  means  that  the  vector  is  pointing  directly 
outward  from  the  page,  toward  the  viewer. 


and  decreases  as  it  recedes  from  it.  Because  the  magnitude  of  the  vector  c*> 
varies,  that  quantity  is  not  particularly  useful  in  describing  the  rotation  of 
the  puck  about  O. 

A quantity  that  is  particularly  useful  is  the  angular  momentum  vector, 
to  which  we  assign  the  symbol  1 (the  letter  pronounced  “el”,  not  the  number 
pronounced  “one”).  By  definition, 

1 = r x p (9-24) 

A particle’s  angular  momentum  vector  about  an  origin  O is  the  cross  product  of  its 
position  vector  and  its  momentum  vector.  The  magnitude  of  the  angular  mo- 
momentum  is  / = |r  x p|  = r sin  6 p,  where  6 is  the  angle  between  r and  p. 
Units  used  to  measure  / are  m-kg-m/s  = m2-kg/s.  The  direction  of  the 
angular  momentum  is  given  by  the  cross-product  right-hand  rule. 

Angular  momentum  is  defined  in  Eq.  (9-24)  for  a particle.  In  everyday 
language,  a particle  is  something  that  is  very  small.  In  the  language  of  new- 
tonian  mechanics,  a particle  is  a body  having  mass  whose  motion  can  be 
treated  completely  by  considering  only  the  motion  of  a single  point.  A par- 
ticle is  an  idealization,  not  something  actually  found  in  nature.  But  approx- 
imations of  actual  bodies  by  particles  are  common.  To  be  adequately 
approximated  by  a particle,  the  size  of  a body  must  he  sufficiently  small,  and  its 
changes  in  orientation  must  be  sufficiently  slow. 

The  puck  in  Fig.  9-12  is  reasonably  approximated  by  a particle.  First, 
its  radius  is  rather  small  compared  to  its  distance  from  O , even  at  the  closest 
point.  Thus  the  position  vector  r from  O to  the  center  of  the  puck  is  much 
the  same  as  the  position  vector  from  O to  any  piece  of  the  puck.  So  the 
single  vector  r does  an  adequate  job  of  representing  the  position  of  every 
piece,  and  hence  the  position  of  the  puck  as  a whole.  Second,  the  motion  of 
any  piece  resulting  from  the  changing  orientation  of  the  puck  is  rather  slow 
compared  to  the  motion  resulting  from  the  changing  location  of  the  puck. 
As  a consequence,  the  velocity  of  the  center  of  the  puck  is  not  very  dif- 
ferent from  the  velocity  of  any  piece  of  the  puck.  This  means  that  the  mo- 


352  Rotational  Motion.  I 


mentum  vector  p,  obtained  by  multiplying  the  puck’s  mass  into  the  velocity 
of  its  center,  is  a fair  representation  of  the  momentum  of  the  puck  as  a 
whole.  Thus  we  can  consider  the  puck  to  be  a particle.  And  we  can  evaluate 
its  angular  momentum  by  using  the  equation  1 = r x p,  with  r the  position 
vector  of  its  center  and  p its  mass  times  the  velocity  vector  of  its  center.  Let 
us  do  so. 

Figure  9-13  reproduces  the  position  vector  r of  the  puck  at  the  begin- 
ning of  each  of  three  time  intervals.  The  momentum  vector  p of  the  puck 
for  each  of  these  time  intervals  is  shown  also.  This  is  done,  as  in  Fig.  4-18, 
by  using  a scale  for  the  magnitude  of  the  momentum  such  that  the  mo- 
mentum vector  in  a time  interval  extends  from  the  puck’s  position  at  the 
beginning  of  the  interval  (when  the  strobe  light  flashes)  to  its  position  at  the 
end  of  the  interval  (when  the  strobe  light  flashes  next).  A vector  construc- 
tion in  the  figure  applies  the  right-hand  rule  for  cross  products  to  determine 
the  direction  of  the  puck’s  angular  momentum  1 = r x p for  a typical  pair 
of  r and  p vectors.  For  that  pair,  and  all  the  others,  1 has  the  same  direction, 
namely,  outward.  So  1 has  a constant  direction. 

The  magnitude  of  1 is  also  constant.  To  prove  this,  we  evaluate 

l = |r  x p|  = r sin  6 p 

The  figure  shows  that  for  a typical  pair  of  vectors  r and  p,  the  quantity 
r sin  6 equals  dm , the  perpendicular  distance  from  O to  an  extension  of  the 
line  of  motion  of  the  puck  during  the  time  interval  used  to  determine  r and  p. 
Thus  we  have 

l = dmp  (9-25) 

But  since  the  puck  is  moving  along  a straight  line,  the  value  of  dm  is  the 
same  for  every  pair  of  vectors.  Furthermore,  since  the  puck  moves  with 
momentum  of  constant  magnitude,  the  value  of  p is  the  same  for  every 
pair.  Therefore,  the  magnitude  / of  the  puck’s  angular  momentum  about  0 


Fig.  9-13  I he  position  vector  r from  origin  O,  and 
the  momentum  vector  p,  of  a puck  moving  uniformly 
across  an  air  table.  Its  angular  momentum  vector 
1 = r x p about  0 is  constant.  The  perpendicular 
distance  from  O to  the  line  of  motion  is  dm . 


9-3  Angular  Momentum  353 


has  the  constant  value  l = dmp.  We  conclude  that  the  puck  moving  with 
constant  momentum  p maintains  an  angular  momentum  1 about  an  arbi- 
trary origin  O which  is  constant  in  both  direction  and  magnitude.  The 
direction  of  1 specifies  the  sense  of  rotation  about  0 in  much  the  same 
way  that  the  direction  of  at  does.  Its  magnitude  l gives  the  value  of  the  mag- 
nitude p of  the  puck's  momentum  times  the  perpendicular  distance  dm 
from  0 to  its  line  of  motion. 

The  puck  moving  freely  across  the  air  table  with  constant  momentum 
p and  angular  momentum  1 has  no  net  force  acting  on  it.  What  happens  if 
there  is  a net  force  acting  on  the  puck?  Its  momentum  p will  surely  no 
longer  be  constant.  But  in  certain  circumstances,  which  apply  in  many  very 
important  cases,  the  puck  can  still  have  a constant  angular  momentum  1. 
Figure  9-14  is  a familiar  strobe  photo  of  a puck  orbiting  on  an  air  table  and 
viewed  by  a camera  mounted  on  the  ground.  The  net  force  acting  on  the 
puck  is  the  force  exerted  on  it  by  the  string  going  from  the  puck,  over  a 
swiveling  pulley  in  the  middle  of  the  table,  to  a weight  hanging  beneath  the 
table.  This  force  is  always  directed  toward  a certain  fixed  point  (the  pulley),  no 
matter  where  the  puck  on  which  it  acts  is  located.  Such  a force  is  called  a 
central  force.  The  puck’s  angular  momentum  1 about  the  origin  O of  an  es- 
sentially inertial  reference  frame  is  constant,  providing  the  proper  point  is 
chosen  for  the  origin.  The  origin  must  be  chosen  at  the  point,  called  the 
force  center,  toward  which  the  central  force  is  directed. 

To  see  this,  we  again  evaluate  1 = r x p.  Application  of  the  right-hand 
rule  for  cross  products  to  each  of  the  r and  p pairs  constructed  in  the  figure 
demonstrates  that,  for  each,  1 is  in  the  direction  outward.  Its  magnitude  / 
can  be  determined  by  applying  Eq.  (9-25), 

l dmp 

[This  relation  is  valid,  no  matter  how  the  puck  moves  before  or  after  a par- 
ticular time  interval  during  which  its  position  and  momentum  are  specified 


Fig.  9-14  A puck  moving  uniformly  in  a circular 
orbit  on  an  air  table,  under  the  influence  of  a force 
always  directed  toward  a fixed  point  called  the  force 
center.  The  puck  maintains  a constant  angular  mo- 
mentum 1 = r x p about  an  origin  0 at  the  force 
center. 


354  Rotational  Motion,  I 


by  a certain  pair  of  values  r and  p.  This  is  because  the  relation  involves  only 
the  magnitude  p of  the  puck’s  momentum  and  the  perpendicular  distance 
dm  from  the  origin  O to  its  line  of  motion,  during  that  time  interval.  Thus 
the  relation  can  be  applied  to  the  circular  motion  of  interest  here,  as  well  as 
in  the  straight-line  motion  used  to  derive  it.  If  this  is  not  evident,  use  Fig. 
9-14  to  rederive  the  relation,  using  the  fact  that  sin  (7 t — 6)  = sin  0.]  Since 
the  puck  is  moving  in  a circular  path,  and  since  the  origin  O is  chosen  to  be 
at  the  center  of  the  circle,  the  perpendicular  distance  dm  from  O to  the 
puck’s  line  of  motion  has  the  same  value  for  each  pair  of  r and  p vectors. 
Furthermore,  because  the  puck  moves  uniformly  through  its  path,  the 
magnitude  p of  its  momentum  is  also  the  same  for  every  pair.  Thus  the 
magnitude  / = dmp  of  its  angular  momentum  about  O is  constant. 

Another  way  to  reach  the  same  conclusion  is  to  note  that  the  angle  0 
between  r and  p is  shown  by  the  figure  to  be  close  to  tt/2.  If  the  strobe  flash 
time  interval  were  reduced  to  an  infinitesimal  value,  the  angle  would  ap- 
proach tt/2.  In  this  limit,  / has  the  value  l = |r  x p|  = r sin  Op  — rp.  Since  r 
is  a constant  equal  to  the  radius  of  the  circle,  and  since  p is  a constant  equal 
to  the  magnitude  of  the  puck's  instantaneous  momentum,  it  follows  that 
l = rp  is  constant.  (It  is  better  to  determine  the  actual  value  of  / in  this  way 
than  to  use  the  relation  / = dmp,  with  dm  and  p constructed  as  in  the  figure. 
The  construction  in  the  figure  underestimates  the  magnitude  of  / for  two 
reasons.  First,  the  dm  constructed  in  the  figure  is  smaller  than  the  actual 
value,  the  radius  of  the  circular  orbit.  Second,  the  construction  gives  the 
magnitude  of  an  average  momentum  p which  is  smaller  than  the  magni- 
tude of  the  actual  instantaneous  momentum,  because  the  puck  moves  in 
a circular  path  and  so  has  a higher  speed  than  the  distance  measured  on  a 
straight  line  between  consecutive  positions  indicates.) 

We  have  found  that  a puck  moving  through  an  inertial  frame  in  a cir- 
cular orbit  under  the  influence  of  a central  force  maintains  an  angular  mo- 
mentum 1 about  an  origin  O at  the  force  center  which  is  constant  in  both 
direction  and  magnitude.  The  magnitude  of  its  momentum  p is  also  con- 
stant, but  the  direction  of  this  vector  is  continually  changing  as  the  puck 
moves  around  the  orbit.  So  in  this  case  1 is  constant,  but  p is  not.  As  a conse- 
quence, it  may  be  easier  to  deal  with  1 than  with  p in  studying  the  motion  of 
the  puck.  This  advantage  will  become  still  greater  for  more  complicated 
motions,  as  we  will  see  soon. 

For  a given  motion  the  value  of  the  angular  momentum  1 = r x p depends  on 
the  choice  of  the  origin  O.  This  is  because  the  value  of  the  vector  r from  O to  a 
given  position  changes  if  the  choice  of  O is  changed.  For  example,  the  angular  mo- 
mentum of  the  puck  moving  uniformly  in  a straight  line  in  Fig.  9-13  is  reversed  in 
direction,  and  changed  in  magnitude,  if  the  origin  is  chosen  to  be  at  the  point  in 
the  figure  labeled  O'.  However,  the  new  angular  momentum  F will  also  have  a 
constant  value,  and  so  O ' would  probably  be  an  equally  appropriate  choice  for  the 
origin.  For  uniform  circular  motion,  the  off-center  origin  O'  shown  in  Fig.  9-14 
would  most  likely  be  inappropriate,  since  the  angular  momentum  F for  that  origin 
does  not  have  a constant  value.  The  question  of  choice  of  origin  arises  repeatedly 
in  the  study  of  rotational  motion.  In  some  cases  the  location  of  the  origin  makes  no 
practical  difference.  But  in  others  considerable  simplification  in  the  analysis  of 
the  system  can  be  achieved  by  a judicious  choice.  If  the  system  has  an  obvious 
point  of  symmetry,  or  an  obvious  rotation  axis,  then  the  origin  should  usually  be 
located  at  the  symmetry  point,  or  on  the  rotation  axis  (at  a fixed  point,  if  there  is 
one). 


9-3  Angular  Momentum  355 


A puck  moving  under  the  influence  of  a central  force,  as  viewed  from 
an  inertial  frame,  maintains  a constant  angular  momentum  about  an  origin 
at  the  force  center,  even  if  its  orbit  is  not  circular.  An  example  is  shown  in 
Fig.  9-15.  The  same  apparatus  was  used  to  obtain  this  strobe  photo  as  was 
used  to  obtain  the  photo  of  a puck  in  a circular  orbit.  But  in  the  motion  re- 
corded in  Fig.  9-15  the  puck  was  given  an  initial  velocity,  perpendicular  to 
the  string,  of  smaller  magnitude  than  that  required  to  put  it  into  a circular 
orbit. 

The  angular  momentum  of  the  puck  in  the  noncircular  orbit  can  be 
analyzed  by  applying  the  definition  1 = r x p,  Doing  so,  you  see  once  more 
that  its  angular  momentum  1 about  an  origin  0 at  the  force  center  is  in  the 
fixed  direction  outward.  In  this  case  the  magnitudes  of  both  its  position 
vector  r and  its  momentum  vector  p vary,  as  does  the  angle  6 between 
them.  Nevertheless,  the  relation  / = dmp  applies.  You  can  use  it  by  mea- 
suring with  a ruler  the  lengths  of  the  lines  labeled  dm  and  p in  Fig.  9-15  and 
then  calculating  their  product  /.  You  will  find  the  same  value  for  each  of  the 
three  r and  p pairs  in  the  figure,  within  the  accuracy  that  can  be  expected 
of  the  technique.  Thus  you  will  show,  within  this  accuracy,  that  the  angular 
momentum  of  the  puck  about  an  origin  is  constant  if  the  puck  is  acted  on 
by  a central  force  directed  toward  that  origin.  (The  accuracy  is  only  of  the 
order  of  10  percent.  The  source  of  error  is  the  underestimate  of  l resulting 
from  the  use  of  the  relation  / = dmp,  with  dm  and  p constructed  by  em- 
ploying consecutive  puck  positions  for  the  finite  strobe  flash  time  interval 
in  Fig.  9-15.  See  the  parenthetical  remark  made  about  Fig.  9-14.  The  un- 
derestimate is  quite  small  for  the  construction  in  Fig.  9-15  where  the  puck 
is  farthest  from  O,  but  the  underestimate  is  more  than  10  percent  for  the 
other  two  constructions.) 

Angular  momentum  is  particularly  important  in  systems  containing 
particles  at  either  of  the  two  extreme  ends  of  the  scale  of  size.  At  the  large 
end  is  a satellite  or  a planet.  This  “particle”  is  acted  on  by  a gravitational 


Fig.  9-15  A puck  moving  nonuniformly  in  a noncir- 
cular orbit  on  an  air  table  under  the  influence  of 
a central  force.  The  puck  maintains  a constant  angular 
momentum  1 = r x p about  an  origin  O at  the  force 
center. 


356  Rotational  Motion,  I 


force,  which  is  a central  force  always  directed  toward  a certain  point  in  a 
reference  frame  that  can  be  considered  to  be  inertial.  The  angular  mo- 
mentum of  the  particle  about  an  origin  at  that  point  is  constant.  For  in- 
stance, Halley’s  comet  is  observed  to  move  through  its  highly  elliptical  orbit 
with  constant  angular  momentum  about  a certain  origin.  This  origin  is  at 
the  sun,  the  point  toward  which  the  gravitational  force  exerted  on  the 
comet  is  always  directed. 

At  the  small  end  of  the  size  scale,  the  same  situation  is  found  for  an 
electron  in  a hydrogen  atom.  The  dominant  force  acting  on  the  electron  is 
an  electric  force  that  is  always  directed  toward  the  nucleus  of  the  atom, 
which  can  be  taken  as  fixed  in  an  inertial  frame.  Experiment  shows  that  the 
electron  maintains  a constant  angular  momentum  about  an  origin  at  the 
nucleus.  In  treating  all  these  systems,  it  can  be  much  more  productive  to 
deal  with  the  constant  angular  momentum  of  the  particle  than  to  deal  with 
its  varying  momentum  — for  the  same  reason  that  it  can  be  very  useful  to 
make  calculations  involving  the  total  mechanical  energy  in  a system  where 
it  is  constant. 

This  section  has  presented  strobe  photos  and  mentioned  other  experi- 
mental evidence  which  leads  to  the  conclusion  that  a particle’s  angular  mo- 
mentum about  the  origin  of  some  inertial  frame  is  constant  if  no  net  force 
acts  on  it,  or  if  the  net  force  acting  on  the  particle  is  a central  force  and  the 
origin  is  taken  at  the  force  center.  In  Sec.  9-4  it  is  shown  that  this  con- 
clusion is  not  really  a new  law  of  physics.  It  can  be  obtained  by  combining 
the  definition  of  angular  momentum,  1 = r x p,  with  Newton’s  second  law 
of  motion,  F = dp/dt.  But  before  we  get  into  these  matters,  the  conclusion 
is  used  in  Example  9-4  to  solve  a problem  that  would  be  very  difficult  to 
solve  without  it. 

EXAMPLE  9-4 

Figure  9-16  shows  a student  projecting  a puck  of  mass  500  g into  a circular  orbit  of 
radius  30.0  cm  on  an  air  table.  The  orbital  period  (the  time  required  for  one  trip 
around  the  orbit)  is  2.00  s.  The  necessary  tension  in  the  string  is  supplied  by  a sec- 
ond student  crouched  beneath  the  table,  who  applies  a force  to  the  lower  end  of  the 
string. 

a.  Evaluate  the  magnitude  of  the  angular  momentum  of  the  puck. 

b.  After  a while,  the  second  student  increases  the  force  exerted  on  the  string, 

Fig.  9-16  Experiment  considered 
in  Example  9-4. 


9-3  Angular  Momentum  357 


pulling  ii  until  the  puck  goes  into  a circular  orbit  of  radius  15.0  cm.  What  happens  to 
the  angular  momentum  of  the  puck?  What  happens  to  its  orbital  period? 

■ a.  In  the  initial  circular  orbit,  the  puck’s  position  vector  r relative  to  an  origin  at 
the  center  is  always  perpendicular  to  its  momentum  vector  p.  So  its  angular  mo- 
mentum has  magnitude 

l — rp  = rmv 

where  r is  the  orbit  radius,  m is  the  mass  of  the  puck,  and  v is  its  speed.  Since 

277  r 

v = 

t 

with  t being  the  orbital  period,  you  have 

277  r^m 

1 = “ (9-26) 


For  the  values  given, 

; 277  x (0.300  m)2  x 0.500  kg 

1 ~ 2.00  s 

= 0. 14 1 m2-kg/s 

b.  Because  the  string  exerts  a central  force  on  the  puck,  the  puck’s  angular 
momentum  is  constant,  even  when  the  radius  of  its  orbit  is  reduced  from  30.0  cm  to 
15.0  cm.  This  allows  you  to  predict  the  period  of  the  new  orbit.  Writing  Eq.  (9-26)  as 

277  m „ 

and  noting  that  277  m/l  does  not  change,  you  can  see  that  when  r is  one-half  as  large, 
t will  be  one-quarter  as  large.  So  the  period  of  the  new  orbit  will  be  2.00  s/4  = 
0.500  s. 

Finding  this  answer  by  using  angular  momentum  was  a simple  task.  It  is  a very 
different  matter  if  you  use  momentum  and  force.  Try  it! 


9-4  TORQUE  Why  is  a particle’s  angular  momentum  1 about  the  origin  O of  an  inertial 
reference  frame  constant  if  no  net  force  acts  on  the  particle,  or  if  the  net 
force  acting  on  it  is  always  directed  toward  that  origin?  We  can  find  out  by 
starting  from  the  definition  of  angular  momentum,  1 = r x p,  where  r is 
the  position  of  the  particle  relative  to  O and  p is  its  momentum.  To  inves- 
tigate the  circumstances  in  which  this  quantity  does  not  change  in  time,  we 
evaluate  its  time  derivative. 


c/1  _ d{ r x p) 
dt  dt 


Using  the  rule  for  finding  the  derivative  of  a cross  product,  w'e  have 

dr 


d\ 

dt 


dt 


x P 


dp 

r x -f- 
dt 


(9-27) 


For  the  moment,  let  us  consider  only  the  first  term  on  the  right  side  of  this 
equation.  Since  dr/dt  = v,  w'here  v is  the  velocity  of  the  particle,  and  since 
p = m\,  where  m is  its  mass,  the  term  can  be  written 

dr 

— x p = v x m\ 
dt  r 


358  Rotational  Motion,  I 


o 

Fig.  9-17  An  attractive  central  force, 
F = — Fr,  acting  on  a particle  of  mass 
m that  is  viewed  from  an  inertial  frame 
whose  origin  O is  at  the  force  center. 


But  v x m\  = 0 always,  because  the  vector  v is  always  parallel  to  the  vector 
mv.  Therefore,  we  have 

dr  n 

7 x p = () 

dt  v 

and  so  Eq.  (9-27)  simplifies  to 

d\  dp 

~r  = r X -f- 
dt  dt 

Finally,  we  invoke  Newton’s  second  law  to  write  dp/dt  = F,  where  F is  the 
net  force  acting  on  the  particle.  We  can  do  this  because  we  use  the  origin 
of  an  inertial  frame  for  O.  The  result  is 

^ = r x F (9-28) 

at 

Equation  (9-28)  provides  the  answer  to  the  question  posed.  Consider 
first  the  zero-force  case.  When  no  net  force  acts  on  a particle,  r x F = 0 be- 
cause F = 0.  Then  Eq.  (9-28)  says  d\/dt  = 0,  or  1 = constant.  This  result 
holds  for  any  choice  of  the  inertial  frame  origin  O,  since  no  choice  has  been 
specified. 

Next  consider  the  central-force  case.  Here  there  is  a net  force  F acting 
on  the  particle,  hut  it  is  always  directed  toward  a certain  point  in  the  inertial 
frame,  and  that  point  has  been  chosen  for  the  origin  O.  As  is  illustrated 
in  Fig.  9-17,  the  force  can  be  expressed  in  terms  of  a unit  vector  r that  is  in 
the  direction  of  the  particle’s  position  vector  r.  The  expression  is 

F = —Fr  (9-29) 

Here  F is  the  magnitude  of  the  attractive  central  force,  and  it  may  or  may 
not  be  a constant.  Evaluating  the  right  side  of  Eq.  (9-28)  with  this  F,  we 
have 

r x F = r x (-Fr)  = -Fr  x r 

But  r x r = 0 always,  because  the  vector  r is  always  parallel  to  its  own  unit 
vector  r.  Hence 

r x F = 0 

and  so  Eq.  (9-28)  again  says  dl/dt  = 0,  or  1 = constant. 

The  force  expressed  by  Eq.  (9-29)  is  called  an  attractive  central  force 
because  its  effect  is  to  attract  the  particle  on  which  it  acts  toward  the  force 
center.  The  force  exerted  by  the  string  on  the  air  table  puck  is  of  this  form, 
as  are  the  gravitational  force  exerted  on  a planet  or  a satellite  and  the  domi- 
nant electric  force  exerted  on  an  electron  in  an  atom.  In  Chap.  20  we  will 
be  concerned  with  a repulsive  central  force,  whose  effect  is  to  repel  the 
particle  on  which  it  acts  from  the  force  center.  Such  a force  is  the  electric 
force  exerted  on  an  alpha  particle  as  it  is  scattered  by  an  atomic  nucleus. 
How  would  you  express  a repulsive  central  force?  Can  you  use  the  expres- 
sion to  prove  that  the  particle  on  which  it  acts  must  maintain  a constant 
angular  momentum  about  an  origin  at  the  force  center? 

If  a particle  is  moving  under  the  influence  of  a net  force  which  is  not  a 
central  force — or  if  the  force  is  central  but  for  some  reason  the  origin  O of 
the  inertial  frame  used  to  observe  the  particle  has  not  been  chosen  at  the 


9-4  Torque  359 


force  center — then  the  right  side  of  Eq.  (9-28), 


will  not  be  zero.  In  such  a case  the  particle’s  angular  momentum  1 about  O 
will  have  a nonzero  rate  of  change  dl/dt,  and  1 will  change  in  time.  It  is  said 
that  the  particle’s  angular  momentum  about  O varies  because  the  force 
acting  on  it  produces  a torque  about  O.  To  be  specific,  the  torque  T about 
an  origin  O produced  by  a force  F applied  at  a point  whose  position  with 
respect  to  O is  r has,  by  definition,  the  value 

T s r x F (9-30) 

The  torque  vector  about  an  origin  O is  the  cross  product  of  the  position  vector  of  the 
point  of  application  of  the  force  and  the  force  vector.  The  magnitude  of  the 
torque  is  T = |r  x F|  = r sin  9 F,  where  9 is  the  angle  between  rand  F.  Units 
used  to  measure  T are  meter-newtons  (m-N).  The  direction  of  the  torque  is 
given  by  the  right-hand  rule  for  cross  products. 

In  common  language  the  word  “twist”  is  frequently  used  to  convey  the 
same  idea  as  the  technical  word  “torque.”  (In  fact,  torque  is  just  the  Latin 
word  for  twist.)  Either  implies  the  act  of  imparting  rotation.  Some  ex- 
amples of  torque  are  illustrated  in  Fig.  9-18.  Observe  that  the  sense  of  the 
rotational  effect  produced  by  a torque  about  O is  in  agreement  with  the  two 
right-hand  rules.  The  magnitude  of  the  torque  is 

T = |r  x F|  = r sin  6 F 

The  caption  to  Fig.  9- 18c  shows  that  this  can  be  written  in  the  form 

T = daF  (9-31) 

where  da  is  the  perpendicular  distance  from  O to  an  extension  of  the  line  of  ac- 
tion of  the  force.  For  a force  of  given  magnitude  F,  maximum  torque  is 
achieved  by  maximizing  da. 

In  terms  of  the  net  torque  acting  on  a particle  when  it  experiences  the 
net  force  occurring  in  Eq.  (9-28),  that  equation  can  be  written 


T = 


dl 

dt 


(9-32) 


In  words,  net  torque  equals  rate  of  change  of  angular  momentum.  This  is  the 
rotational  form  of  Newton’s  second  law.  The  relation  is  as  basic  to  rota- 
tional mechanics  as  the  relation  “net  force  equals  rate  of  change  of  momen- 
tum” is  to  translational  mechanics.  We  have  obtained  the  relation  by  con- 
sidering the  motion  of  a single  particle  viewed  from  an  inertial  frame.  But 
soon  we  will  see  that  a very  similar  relation  applies  to  a system  containing 
many  particles. 

There  is  a striking  analogy  between  the  rotational  form  of  Newton’s 
second  law,  Eq.  (9-32),  and  its  translational  form, 


The  momentum  p measures  the  inertial  tendency  of  a particle  to  maintain 
its  translational  motion.  If  this  measure  of  the  translational  motion 
changes,  we  say  the  reason  is  that  a net  force  F acts  on  it.  The  value  of  F is 
equal  to  the  rate  of  change  of  p.  Similarly,  the  angular  momentum  1 mea- 


360  Rotational  Motion,  I 


Fig.  9-18  (a)  To  set  a heavy  door  into  rotation,  you  must  pro- 

duce a torque  T = r x F about  a point  0 on  the  hinge  axis, 
directed  parallel  to  that  axis.  Applying  a force  F of  a given 
magnitude  leads  to  a torque  in  the  required  direction  that  has 
the  largest  magnitude  when  the  vector  r from  0 to  the  point 
of  application  of  the  force  has  a maximum  magnitude  and  a 
direction  perpendicular  to  the  direction  of  the  force.  ( b ) If  the 
magnitude  of  the  vector  r from  O to  the  point  of  application 
of  the  force  F is  reduced,  the  magnitude  of  the  torque  T is 
reduced,  (c)  Also,  if  the  direction  of  the  applied  force  F is 
changed,  so  that  it  is  no  longer  perpendicular  to  the  direction 
of  the  vector  r from  O to  the  point  of  application,  the  magni- 
tude of  the  torque  T is  reduced.  In  all  cases,  the  magnitude 
of  the  torque  is  given  by  T = |r  x F|  = r sin  6 F.  Since 
sin(7r  — Q)  = sin  0,  this  can  be  written  T = r sin(7r  — 6)  F.  The 
figure  shows  that  r sin(-7r  — 6)  = da.  The  quantity  da  is  the  per- 
pendicular distance  from  0 to  a line  passing  through  the  point 
of  application  of  the  force  and  extending  in  its  direction,  called 
the  line  of  action  of  the  force.  So  the  magnitude  of  the  torque 
is  T = daF.  The  three  parts  of  this  figure  give  three  examples 
of  the  proportionality  of  T to  da  . Does  this  agree  with  your 
intuition?  Do  you  agree,  on  an  intuitive  basis,  that  T is  propor- 
tional to  F? 


sures  the  inertial  tendency  of  a particle  to  maintain  its  rotational  motion 
about  an  origin.  If  this  measure  of  the  rotational  motion  changes,  we  say 
the  reason  is  that  a net  torque  T about  the  origin  acts  on  the  particle.  And 
the  value  of  T is  equal  to  the  rate  of  change  of  1.  Of  course,  the  analogy  is 
not  accidental;  we  defined  torque  and  angular  momentum  so  as  to  ob- 
tain it! 

From  the  point  of  view  of  Eq.  (9-32),  the  reason  why  the  angular  mo- 
mentum about  an  inertial  frame  origin  O does  not  change  for  a particle 
acted  on  by  no  net  force,  or  by  a force  directed  to  or  from  O,  is  that  in  both 
cases  no  net  torque  is  exerted  about  O.  Example  9-5  applies  Eq.  (9-32)  to  in- 
vestigate the  motion  of  a particle  in  a situation  in  which  the  force  acting  on 
the  particle  does  exert  a torque  about  an  origin  O,  and  consequently 
changes  its  angular  momentum  about  that  origin. 

EXAMPLE  9-5  — ■■■■—— 

a.  The  air  table  puck  of  mass  500  g in  Example  9-4  is  traveling  in  the  circular 
orbit  of  radius  30.0  cm  with  period  2.00  s.  Suddenly  the  power  fails,  the  pump  sup- 
plying air  to  the  table  stops,  and  the  puck  comes  into  direct  contact  with  the  tabletop. 
The  coefficient  of  kinetic  friction  between  the  plastic  puck  and  the  aluminum  table 
top  is  0.13.  How  much  time  elapses  from  the  moment  when  the  puck  contacts  the 
top  to  the  moment  when  it  comes  to  rest? 

■ First  you  should  make  a sketch,  like  Fig.  9-19,  showing  the  puck  at  some  point 
in  its  orbit  while  it  is  moving  in  contact  with  the  tabletop.  The  forces  acting  on  the 


9-4  Torque  361 


| d\  = T dr 

1 


Fig.  9-20  The  relation  between  the 
angular  momentum  1 of  the  puck  in 
Example  9-5  and  the  change  d\  in  the 
angular  momentum  occurring  during 
the  time  interval  dt. 


Fig.  9-19  Vectors  associated  with  the 
motion  of  the  air  table  puck  considered 
in  Example  9-5. 


puck  of  mass  m are  the  force  mg  exerted  in  the  downward  direction  by  gravity,  the 
supporting  force  N exerted  by  the  tabletop  in  the  upward  direction  normal  to  the 
surface,  the  contact  kinetic  friction  force  Ck  exerted  by  the  tabletop  in  the  direction 
parallel  to  the  surface  and  opposite  to  its  direction  of  motion  (the  direction  of  its 
momentum  p),  and  the  force  S exerted  by  the  string  in  the  inward  radial  direction 
to  keep  the  puck  in  orbit. 

The  torque  exerted  about  O by  the  force  N is  r x N,  where  r is  the  position 
vector  of  the  puck  relative  to  the  origin  O.  Similarly,  the  torque  exerted  about  0 
by  the  force  mg  is  r x mg.  The  sum  of  these  torques  is  r x N + r x mg  = 
r X (N  + mg)  = 0.  This  is  true  because  the  puck  does  not  accelerate  in  the  vertical 
direction,  and  so  N + mg,  which  is  the  net  vertical  force  it  feels,  must  be  zero. 

The  torque  about  O produced  by  the  force  S is  rxS  = rX  ( — Sr)  = 
—Sr  X r = 0,  where  r is  a unit  vector  in  the  direction  of  r.  In  words,  the  force 
that  the  string  exerts  on  the  puck  is  a central  force  and  so  produces  no  torque  about 
the  origin  located  at  the  force  center. 

Consequently,  the  net  torque  about  O acting  on  the  puck  is  the  torque  produced 
by  the  force  Ck . This  torque  is 

T = r x Cfc 

The  right-hand  rule  for  cross  products  shows  you  that  the  direction  of  T is  down- 
ward, as  indicated  in  the  figure.  Its  magnitude  is 

T = |r  X Ck\  = r sin  9 Ck  = rCk 

because  the  angle  9 between  r and  Ck  has  the  value  tt/2.  According  to  Eq.  (4-23), 
you  have  Ck  = fikN,  with  /xk  being  the  coefficient  of  kinetic  friction.  Since  N = mg, 
this  can  be  written  Ck  = n-kmg,  and  so  you  have 

T = rfxkmg 

The  angular  momentum  of  the  puck  about  O is 

1 = r X p 

where  p is  its  momentum.  The  right-hand  rule  for  cross  products  shows  that  1 is  in 
the  upward  direction,  as  the  figure  indicates. 

Hence  you  find  that  the  puck  is  acted  on  by  a torque  vector  whose  direction  is 
opposite  to  that  of  its  angular  momentum  vector.  Since  the  torque  is  equal  to  the 
rate  of  change  of  the  angular  momentum,  the  fact  that  the  torque  is  directed  opposite 
to  the  angular  momentum  means  that  the  angular  momentum  will  have  an  ever- 
decreasing  magnitude.  You  can  see  that  this  is  the  case  by  making  a sketch,  as  in 
Fig.  9-20,  showing  the  vector  1 at  some  instant  and  its  change  d\  during  the  small 
time  interval  dt  immediately  following  that  instant.  Since  dl/dt  = T,  the  vector  f/1 
is  given  by  the  expression  t/l  = T dt.  And  since  T is  directed  opposite  to  1 and  dt  is 
positive,  d\  must  be  directed  opposite  to  1,  as  shown  in  the  sketch.  The  sketch  makes 
it  clear  that  the  magnitude  of  1 is  decreasing. 


362  Rotational  Motion,  I 


The  magnitude  of  T gives  the  rate  at  which  the  magnitude  of  1 decreases, 
because  T = dl/dt.  The  rate  has  the  constant  value  r/j.kmg.  If  l decreases  by  this 
amount  each  second,  in  At  seconds  it  will  decrease  by  the  amount  A/,  where 

Al  = r/xkmg  At 

Solving  for  At,  you  obtain 


At 


A! 

r/J-kmg 


When  the  puck  comes  to  rest,  the  magnitude  of  its  angular  momentum  has  de- 
creased from  the  value  it  had  at  the  instant  it  hrst  touched  the  tabletop  to  its  final 
value,  zero.  The  numerical  result  obtained  in  Example  9-4c  shows  you  that  the 
decrease  is 


Al  = 0.141  m2-kg/s 

Using  this,  the  well-known  value  of  g,  and  the  values  quoted  for  r,  , and  m,  you  find 
that  the  time  required  for  the  puck  to  come  to  rest  is 

_ 0.141  m2-kg/s 

1 ~ 0.300  m x 0.13  x 0.500  kg  x 9.80  m/s2 
= 0.74  s ■ 

b.  If  the  air  pump  stays  on  until  after  the  student  below  the  table  pulls  on  the 
string  to  make  the  puck  go  into  the  orbit  of  radius  1 5.0  cm,  and  then  the  pump  stops, 
how  much  time  elapses  from  when  the  puck  contacts  the  tabletop  to  when  it  comes 
to  rest? 

■ The  puck  has  the  same  angular  momentum  about  O in  the  orbit  of  15.0  cm 
radius  as  it  does  in  the  orbit  of  30.0  cm  radius,  because  the  string  cannot  exert  a 
torque  on  the  puck  to  change  its  angular  momentum.  Therefore  the  value  of  Al  is 
the  same  as  in  part  a.  The  force  of  kinetic  friction  has  the  same  value  as  in  part  a also. 
However,  this  force  exerts  only  half  as  much  torque  about  O on  the  puck  because 
the  radius  r of  its  orbit  is  only  half  as  large  as  in  part  a.  This  means  the  decrease  per 
second  in  / is  half  as  much  as  in  part  a,  and  consequently  it  takes  twice  as  much  time 
for  the  decrease  to  take  place.  Hence  in  this  case  you  have 

At  = 2 x 0.74  s = 1.5  s 


9-5  ROTATION  OF 
SYSTEMS  AND 
ANGULAR  MOMENTUM 
CONSERVATION 


The  next  step  in  our  study  of  rotation  is  to  broaden  the  scope  of  the  equa- 
tion governing  rotational  motion  so  that  it  applies  to  a system  comprising 
many  particles.  This  will  allow  us  to  treat  the  rotation  of  a body  which 
cannot  be  considered  to  be  a single  particle.  It  also  will  lead  us  to  the  very 
important  law  of  the  conservation  of  angular  momentum  of  an  isolated 
system. 

Consider  a system  containing  n ideal  particles,  viewed  from  an  inertial 
reference  frame  whose  origin  is  O.  As  shown  in  Fig.  9-21,  the  position  of 
the  hrst  particle  relative  to  O is  specified  by  the  position  vector  iq.  Its  mo- 
mentum vector  is  pj.  Subscripts  ranging  from  2 to  n are  used  to  identify  the 
other  particles  of  the  system  and  to  label  their  positions  and  momenta.  The 
letter  j is  used  to  represent  a typical  subscript.  Thus  the  position  and  mo- 
mentum of  a typical  particle  are  r,  and  pj.  According  to  the  definition  of  Eq. 
(9-24),  the  jth  particle’s  angular  momentum  about  O is 


L = U X p; 


9-5  Rotation  of  Systems  and  Angular  Momentum  Conservation  363 


Fig.  9-21  A rotating  body  comprising 
a set  of  n particles  labeled  by  the  sub- 
scripts 1,  2,  3,  . . . , j n.  The 

momentum  of  a typical  particle  is  given 
by  the  vector  p,-,  and  its  position  with 
respect  to  the  origin  O is  given  by  the 
vector  rj . 


We  define  the  system’s  total  angular  momentum  as  the  sum  of  the 
angular  momenta  of  its  constituent  particles.  Thus  the  total  angular  mo- 
mentum L about  O is  given  by  the  vector  sum 

L - 2 h (9-33) 

t=i 

Evaluating  lj,  we  have 

L = 2 rj  x Pi 

j=i 

To  develop  the  equation  governing  the  rotation  of  the  system  as  a 
whole,  we  proceed  in  exactly  the  same  way  as  in  developing  Eq.  (9-32),  the 
equation  governing  the  rotation  of  a single  particle.  That  is,  we  calculate 


Since  the  derivative  of  a sum  of  terms  equals  the  sum  of  their  derivatives, 
this  is 


d L 

dt 


” d 

2 Tt  {ri  x Pi) 


3=  1 


Evaluating  the  derivative  of  the  cross  product,  we  have 


(TL 

dt 


" dr}  " dp, 

3=1  3=1 


Now  each  of  the  terms  in  the  first  sum  has  the  value 

d*j 


dt 


x Pj  = Vj  x m.jYj 


0 


(9-34) 


So  the  first  sum  on  the  right  side  of  Eq.  (9-34)  is  zero.  Thus  we  have 

dt  " dt 

3=1 

Since  we  are  working  in  an  inertial  frame,  we  can  employ  Newton’s  second 
law  to  yield 

,/T  n 

(9-35) 

3=  1 

where  Fj  is  the  net  force  acting  on  thejth  particle  of  the  system. 


We  can  achieve  a tremendous  simplification  of  Eq.  (9-35)  by  realizing 
that  the  net  force  Fj  felt  by  a typical  particle  arises  from  two  distinct 
sources — external  and  internal.  The  first  part  of  the  net  force  comes  from 
the  force  applied  externally  to  the  particle.  This  is  the  force  acting  on  it 
from  outside  the  system  being  considered,  which  we  write  as  FextJ-.  The  sec- 
ond part  of  the  net  force  Fj  experienced  by  a typical  particle  is  the  inter- 
nally applied  force  Fintj  acting  on  it  from  inside  the  system.  The  Fintj  are 
the  forces  which  the  particles  of  the  system  exert  on  one  another.  As  an  ex- 
ample, consider  a system  containing  only  the  particles  of  a child’s  top.  Each 
particle  of  the  system  is  acted  on  by  an  external  force,  the  gravitational 
force  which  an  object  external  to  the  system  (the  earth)  exerts  on  the  par- 
ticle. The  particle  at  the  tip  is  also  acted  on  by  the  external  force  exerted  by 


364  Rotational  Motion,  I 


the  surface  on  which  the  top  rests.  The  internal  forces  acting  on  the  par- 
ticles of  the  top  are  those  which  bind  them  rigidly  to  the  other  particles,  so 
that  the  entire  system  remains  a top.  But  a system  containing  only  a single 
rigid  body  is  not  the  only  type  of  system  for  which  a decomposition  of  total 
force  into  external  and  internal  forces  can  be  made.  The  decomposition 
can  be  done  for  any  system.  Thus  the  results  we  obtain  apply  as  well  to  a 
system  containing  no  rigid  bodies,  or  several  rigid  bodies  moving  with 
respect  to  one  another. 

These  considerations  suggest  that  we  write 

Fj  = Fextj  + Fintj 

Doing  so  makes  the  right  side  of  Eq.  (9-35)  break  into  two  parts,  to  yield 
dL  " " 

= 2 D x FexU  + 2 rj  x Fintj  (9-36) 

j=i  j=i 

The  simplification  resulting  from  this  decomposition  is  that  the  second 
summation  on  the  right  side  of  Eq.  (9-36)  has  the  value  zero.  Hence  the 
rate  of  change  of  the  total  angular  momentum  of  the  system  depends  only 
on  the  forces  applied  externally  to  its  particles. 

To  see  this,  we  must  realize  that  each  of  the  terms 

ri  x Fint  j 

in  the  second  summation  of  Eq.  (9-36)  is  itself  the  sum  of  a series  of  terms. 
This  is  because  Fint  j is  the  vector  sum  of  all  the  forces  exerted  on  particle  j 
by  all  the  other  particles.  For  instance,  for  particle  1 

f 1 X Fint  i D X (Fon  lby2  + Fon  i by  3 "E  ■ T Fon  i by  n ) 

There  is  a similar  series  for  each  of  the  particles  2,  3,  . . . , n which  com- 
prise the  system. 

Now  note  that  we  can  arrange  the  terms  in  the  sum 

X D X Fint  j 
j=l 

by  pairing  each  force  exerted  on  particle  j by  particle  k with  the  force  ex- 
erted on  particle  k by  particle  j.  For  particles  1 and  2,  as  an  example,  the 
two  terms  associated  with  the  pair  of  forces  look  like  this: 

t*l  X Fon  J ijy  2 f r2  X Fon  2 by  1 

The  two  forces  in  this  pair  are  illustrated  in  Fig.  9-22.  They  constitute  (as 
does  every  similar  pair)  an  equal  but  oppositely  directed  action-reaction 
pair,  according  to  Newton’s  third  law: 

Fon  1 by  2 Fon  2 by  1 

So  we  can  write 

D X Fon  lby  2 d"  r2  X Fon2byl  D X Fonjby2  1*2  X Fon  i by  2 

= (rx  - r2)  x Fonlby2 
= 0 

The  reason  for  the  zero  also  is  illustrated  in  Fig.  9-22.  The  vector  iq  — r2  is 
antiparallel  to  the  vector  Foniby2  if  the  forces  are  attractive  (or  parallel  if 
they  are  repulsive).  So  the  sine  factor  determining  the  magnitude  of  the 

9-5  Rotation  of  Systems  and  Angular  Momentum  Conservation  365 


9-22  A pair  of  particles  in  a system 
and  the  action-reaction  forces  they 
exert  on  each  other. 


cross  product  is  sin  tt  (or  sin  0),  and  the  cross  product  is  therefore  zero. 
This  is  the  case  because  in  the  newtonian  domain  the  forces  two  particles 
exert  on  each  other  always  act  along  the  line  passing  through  the  two  points 
specifying  their  positions,  as  the  figure  indicates.  We  will  have  more  to  say 
about  this  property  of  forces  soon. 

Since  each  of  the  pairs  of  the  sum  adds  to  zero  and  since  there  are  no 
leftover  terms,  the  entire  sum  adds  to  zero  also.  Hence  we  have 

X rj  x Fint j = 0 (9-37) 

3=1 

This  result  can  be  explained  by  saying  that  while  each  internal  force  of  an 
action-reaction  pair  produces  a torque  about  the  origin,  the  torque  is  exactly 
canceled  by  the  torque  produced  by  the  equal  but  oppositely  directed  other 
force  of  that  pair,  because  the  two  forces  have  a common  line  of  action. 


Returning  to  Eq.  (9-36),  we  see  that  it  simplifies  to 


t/L  " 

X rj  * Fextj 
j=l 


dt 


We  now  use  the  definition  of  torque  in  Eq.  (9-30)  to  write 

fj  x Fextj  Tj 

Here  Tj  is  the  torque  applied  about  the  origin  O to  the  jth  particle  by  the  ex- 
ternal force  acting  on  it.  Then  we  have 


d L » 

7F  = ST- 

ut  j=1 


Next,  we  define  the  net  torque  T about  O exerted  on  the  system  as 

T = X T;  (9-38) 


j=i 


[Of  course,  T is  actually  the  net  external  torque,  but  we  do  not  need  to  indi- 
cate this  by  using  a subscript  “ext”  because  Eq.  (9-37)  tells  us  that  there  is  no 
net  internal  torque.]  In  terms  of  T,  we  have 


dL 

T = -r- 
dt 


(9-39) 


Net  torque  equals  the  rate  of  change  of  angular  momentum.  The  angular  mo- 
mentum in  the  mathematical  and  verbal  statements  of  the  equation  is  the 
total  angular  momentum  of  a system  of  particles.  The  system  is  viewed 
from  an  inertial  reference  frame,  and  the  torque  and  angular  momentum 
are  both  taken  about  its  origin.  This  equation  is  the  rotational  form  of 
Newton's  second  law.  It  is  more  general  than  Eq.  (9-32)  in  that  it  applies 
to  any  system,  and  not  only  to  a system  containing  a single  particle. 


We  use  Eq.  (9-39)  to  determine  how  the  angular  momentum  of  a 
system  changes  when  there  is  a net  torque  applied  to  it.  But  first  let  us  con- 
sider the  important  special  case  of  a system  isolated  from  its  environment  in 
such  a way  that  no  net  torque  can  be  applied.  We  view  the  system  from  an 
inertial  frame,  as  assumed  in  obtaining  Eq.  (9-39).  For  such  a system  the 
equation  states  that  dL/dt  = 0,  so  that  L must  be  constant.  This  is  the  law 
of  conservation  of  angular  momentum:  As  seen  by  an  observer  in  an  inertial 
frame,  a system  to  which  no  net  torque  is  applied  maintains  a constant  total  angular 
momentum. 


366  Rotational  Motion,  I 


Fig.  9-23  (a)  A lecturer  sitting  on  a 

swiveling  stool  with  low-friction  bearings, 
while  holding  a spinning  bicycle  wheel 
by  its  axle  with  the  axle  horizontal, 
( b ) As  the  lecturer  turns  the  axle  so  that 
the  angular  momentum  vector  of  the 
wheel  points  upward,  his  body  begins  to 
rotate  so  that  its  angular  momentum 
vector  points  downward.  ( c ) By  turning 
the  axle  so  as  to  reverse  the  direction 
of  the  wheel's  angular  momentum  vec- 
tor, the  lecturer  can  reverse  the  direc- 
tion of  his  body’s  angular  momentum 
vector. 


The  law  of  conservation  of  angular  momentum  is  a fundamental  law  of 
physics.  It  is  on  a completely  equal  footing  with  the  law  of  conservation  of 
momentum.  Each  of  these  basic  conservation  laws  is  founded  firmly  on 
experimental  evidence.  In  Sec.  4-3  we  obtained  the  momentum  conserva- 
tion law  directly  from  experimental  observation.  Here  also  we  obtained  the 
angular  momentum  conservation  law  by  using  an  experimental  observa- 
tion, namely  that  in  the  newtonian  domain  the  forces  which  a pair  of  par- 
ticles exert  on  each  other  always  act  along  the  line  passing  through  the  par- 
ticles. Another  way  of  describing  the  situation  is  to  state  that  in  newtonian 
mechanics  the  force  which  particle  1 exerts  on  particle  2 is  a central  force 
since  it  is  always  directed  toward  or  from  a point,  lying  somewhere  along 
the  line  between  the  particles,  that  can  be  taken  as  fixed  in  a suitably  chosen 
inertial  frame.  And  similarly,  the  reaction  force  which  particle  2 exerts  on 
particle  1 is  a central  force  acting  toward  or  away  from  the  same  force 
center.  There  is  experimental  evidence  for  this  statement.  But  there  is  even 
more  direct  experimental  evidence  for  the  conclusion  reached  by  using  the 
statement,  that  is,  for  the  law  of  conservation  of  angular  momentum. 

An  example  of  this  evidence  is  pictured  in  Fig.  9-23a.  A lecturer  seats 
himself  on  a stool  that  can  rotate  freely  about  a vertical  axis,  while  hold- 
ing a bicycle  wheel  that  is  spinning  about  a horizontal  axis  so  that  the  angu- 
lar momentum  vector  of  the  wheel  is  horizontal.  The  tire  is  filled  with  sand, 
causing  the  wheel  to  have  an  appreciable  amount  of  angular  momentum 
when  it  spins.  In  Fig.  9-23 b the  lecturer  turns  the  rotation  axis  of  the  wheel 
so  that  its  angular  momentum  vector  is  upward.  As  he  does  this,  he  starts 
rotating  on  the  stool,  with  his  body’s  angular  momentum  vector  downward. 
The  lecturer  is  demonstrating  conservation  of  the  vertical  component  of 
the  total  angular  momentum  of  the  system  consisting  of  the  wheel  plus  his 
body  and  the  seat  of  the  stool.  The  system  is  isolated  by  the  smooth 
bearings  of  the  stool  from  the  application  of  external  torques  with  vertical 
components.  Since  the  total  vertical  angular  momentum  of  the  system  is  in- 
itially zero,  it  must  remain  zero.  In  Fig.  9-23c,  the  lecturer  turns  the  bicycle 
wheel  so  that  its  angular  momentum  vector  points  downward,  which  causes 
his  body  to  reverse  its  sense  of  rotation  in  order  to  reverse  the  direction  of 
its  angular  momentum.  If  you  can  carry  out  this  demonstration  yourself, 
you  will  literally  get  a seat-of-the-pants  feeling  for  angular  momentum  con- 
servation. 

There  is  a tremendous  variety  of  direct  experimental  evidence  demon- 
strating that  the  total  angular  momentum  of  an  isolated  system  is  con- 
served. Observation  shows  this  to  be  true  of  the  solar  system,  as  an  ex- 
ample. In  fact,  the  most  fundamental  view  is  to  take  the  angular  momentum  con- 
servation law  to  be  based  on  direct  observation,  just  as  the  momentum  conserva- 
tion law  was  justified  in  Sec.  4-3  on  the  basis  of  direct  observation.  From 
this  point  of  view,  the  calculation  leading  to  Eq.  (9-39),  and  its  special  case 
L = constant  for  T = 0,  suggests  that  in  the  newtonian  domain  the  forces 
acting  between  a pair  of  particles  are  central  forces. 

When  you  come  to  the  study  of  electromagnetism,  you  will  learn  that  there 
are  situations  outside  the  newtonian  domain  where  noncentral  forces  arise.  For  in- 
stance, the  forces  acting  between  a pair  of  charged  particles,  such  as  a pair  of  elec- 
trons, depart  significantly  from  being  central  forces  if  the  particles  are  moving  rel- 
ative to  each  other  at  a speed  comparable  to  the  speed  of  light.  The  situation  is  il- 
lustrated in  Fig.  9-24  from  the  point  of  view  of  an  observer  stationed  at  an  origin  of 
an  inertial  frame  which  remains  midway  between  the  two  electrons.  These  forces 
cause  a net  torque  about  O to  act  on  the  electrons,  and  their  angular  momentum 


9-5  Rotation  of  Systems  and  Angular  Momentum  Conservation  367 


Fig.  9-24  Two  electrons  moving  past 
each  other  at  a speed  comparable  to  the 
speed  of  light  exert  forces  on  each  other 
which  are  not  central  forces.  But  this 
situation  does  not  lead  to  a violation  of 
the  law  of  conservation  of  angular 
momentum,  for  reasons  explained 
in  the  text. 


EXAMPLE  9-6 


Fig.  9-25  Experiment  analyzed  in 
Example  9-6. 


does  not  remain  constant.  It  would  appear  that  this  is  a violation  of  the  law  of 
angular  momentum  conservation.  But  actually  it  is  not  because  there  is  something 
in  the  isolated  system  not  shown  in  Fig.  9-24.  This  is  the  so-called  electromagnetic 
field,  which  also  contains  angular  momentum.  When  the  complete  electron- 
pair-plus-field  system  is  considered,  its  total  angular  momentum  is  found  to  be 
conserved  because  every  change  in  the  angular  momentum  of  the  electron  pair  is 
exactly  compensated  for  by  a change  in  the  angular  momentum  of  the  field. 

Thus  even  for  microscopic  systems  containing  particles  moving  at  relativistic 
speeds,  the  total  angular  momentum  is  conserved,  providing  that  the  system  is 
isolated,  all  its  parts  are  taken  into  account,  and  it  is  viewed  from  an  inertial  frame. 
The  uranium  atom  is  an  example.  Its  inner  electrons  move  at  speeds  quite  close  to 
the  speed  of  light,  exchanging  angular  momentum  between  themselves  and  the 
electromagnetic  field  in  the  atom  in  a variety  of  complex  processes.  But  measure- 
ment shows  that  the  total  angular  momentum  of  the  system  remains  constant. 

The  outer  electrons  of  an  atom  are  involved  in  the  forces  that  bind  atoms  into 
molecules  and  molecules  into  solids.  They  move  at  speeds  which  are  very  small 
compared  to  the  speed  of  light.  As  a consequence,  the  internal  forces  playing  a sig- 
nificant role  in  keeping  a rigid  body  rigid  are  central  forces.  The  same  is  true  of  all 
the  other  forces  of  interest  in  newtonian  mechanics. 


Example  9-6  applies  the  law  of  angular  momentum  conservation. 


Figure  9-25  shows  the  loaded  bicycle  wheel  with  its  axle  secured  firmly  in  a vertical 
orientation  and  initially  not  rotating.  A small-caliber  rifle  fires  a bullet  tangentially 
into  the  sand-filled  tire,  where  it  embeds  itself.  The  mass  of  the  bullet  is  m = 8.1  g, 
and  its  speed  is  v = 370  m/s.  The  mass  of  the  sand,  the  tire,  and  the  rim  on  which  it 
is  mounted  is  M = 5.2  kg.  The  radius  to  the  center  of  the  narrow  tire  is  r = 33  cm. 
Determine  the  angular  speed  a>  of  the  wheel  after  the  impact. 

■ Before  using  angular  momentum  conservation  in  the  system  comprising  the 
bullet  plus  the  wheel  to  solve  this  problem,  you  should  note  that  it  is  the  only  con- 
servation law  applicable  to  the  system.  Mechanical  energy  is  not  conserved  because 
the  bullet  undergoes  a sequence  of  inelastic  collisions  when  it  embeds  itself  in  the 
sand.  Momentum  is  not  conserved  because  the  structure  supporting  the  axle  exerts 
forces  on  it  — particularly  during  the  impact  — so  the  system  is  not  isolated  from  ex- 
ternal forces,  and  its  total  momentum  changes.  You  can  see  that  this  is  true  by  con- 
sidering the  total  momentum  of  the  system  before  impact  and  at  several  times  after 
impact,  using  the  fact  that  the  symmetrical  wheel  itself  has  zero  total  momentum 
whether  it  is  rotating  or  not. 

But  the  forces  exerted  on  the  axle  by  the  support  do  not  exert  a torque  about 
an  origin  O at  the  intersection  of  the  axle  and  the  plane  of  the  wheel.  Thus  the  total 
angular  momentum  about  O of  the  system  maintains  a constant  magnitude  and  a 
constant  direction,  the  direction  being  vertically  upward  in  Fig.  9-25. 

You  can  obtain  a simple  but  sufficiently  accurate  expression  for  the  final  angu- 
lar momentum  magnitude  L in  terms  of  the  angular  speed  w by  assuming  that  all 
the  mass  M of  the  sand,  tire,  and  rim  of  the  wheel  is  rotating  in  a circle  of  the  same 
radius  r,  with  the  same  speed  v = cor.  Then  each  element  nij  of  this  mass  makes  a 
contribution  L,  to  the  total  angular  momentum  vector  L.  All  the  L,  are  parallel  to 
the  axle,  and  each  has  magnitude 

Lj  = rrrijV  = rnijco  r = r2co  mj 
Summing  over  the  mass  elements  immediately  gives 

71  11  11 

L = Lj  = ^ r2co  nij  = r2co  ^ nij  = r2coM 

j=i  j=i  i=i 


368 


Rotational  Motion.  I 


EXAMPLE  9- 


An  Atwood  machine. 


The  justification  for  ignoring  the  angular  momentum  of  the  spokes  is  that  their 
mass  is  small  compared  to  M and  that  much  of  this  mass  is  rotating  at  a distance 
from  the  axis  that  is  small  compared  to  r.  For  the  same  two  reasons,  and  particularly 
for  the  latter,  you  can  ignore  the  angular  momentum  of  the  hub  of  the  wheel.  And 
the  angular  momentum  of  the  bullet  embedded  in  the  rotating  wheel  can  be  ig- 
nored in  evaluating  the  final  total  angular  momentum  magnitude  L because  its  mass 
is  very  small  compared  to  M. 

The  bullet  contains  all  the  angular  momentum  of  the  system  before  it  hits  the 
wheel.  The  bullet  of  mass  m is  moving  at  the  high  speed  v along  a line  whose  perpen- 
dicular distance  from  O is  r.  According  to  Eq.  (9-25),  the  magnitude  of  its  angular 
momentum  is  this  perpendicular  distance  times  the  magnitude  of  its  momentum 
rnv.  So  the  bullet  has  angular  momentum  of  magnitude  rmv.  Equating  the  initial 
angular  momentum  to  the  final  angular  momentum  r2wM,  you  have 

r2a)M  = rmv 


or 


mv 

Mr 


The  numerical  value  is 

8.1  x KT3  kg  x 3.7  x l()2m/s 
w ~ 5.2  kg  x 3.3  x lir1  m 

= 1.7  s“‘ 


So,  after  the  impact,  the  wheel  rotates  at  1.7  rad/s  (or  0.44  rotations  per  second). 


Example  9-7  treats  a system  whose  angular  momentum  is  not  con- 
served because  a net  torque  is  acting  on  it. 


The  Atwood  machine  of  Fig.  9-26  contains  a pulley  whose  mass  cannot  be 
neglected.  The  pulley  consists  of  a substantial  rim,  of  mass  M and  radius  r,  sup- 
ported by  spokes  of  negligible  mass.  Compact  bodies  1 and  2 hang  from  each  end  of 
a cord  of  negligible  mass,  which  passes  over  the  pulley.  Their  masses  are  mx  and  m2. 
with  mx  > m 2.  Find  the  magnitude  of  the  downward  acceleration  of  body  1. 

■ Choose  the  center  of  the  pulley  as  the  origin  O,  and  then  evaluate  the  magni- 
tude of  the  torque  exerted  on  the  pulley-plus-bodies-plus-string  system  by  the 
gravitational  force  acting  on  body  1.  It  is 

T\  = rmlg 

The  direction  of  this  torque  is  outward.  The  torque  exerted  by  the  gravitational 
force  on  body  2 is  of  magnitude 

T2  - rm2g 

and  is  directed  inward  (that  is,  normal  to  the  page  and  away  from  you  as  you  view 
the  page).  Since  m,  > m2,  the  net  torque  is  outward  and  has  magnitude 

T — Ti  — T2  = r(m1  — m2)g 

In  this  example  the  net  torque  applied  to  the  system  is  not  zero. 

When  the  pulley  is  rotating  with  an  angular  velocity  that  is  directed  outward 
and  is  of  magnitude  w,  the  speed  of  body  1 is  cor  and  its  angular  momentum  has 
magnitude  rmxwr  = r'2a)m1.  The  direction  of  this  angular  momentum  is  outward. 
The  angular  momentum  of  body  2 is  in  the  same  direction  and  has  the  magnitude 
r2o)  m2 . The  angular  momentum  of  the  pulley  is  also  outward.  As  argued  in  Example 

9-5  Rotation  of  Systems  and  Angular  Momentum  Conservation  369 


9-6,  its  magnitude  is  r2uM.  So  the  magnitude  L of  the  total  angular  momentum  of 
the  system  will  be  the  sum  of  the  angular  momenta  of  bodies  1 and  2 and  of  the 
pulley: 

L = r2a>  m1  + r2ou  m2  + r2a)M 
= + m2  + M) 

The  total  angular  momentum  is  outward. 

Now  you  write  the  rotational  form  of  Newton’s  second  law,  Eq.  (9-39),  for  a sit- 
uation where  both  T and  L have  the  same  constant  direction.  It  is 


T 


dL 

~dt 


When  this  is  applied  to  the  case  at  hand,  you  have 

d 

r(m-L  - m2)g  = — [r2a)(m1  + m2  + M)] 
at 

dw 

= r2(m1  + m2  + M)  — 
at 

= r2(mx  + m2  + M)a 


where  a is  the  magnitude  of  the  angular  acceleration.  Solving  for  this  quantity 
yields 

(”h  ~ m2)g 
(rrii  + mi  + M)r 

The  direction  of  the  angular  acceleration  is  also  outward. 

The  acceleration  of  body  1 is  the  same  as  the  tangential  part  of  the  acceleration 
of  the  point  on  the  wheel  to  which  the  string  to  body  1 is  tangent.  Thus  you  can  use 
the  hist  term  on  the  right  side  of  Eq.  (9-21)  to  evaluate  its  magnitude  as 


ax  = ar 


Then  you  have  the  required  result 

(mi  - m2)g 
1 mA  + m2  + M 


(9-40) 


If  M is  so  much  smaller  than  mi  + m2  that  it  can  be  neglected,  the  value  of 
predicted  by  this  result  agrees  with  Eq.  (5-7 a),  the  result  obtained  in  Sec.  5-2  for  an 
Atwood  machine  with  a massless  pulley.  Comparison  of  the  present  calculation  with 
the  earlier  one  will  show  you  that  it  easier  to  give  a realistic  treatment  of  an  Atwood 
machine  by  considering  torques  than  it  is  to  give  a simplified  treatment,  ignoring 
the  mass  of  the  pulley,  by  considering  forces. 


If  you  carry  out  the  stool-and-bicycle-wheel  demonstration  of  angular 
momentum  conservation  pictured  in  Fig.  9-23,  you  will  experience  the 
torque  that  your  arms  must  apply  to  the  wheel  to  change  its  angular  mo- 
mentum vector,  as  well  as  the  reaction  torque  which  the  wheel  must  exert 
through  your  arms  to  change  your  angular  momentum  vector.  Further- 
more, you  will  appreciate  that  the  behavior  of  the  system  can  be  explained 
either  by  the  law  of  angular  momentum  conservation  or  by  the  rotational 
form  of  Newton’s  third  law.  The  latter  states  that  when  two  bodies  exert 
tore] ties  on  each  other  in  their  interaction,  the  torques  have  equal  magnitude  but 
opposite  direction.  That  is, 

Ton  1 by  2 = Ton  2 by  1 (9-41) 


370  Rotational  Motion,  I 


Table  9-1 


Some  Equations  Used  in  Rotational  and  Translational  Mechanics 
Rotational  equations  Connecting  equations 

„ - 


Translational  equations 


v 


ds 

dt 


dt 


For  a single  particle: 


T 


dl 

dt 


a = aXr  + wXv 


T = rxF;  l = rxp 


a 


dv 

dt 


F 


dp 

dt 


For  n particles: 


T 


dL 

dt 


1 on  1 by  2 


-T 


on  2 by  1 


T = 2 T x FextJ- 

i=  1 


L =2  C x Pj 

j=i 


rl  P " n 

F = — , where  F = V Fext ,■  and  P = Y p, 
dt  ^ 

j=i  j=i 

Eon  1 by  2 Eon  2 by  1 


In  Sec.  4-5,  the  translational  form  of  Newton’s  third  law  was  derived  by 
using  the  law  of  momentum  conservation  and  the  definition  of  force  given 
in  the  translational  form  of  Newton’s  second  law.  You  should  have  no  diffi- 
culty in  modifying  the  derivation  to  obtain  the  rotational  form  of  the  third 
law  by  using  the  law  of  angular  momentum  conservation  and  the  rotational 
form  of  the  second  law.  What  is  the  rotational  form  of  Newton’s  first  law? 

Table  9-1  summarizes  the  most  important  equations  obtained  up  to 
this  point  in  our  study  of  rotational  motion,  and  it  also  shows  their  relation- 
ships to  analogous  equations  that  we  employ  when  studying  translational 
motion.  In  Chap.  10  we  continue  the  development  of  equations  that  we  will 
use  in  the  investigation  ol  how  objects  rotate.  In  particular,  we  will  find  the 
rotational  analogue  of  Newton’s  second  law  in  the  form  “force  equals  mass 
times  acceleration”  and  the  rotational  analogues  of  the  energy  relations. 
We  close  this  chapter  by  turning  our  attention  in  Secs.  9-6  and  9-7  to  an  im- 
portant practical  application  of  the  theory  already  at  hand. 


9-6  STATIC 
EQUILIBRIUM  OF 
RIGID  BODIES  AND 
CENTER  OF  MASS 


In  the  design  of  structures  such  as  buildings  or  bridges,  it  is  essential  to  en- 
sure that  they  will  remain  in  place.  In  doing  this  work,  structural  engineers 
often  begin  their  calculations  by  applying  Newton's  second  law,  in  both  the 
rotational  form  and  the  translational  form,  to  a stationary  rigid  body.  They 
are  then  working  with  the  subject  of  this  section:  the  static  equilibrium  of 
rigid  bodies.  It  is  relatively  simple,  at  least  in  concept.  At  later  stages  of 
their  calculations,  structural  engineers  must  frequently  take  into  account 
the  fact  that  no  body  is  perfectly  rigid.  This  leads  to  complications  which  we 
will  not  consider  here. 

The  rotational  and  translational  forms  of  Newton’s  second  law  show 
that  for  a body  to  be  stationary  with  respect  to  the  earth’s  surface  (consid- 
ered as  an  inertial  frame),  two  conditions  must  be  met:  (1)  The  net  torque 
about  any  origin  acting  on  the  body  must  be  zero  because  the  nonrotating  body 
has  the  constant  angular  momentum  zero  about  any  origin.  (2)  The  net  force 


9-6  Static  Equilibrium  of  Rigid  Bodies  and  Center  of  Mass  371 


acting  on  the  body  must  be  zero  because  the  nontranslating  body  has  the  con- 
stant momentum  zero.  Example  9-8  illustrates  the  application  of  these  two 
conditions  to  a simple  case. 


EXAMPLE  9-8  mini ■ 

Find  the  force  exerted  on  the  lever  in  Fig.  9-27  by  its  fulcrum,  labeled  O.  The  lever 
is  stationary  and  is  supporting  a body  of  mass  m because  an  upward  force  is  applied 
to  the  free  end  of  the  lever.  Assume  that  the  mass  of  the  lever  is  zero. 

■ Consider  first  the  condition  that  the  net  force  acting  on  the  lever  must  be  zero. 
The  force  exerted  on  the  lever  by  the  body  it  supports  certainly  acts  in  the  down- 
ward direction.  And  you  are  told  that  the  force  exerted  on  the  lever  at  its  free  end 
acts  in  the  upward  direction.  Since  these  forces  have  no  horizontal  components, 
neither  can  the  force  exerted  on  the  lever  by  the  fulcrum.  This  is  because  the  net 
force  acting  on  the  lever  must  have  a zero  horizontal  component.  Thus  the  force 
applied  by  the  fulcrum  must  act  in  either  the  upward  or  the  downward  direction. 
You  can  use  signed  scalars  to  specify  the  three  forces.  If  you  take  the  upward  direc- 
tion as  positive,  the  force  applied  by  the  fulcrum  is  T0  (whose  value  has  unknown 
sign  and  magnitude),  the  force  applied  by  the  supported  body  is  Ft  (whose  value 
— mg  is  negative),  and  the  force  applied  at  the  free  end  is  T2  (whose  value  has  a posi- 
tive sign  but  an  unknown  magnitude).  The  condition  that  the  net  force  acting  on 
the  lever  is  zero, 

To  + Fx  + F2  = 0 

then  reads 


— mg  + Fo  = 0 

(9-42a) 

T0  = mg  - F2 

(9-426) 

To  find  F0,  you  need  another  equation  so  that  you  can  determine  the  value  of 
T2.  You  can  obtain  the  equation  by  considering  the  condition  that  the  net  torque 
about  any  origin  acting  on  the  lever  must  be  zero.  But  before  utilizing  this  condi- 
tion, you  must  choose  an  origin.  The  best  choice  is  to  take  the  origin  O at  the  ful- 
crum. as  shown  in  the  figure.  The  reason  is  that  then  the  force  T0  can  exert  no 
torque  about  0,  and  hence  F0  will  not  appear  in  the  equation  that  determines  the 
value  of  F2.  With  this  choice,  the  torque  about  O exerted  on  the  lever  by  the  force  F2 
is  described  by  a vector  which  is  directed  outward.  That  is,  this  torque  tends  to  ro- 
tate the  lever  in  the  counterclockwise  sense,  from  the  point  of  view  illustrated  in  the 
figure.  You  can  see  that  this  is  so  by  using  the  right-hand  rules  or,  better,  by  using 
your  intuition.  Equation  (9-31)  shows  that  the  magnitude  of  the  torque  equals  the 
perpendicular  distance  r2  from  O to  the  line  of  action  of  the  force,  multiplied  by  its 


Fig.  9-27  A lever. 


J 

F0\(Sign  unknown 

Ti  = -mg 

’ 

| initially) 

' 

i m 

1 1 

F2>  0 1 1 


p? 


372  Rotational  Motion,  I 


value  F2.  Signed  scalars  can  be  used  to  specify  torque  vectors  that  all  act  either  in 
one  direction  or  in  the  opposite  direction,  just  as  signed  scalars  can  be  used  for 
force  vectors  that  are  all  directed  in  one  way  or  in  the  opposite  way.  Choosing  the 
direction  outward  as  the  direction  of  positive  torque,  the  torque  due  to  the  force  F2 
is  written  as  T2  = rj??,  (whose  value  is  positive).  Similarly,  the  torque  due  to  the 
force  Fi  is  7\  = r1F1  = r^—mg)  = — r\mg  (whose  value  is  negative).  Then  the  condi- 
tion T1  + T2  = 0,  that  the  net  torque  about  O be  zero,  becomes 

— rxmg  + r2F2  = 0 (9-43a) 


or 


»,  mg 
F2  = — 
r2 

Using  Eq.  (9-436)  in  Eq.  (9-426),  you  obtain 

F o = mg mg 

r2 


(9-436) 


or 


F»=mg(i  -jj) 

Since  r1  cannot  be  greater  than  r2,  the  value  of  F0  must  always  be  positive.  Thus  the 
force  exerted  on  the  lever  by  the  fulcrum  must  always  be  directed  upward,  as  indi- 
cated in  the  figure.  Explain  to  yourself  the  physical  meaning  of  the  value  predicted 
for  F0  in  the  three  cases  rx  = 0,  r1  = r2/2,  and  r,  = r2. 


How  could  you  extend  the  calculation  in  Example  9-8  to  take  into  ac- 
count the  fact  that  a lever  is  not  actually  massless?  At  first  thought,  this 
would  seem  to  be  a difficult  thing  to  do  because  there  is  a gravitational 
force  acting  on  every  one  of  its  particles.  Each  of  these  particles  is  at  a dif- 
ferent distance  from  the  origin  O,  so  each  of  the  many  forces  produces  a 
different  torque  about  O.  But  the  concept  of  center  of  mass  makes  such  an 
extension  easy,  as  we  now  show. 

Consider  the  body  of  arbitrary  shape  and  mass  distribution  in  Fig. 
9-28.  A typical  particle  of  the  body  has  mass  m}  and  position  rjt  relative  to 
some  origin  O.  Let  us  evaluate  the  total  torque  T about  O produced  by  all 
the  gravitational  forces  which  the  earth  exerts  on  the  particles  of  the  body. 
The  gravitational  force  on  a typical  particle  is 

Fj  = mjg 


Fig.  9-28  The  position  r,,  relative  to  an  origin  O,  of  a typical  particle 
in  an  extended  body.  The  gravitational  force  acting  on  that  particle 
is  trij g. 


9-6  Static  Equilibrium  of  Rigid  Bodies  and  Center  of  Mass  373 


No  label  is  needed  on  the  gravitational  acceleration  vector  g.  This  vector 
has  the  same  magnitude  and  same  direction  over  the  entire  body  if,  as  we 
will  assume  to  be  the  case,  its  size  is  small  compared  to  the  size  of  the  earth. 
The  torque  about  0 produced  by  this  gravitational  force  is 

Tj  = ij  X Fj  = Tj  X rrij  g 

I he  total  torque  produced  by  all  the  gravitational  forces  acting  on  the 
n particles  in  the  body  is 

T = X T j=  'Z  rJ  x mjg 
j=  1 j=i 

But  since  the  rrij  are  scalars,  their  ordering  in  the  vector  product  is  of  no 
consequence,  and  we  can  write 

t = i)  miri * § 

j=i 

Using  parentheses  to  indicate  explicitly  that  the  summation  of  terms  nij  p 
can  be  performed  first,  and  then  the  vector  product  of  the  result  can  be 
taken  with  g,  we  have 

T = ( I mjr)  x g (9-44) 

v t=i  7 

The  purpose  of  this  manipulation  is  to  isolate  the  quantity  in  paren- 
theses: 

n 

X mjti 
j=i 

It  is  a summation,  over  all  the  particles  of  the  body,  of  the  mass  of  each 
times  its  position  vector.  The  value  of  this  summation  is  not  obvious,  but  it 
must  have  some  value.  Because  the  summation  is  taken  over  all  the  constitu- 
ent masses  of  the  body,  the  total  mass  M of  the  body  should  appear  as  a 
factor  in  the  value.  In  other  words,  we  should  be  able  to  write 

^ nij  r j = Mr  (9-45) 

t= i 

where  r represents  a vector  whose  value  is  such  that  when  it  is  multiplied  by 
M,  the  result  has  the  same  value  as  the  summation.  Deferring  for  the  mo- 
ment the  question  of  just  how  we  are  going  to  determine  r,  we  can  use  Eq. 
(9-45)  to  put  Eq.  (9-44)  in  the  form 

T = Mr  x g 

or 

T = r x Mg  (9-46) 


To  interpret  the  meaning  of  this  result,  first  we  note  that  since  the 
gravitational  force  acting  on  the  jth  particle  of  the  body  is 

F;  = mjg 

the  total  gravitational  force  acting  on  the  body  is 

F = 2 F'- = S "fig  = (s  mj)  § 

t=i  j=i  U=i  7 


374 


Rotational  Motion,  I 


The  summation  in  parentheses  is  M,  the  total  mass  of  the  body.  So  we  have 

F = Mg  (9-47) 

This  equation  makes  the  obviously  correct  statement  that  the  net  force  pro- 
duced by  the  gravitational  forces  acting  on  the  particles  of  the  body  of  total 
mass  M is  the  same  as  the  gravitational  force  acting  on  a particle  of  mass  M. 
Next  we  use  Eq.  (9-47)  to  write  Eq.  (9-46)  for  the  net  torque  about  0 acting 
on  the  body  as  T = r x F.  We  then  conclude  that  the  net  torque  T about  0 
produced  by  the  gravitational  forces  acting  on  the  particles  of  the  body  of 
total  mass  M is  the  same  as  that  which  would  be  produced  by  the  net  gravi- 
tational force  F if  it  were  acting  on  a particle  of  mass  M at  position  r.  In 
other  words,  Eqs.  (9-46)  and  (9-47)  show  that  the  effect  of  the  gravitational 
forces  acting  on  all  the  particles  of  the  body  is  the  same,  with  regard  to  both  the  net 
force  and  the  net  torque,  as  if  the  entire  mass  M of  the  body  were  concentrated  at  the 
position  r.  The  vector  r gives  the  location  relative  to  O of  a point  in  the  body 
called  its  center  of  mass. 


It  remains,  of  course,  to  find  r.  In  general  terms,  this  is  a matter  of 
solving  Eq.  (9-45)  for  r.  Doing  so,  we  obtain  the  following  expression  for 
the  vector  r that  gives  the  location  of  the  center  of  mass: 


n 

2 m>  rj 

j=i 


(9-48a) 


Here  m}  is  the  mass  of  the  jth  particle  of  the  body  whose  location  is  given  by 
the  vector  r}.  There  are  n particles  in  the  body,  and  its  total  mass  is  M.  An 
equivalent  expression  for  the  vector  r is 

n 

X mi  rJ 

r - (9-486) 

V m} 

3=1 


The  two  expressions  are  equivalent  because  M is  just  the  sum  of  the  masses 
m j.  Before  we  discuss  the  use  of  Eqs.  (9-48)  to  evaluate  r in  specific  cases, 
it  is  worthwhile  considering  some  general  properties  of  the  center  of  mass. 

The  location  of  a body’s  center  of  mass  does  not  depend  on  the  choice 
of  the  origin  O,  even  though  the  vectors  rj  and  r change  if  the  origin  is 
changed.  This  statement  is  verified  in  Fig.  9-29  and  the  accompanying  cap- 


Fig.  9-29  It  the  origin  O used  in  the  defini- 
tion of  a body’s  center  of  mass  (CM)  is  moved  to 
O',  the  position  vector  r,  of  a typical  particle 
in  the  body  becomes  rj.  The  relation  between 
the  two  vectors  is  r,  = rj  + R,  where  R is 
the  vector  giving  the  position  of  O'  relative 
to  O.  Using  this  relation  in  Eq.  (9-48a)  yields 
r = 2"=i  + R )/M  = %?=1  m^/M  + 

R 2"=!  mjM.  This  immediately  reduces  to  the 
result  r = r'  + R,  where  r'  gives  the  location 
of  the  body’s  CM  relative  to  the  origin  O'. 
Inspection  of  the  figure  will  show  that  this 
result  is  consistent  with  the  fact  that  the  CM  is 
fixed  in  the  body  and  does  not  depend  on  the 
location  of  the  origin  used  in  its  definition. 


9-6  Static  Equilibrium  of  Rigid  Bodies  and  Center  of  Mass  375 


* 


O CM 


o CM 


°CM 

Fig.  9-30  L .ocation  of  the  center  of 
mass  of  a sphere,  a cylinder,  a rectang- 
ular plate,  a square  plate,  and  a circular 
plate.  In  each  case  the  body  is  made  of 
material  of  uniform  density. 


tion.  Rather,  the  center  of  mass  of  a body  is  at  a position  fixed  relative  to 
the  body  (though  not  necessarily  within  the  body).  The  position  does  de- 
pend on  the  distribution  of  mass  in  the  body.  Indeed,  the  vector  r specifies 
the  average  location  of  that  mass.  If  each  particle  of  the  body  has  the  same 
mass,  that  is,  if  all  the  nij  have  a common  value  m,  then  Eq.  (9-486)  simplifies 
to 


n n 

m X ri  X rj 

j=i  j=i 

r = = 

run  n 

In  this  case,  the  computation  of  r is  nothing  more  than  a matter  of  com- 
puting the  simple  arithmetical  average,  over  all  the  n particles  of  the  body, 
of  the  position  vectors  ij  to  each  particle.  You  add  all  the  r,-  and  then  divide 
by  the  total  number  of  terms  in  the  sum,  which  is  n.  In  a general  case  where 
the  particle  masses  nij  are  not  all  the  same,  in  computing  the  average  you 
“weight"  the  position  r,  of  each  particle  by  multiplying  it  by  the  mass  nij  of 
the  particle.  In  fact,  Eq.  (9-486)  is  a direct  application  of  the  general  mathe- 
matical rule  for  calculating  a weighted  average,  or  mean.  Figure  9-30  illus- 
trates the  location  of  the  center  of  mass  of  some  simple  bodies.  Note  that  if 
the  mass  distribution  of  a body  is  symmetrical  about  a point,  then  its  center 
of  mass  will  lie  at  that  point.  If  the  mass  distribution  is  symmetrical  about  a 
line,  then  the  center  of  mass  will  lie  somewhere  on  the  line.  If  it  is  symmet- 
rical about  a plane,  the  center  of  mass  will  lie  somewhere  in  the  plane.  If 
there  is  more  than  one  line  or  plane  of  symmetry,  the  center  of  mass  must 
be  where  they  intersect. 


The  center  of  mass  of  a body  is  often  called  its  center  of  gravity,  for  reasons 
arising  from  the  interpretation  given  to  Eq.  (9-46).  The  point  at  which  it  is  located 
is  sometimes  called  the  balance  point.  The  terminology  is  appropriate  because 
when  an  object  is  balanced  about  a point  O,  its  center  of  mass  lies  somewhere  on  a 
vertical  line  passing  through  O.  The  reason  is  that  the  body  is  balanced  if  there  is 
no  total  gravitational  torque  about  O.  Equation  (9-46)  shows  that  there  will  be  no 
such  torque  when  the  vector  r from  O to  the  center  of  mass  is  antiparallel,  or  paral- 
lel, to  the  downward-directed  vector  g because  the  center  of  mass  is  immediately 


Fig.  9-31  The  center  of  mass  of  an  irregular  body  can  be  located 
experimentally  by  suspending  it  from  a point  on  its  rim,  recording 
on  tbe  body  the  direction  of  a line  extending  vertically  downward 
from  the  suspension  point,  and  then  repeating  the  procedure.  The 
center  of  mass  is  at  the  intersection  of  the  two  lines.  Why? 


I 


T 1 
N I 
X^CM 

l> 

I 


\ 


\ 


\ 


376  Rotational  Motion,  I 


above,  or  below,  O.  This  fact  provides  the  basis  of  an  experimental  method,  indi- 
cated in  Fig.  9-31,  for  locating  the  center  of  mass  of  a body. 

The  theoretical  method  of  finding  the  center  of  mass  is  to  use  Eq.  (9-48a).  If 
the  body  is  not  composed  of  a relatively  small  number  of  discrete  particles,  this  is 
most  easily  done  by  replacing  the  sum  by  an  integral.  That  is,  an  alternative  ex- 
pression for  the  position  vector  r of  the  center  of  mass  is 


r 


r'  dm 


body 


M 


(9-49a) 


Here  dm  is  an  infinitesimal  element  of  mass  of  the  body  whose  total  mass  is  M , r'  is 
the  position  vector  of  that  mass  element,  and  the  integral  is  taken  over  the  entire 
body.  This  definition  can  also  be  written  as 


r '(dm/dV)  dV 


body 


r'p  dV 


body 


r = 


M 


M 


(9-49b) 


where  p = dm/dV  is  the  density  of  the  body  and  dV  is  an  element  of  its  volume. 
The  integral  over  the  volume  of  the  body  of  the  vector  quantity  r'p  is  evaluated  by 
writing  r'  in  terms  of  its  components.  For  instance,  in  rectangular  coordinates 
doing  so  gives  r'p  = x'px  + y'  p y + z'pz.  Using  this  in  Eq.  (9-49b)  leads  to  ex- 
pressions for  the  components  of  r,  each  of  which  involves  an  integral  of  a scalar 
quantity. 


In  practice,  the  center  of  mass  of  a body  often  can  be  obtained  quite 
simply  by  invoking  whatever  symmetries  it  has.  For  instance,  the  symmetry 
of  a uniform,  straight  rod  makes  it  apparent  that  its  center  of  mass  lies  at  its 
midpoint.  Therefore  you  can  modify  Example  9-8  to  take  account  of  the 
mass  M of  die  lever  by  adding  —Mg  to  the  left  side  of  Eq.  (9-42«)  and  by 
adding  —(r2/2)Mg  to  the  left  side  of  Eq.  (9-43a).  Something  very  similar  is 
done  in  Example  9-9. 


To  summarize,  investigating  the  conditions  in  which  bodies  will  re- 
main stationary  in  an  inertial  reference  frame  involves  applying  the  two 
conditions 


and 


F = 0 (9-50fl ) 

T = 0 (9-50(0 


The  hist  states  that  there  must  be  zero  net  force  acting  on  a stationary  body; 
the  second  states  that  that  there  must  be  zero  net  torque  acting  on  the  body, 
about  any  origin.  The  net  contribution  from  a set  of  gravitational  forces 
acting  on  an  extended  member  of  the  body  is  evaluated  by  treating  those 
forces  as  if  they  were  all  applied  at  the  center  of  mass  of  the  member.  The 
choice  of  the  origin  about  which  the  torques  are  computed  is  dictated  by 
convenience,  since  the  second  condition  must  be  satisfied  for  any  origin. 
This  is  emphasized  by  Example  9-9. 


EXAMPLE  9-9 

A ladder  of  uniform  construction,  length  5.00  m,  and  mass  20.0  kg  is  supported  by 
a rough  floor.  It  is  found  that  it  remains  leaning  against  a smooth  wall  if  the  dis- 
tance from  the  wall  to  the  bottom  of  the  ladder  is  no  greater  than  3.00  m.  If  this  dis- 
tance is  exceeded,  the  ladder  slips.  Determine  the  force  acting  between  the  wall  and 
the  top  of  the  ladder  when  its  bottom  is  at  the  critical  location. 

9-6  Static  Equilibrium  of  Rigid  Bodies  and  Center  of  Mass  377 


Fig.  9-32  The  forces  acting  on  a ladder  resting  on  a rough  floor 
and  leaning  against  a perfectly  smooth  wall. 


© uMg 


■ The  drawing  in  Fig.  9-32  shows  the  ladder  when  it  is  just  at  the  point  of  slip- 
ping and  the  forces  which  then  act  on  it.  The  gravitational  forces  exerted  on  it  have 
been  replaced  by  a single  force  of  magnitude  Mg,  where  M is  its  total  mass.  This 
force  acts  downward  at  the  midpoint  of  the  uniform  ladder,  which  is  the  location  of 
its  center  of  mass.  The  force  applied  by  the  floor  to  the  bottom  of  the  ladder  is  not 
directed  perpendicular  to  the  floor,  since  the  floor  is  not  smooth.  The  figure  shows 
the  vertical  and  horizontal  components  of  this  force,  taking  the  upward  direction  to 
be  positive  and  the  direction  to  the  right  to  be  positive.  Since  Eq.  (9-50o)  requires 
that  the  net  force  acting  on  the  ladder  have  no  vertical  component,  you  find  imme- 
diately that  the  vertical  component  is  the  positive  quantity  Mg,  as  shown  in  the  fig- 
ure. The  figure  also  shows  that  the  horizontal  component  of  the  force  exerted  by 
the  floor  is  the  positive  quantity  pMg,  where  p,  is  the  coefficient  of  static  friction 
between  the  floor  and  the  bottom  of  the  ladder.  The  sign  is  found  to  be  positive  by 
considering  that  this  component  is  the  frictional  force  which  acts  to  the  right  in 
resisting  the  incipient  leftward  motion  of  the  bottom  of  the  ladder.  The  magnitude 
is  found  by  using  Eq.  (4-22 b),  which  relates  the  static  friction  force  that  the  floor 
exerts  on  the  ladder  to  the  normal  force  exerted  by  the  floor  on  the  ladder.  Finally, 
the  force  applied  by  the  wall  on  the  top  of  the  ladder  is  drawn  perpendicular  to  the 
wall.  This  is  because  the  wall  is  frictionless  and  hence  cannot  apply  a force  parallel 
to  its  surface.  Using  Eq.  (9-50a)  again,  you  can  see  that  this  horizontal  force  applied 
by  the  vertical  wall  must  act  to  the  left  and  have  magnitude  /J-Mg,  as  shown.  The 
reason  is  that  the  net  force  exerted  on  the  ladder  can  have  no  horizontal  compo- 
nent, according  to  Eq.  (9-50a). 

Applying  Eq.  (9-506)  requires  choosing  a location  for  the  origin  O.  Four  pos- 
sible choices  are  shown  in  the  figure.  Perhaps  the  most  obvious  choice  is  location  1, 
at  the  center  of  mass  of  the  ladder.  But  with  this  choice  there  would  be  torques  pro- 
duced by  two  forces,  the  ones  acting  on  the  ends  of  the  ladder.  The  same  objection 
holds  for  location  2,  at  the  top  of  the  ladder.  Location  3,  at  the  bottom  of  the  ladder, 
will  simplify  the  calculation  somewhat  because  the  more  complicated  force  acting  at 
the  bottom  of  the  ladder  exerts  no  torque  about  an  origin  at  that  location. 

The  least  obvious,  but  most  advantageous,  choice  of  origin  is  location  4,  at  the 
intersection  of  the  lines  of  action  of  the  forces  exerted  on  the  top  of  the  ladder  and 
on  its  center  of  mass.  Neither  of  these  forces  produces  a torque  about  that  origin. 


378 


Rotational  Motion,  I 


9-7  STABILITY  OF 
EQUILIBRIUM 


Equation  (9-506)  therefore  requires  that  the  same  be  true  of  the  force  acting  at  the 
bottom  of  the  ladder,  in  order  that  there  be  zero  net  torque  about  the  origin.  Thus 
the  line  of  action  of  the  force  on  the  bottom  of  the  ladder  must  also  pass  through  lo- 
cation 4. 

The  figure  shows  that  this  means  the  line  of  action  is  inclined  to  the  horizontal 
at  an  angle  9 such  that  cot  9 = 1.50  m/4.00  m = 0.375.  Furthermore,  it  shows  that 
cot  9 = fiMg/Mg  = ix.  So  you  bud  that  the  coefficient  of  static  friction  is  p,  = 0.375. 
Then  you  can  immediately  evaluate  the  force  acting  between  the  wall  and  the  top  of 
the  ladder;  it  has  the  value  ixMg  = 0.375  x 20.0  kg  x 9.80  m/s2  = 73.5  N. 


Cases  can  arise  in  which  it  is  impossible  to  treat  the  equilibrium  of  a 
rigid  structure  because  there  are  more  unknown  force  components  than 
there  are  scalar  equations  involving  these  components.  The  structure  is 
then  said  to  be  underdetermined.  This  would  be  so  in  Example  9-9  if  both 
the  floor  and  the  wall  were  rough,  with  both  coefficients  of  friction  being 
unknown.  For  a less  contrived  case,  consider  the  forces  which  the  floor 
exerts  on  the  four  legs  of  a chair.  One  equation  relating  the  four  unknown 
forces,  and  the  gravitational  force  acting  at  its  center  of  mass,  is  obtained 
from  the  vertical  component  of  F = 0.  Relative  to  any  particular  origin, 
two  more  equations  relating  these  forces  are  obtained  from  the  two  hori- 
zontal components  of  T = 0.  The  structure  is  underdetermined  because 
there  are  four  unknowns  and  only  three  independent  equations.  In  con- 
trast, a three-legged  stool  is  not  underdetermined. 

The  difficulty  is  removed  by  taking  into  account  the  fact  that  no  real 
structure  is  completely  rigid.  When  this  is  done,  additional  equations  are 
found  that  involve  the  elastic  properties  of  the  members  of  the  deformable 
structure.  They  make  possible  a complete  analysis  of  the  conditions  for  its 
equilibrium.  But  we  leave  these  matters  to  specialized  books  on  structural 
engineering. 

We  turn  instead  to  a treatment  of  the  stability  of  equilibrium.  This 
treatment  extends  the  one  given  in  Chap.  6 by  using  energy  relations  devel- 
oped in  Chap.  7 and  the  concepts  of  center  of  mass  and  torque  introduced 
in  this  chapter. 


Every  structure  is  continually  subjected  to  influences  which  displace  it 
slightly  from  its  equilibrium  position.  Gusts  of  wind  and  vibrations  in  the 
ground  have  such  an  effect.  Since  the  displaced  position  is  not  an  equilib- 
rium position,  there  will  be  a net  force  and/or  net  torque  acting  on  the 
structure  in  that  position.  If  for  all  positions  near  the  equilibrium  position 
they  are  directed  so  as  to  tend  to  return  the  structure  to  the  equilibrium 
position,  then  it  is  a position  of  stable  equilibrium. 

In  Sec.  6-1  we  studied  the  equilibrium  of  a particle  in  a system  in  cases 
where  the  particle’s  position  can  be  specified  completely  by  one  coordinate, 
and  it  is  not  necessary  to  consider  torque.  If  we  write  the  coordinate  as  x 
and  the  net  force  acting  on  the  particle  as  F(x),  the  conclusions  reached 
there  can  be  summarized  as  follows:  (1)  F(x)  = 0 at  x = xe  if  xe  is  an  equi- 
librium position.  (2)  In  addition,  dF(x)/dx  < 0 at  x — xe  if  xe  is  a posi- 
tion of  stable  equilibrium.  In  Sec.  7-6  we  learned  that  if  F(x)  is  a conserva- 
tive force  arising  from  an  interaction  between  the  particle  and  some  other 
body  in  the  system,  then  the  potential  energy  U(x)  of  the  system  is  related  to 


9-7  Stability  of  Equilibrium  379 


F(x)  by  the  equation  F(x)  = - dU(x)/dx.  Using  this,  we  can  write  the  condi- 
tion for  equilibrium  at  x = xe  in  the  form 


dU(x) 

dx 


at  x — xe 


This  is  equivalent  to 


dU(x) 

dx 


at  x = xe  (equilibrium  condition) 


(9-51) 


And  we  can  write  the  condition  that  the  equilibrium  at  x = xe  is  stable  in  the 
form 


d_ 

dx 


dU(x)~ 

dx 


< 0 


at  x = xe 


or 

d2U(x) 

r-5 — <0  at  x = xP 

dx 2 


If  the  negative  of  a quantity  is  less  than  zero,  the  quantity  itself  is  greater 
than  zero.  Thus  the  condition  for  the  equilibrium  to  be  stable  can  be  ex- 
pressed as 

d2U(x) 

- -----  >0  at  x = xe  (stability  condition)  (9-52) 

Figure  9-33«  illustrates  qualitatively  the  behavior  of  the  potential  en- 
ergy U(x)  near  a position  of  stable  equilibrium  at  xe.  The  slope  of  the  U(x) 
curve  [in  other  words,  the  first  derivative  of  U(x)\  is  zero  at  xe.  This  satisfies 
the  mathematical  requirement  of  Ecp  (9-51).  Physically,  it  means  that  the 
net  force  acting  on  the  particle,  F(x)  = ~dU(x)/dx,  is  zero  atxe,  so  thatxe  is, 
indeed,  an  equilibrium  position.  The  rate  of  change  of  slope  of  the  U(x) 
curve  [in  other  words,  the  second  derivative  of  U(x)]  is  positive  at  xe.  Thus 
the  requirement  of  Eq.  (9-52)  is  also  satisfied.  You  can  see  the  physical  sig- 
nificance of  this  by  using  the  relation  F(x)  = —dU(x)/dx  to  determine  from 
the  negative  of  the  slope  of  the  U(x)  curve  the  direction  of  the  net  force  that 
the  particle  feels  when  some  influence  has  given  the  particle  a small  dis- 
placement from  xe.  You  will  find  that,  whether  the  particle  has  been  dis- 
placed to  one  side  of  xe  or  to  the  other,  the  net  force  is  always  directed 
toward  xe.  Hence  is,  in  fact,  a position  of  stable  equilibrium.  The  effect  of 
Eq.  (9-51)  is  to  require  the  U(x)  curve  to  have  zero  slope  at  xe.  The  effect  of 
Eq.  (9-52)  is  to  require  the  curve  to  be  concave  upward  at  xe.  It  is  apparent 
from  the  figure  that  their  combined  effect  is  to  require  the  curve  to  de- 
scribe a potential  energy  U(x)  which  has  a minimum  value  at  the  position  of 
stable  equilibrium  xe. 


Figure  9-33 b sketches  the  potential  energy  U(x)  for  a case  where  it  has 
a maximum  value  at  xe.  In  this  case  xe  is  still  an  equilibrium  position  since 
the  relation  F{x)  = —dU(x)/dx  shows  that  the  net  force  F(x)  acting  on  the 
particle  is  still  zero  at  xe.  But  if  you  apply  this  relation  to  determine  the 
direction  of  the  force  that  the  particle  feels  when  some  influence  has  given 
it  a small  displacement  from  xe,  you  will  find  that  the  force  tends  to  push  it 
even  farther  from  that  position.  Thus  xe  is  a position  of  unstable  equilib- 
rium if  U(x)  has  a maximum  value  there. 


I U(x) 


Stable 

equilibrium 


xe 

(a) 


U(x) 


Unstable 

equilibrium 


Xe 

( b ) 


Wx) 

Neutral 

equilibrium 


xe 

(c) 

Fig.  9-33  The  three  types  of  equi- 
librium. 


Figure  9-33c  shows  the  constant  value  of  the  potential  energy  U(x)  near 
a position  xe  of  neutral  equilibrium.  The  name  is  appropriate  because  if 
the  particle  is  given  a displacement  from  that  position  to  any  nearby  posi- 
tion, there  is  no  net  force  acting  on  it  at  the  displaced  position.  This  is  be- 
cause F(x)  = -dU(x)/dx  = 0 everywhere  near  xe.  As  a consequence,  there  is 
no  tendency  for  the  displaced  particle  then  to  move  either  closer  to  or  far- 
ther from  xe. 

Although  we  have  been  considering  a potential  energy  which  is  a func- 
tion of  the  single  coordinate  x of  a particle,  our  conclusions  are  perfectly 
general.  Whether  the  movable  object  is  a particle  or  an  extended  body,  and 
no  matter  what  coordinates  are  used  to  specify  its  position,  the  potential  en- 
ergy U of  a system  has  a minimum  at  a position  of  stable  equilibrium,  has  a maximum 
at  a position  of  unstable  equilibrium,  and  is  constant  at  a position  of  neutral  equilib- 
rium. This  statement  applies  to  any  type  of  potential  energy.  That  is,  it 
makes  no  difference  what  type  of  force  gives  rise  to  the  potential 
energy — except  that  the  force  must  be  conservative  if  there  is  to  be  a po- 
tential energy. 

A particularly  important  case  is  the  one  in  which  U represents  the 
gravitational  potential  energy  of  a system  containing  the  earth  and  a mov- 
able body  near  the  earth’s  surface.  To  obtain  an  expression  for  U,  we  con- 
sider the  body  to  be  comprised  of  particles  and  calculate  the  gravitational 
potential  energy  associated  with  each  of  them.  We  then  sum  over  the  n par- 
ticles in  the  body.  With  the  subscript  j employed  to  designate  a typical  par- 
ticle, Eq.  (7-53)  states  that  the  potential  energy  resulting  from  the  gravita- 
tional force  exerted  on  it  by  the  earth  is 

Uj  = mjgyj 

Here  nij  is  the  particle’s  mass,  y}  is  its  coordinate  measured  vertically  up- 
ward from  the  reference  height  chosen  to  be  at  y = 0,  and  g is  the  magni- 
tude of  the  gravitational  acceleration.  The  gravitational  potential  energy  of 
the  system  is 

U = X ui  = S "b'&yj 

i=  l i=  l 


Since  g is  a constant,  this  is 


n 

u = g J Wjyj 


(9-53) 


The  sum  can  be  evaluated  by  writing  Eq.  (9-47a)  as 

n 

Y m}  r,  = Mr 
j=i 

In  this  equation  M represents  the  total  mass  of  the  body,  and  r the  position 
vector  of  its  center  of  mass.  Considering  the  y component  of  both  sides  of 
the  vector  equation  gives  us  the  scalar  equation 

n 

2 m&i  = My 

3=1 

where  y is  the  y coordinate  of  the  body’s  center  of  mass.  Using  this  relation 
in  Eq.  (9-53),  we  obtain 


U = Mgy  (9-54) 

9-7  Stability  of  Equilibrium  381 


EXAMPLE  9-10 


cc 


[ r — a — O'  — a)  cos  d] 
( b ) 


Fig.  9-34  (a)  A body  with  a curved  bot- 

tom supported  in  an  upright  position  by 
a rough  plane,  (6)  The  body  after  it  has 
rolled  through  a small  angle  0. 


Our  result  tells  us  that  the  gravitational  potential  energy  is  the  same  as  if  the  en- 
tire mass  of  the  body  were  concentrated  at  its  center  of  mass. 

Now  look  again  at  Fig.  9-33,  considering  the  case  where  U is  the  gravi- 
tational potential  energy  given  by  Eq.  (9-54).  According  to  the  equation, 
the  vertical  scale  in  each  part  of  the  figure  is  a measure  of  the  height  of  the 
body’s  center  of  mass.  Restate  the  conditions  for  stable  equilibrium,  unstable 
equilibrium,  and  neutral  equilibrium  in  these  terms.  Do  the  restated  condi- 
tions agree  with  your  intuition? 

Example  9-10  uses  Eqs.  (9-51),  (9-52),  and  (9-54)  to  investigate  the 
equilibrium  of  a rigid  body  acted  on  by  gravity. 


The  rigid  body  shown  in  Fig.  9-34a  is  supported  by  a rough  horizontal  plane.  The 
density  of  the  body  is  nonuniform,  the  mass  being  distributed  in  such  a way  that  its 
center  of  mass  (CM)  is  at  a distance  a above  the  point  of  contact  between  the  curved 
surface  and  the  plane,  with  a being  less  than  the  distance  r from  the  point  of  contact 
to  the  center  of  curvature  (CC)  of  the  surface.  Is  the  body  in  equilibrium?  Is  the 
equilibrium  stable? 

■ To  answer  these  questions  by  using  Eqs.  (9-51)  and  (9-52),  first  you  must  obtain 
a mathematical  expression  for  the  potential  energy  ot  the  body  plus-earth  system, 
when  the  body  is  in  some  position  near  the  one  shown  in  Fig.  9-33a.  It  gets  to  the 
new  position,  illustrated  in  Fig.  9-346,  by  rolling  over  the  rough  plane  from  its  origi- 
nal position.  In  the  rolling  process,  the  force  which  the  plane  exerts  on  the  body  is 
always  applied  to  a point  in  the  body  that  is  at  rest  at  the  instant  it  is  in  contact  with 
the  plane.  Therefore  the  force  exerted  by  the  plane  does  no  work  since  there  is  no 
displacement  of  the  point  to  which  the  force  is  applied.  In  other  words,  rolling  is  a 
workless  constraint.  This  means  the  potential  energy  of  the  system  arises  entirely  from 
the  work  done  by  the  gravitational  force  exerted  on  the  body  by  the  earth.  So  you 
will  be  able  to  evaluate  the  potential  energy  by  using  Eq.  (9-54)  as  soon  as  you  have 
an  expression  for  the  height  of  the  CM  of  the  body. 

To  this  end,  you  specify  the  position  of  the  body  in  terms  of  the  angle  6 shown 
in  Fig.  9-346.  It  is  the  angle  between  the  intersection  at  the  CC  of  two  lines.  One  is 
the  radial  line  from  the  CC  through  the  CM  to  the  point  that  had  been  in  contact 
with  the  plane.  The  other  is  the  radial  line  from  the  CC  to  the  point  that  now  is  in 
contact  with  the  plane.  Choose  the  reference  height  used  to  define  the  gravitational 
potential  energy  of  the  system  as  the  height  of  the  CM  when  the  body  is  in  the  posi- 
tion shown  in  Fig.  9-34a,  and  use  this  height  to  locate  the  origin  of  a vertically 
directed  y axis.  Then  inspection  of  the  figure  will  show  you  that  the  y coordinate  of 
the  CM  when  the  body  has  rolled  to  the  position  shown  in  Fig.  9-346  is 

y = (r  — a)  — (r  — a)  cos  6 
or 

y = (r  — a)(l  — cos  6) 

Now  use  Eq.  (9-54)  to  express  the  gravitational  potential  energy  of  the  system  as 

U(6)  = Mgy 


or 

U(9)  = Mg(r  - a)(l  - cos  9)  (9-55) 

where  M is  the  mass  of  the  body  and  g is  the  magnitude  of  the  gravitational  acceler- 
ation. 

You  can  find  an  equilibrium  position  of  the  body  by  applying  Eq.  (9-51),  if  you 
write  it  in  terms  of  the  coordinate  9 instead  of  the  coordinate  x.  Doing  so,  you  have 


dU(9) 

d9 


for  9 = 9e 


382  Rotational  Motion,  I 


where  0e  specifies  an  equilibrium  position.  Differentiating  both  sides  of  Eq.  (9-55) 
with  respect  to  0 , you  obtain 


dU(6) 

— — — = Mg(r  - a)  sin  0 (9-56) 

(10 

Setting  this  equal  to  zero  for  0 = 0e  gives  you 

Mg(r  — a)  sin  0e  = 0 

Since  Mg(r  — a)  ^ 0,  it  must  be  that  sin  0e  = 0.  Thus  you  find  that 

0e  = 0 

describes  an  equilibrium  position  of  the  body.  The  body  is,  therefore,  in  equilibrium 
in  the  position  shown  in  Fig.  9-34 a. 

To  determine  the  stability  of  the  equilibrium  position  at  0e  = 0,  you  differen- 
tiate both  sides  of  Eq.  (9-56)  with  respect  to  0,  producing 

d2U(0) 

- ,fl2  = Mg(r  - a)  cos  0 


Setting  0 = 0e  = 0,  you  have 


d2U(  0) 

~M ~ ‘ Mgir  ~ a) 


for  0 = 0e  = 0 


(9-57) 


But  r > a,  so  Mg(r  — a)  > 0,  and  therefore 

d2U(0) 

— > 0 for  0 = 0e  = 0 
d u 

Since  Eq.  (9-52)  is  satisfied,  0e  = 0 describes  a stable  equilibrium  position.  The  posi- 
tion shown  in  Fig.  9-34a  is  a position  of  stable  equilibrium. 


If  the  mass  in  the  body  is  distributed  so  that  in  Fig.  9-34a  the  CM  lies  above  the 
CC,  you  can  show  that  Eqs.  (9-55)  and  (9-57)  still  apply  with  r < a.  In  such  a case, 
d2U(0)/d02  < 0 for  0 = 0e  = 0.  This  means  that  0e  = 0 is  a position  of  unstable  equi- 
librium. If  the  CM  and  the  CC  are  coincident,  then  r = a,  d2U{0)/d02  = 0 for  0 = 
0P  = 0,  and  0e  = 0 is  a position  of  neutral  equilibrium.  To  verify  these  statements, 
and  to  see  that  they  agree  with  your  intuition,  use  Eq.  (9-55)  to  sketch  U versus  0 for 
the  three  cases  r > a,  r < a,  and  r = a.  Then  compare  your  sketches  with  those  for 
U versus  x for  the  three  cases  in  Fig;.  9-33. 

You  can  calculate  the  net  torque  about  some  origin  acting  on  the  body  from  the 
potential  energy  U(0)  of  the  system  by  writing  the  torque  as  the  signed  scalar  T(0) 
and  employing  the  relation 


dU(0) 

Tie)  = - (9-58) 

I his  relation  is  obtained  by  analogy  to  the  one  in  Eq.  (7-56),  F(x)  = ~dU(x)/dx.  Find 
T(0),  and  then  use  it  to  explain  to  yourself  the  equilibrium  properties  of  the  body  in 
the  cases  r > a,  r < a,  and  r = a,  in  terms  of  the  net  torque  acting  on  the  body.  To 
check  the  validity  of  Eq.  (9-58),  find  an  expression  for  T(8)  from  a direct  consider- 
ation of  the  two  forces  acting  on  the  body  and  the  geometry  of  Fig.  9-346,  taking  ac- 
count of  the  fact  that  both  forces  have  the  same  magnitude  Mg.  Note  that  you  get 
the  same  expression  for  T(0),  no  matter  where  you  choose  the  origin  O.  Can  you  ex- 
plain why? 


9-7  Stability  of  Equilibrium  383 


EXERCISES 

Group  A 

9-1.  How  much  turning?  Evaluate  the  angle  through 
which  the  wheel  in  Example  9-1  has  turned  by  applying 
Eq.  (9-10),  and  compare  your  results  with  those  obtained 
in  the  example. 

9-2.  A change  of  sign.  Compute  the  direction  and  mag- 
nitude of  the  total  acceleration  of  the  point  on  the  wheel 
in  Example  9-2,  using  a = 1.53  rad/s2.  Then  repeat  the 
calculation  using  a = —1.53  rad/s2.  Compare  the  two  re- 
sults. 

9-3.  Plane  facts.  If  t lie  rotation  df  in  Fig.  9-3  is  to 
be  represented  by  a vector,  one  might  be  tempted  to  use  a 
vector  in  the  plane  of  the  flywheel.  Give  two  reasons  why 
such  a procedure  would  not  give  a determinate  vector. 

9-4.  Angular  acceleration  and  tangential  acceleration. 
Starting  from  rest,  a flywheel  of  30-cm  radius  makes  6.0 
revolutions  in  3.0  s.  Assume  that  the  angular  acceleration 
is  constant  during  this  interval. 

a.  Evaluate  the  magnitude  of  the  angular  accelera- 
tion. 

b.  What  is  the  magnitude  of  the  tangential  accelera- 
tion of  a point  on  the  rim  of  the  flywheel? 

9-5.  Grinding  to  a halt.  A motorized  grindstone  is 
turning  at  its  full  speed  of  1800  revolutions  per  minute 
when  a worker  switches  off  the  motor.  The  grindstone 
comes  to  rest  exactly  2 minutes  later. 

a.  What  is  the  average  angular  deceleration? 

b.  Assuming  that  the  angular  deceleration  is  con- 
stant, how  many  revolutions  does  the  grindstone  make 
during  its  deceleration? 

9-6.  Speed  and  acceleration  in  steady  spinning.  A grind- 
stone is  turning  at  1800  revolutions  per  minute.  Its  diam- 
eter is  20  cm. 

a.  What  is  the  speed  of  any  point  on  its  rim? 

b.  What  is  the  acceleration  of  this  point? 

9-7.  Angular  velocities.  A ball  of  radius  r rolls  along  a 
loop-the-loop  track  of  radius  R.  As  the  ball  passes  through 
the  bottom  of  the  loop,  the  magnitude  of  the  velocity  of 
the  center  of  the  ball  is  v0 . 

a.  What  is  the  instantaneous  angular  velocity  of  the 

ball? 

b.  What  is  the  instantaneous  angular  velocity  of  the 
line  joining  the  center  of  the  ball  to  the  center  of  the  loop? 

c.  Evaluate  your  results  for  r = 1.0  cm,  R = 5.0  cm, 
and  v0  = 20.0  cm/s. 

9-8.  Geometry  of  the  cross  product.  Two  vectors  A and  B 
determine  a parallelogram.  Show  that  |A  x B|  is  equal  to 
the  area  of  the  parallelogram.  Hence  the  area  can  be  re- 
presented by  a single  vector  A x B.  What  is  the  spatial  re- 
lationship between  this  vector  and  the  plane  of  the  paral- 
lelogram? 

9-9.  What  if  the  earth  lost  its  caps?  The  ice  in  the  Green- 
land and  Antarctic  caps  is  almost  entirely  above  the 


present  sea  level.  Suppose  the  earth's  temperature  rose 
sufficiently  to  melt  this  ice.  Neglecting  any  adjustments  of 
the  solid  earth,  how  would  the  length  of  earth's  day  be  af- 
fected? Explain  your  answer. 

9-10.  Encounter  with  a merry-go-round.  A playground 
merry-go-round  is  shaped  like  a bicycle  wheel,  with  a cir- 
cular bench  supported  by  rods  arranged  like  spokes  about 
die  vertical  central  axle.  Its  radius  is  1.0  m,  and  its  mass  of 
50  kg  is  concentrated  in  the  circumference.  The  merry- 
go-round  is  stationary.  A 30-kg  child  runs  along  a line 
tangent  to  the  circumference  with  a speed  of  5.0  m/s  and 
jumps  onto  the  merry-go-round. 

a.  Neglecting  friction,  what  angular  velocity  does  the 
merry-go-round  acquire? 

b.  The  child  jumps  from  the  moving  merry-go- 
round,  being  careful  to  push  only  along  the  radial  direc- 
tion. What  is  the  final  angular  velocity  of  the  merry-go- 
round? 

9-11.  A student's  turn.  A student  volunteer  is  sitting 
stationary  on  a piano  stool  with  her  feet  off  the  floor.  The 
stool  can  turn  freely  on  its  axle. 

a.  The  volunteer  is  handed  a nonrotating  bicycle 
wheel  which  has  handles  on  the  axle.  Holding  the  axle 
vertically  with  one  hand,  she  grasps  the  rim  of  the  wheel 
with  the  other  and  spins  the  wheel  clockwise  (as  seen  from 
above).  What  happens  to  the  volunteer  as  she  does  this? 

b.  She  now  grasps  the  ends  of  the  vertical  axle  and 
turns  the  wheel  until  the  axle  is  horizontal.  What  happens? 

c.  Next  she  gives  the  rotating  wheel  to  the  instructor, 
who  turns  the  axle  until  it  is  vertical  with  the  wheel  ro- 
tating clockwise,  as  seen  from  above.  The  instructor  now 
hands  the  wheel  back  to  the  volunteer.  What  happens? 

d.  The  volunteer  grasps  the  ends  of  the  axle  and 
turns  the  axle  until  it  is  horizontal.  What  happens  now? 

e.  She  continues  turning  the  axle  until  it  is  vertical 
but  with  the  wheel  rotating  counterclockwise  as  viewed 
from  above.  What  is  the  result? 

9-12.  A force  couple.  A pair  of  equal,  oppositely 
directed,  but  noncollinear  forces  is  called  a couple.  Con- 
sider the  couple  shown  in  Fig.  9E-12. 

a.  What  torque  does  the  couple  exert  about  point  A? 

b.  What  torque  does  the  couple  exert  about  point  B? 

c.  If  a force  couple  is  applied  to  an  initially  stationary 
and  nonrotating  object,  describe  the  resultant  motion. 

A Fie.  9E-12 

I 

d 


y I 

i 

B 


384  Rotational  Motion,  I 


Fig.  9E-18 


9-13.  Between  earth  and  moon.  The  mass  of  the  earth  is 
81  limes  t lie  mass  of  the  moon.  The  distance  between 
their  centers  is  384,000  km.  Where  is  the  center  of  mass  of 
i the  earth-moon  system  in  relation  to  the  surface  of  the 
earth?  Take  the  radius  of  the  earth  as  6400  km. 

9.14  Center  of  mass  in  a three-particle  system,  l.  I hree 
particles  of  equal  masses  are  placed  so  that  they  form  a 3, 
4,  5 right  triangle  as  shown  in  Fig.  9E-14.  Locate  their 
center  of  mass  using  a cartesian  system  (x,  y),  with  x and  y 
J as  shown  and  with  the  origin  located  at  the  vertex  of  the 
right  angle. 


b.  If  the  pointer  (of  negligible  mass)  attached  to  the 
beam  is  20.0  cm  long,  through  what  distance  in  mm  will 
1 mg  deflect  the  free  end  of  the  pointer?  This  is  called  the 
sensitivity  of  the  balance  (in  millimeters  per  milligram). 


N 


\ 


\ 


X5 

N 


\ 


\ 


m 


\ 


Fig.  9E-14 


9-15.  Center  of  mass  in  a four- particle  system.  Locate  the 
center  of  mass  for  the  system  shown  in  Fig.  9E- 1 5,  in 
which  the  four  particles  form  a square  of  side  l.  LNe  a car- 
tesian system  (x,  y)  whose  origin  coincides  with  the  40-g 
particle  and  whose  orientation  is  as  indicated. 


10g 


20  g 


• * 

40  g 30  g 


Fig.  9E-15 


9-19.  Surprisingly  stable.  A half-dollar  is  partly  em- 
bedded in  a large  cork  into  which  two  forks  have  been 
stuck,  as  shown  in  Fig.  9E-19.  If  the  edge  of  the  coin  is 
placed  on  a needle,  the  system  is  stable.  It  can  oscillate  on 
tbe  point  of  the  needle  without  falling  over.  Account  for 
the  stability  of  this  system. 


4 


Fig.  9E-19 


Group  B 

9-20.  Angular  kinematics. 

a.  Derive  Eq.  (9-10).  <p  = <f>i  + (or  — a»f)/2o:  for  con- 
stant a,  directly  from  Eqs.  (9-7)  through  (9-9). 

b.  Derive  Eq.  (9-11),  </>  = + (ca  + w,)t/2  for  con- 

stant a,  directly  from  Eqs.  (9-7)  through  (9-9). 


9-21.  Running  down.  Modify  Eq.  (9-23),  a = 
ar\  - orrr,  so  that  it  describes  the  total  acceleration  of  a 
point  on  the  tint  of  a flywheel  whose  angular  speed  is 
decreasing.  Remember  that  a = |a|  = \do)/dt\. 


9-16.  How  much  work ? A plank  of  mass  25  kg  and 
length  2.0  m lies  flat  on  the  floor  with  one  end  against  a wall. 
A worker  takes  hold  of  the  other  end  and  raises  the  plank 
to  the  vertical  position  by  using  the  end  against  the  wall  as 
a pivot.  How  much  work  does  the  worker  do? 

9-17.  When  the  mass  of  a lever  cannot  be  neglected.  Mod- 
ify the  calculation  in  Example  9-8  to  take  into  account  the 
mass  M of  the  lever. 

9-18.  A delicate  balance.  The  beam  of  the  balance  in 
Fig.  9E-18  has  mass  of  25.0  g.  The  center  of  mass  of  the 
beam  is  at  G,  4.00  mm  below  the  point  of  support,  O. 

a.  If  a load  of  exactly  1 mg  is  placed  in  one  pan, 
through  what  angle  0 will  the  beam  be  deflected?  Express 
your  result  in  radians. 


9-22.  Central  force  and  angular  momentum  conservation. 
A particle  is  acted  on  by  a repulsive  central  force.  Write  a 
vector  expression  for  the  force,  and  then  use  it  to  prove 
that  the  particle  maintains  a constant  angular  momentum 
in  an  inertial  frame  about  an  origin  located  at  the  force 
center. 

9-23.  Alternative  derivation  of  the  rotational  form  of 
Newton’s  third  law.  Modify  the  derivation  of  Sec.  4-5  to 
prove  that  the  law  of  conservation  of  angular  momentum 
and  the  rotational  form  of  Newton’s  second  law  lead  to  the 
rotational  form  of  Newton’s  third  law. 

9-24.  Which  way  will  it  go?  A spool  of  thread  rests  on  a 
level  tabletop,  as  shown  in  Fig.  9E-24.  The  thread  is  pulled 
gently,  so  that  there  is  no  slippage  at  P,  the  point  of  con- 
tact between  the  spool  and  the  tabletop.  For  each  of  the 


Exercises  385 


Fig.  9E-24 


thread  positions  a through  d,  determine  which  way  the 
spool  will  roll.  Explain  your  answers.  Notice  that  in  posi 
tion  d the  line  determined  by  the  thread  passes  through 
point  P. 

9-25.  Eight  lives  to  go.  A uniform  board  6.0  m long 
overhangs  the  edge  of  a table  by  2.0  m.  The  mass  of  the 
board  is  12.0  kg.  An  8.0-kg  cat  walks  out  along  the  board. 
How  far  from  the  edge  of  the  table  will  the  cat  be  when 
the  board  begins  to  tip? 


9-26.  Center  of  mass  in  a three-particle  system,  II.  As  indi- 
cated in  Fig.  9E-26,  the  position  vectors  of  three  particles 
of  equal  mass  m are  given  by  a,  b,  and  c. 


Fig.  9E-26 


a.  Find  the  position  vector  r which  locates  the  center 
of  mass  of  the  system. 

b.  Show  that  the  position  specified  by  r lies  precisely 
at  the  intersection  of  the  medians  of  the  triangle  ABC  of 
Fig.  9E-26. 

9-27.  Center  of  mass  of  a homogeneous  plane  triangle. 
Prove  that  the  center  of  mass  of  a homogeneous  sheet- 
metal  triangle  lies  at  the  intersection  of  the  medians.  (This 
point  is  called  the  barycenter.) 

9-28.  Center  of  mass  of  a homogeneous  hemisphere.  A 
hemispherical  object  of  uniform  density  has  radius  R. 
Prove  by  integration  that  its  center  of  mass  lies  along  its 
axis  of  symmetry  at  a distance  3/2/8  from  the  center  of  the 
plane  surface. 


9-29.  Center  of  mass  of  a homogeneous  tetrahedron.  Prove 
that  the  center  of  mass  of  any  tetrahedron  of  uniform 
density  lies  at  the  common  intersection  of  four  lines,  each 
drawn  from  a vertex  to  the  intersection  of  the  medians  of 
the  opposite  base.  (It  may  be  helpful  to  refer  to  Exercise 
9-27.) 


9-30.  Take  the  upper  end!  A heavy  trunk  is  uniformly 
filled  so  that  its  center  of  mass  is  at  its  center  of  volume.  Its 
length  is  twice  its  height.  As  shown  in  Fig.  9E-30,  it  is 
being  carried  down  a flight  of  stairs  by  two  movers,  A and 
B,  with  A in  front.  The  trunk  is  at  an  angle  of  45°  with  the 
horizontal,  and  A and  B apply  strictly  vertical  forces  at  op- 
posite ends  of  the  bottom.  What  fraction  of  the  weight 
does  A carry? 


Fig.  9E-30 


9-31.  No  tipping,  please.  A uniformly  loaded  crate  of 
height  h and  width  w rests  on  the  floor.  The  coefficient  of 
kinetic  friction  between  crate  and  floor  is  /xfc. 

a.  Suppose  the  crate  is  pulled  slowly  along  the  floor 
by  a horizontal  force  F applied  at  the  level  of  the  center  of 
mass.  How  far  from  the  center  of  the  base  does  the  resul- 
tant normal  reaction  force  of  the  floor  act? 

b.  What  is  the  maximum  height  above  the  floor  at 
which  the  horizontal  force  F can  be  applied  without  tip- 
ping the  crate? 

c.  Evaluate  your  results  of  parts  a and  b for  h = 1.00 
m,  w = 0.50  m,  and  = 0.33. 


9-32.  The  tasks  of  a hinge.  A uniform  door  of  height  h, 
width  b,  and  mass  m is  hung  from  two  hinges,  which  are  at 
a distance  d from  the  upper  and  lower  ends  of  the  door, 
respectively. 


386  Rotational  Motion,  I 


a.  What  is  the  magnitude  of  the  horizontal  compo- 
nent of  the  force  exerted  on  the  door  (1)  by  the  upper 
hinge;  (2)  by  the  lower  hinge? 

b.  What  is  the  total  magnitude  of  the  vertical  compo- 
nent of  the  force  carried  by  both  hinges?  (Note:  The  sepa- 
rate vertical  components  of  the  force  carried  by  each 
hinge  cannot  be  determined  from  the  information  given.) 

9-33.  Finding  the  center  of  mass  of  a rod.  A horizontal 
rod  or  beam  of  mass  M (not  necessarily  uniform)  is  sup- 
ported by  a person  using  one  huger  from  each  hand.  The 
points  of  support  lie  on  opposite  sides  of  the  center  of 
mass  at  distances  dx  and  d2  from  it. 

a.  Show  that  the  magnitudes  of  the  normal  forces  N1 
and  N2  exerted  by  the  fingers  are  given  by  N x = 
Mgd2/(dl  + d2)  and  N2  = Mgd1/(d1  + d2). 

b.  Show  that  Ni  > N2  if  d1  < d2. 

c.  Suppose  that  the  person  supporting  the  rod  moves 
his  hands  closer  together  by  exerting  inward-directed 
horizontal  forces  that  are  equal  in  magnitude  and  just 
strong  enough  to  cause  slippage  between  (at  least)  one  of 
the  support  fingers  and  the  rod.  Assuming  that  the  same 
coefficient  of  friction  applies  to  the  contact  between  each 
finger  and  the  rod,  show  that  if  dx  < d2,  then  only  finger  2 
moves  relative  to  the  rod  until  dA  = d2 , after  which  both 
fingers  move  equally  relative  to  the  rod.  Show  also  that  if 
di  > d2,  then  only  finger  1 moves  relative  to  the  rod,  until 
dA  = d2 , after  which  both  fingers  move  equally. 

d.  Use  the  above  results  to  explain  why  the  following 
method  can  be  used  to  locate  the  center  of  mass  of  any 
rod:  “Use  one  finger  from  each  hand  to  support  the  rod  in 
a horizontal  position.  Using  the  smallest  horizontal  force 
that  results  in  some  motion,  draw  the  support  fingers  in- 
ward. Continue  until  the  fingers  touch.  When  they  do  so, 
they  are  directly  under  the  center  of  mass  of  the  rod.’’ 

e.  Suppose  the  coefficients  of  friction  for  the  two 
supports  differ  slightly.  Will  the  method  still  work?  Ex- 
plain your  answer. 

/f'  \9-343  Slippery  job.  Consider  once  more  the  20-kg 
ladder“clescribed  in  Example  9-9,  but  now  suppose  that 
both  the  ground  and  the  wall  are  frictionless.  The  ladder 
is  kept  from  slipping  by  a horizontal  rope  which  is  tied  to 
the  ladder’s  center  and  anchored  to  the  wall.  A 70-kg  man 
is  standing  on  the  ladder  at  some  point  P above  C,  the 
midpoint  of  the  ladder. 

a.  What  force  does  the  ground  exert  on  the  base  of 
the  ladder? 

b.  What  force  does  the  wall  exert  on  the  top  end  of 
the  ladder? 

c.  What  is  the  tension  in  the  rope? 

9-35.  The  statics  of  a derrick.  As  shown  in  Fig.  9E-35,  a 
derrick  consists  of  a uniform  6.00-m  boom  of  mass  100  kg, 
hinged  to  a vertical  mast.  A load  of  mass  400  kg  is  fastened 
to  the  free  end  of  the  boom.  A cable  is  attached  to  a hook 
1.50  m from  the  end  of  the  boom.  The  cable  is  fastened  to 
the  mast,  and  the  angles  between  the  various  members  are 
as  shown.  Jj  J '■ 

K'yitiNL'tT  Ao  o 

Jjiajl?  * 


Fig.  9E-35 


a.  What  is  the  tension  in  the  cable? 

b.  What  are  the  horizontal  and  vertical  components 
of  the  force  exerted  by  the  hinge  on  the  base  of  the  boom? 

c.  Using  an  origin  located  at  the  hinge,  evaluate  the 
torque  exerted  on  the  mast  by  the  cable.  What  prevents 
the  mast  from  toppling? 


9-36.  To  maintain  equilibrium.  As  shown  in  Fig.  9E-36, 
a light,  bent  rod  is  pivoted  at  point  B.  The  distances  AB, 
BC,  and  CD  are  equal,  and  the  angles  at  B and  C are  right 
angles.  A force  Fx  of  magnitude  100  N is  applied  at  point 
A;  the  force  is  directed  parallel  to  the  line  from  B to  C.  A 
second  force  F2  is  applied  at  point  D in  order  to  maintain 
the  equilibrium  of  the  rod.  What  is  the  minimum  possible 
magnitude  of  F2? 


C 


Fig.  9E-36 


□ D 


9-37.  Hanging  by  a thread,  I.  A homogeneous  sphere  is 
suspended  by  a string  from  a wall.  The  sphere  is  set 
against  the  wall  in  such  a way  that  the  point  of  attachment 
A of  the  string  to  the  sphere  is  vertically  above  the  center 
of  the  sphere  O.  Prove  that  the  coefficient  of  static  friction 
between  the  sphere  and  wall  must  equal  or  exceed  1 if  the 
sphere  is  to  remain  in  the  given  position. 

9-38.  Supporting  a sign.  As  shown  in  Fig.  9E-38  a 
50-kg  sign  is  supported  by  a light  horizontal  beam.  The 
sign’s  mass  is  uniformly  distributed.  The  beam  is  hinged 
to  a wall  bracket  at  one  end.  The  other  end  of  the  beam  is 
supported  by  a guy  wire  which  makes  an  angle  of  30°  with 
the  horizontal. 

a.  Show  that  the  force  FB  exerted  by  the  wall  bracket 
on  the  beam  acts  along  a line  that  passes  through  the  mid- 
point of  the  guy  wire.  What  angle  does  FB  make  with  the 
horizontal? 


fa  < 


Exercises  387 


Fig.  9E-38 


Fig.  9E-42 


b.  Determine  the  magnitude  of  the  force  Ffl  and  its 
horizontal  and  vertical  components. 

c.  What  is  the  magnitude  of  the  tension  in  the  guy 
wire? 

9-39.  A moving  center  of  mass.  A body  of  mass  mx  is 
moving  with  velocity  Vj.  A second  body  of  mass  m2  is  sta- 
tionary. 

a.  What  is  the  velocity  of  the  center  of  mass  of  the 
system  consisting  of  the  two  bodies? 

b.  What  is  the  velocity  v[  of  body  1 relative  to  the 
center  of  mass? 

c.  What  is  the  velocity  v2  of  body  2 relative  to  the 
center  of  mass? 

d.  What  is  the  momentum  of  the  system  in  a frame  of 
reference  attached  to  the  moving  center  of  mass? 

9-40.  Frozen  fast.  A hollow  cylindrical  tube  is  resting 
on  its  side  on  a horizontal  surface,  half-filled  with  water. 
The  water  freezes,  so  that  the  tube  is  half  full  of  ice,  which 
has  a mass  M;.  The  tube  has  outer  radius  R , and  inner 
radius  r.  The  total  mass  of  the  tube  and  ice  is  Mt  + Mi. 

a.  Determine  the  location  of  the  center  of  mass  of  the 
system. 

b.  Suppose  that  the  tube  rolls  to  one  side  (without 
slipping)  so  that  it  turns  through  an  angle  8.  What  is  the 
increase  in  gravitational  potential  energy  of  the  system? 
Assume  that  the  ice  is  frozen  fast  to  the  inner  wall  of  the 
tube. 

9-41.  An  important  intersection.  Explain  why  the  center 
of  mass  of  the  body  in  Fig.  9-31  is  located  at  the  intersec- 
tion of  the  two  dashed  lines. 

9-42.  Making  the  best  of  an  unequal-arm  balance.  The 
mass  of  an  object  is  to  be  determined  by  using  a balance.  It 
is  suspected  that  the  arms  of  the  balance  are  not  exactly 
equal — that  is,  lA  f l2  in  Fig.  9E-42.  The  body  of  unknown 
mass  M is  placed  in  the  right  pan,  and  a known  mass  m1  is 
required  in  the  left  pan  to  level  the  beam.  The  body  of 
unknown  mass  is  now  placed  in  the  left  pan.  It  is  found 
that  a known  mass  m2  is  required  in  the  right  pan  for  le- 
veling. 


h 


a.  Show  that  M = V 

b.  Show  that  U/f  = \ZrnJrn 2. 

9-43.  Torque  and.  potential  energy.  Verify  Eq.  (9-58), 
T(6)  = — dU{6)/dO,  by  following  the  procedure  suggested 
below  that  equation. 


9-44.  Off  her  rocker.  A homogeneous,  hemispherical 
solid  of  mass  M and  radius  R is  resting  on  a horizontal  sur- 
face. An  aluminum  pole  of  negligible  mass  extends  verti- 
cally upward  from  the  center  of  the  flat  face  of  the  hemi- 
sphere. As  shown  in  Fig.  9E-44,  an  acrobat  of  mass  m is 
climbing  the  pole.  How  far  up  the  pole  can  she  climb  be- 
fore the  equilibrium  becomes  unstable? 


Fig.  9E-44 


9-45.  Futuristic  rocking  chair?  A homogeneous  half- 
cylinder of  mass  M and  radius  R is  able  to  rock  without 
slipping  on  a horizontal  surface,  as  shown  in  Fig.  9E-45. 
Find  the  gravitational  potential  energy  of  the  system  as  a 
function  of  the  angle  6.  (Assume  that  0 < 90°. ) 

Fig.  9E-45 


9-46.  Weighting  the  base.  A uniform  rod  of  length  100 
cm  and  mass  300  g protrudes  from  a cylindrical  can,  with 
its  bottom  end  resting  against  the  bottom  edge  of  the  can. 
The  can  has  a diameter  of  12.0  cm,  a height  of  16.0  cm, 
and  negligible  mass.  What  minimum  depth  of  water  must 


388  Rotational  Motion,  I 


be  poured  into  the  can  to  prevent  the  can  from  tipping? 
The  density  of  water  is  1.00  g/cm3. 

9-47.  Point  of  no  return.  Consider  the  uniform  block 
shown  in  Fig.  9E-47. 


A B 


9-50.  A common  (and  frustrating)  occurrence.  A dresser 
drawer  has  depth  / and  width  w.  The  handles  are  a dis- 
tance d apart  and  are  properly  centered,  as  shown  in  Fig. 
9E-50.  The  drawer  binds  when  it  is  pushed  with  a force  F 
at  one  handle,  as  shown  in  the  figure.  This  happens  be- 
cause the  corners  A and  B press  against  the  drawer  guides; 
the  resulting  frictional  forces  f ( and  fB  immobilize  the 
drawer.  The  coefficient  of  static  friction  between  the 
drawer  and  the  guides  is  /xs. 

A w Fig.  9E-50 


a.  What  is  the  maximum  angle  through  which  the 
block  can  be  rotated  about  edge  AB  and  still  return  to  its 
original  position  when  released? 

b.  What  would  the  maximum  angle  be  if  the  block 
were  rotated  about  edge  BC ? 


C 9-48  TJTopspin.  Apply  Eq.  (9-21),  a = aXr  + toXv, 
a o aestrme  qualitatively  the  acceleration  a of  a point  fixed 
on  the  surface  of  a .child’s  top  at  the  widest  point  of  the 
top.  The  top  spins^abouf  its  axis  <)fs^nnnc‘  ''  1 

vebreit-y  <*>,  The  magnitude  of is  constc 
don  changes  as  the  axis  of  symmetry  moves 
around  a cone,  as  in  Fig.  9-2.  First  make  a sketch  at  an  in- 
stant when  o»  is  just  moving  into  the  plane  of  the  page, 
and  determine  the  direction  of  the  top’s  angular  accelera- 
tion a.  Next  make  two  more  sketches  with  to  just  moving 
into  the  plane  of  the  page.  In  the  hrst  of  these,  the  point 
fixed  on  the  surface  of  the  top  lies  outside  the  cone 
described  by  the  rotation  of  to.  In  the  second,  which  illus- 
trates the  situation  a short  time  later,  the  point  has  rotated 
around  the  axis  of  the  top  so  that  it  lies  inside  the  cone. 
Use  Eq.  (9-19),  v = to  x r,  to  find  the  direction  of  the 
velocity  v of  the  point  in  both  cases,  measuring  its  position 
vector  r from  an  origin  at  the  pointed  t i p ol  the  top.  Then 
find  the  direction  of  a in  both  cases.  Also  compare  the 
magnitude  of  a in  the  two  cases.  Can  you  relate  the  behav- 
ior of  a to  the  path  traced  out  in  space  by  the  point  fixed 
on  the  surface  of  the  top? 


l 


i b 


h — d H 

-M- 

L B " 

F 

a.  What  is  the  relationship  between  the  normal  forces 
Na  and  Nb? 

b.  What  is  the  relationship  between  the  frictional 
forces  fA  and  fB  if  the  drawer  is  just  barely  stuck? 

c.  Find  fA  and  fB  in  terms  of  F if  the  drawer  is  stuck. 

d.  Given  that  the  drawer  sticks  when  pushed  at  one 
handle,  find  a relationship  among  l,  d,  and  p,,. 

e.  Can  the  sticking  be  overcome  simply  by  pushing 
harder  — that  is,  by  increasing  the  magnitude  of  F?  Ex- 
plain your  answer. 

9-51.  The  ground,  the  wall,  and  the  ladder.  A uniform 
ladder  is  standing  on  a horizontal  surface  and  leaning 
against  a vertical  wall.  The  coefficient  of  static  friction 
between  the  ladder  and  the  horizontal  ground  surface  is 
fxG,  and  the  coefficient  of  static  friction  between  the  ladder 
and  the  wall  is  /xw. 

a.  In  terms  of  /zG  and  /x w,  find  the  minimum  elevation 
angle  90  at  which  the  ladder  can  be  leaned  without  begin- 
ning to  slip. 


9-49.  Su  rmounting  an  obstacle. 

a.  A roller  is  1 .0  m in  diameter  and  has  a mass  of  50 
kg.  Find  the  magnitude,  direction,  and  point  of  applica- 
tion of  the  smallest  applied  force  that  can  get  it  on  top  of  a 
flat  block  0.30  m high.  Assume  that  the  roller  does  not  slip 
at  its  line  of  contact  with  the  upper  edge  of  the  block. 

b.  If  the  force  found  in  part  a is  applied,  what  will  be 
the  magnitude  and  direction  of  the  force  F'  that  the  edge 
of  the  block  exerts  on  the  roller  as  the  roller  loses  contact 
with  the  floor? 

c.  For  the  force  F'  found  in  part  b,  find  the  compo- 
nent tangential  to  the  roller  surface  as  well  as  the  normal 
component.  Evaluate  the  ratio  of  the  tangential  compo- 
nent to  the  normal  component. 

^ MCfituSL 


b.  Evaluate  9n  for  fiG  = /jlw  = 0.40. 

c.  Evaluate  90  for  /u.G  = 0.40  and  /xw  = 0. 

d.  Evaluate'  90  for  /jlg  = 0 and  fjuw  = 0.40. 

e.  Based  on  your  results  for  parts  b to  d,  which  coeffi- 
cient of  friction  appears  to  be  more  important  in  deter- 
mining the  minimum  elevation  angle? 

9-52.  Appropriate  compensation.  An  Atwood  machine  is 
set  up  on  a lever,  as  shown  in  Fig.  9E-52.  Frictional  forces 
and  pulley  masses  are  negligible.  Initially,  the  masses  mA 
and  mB  are  equal,  and  weights  are  placed  in  the  pan  until 
the  beam  is  horizontal.  Then  an  amount  of  mass  m is 
transferred  from  mA  to  mB , so  that  m'A  = mA  — m and  m'B  = 
mA  + m. 


Exercises  389 


a.  How  much  mass  Am  must  be  simultaneously  re- 
moved from  the  pan  if  the  lever  is  to  remain  horizontal 
while  the  Atwood  machine  operates? 

b.  Evaluate  the  result  of  part  a for  the  case  mA  = 
0.100  kg,  m = 0.010  kg,  U = l3  = 0.100  m,  and  l2  = 
0.050  m. 

9-53.  Center  of  mass  of  a right-angle  bracket.  Consider  a 
uniform  right-angle  iron  bracket,  ABC,  with  equal  arms  of 
length  /. 

a.  Where  is  the  center  of  mass  of  the  system  relative 
to  B , the  angle  vertex? 

b.  If  end  A is  tied  to  a vertical  string  and  the  bracket 
is  suspended,  what  will  be  the  angle  between  the  line  AB 
and  the  downward  vertical? 

9-54.  A well-designed  scale.  Figure  9E-54  is  a schematic 
diagram  of  a platform  scale  similar  to  ones  used  by  physi- 
cians. This  type  of  scale  has  three  advantages:  (1)  It  can  be 
used  to  measure  a large  mass  with  a small  one;  (2)  the  re- 
sult is  independent  of  the  position  of  the  body  of  mass  M 
on  the  platform  EF\  (3)  EF  always  remains  horizontal. 
These  advantages  result  from  the  construction  of  the  scale 
so  that  the  ratios  BD/BC  and  GH/GF  are  equal.  Call  the 
ratio  n. 

a.  If  EK  = a and  KF  = b,  what  are  the  forces  Fe 
pulling  at  E and  F/?  pushing  at  F? 

b.  What  is  the  force  FH  pulling  at  H ? 

c.  Write  the  equation  for  the  equilibrium  of  the  beam 
AD  and  show  that  the  value  of  the  mass  m required  for 
equilibrium  satisfies  the  condition  mg  X AB  = Mg  X BC, 

d.  If  AB/BC  = 100,  what  is  the  relation  between  M 
and  m?  Show  that  your  result  is  independent  of  the  posi- 
tion of  the  body  of  mass  M along  the  platform  EF. 

e.  Suppose  E descends  a distance  d.  How  far  will  H 
descend?  How  far  will  F descend?  Show  that  the  platform 
EF  which  bears  the  load  remains  horizontal. 


9-55.  The  critical  slope.  A cylinder  lies  on  an  incline  of 
angle  0.  A string  is  wrapped  around  it  many  times  over,  so 
as  to  cover  its  surface.  The  string  continues  above  the  cyl- 
inder parallel  to  the  incline.  The  end  of  the  string  is  tied  to 
a nail.  Let  the  coefficient  of  static  friction  between  the 
string  and  the  plane  be  /j.s.  Prove  that  if  tan  6 < 2 fj.s,  the 
cylinder  will  remain  in  place. 

9-56.  Pappus’s  theorem.  A plane  lamina  (that  is,  a flat, 
thin  sheet  of  material  of  uniform  density  and  thickness)  is 
shown  in  Fig.  9E-56.  This  lamina  has  surface  area  A,  total 
mass  M,  and  mass  per  unit  area  cr  = M/A.  The  shape  of 
the  lamina  is  arbitrary.  The  lamina  lies  in  the  xy  plane. 
The  x axis  does  not  pass  through  it,  though  it  may  be 
tangent  to  the  lamina.  The  center  of  mass  of  the  lamina 
lies  at  some  point  whose  y coordinate  is  Yc. 

y Fig.  9E-56 


a.  Suppose  that  the  lamina  is  revolved  about  the  x 
axis  to  sweep  out  a solid  of  revolution.  Prove  that  the  vol- 
ume V of  the  solid  (not  including  the  axial  hole  that  may 
pass  through  it)  satisfies  the  equation  V = 2ttAYc.  This  re- 
sult is  known  as  Pappus’s  theorem,  after  the  Alexandrian 
mathematician  Pappus  (fourth  century  A.D.). 

b.  To  see  one  practical  use  of  Pappus’s  theorem, 
apply  it  to  determine  the  position  of  the  center  of  mass  of 
a semicircular  piece  of  sheet  metal  whose  radius  is  R. 

9-57.  Precarious  balance.  Consider  a stack  consisting  of 
a large  number  of  uniform  and  identical  rectangular 
boards  of  length  l.  Initially  the  boards  are  neatly  stacked 
one  above  another;  starting  from  the  top,  they  are  labeled 
1,  2,  3,  and  so  on. 

a.  Board  1 is  moved  (along  the  length  of  the  stack)  as 
far  as  possible  without  causing  it  to  tilt.  What  is  this  max- 
imum overhang? 


390 


Rotational  Motion,  I 


b.  With  board  1 overhanging  board  2 by  the  distance 
found  in  part  a,  boards  2 and  1 are  moved,  in  the  same 
direction  as  in  part  a,  as  far  as  possible  without  causing 
board  2 to  tilt.  By  how  much  does  board  2 overhang  board 
3?  What  is  the  total  overhang  of  boards  1 and  2?  That  is, 
by  how  much  does  board  1 overhang  board  3? 

c.  The  process  described  in  parts  a and  b is  continued 
until  boards  1 through  N are  all  displaced  as  far  as  pos- 
sible. By  how  much  does  board  N overhang  board  N + 1? 
What  is  the  total  overhang  of  boards  1 through  TV? 

d.  Show  that  the  total  overhang  of  boards  1 to  4 is 
greater  than  /,  the  length  of  each  board. 

e.  Show  that  total  overhang  found  in  part  b increases 
without  limit  as  N — » °°. 

9-58.  Hanging  by  a thread,  II.  A homogeneous  sphere 
of  mass  M and  radius  R is  suspended  from  a vertical  wall 


by  a string  of  length  L.  There  is  no  friction  between  the 
wall  and  the  sphere. 

a.  Find  the  tension  S in  the  string. 

b.  Find  the  normal  force  N exerted  on  the  sphere  by 
the  wall. 

c.  For  what  value  of  L/R  does  the  tension  equal  twice 
the  sphere’s  weight  (S  = 2 Mg)?  What  is  the  corresponding 
value  of  N ? 

d.  Find  appropriate  approximations  for  S and  for  N 
in  the  limiting  cases  L « R and  L » R. 

9-59.  On  top  of  the  globe.  A homogeneous,  solid  hemi- 
sphere of  radius  r is  sitting,  flat  side  up,  on  top  of  a fixed 
sphere  of  radius  R.  The  contact  surface  is  sufficiently 
rough  to  prevent  slippage,  but  the  hemisphere  can  roll 
freely.  What  is  the  minimum  value  of  R for  which  the 
equilibrium  is  a stable  equilibrium? 


Exercises  391 


Rotational  Motion,  II 


10-1  MOMENT  OF  When  you  analyze  the  translational  motion  of  a rigid  body,  it  is  not  neces- 

INERTIA  sary  to  consider  its  individual  particles  because  each  has  the  same  velocity  v. 

Furthermore,  the  total  momentum  of  the  body  is  related  to  v in  a very 
simple  way.  If  you  write  the  total  momentum  as  P,  then  P = Mx,  where  the 
scalar  constant  M is  the  total  mass  of  the  body.  The  translational  form  of 
Newton’s  second  law  gives  you  an  equation  for  the  rate  of  change  of  v,  in 
terms  of  M and  the  net  force  applied  to  the  body.  You  have  had  consider- 
able experience  working  with  that  equation,  studying  its  solutions  and  their 
physical  interpretations  in  a wide  variety  of  cases. 

It  would  he  very  desirable  for  you  to  be  able  to  use  an  analogous  proce- 
dure in  analyzing  the  rotational  motion  of  a rigid  body.  If  this  were  pos- 
sible, you  could  use  analogy  to  carry  over  directly  to  rotational  motion 
much  of  what  you  have  learned  from  the  study  of  translational  motion.  Is 
it  possible?  Each  particle  of  a rigid  body  rotating  about  some  axis  has  the 
same  angular  velocity  to,  just  as  each  particle  of  a translating  rigid  body  has 
the  same  velocity.  Furthermore,  in  certain  circumstances  the  body’s  total 
angular  momentum  L about  some  origin  can  be  written  as 

L = ho  (10-1) 

The  constant  / is  called  the  moment  of  inertia  of  the  body  for  rotation 
about  the  axis.  When  Eq.  (10-1)  is  valid,  prediction  of  the  rotational  motion 
of  the  body  from  T,  the  net  torque  about  the  origin  that  acts  on  the  body,  is 
completely  analogous  to  the  prediction  of  the  translational  motion  of  a 
body  from  F,  (he  net  force  acting  on  it.  This  is  so  because  the  rotational 
form  of  Newton’s  second  law,  T = dL/dt,  is  mathematically  equivalent  to  its 
translational  form  F = dP/dt  and  because  the  relation  L = l<o  is  mathe- 
matically equivalent  to  the  relation  P = Mx. 


392 


To  see  an  example  of  the  analogy,  recall  that  for  constant  M the  trans- 
lational form  of  Newton’s  second  law  gives  F = dP/dt  = d(M\)/dt  — 
M dx/dt.  Writing  dx/dt  = a,  the  body’s  acceleration,  you  have  F = Ma.  This 
is  the  equation  you  have  used  most  frequently  in  studying  translational  me- 
chanics. If  the  relation  L = Ico  is  valid  and  / is  constant,  then  the  rotational 
form  of  Newton’s  second  law  gives  T = dh/dt  — d(Ico/dt  = Idco/dt.  Writing 
dco/dt  — a,  the  angular  acceleration  of  the  body,  you  obtain 

T = la  (10-2) 

This  equation  allows  you  to  find  a body’s  angular  acceleration  from  the  net 
torcjue  acting  on  it  and  its  moment  of  inertia.  Then  you  can  use  the  angular 
acceleration  to  determine  the  angular  velocity  and  the  angular  position  of 
the  body  as  functions  of  time.  Since  the  procedure  is  identical,  from  a 
mathematical  point  of  view,  to  the  one  starting  with  the  equation  F = Ma, 
frequently  you  will  not  have  to  repeat  the  detailed  mathematical  argu- 
ments. Instead  you  can  go  directly  to  the  physical  insight  you  seek. 

However,  the  relation  L = Ico,  with  / a scalar  constant,  is  valid  only  in 
certain  cases.  While  the  angular  velocity  a>  is,  by  definition,  always  directed 
along  the  axis  about  which  a rigid  body  rotates,  you  will  soon  see  that  the 
total  angular  momentum  L of  the  body  is  not  necessarily  directed  along  the 
rotation  axis.  Thus  L may  be  at  an  angle  to  co.  When  this  is  the  case,  the 
equation  L = Ico  certainly  cannot  be  correct.  But  you  will  see  also  that  there 
are  many  cases  in  which  L = I co  is  a valid  equation  because  L is  in  the  direc- 
tion of  at  and  because  the  magnitude  of  L is  proportional  to  the  magnitude 
of  to.  These  cases,  which  fortunately  include  many  of  great  practical  inter- 
est, are  ones  in  which  the  particles  whose  masses  make  up  the  total  mass  of 
the  body  are  distributed  in  special  ways  relative  to  the  rotation  axis,  or  rela- 
tive to  the  origin  used  to  define  L. 

Furthermore,  you  will  see  that  even  when  the  total  angular  mo- 
mentum of  a rigid  body  is  not  aligned  parallel  to  the  rotation  axis,  its  com- 
ponent L||  in  a particular  direction  along  that  axis  has  a value  proportional 
to  the  value  of  the  signed  scalar  co  specifying  the  angular  velocity  of  the 
body.  Thus  in  all  cases  the  scalar  relation  Ln  = ho  is  correct.  The  positive 
constant  I appearing  in  this  relation  also  is  called  the  body’s  moment  of  in- 
ertia for  rotation  about  the  axis.  The  relation  makes  it  possible  to  use  the 
equation  Tn  = la  to  find  the  component  Tn  of  the  net  torque  applied  to  the 
body  when  it  has  an  angular  acceleration  given  by  the  signed  scalar  a.  You 
cannot  use  a similar  equation  to  determine  the  components  of  the  net 
torque  in  directions  perpendicular  to  the  rotation  axis.  But  in  many  prac- 
tical circumstances  Tn  is  the  only  component  you  need  to  know. 

The  actual  value  of  the  moment  of  inertia  depends  not  just  on  the  total 
mass  of  a rigid  body  but  also  on  how  that  mass  is  distributed  with  respect  to 
the  axis  about  which  the  body  rotates.  Although  in  the  newtonian  domain  a 
particular  rigid  body  has  a unique  total  mass  M,  it  does  not  have  a unique 
moment  of  inertia  I.  Changing  the  rotation  axis  changes  the  value  of  I. 
Thus  the  concept  of  moment  of  inertia  is  more  complex  and  less  general 
than  the  concept  of  mass.  Nevertheless,  it  is  a worthwhile  concept  because  it 
leads  to  an  efficient  way  of  treating  rotational  motion. 

We  begin  this  treatment  by  investigating  the  relation  between  a body’s 
angular  momentum  about  an  origin  and  its  angular  velocity.  First  we  con- 
sider the  simplest  possible  case,  illustrated  in  Fig.  10- la.  A single  particle  of 


10-1  Moment  of  Inertia  393 


(f> 1 

Fig.  10-1  (a)  A single  particle  rotating 

in  a circle  centered  on  a point  O.  which  is 
used  for  an  origin.  Its  angular  velocity  is 
co.  ( b ) A thin,  flat,  rigid  plate  rotating 
in  its  own  plane.  Then  particles  of  the 
body  are  all  rotating  in  circles  centered 
on  the  point  O that  is  used  for  the  origin, 
but  only  thejth  particle  is  shown.  They 
all  have  the  same  angular  velocity  to. 


mass  m is  rotating  in  a circle  with  angular  velocity  to.  We  fix  the  origin  0, 
used  to  define  the  particle’s  angular  momentum  1,  on  the  rotation  axis  at  its 
intersection  with  the  plane  of  the  circle.  Then  the  particle’s  position  vector 
r is  in  the  plane  of  rotation  and  is  always  perpendicular  to  its  momentum 
vector  p.  Since  p also  lies  in  that  plane,  the  angular  momentum  1 = r x p is 
perpendicular  to  the  plane.  The  right-hand  rules  show  that  1 points  in  the 
same  direction  as  to.  This  parallelism  of  1 and  to  is  essential  to  the  argument 
which  follows. 

The  magnitude  of  1 has  the  value 

/ = |r  X p|  = rp 

because  r and  p are  perpendicular.  Since  p = mv,  where  v is  the  particle’s 
speed,  we  can  write  this  as 

l = rmv 

But  the  particle’s  velocity  v is  given  by  v = to  x r.  So  v = |to  x r|  = cor  be- 
cause to  and  r are  perpendicular.  Thus  we  have 

/ = rmcor  = mr^to 

The  magnitude  of  1 is  proportional  to  the  magnitude  of  to,  the  proportion- 
ality constant  being  mr2.  Since  the  two  vectors  are  in  the  same  direction,  we 
can  write  the  vector  equation 

1 - wzTto 


We  express  this  result  as 


where 


1 = /to 


i = mr2 


(10-3) 


(10-4) 


The  quantity  i is  a rudimentary  example  of  a moment  of  inertia.  Specifi- 
cally, it  is  the  moment  of  inertia  of  the  particle  for  rotation  about  an  axis  for 
which  the  perpendicular  distance  from  the  axis  to  the  particle  always  has 
the  same  value  r. 


Now  consider  the  case  shown  in  Fig.  10-16.  There  are  n particles 
forming  a thin,  flat  rigid  plate  of  irregular  shape  and  mass  distribution. 
The  plate  is  rotating  in  its  own  plane  about  a fixed  origin  O chosen  in  that 
plane.  Since  Eqs.  (10-3)  and  (10-4)  apply  to  each  of  its  particles,  a typical 
one  has  angular  momentum 

I,  = mpf  to 

The  total  angular  momentum  about  O of  the  body  is 

L = 2 ls  = X = (X  0i 

j=i  j=i  ' i=i  7 

No  subscript  is  needed  on  the  to  since  the  angular  velocity  is  the  same  for  all 
the  particles  in  the  rigid  body.  We  can  write  this  result  as 

L = I(o  (10-5) 


where 


/ = ^ mtf  (10-6) 

j=i 


394  Rotational  Motion,  II 


The  quantity  I is  the  moment  of  inertia  of  the  rigid  body  for  rotation  about 
the  axis. 


Note  that  we  can  also  get  this  result  by  evaluating 

n 

3=1 

with 


ij  = m/f 

so  that,  again, 

I = ^ m jrj 

j=i 

Thus  the  total  moment  of  inertia  of  a rigid  body  for  rotation  about  an  axis  equals 
the  sum  of  the  moments  of  inertia  of  its  component  particles  for  rotation  about  that 
axis,  just  as  its  total  mass  equals  the  sum  of  their  masses. 


Fig.  10-2  The  same  rotating  particle 
shown  in  Fig.  10- la.  II  the  origin  O is 
not  chosen  to  be  in  the  plane  of  its  rota- 
tion, its  angular  momentum  1 is  not  par- 
allel to  its  angular  velocity  &>. 


We  have  shown  that  for  a flat  plate  of  arbitrary  mass  distribution  the 
angular  momentum  L is  proportional  to  the  angular  velocity  co  if  the  plate 
is  rotating  about  an  axis  perpendicular  to  its  plane  and  if  the  origin  O is 
chosen  at  the  intersection  of  the  axis  and  the  plane  of  rotation.  The  pro- 
portionality constant  / is  the  moment  of  inertia  of  the  flat  plate  when  ro- 
tating about  the  perpendicular  axis. 

There  is  no  unique  moment  of  inertia  I for  a particular  flat  plate.  The  value  of 
I depends  on  where  the  perpendicular  rotation  axis  passes  through  the 
plate.  For  different  rotation  axes,  all  the  rf  terms  in  Eq.  (10-6)  change,  and 
so  there  is  a different  value  of  I. 

Furthermore,  for  rotation  about  a particular  perpendicular  axis,  it  is 
essential  to  locate  the  origin  on  the  axis  at  its  intersection  with  the  plane  of  rotation  of 
the  plate,  as  in  Fig.  10-16.  Otherwise,  Eq.  (10-5)  may  not  be  valid.  The 
reason  is  that  if  the  origin  O is  not  chosen  at  the  intersection  of  the  rotation 
axis  and  the  rotation  plane,  L may  not  be  parallel  to  co.  In  that  case,  L and 
co  cannot  be  connected  by  a scalar,  as  they  are  in  Eq.  (10-5).  This  can  be 
seen  from  Fig.  10-2,  which  shows  the  same  rotating  particle  depicted  in  Fig. 
10- la  but  with  the  origin  O fixed  at  a point  on  the  rotation  axis  that  is  not  at 
the  intersection  with  the  rotation  plane.  The  figure  shows  the  position 
vector  r from  O to  the  particle,  its  momentum  vector  p,  and  its  angular  mo- 
mentum vector  1 = r x p.  The  magnitude  of  1 has  the  constant  value  l = 
rp,  since  r is  perpendicular  to  p and  both  r and  p have  constant  magnitudes. 
But  the  direction  of  1,  being  at  all  times  perpendicular  to  the  direction  of  p, 
is  always  changing.  Thus  the  vector  1 rotates  around  the  axis,  in  step  with 
the  rotation  of  the  particle  about  the  axis.  In  contrast,  the  direction  of  the 
angular  velocity  co  is  constant  since  it  always  lies  along  the  rotation  axis. 
Therefore  the  angular  momentum  1 is  not  parallel  to  the  angular  velocity  co 
if  we  use  this  choice  of  the  origin  O. 

Since  1 is  not  parallel  to  co  for  a single  rotating  particle  when  O is  not  in 
the  rotation  plane,  typically  it  is  true  that  the  total  angular  momentum  L of 
all  the  particles  that  form  the  rotating  plate  of  Fig.  10-16  will  not  be  parallel 
to  its  angular  velocity  co  if  the  origin  O is  not  taken  to  be  at  the  intersection 
of  the  rotation  axis  and  the  rotation  plane.  Of  course,  for  a flat  plate  ro- 
tating in  its  own  plane,  we  can  always  locate  O in  the  plane  and  thereby  ob- 
tain the  very  considerable  simplification  of  having  L parallel  to  co.  But  the 
situation  is  more  complicated  for  a rotating  body  whose  extension  in  space 


10-1  Moment  of  Inertia  395 


Pi 


Fig.  10-4  A rotating  body  composed  of 
two  equal-mass  particles.  The  body  has 
enough  symmetry  to  make  its  total  angu- 
lar momentum  L parallel  to  its  angular 
velocity  io,  even  though  the  origin  O does 
not  lie  in  the  plane  of  rotation  of  the 
two  particles. 


Fig.  10-3  The  motion  of  three  of  the 


is  three-dimensional,  instead  of  two-dimensional  like  a flat  plate.  If  a body 
extends  along  the  direction  parallel  to  the  rotation  axis,  as  well  as  along  the 
directions  perpendicular  to  that  axis,  then  no  matter  where  on  the  axis  O is 
located,  most  of  the  particles  in  the  body  do  not  rotate  in  a plane  passing 
through  O.  This  point  is  made  in  Fig.  10-3. 

However,  if  an  extended  body  has  a sufficient  degree  of  symmetry  rela- 
tive to  the  axis  about  which  it  is  rotating,  then  its  total  angular  momentum  L 
will  be  parallel  to  its  angular  velocity  oj,  no  matter  where  on  the  axis  O is  lo- 
cated. In  addition,  the  magnitudes  of  these  vectors  will  be  proportional. 
Thus  we  will  be  able  to  write  L = Tot.  To  understand  the  nature  of  the  re- 
quired symmetry,  we  consider  first  the  very  simple  example  shown  in  Fig. 
10-4.  The  figure  shows  a system  consisting  of  two  equal-mass  particles  ro- 
tating at  opposite  ends  of  a diameter  of  their  common  circle  of  rotation. 
Although  neither  of  the  individual  angular  momenta  lx  or  12  is  parallel  to 
the  angular  velocity  oj  (because  the  origin  O does  not  lie  in  the  plane  of  the 
circle),  their  components  perpendicular  to  oj  cancel  so  that  the  total  angu- 
lar momentum  L = lx  + 12  is  parallel  to  oj. 

The  same  kind  of  construction  shows  that  the  same  result  will  be  ob- 
tained for  three  equal-mass  particles  distributed  uniformly  around  a 
common  circle  of  rotation,  as  in  Fig.  10-5«.  And  Fig.  10-56  shows  four 
equal-mass  particles  uniformly  distributed  around  the  circle  that  will  pro- 
duce a total  L parallel  to  oj.  Furthermore,  for  four  particles  distributed  as 
shown  in  Fig.  10- 5c,  it  will  also  be  true  that  L is  parallel  to  oj.  As  the  number 
of  equal-mass  particles  rotating  around  the  common  circle  increases,  there 
is  an  increasing  number  of  ways  that  they  can  be  distributed  so  that  their 
total  angular  momentum  will  be  parallel  to  their  angular  velocity.  But  in  all 
cases  of  two  or  more  particles,  a simple  way  that  this  can  be  done  is  to  dis- 
tribute them  uniformly  around  the  circle.  It  is  also  the  most  commonly  en- 
countered distribution.  In  particular,  a uniform  ring  of  particles  forming  a 
rigid  body  (a  hoop)  has  a total  angular  momentum  L that  is  parallel  to  its 
angular  velocity  oj  if  the  body  rotates  about  its  symmetry  axis,  no  matter 
where  O is  on  that  axis. 


396  Rotational  Motion,  II 


(ft) 


(c) 

Fig.  10-5  Other  simple  examples  of  a 
body  composed  of  a few  equal-mass  par- 
ticles with  symmetry  such  that  the  total 
angular  momentum  L is  parallel  to  the 
angular  velocity  to.  This  is  true  no  matter 
where  along  the  rotation  axis  the  origin 
0 is  located. 


Fig.  10-6  A typical  particle  in  a rotat- 
ing body  consisting  of  a set  of  n par- 
ticles of  equal  mass  distributed  in  some 
manner  around  a circle  about  the  rota- 
tion axis. 


Consider  a set  of  particles  rotating  in  a common  circle  about  an  axis, 
with  angular  velocity  to,  and  distributed  in  an  arbitrary  manner  around  the 
circle.  Let  us  evaluate  the  component  along  the  axis  of  rotation  of  their 
total  angular  momentum  about  an  origin  somewhere  on  the  axis.  In  Fig. 
10-6  the  magnitude  of  the  angular  momentum  lj  of  the  particle  having  po- 
sition Tj  and  momentum  p,  is 

ij  = II  x pil  = rjpj 

since  rj  is  perpendicular  to  p;.  Now  pj  = m}Vj,  where  m-}  and  v}  are  the  par- 
ticle’s mass  and  speed.  So  we  have 


Vj  = |to  x rj|  = torj  sin  8 

Here  0 is  the  angle  between  to  and  rjt  and  the  sin  8 factor  has  been  placed  at 
the  end  of  the  expression  for  the  sake  of  convenience.  We  will  write  this  in 
terms  of  the  angle  y = tt  — 8.  As  the  figure  shows,  y is  the  angle  between  the 
rotation  axis  and  ij,  as  well  as  the  angle  between  the  perpendicular  to  the  axis 
and  lj.  Since  sin  (tt  — 8)  = sin  8,  we  have  sin  8 = sin  y and  so  v,  = cor,-  sin  y. 
Using  this  in  the  equation  for  lj,  we  obtain 

lj  = mj-rjco  sin  y 

The  contribution  of  this  particle  to  the  total  component  of  angular  mo- 
mentum along  the  rotation  axis  is  the  component  of  lj  that  is  parallel  to  that 
axis.  Since  tt/2  — y is  the  angle  between  the  rotation  axis  and  lj,  this  is 

lj H = lj  cos  (tt/2  ~ y)  — lj  sin  y 

Summing  the  contributions  of  all  n particles  rotating  in  the  circle,  we  have 

n n n 

u = 2 'k  = X li sin  y = X w si°2  y 

j=i  j=i  j=i 

The  figure  shows  that  ty  sin  y — R,  where  R is  the  distance  from  thejth  par- 
ticle to  the  axis  of  rotation  — in  other  words,  where  R is  the  radius  of  the 
common  circle  of  rotation.  In  terms  of  R,  the  expression  for  LN  can  be 
written 


L\\  = 2 mjR2(tJ 
j=i 

Since  R and  to  are  the  same  for  all  terms  in  the  sum,  we  have 

L\\  = R2to  V w»j 

j=i 

This  is 


L„  = MR°-to 


(10-7) 


with  M being  the  total  mass  of  the  rotating  particles. 

Now  let  the  rotating  particles  be  distributed  around  the  circle  of  rota- 
tion with  any  symmetry  that  leads  to  a cancellation  of  the  components  of 
angular  momenta  perpendicular  to  the  rotation  axis.  (See  again  the  ex- 
amples illustrated  in  Fig.  10-5.)  Then  the  Ln  just  evaluated  is  the  magnitude 


10-1  Moment  of  Inertia  397 


of  the  total  angular  momentum  of  the  system.  That  is,  we  will  have 

L = Ln  = MR2to 

and  we  can  write  our  results  in  the  vector  form 

L = Ito 

In  this  expression  the  moment  of  inertia  I for  rotation  about  the  axis  has 
the  value 


I = MR2  (10-8) 

These  results  are  valid  for  any  location  of  the  origin  O on  the  rotation  axis. 
The  moment  of  inertia  of  the  rotating  mass  distribution  depends  only  on 
its  total  mass  M and  the  square  of  the  common  distance  R from  all  its  parts 
to  the  axis. 

If  the  particles  are  distributed  on  their  common  circle  of  rotation  in  such 
a manner  that  L is  not  parallel  to  to  for  an  origin  O that  does  not  lie  in  the 
plane  of  rotation,  the  basic  results  expressed  in  Eq.  (10-7)  are  still  valid. 
They  can  be  used  to  calculate  Ln,  the  component  of  L along  the  direction  of 
to.  This  may  be  all  that  you  are  interested  in  calculating.  For  instance,  say 
you  have  an  old  and  rather  asymmetrical  grinding  wheel  that  is  manually 
operated.  You  want  to  operate  it  electrically,  and  you  need  to  calculate  how 
much  torque  an  electric  motor  must  supply  to  bring  the  wheel  up  to  a cer- 
tain angular  speed  in  a certain  amount  of  time.  Then  yoti  do  not  care  about 
the  torque  applied  by  the  bearings  supporting  the  axle  of  the  wheel.  This 
torque,  which  fixes  the  rotation  axis,  is  always  directed  perpendicular  to 
that  axis.  The  torque  you  care  about  is  the  one  applied  by  the  motor  in  a 
direction  parallel  to  the  axis  to  change  the  angular  speed  of  the  wheel.  If 
this  is  the  case,  you  need  only  L\\,  the  component  of  L along  a direction  par- 
allel to  to,  and  you  can  use  Eq.  (10-7)  to  write 

Ln  = I to  (10-9) 

where 


I = MR 2 


just  as  in  Eq.  (10-8). 

The  basic  point  to  understand  is  that  Eq.  (10-8)  can  be  used  in  all  cir- 
cumstances to  evaluate  the  proportionality  constant  I between  the  compo- 
nent Ln  of  the  rotating  body’s  total  angular  momentum  parallel  to  the  rota- 
tion axis  and  its  angular  speed  to.  The  only  restriction  is  that  all  the  mass 
elements  be  at  the  same  distance  R from  the  rotation  axis  and  that  M be  the 
sum  of  their  masses.  If  the  mass  distribution  has  sufficient  symmetry,  there 
will  be  no  perpendicular  components  of  the  total  angular  momentum. 
Otherwise  there  will  be.  But  whether  there  is  or  is  not  symmetry  has  no 
effect  on  the  parallel  component. 

Another  point  to  be  made  is  that  Eq.  (10-8)  can  be  applied  immediately 
to  particles  which  are  distributed  along  a line  parallel  to  the  rotation  axis  at  a 
constant  distance  R from  it.  The  reason  is  that  Eq.  (10-7)  is  valid  no  matter 
where  the  origin  O is  located  with  respect  to  the  plane  of  rotation  of  a con- 
stituent particle,  that  is,  no  matter  where  that  plane  is  located  with  respect  to 
the  origin. 


398  Rotational  Motion,  II 


For  a distribution  of  particles  extending  in  the  directions  that  are  not 
along  the  rotation  axis,  the  body’s  moment  of  inertia  I about  the  axis  can  be 
obtained  by  using  Eq.  (10-8)  to  evaluate  the  contribution  of  each  element  of 
mass  Mj,  consisting  of  all  the  particles  having  essentially  the  same  distance 
Rj  from  the  axis,  and  then  summing  over  all  ol  its  n mass  elements.  That  is, 
the  total  moment  of  inertia  of  the  body  for  rotation  about  the  given  axis  is 

7 = 2 Mj  Rf  (10-10) 

j=  i 


In  actually  evaluating  I for  an  extended  body,  it  is  usually  convenient  to  re- 
place the  sum  over  its  elements  of  finite  mass  Mj  by  an  integral  over  ele- 
ments of  infinitesimal  mass  dM.  Each  mass  element  has  a moment  of  inertia 
dl  = R2  dM,  and  the  total  moment  of  inertia  of  the  body  is,  consequently, 


I = 


R2  dM 


' body 


(10-1  la) 


The  integrals  are  taken  over  the  entire  body,  although  the  notation  is  ab- 
breviated by  not  showing  explicit  limits  on  the  integral  signs.  Equation 
(10-1  la)  can  also  be  expressed  as 


I = 


Jbody 


R2p  dV 


(10-116) 


where  p = dM/dV  is  the  density  of  the  body,  with  dV  an  element  of  its 
volume. 


Before  we  work  out  some  examples,  it  is  worthwhile  making  a general 
comment  about  the  presence  in  these  equations  of  R 2,  the  square  of  the  dis- 
tance from  a rotating  mass  element  to  the  axis  about  which  it  rotates. 
Since  I depends  quadratically  on  the  distance  from  the  mass  element  to  the 
axis,  and  only  linearly  on  the  mass  of  the  element,  the  dominant  factor  in 
determining  the  moment  ol  inertia  of  a rotating  body  tends  to  be  not  how 
much  mass  it  has  but  how  far  the  mass  is  from  the  axis.  To  put  the  matter 
another  way,  the  part  of  the  mass  of  the  rotating  body  that  is  farthest  from 
the  axis  of  rotation  makes  the  dominant  contribution  to  its  moment  of  in- 
ertia. 


EXAMPLE  10-1 

A uniform  bar  of  length  A and  mass  M is  rotating  about  an  axis  passing  through  its 
center  and  oriented  as  shown  in  Fig.  10-7.  The  thickness  t of  the  bar  in  the  direction 
perpendicular  to  the  axis  is  very  small,  but  its  width  w along  the  axis  is  not.  Deter- 
mine whether  the  total  angular  momentum  of  the  bar  about  the  origin  0 indicated 
in  the  figure  is  parallel  to  its  angular  velocity.  Then  evaluate  the  bar’s  moment  of 
inertia  for  rotation  about  the  axis. 

■ You  can  think  of  the  bar  as  consisting  of  a sum  of  pairs  of  elements  of  infinites- 
imal length  dR,  shown  in  the  figure.  Since  the  mass  in  each  pair  of  elements  has 
the  same  symmetry  as  that  in  Fig.  10-5c,  their  net  contribution  to  the  angular  mo- 
mentum will  be  parallel  to  the  angular  velocity.  Thus  the  total  angular  momentum 
L will  be  parallel  to  the  angular  velocity  to,  and  you  will  have  L = /to. 

To  evaluate  /,  you  make  use  of  the  fact  that  the  thickness  of  the  bar  is  neg- 
ligible, and  thus  all  the  mass  in  any  pair  of  elements  of  length  dR  is  at  the  same  dis- 
tance R from  the  rotation  axis.  The  ratio  of  the  mass  dM  of  this  pair  of  elements  to 


10-1  Moment  of  Inertia  399 


Fig.  10-7  A bar  of  uniform  density 
rotating  about  an  axis  passing  through 
its  center. 


the  total  mass  M of  the  bar  is  the  same  as  the  ratio  of  their  combined  length  2 dR  to 
the  total  length  A of  the  bar.  That  is. 


dM  _ 2 dR 
~M  ~ A 


or 


dM  = 


2 M dR 


So  Eq.  (10-1  la)  becomes 
I = 


f 2 M dR  2 M 

R2 = — 

body  A A 


R2  dR 


body 


To  evaluate  the  integral,  its  limits  of  integration  must  be  made  explicit.  Since  the 
smallest  value  of  the  variable  R is  0 and  the  largest  is  A/2,  you  have 

2 M rA'2 

I = — R2  dR 
A Jo 


Using  Eq.  (7-20),  you  find 


2 M 

' (A/2)3  (0)3~ 

2M 

f A3\ 

A 

3 3 J 

3A 

l 8 / 

or 


/ = 


MA2 


The  factor  tz  indicates  how  ineffectively  the  mass  located  near  the  rotation  axis 
contributes  to  the  moment  of  inertia. 


A hollow  cylinder  of  uniform  density,  shown  in  Fig.  10-8,  is  rotating  about  an  axis 
along  the  center  of  the  cylinder.  The  inner  radius  is  Ru  the  outer  radius  is  R2,  and 
the  length  along  the  axis  is  A.  Is  L parallel  to  to  for  the  origin  O shown  in  the  figure? 
What  is  the  value  of  I for  rotation  about  the  axis? 

■ The  answer  to  the  first  question  is  yes.  To  see  this,  decompose  the  cylinder 
mentally  into  a set  of  thin,  concentric  tubes,  as  indicated.  The  mass  in  each  is  uni- 
formly distributed  around  common  circles  of  rotation,  with  the  symmetry  of  Fig. 
10-56.  Each  tube  contributes  a net  angular  momentum  parallel  to  the  angular  veloc- 
ity, and  the  total  angular  momentum  of  the  hollow  cylinder  is  therefore  also  parallel 
to  the  angular  velocity. 

To  evaluate  I,  you  can  hrst  find  the  volume  of  an  elemental  tube  of  radius  R 
and  infinitesimal  wall  thickness  dR.  Since  its  periphery  is  2ttR,  its  volume  is 

dV  = A2ttR  dR 


LJsing  this  expression  in  the  integral  of  Eq.  ( 10-1 16)  and  writing  explicit  limits,  you 
have 


I = 


R2pA2irR  dR 


or 


I 


r«2 


277 pA 


R3  dR 


’ R i 


400  Rotational  Motion,  II 


since  both  p and  A are  constants.  The  limits  of  integration  on  the  definite  integral 
show  that  it  is  taken  from  the  inner  radius  of  the  cylindrical  body  to  its  outer  radius. 
Using  Eq.  (7-20)  to  evaluate  the  integral,  you  obtain 


I = 2irpA 


It  is  convenient  to  factor  this  result  into  the  expression 


/ = 


7 TpA 
2 


(Rl 


R\)(R\  + Rl) 


The  reason  is  that  the  volume  of  the  hollow  cylinder  is  tt(R\  — R\)A , and  so  its  total 
mass  M is 


M = ptt(R\  - R\)A 

Therefore  the  moment  ol  inertia  of  the  hollow  cylinder  rotating  about  its  axis  of 
symmetry  simplifies  to 


The  moment  of  inertia  / of  the  hollow  cylinder  evaluated  in  Eq. 
(10-12)  is  its  mass  times  a quantity  that  can  be  interpreted  as  the  average  of 
the  squares  of  its  inner  and  outer  radii.  T his  result  seems  reasonable  in 
consideration  of  the  remarks  made  immediately  before  the  examples  about 
the  significance  of  R2.  For  a solid  cylinder  of  radius  R rotating  about  its 
axis,  Eq.  (10-12)  can  be  used  by  setting  R1  = 0 and  R2  — R,  producing 

I — i MR2  (10-13) 


Equation  (10-12)  contains  also  the  case  of  a thin-walled  tube.  Just  set 
R1  — Ro  — R,  and  it  yields 


I = M 


R2  + R2 


9 


= MR2 


in  agreement  with  Eq.  (10-8). 

The  relation  L = Iw  is  not  restricted  to  the  rotation  of  bodies  that  have  some 
sort  of  symmetry  in  their  mass  distribution.  In  fact,  it  is  possible  to  prove  that 
through  every  point  in  any  body,  no  matter  how  asymmetric,  there  pass  three  per- 
pendicular axes,  each  having  the  property  that  L is  proportional  to  a>  if  the  body 


10-1  Moment  of  Inertia  401 


T 

D 


R,  jo'Mi 


Fig.  10-9  (a)  The  parallel-axis  theorem, 

Eq.  (10-14).  The  dashed  line  is  a rota- 
tion axis  passing  through  the  body’s 
center  of  mass.  The  solid  line  is  a paral- 
lel rotation  axis  at  distance  D.  ( b ) A view 
of  the  body  looking  along  the  parallel 
axes. 


(a) 


(6) 


rotates  about  it.  The  proportionality  constant  usually  is  not  the  same  for  rotation 
about  each  of  these  axes.  In  other  words,  it  is  possible  to  write  L = Ix co  for  rotation 
about  the  first  so-called  principal  axis,  L = I^co  for  rotation  about  the  second  prin- 
cipal axis,  L = I-co  for  rotation  about  the  third,  and  generally  Ix  ^ I y ^ h-  If  the 
body  rotates  about  an  axis  other  than  one  of  its  principal  axes,  then  L is  not  in  the 
same  direction  as  to.  A principal  axis  can  be  located  experimentally  by  finding  a 
rotation  axis  for  which  a constant  to  leads  to  a constant  L,  so  that  dL/dt  = T = 0, 
and  no  torque  T need  be  applied  to  maintain  the  constant  to.  One  way  to  do  this  is 
to  throw  the  body  in  the  air  in  such  a way  that  it  spins.  If  the  rotation  axis  main- 
tains a fixed  alignment,  you  have  found  a principal  axis.  If  it  “wobbles,”  the  body 
is  not  rotating  about  a principal  axis. 


When  the  moment  of  inertia  of  a body  rotating  about  an  axis  passing 
through  its  center  of  mass  is  known  to  have  the  value  7,  it  is  easy  to  evaluate 
the  moment  of  inertia  7'  it  will  have  for  rotation  about  a parallel  axis  passing 
through  some  other  point,  either  inside  or  outside  the  body.  The  situation 
under  consideration  is  shown  in  Fig.  10-9a.  According  to  the  parallel-axis 
theorem, 


7'  = 7 + MD2  (10-14) 

where  M is  the  mass  of  the  body  and  D is  the  distance  from  the  new  axis  to 
the  parallel  axis  passing  through  its  center  of  mass.  We  will  prove  this  and 
then  apply  it  in  Example  10-3. 

Figure  10-96  shows  a view  of  Fig.  10-9«  that  would  be  seen  by  looking 
along  the  parallel  axes;  that  is,  it  is  Fig.  10-9a  projected  onto  a plane  per- 
pendicular to  the  axes.  The  distance  D is  the  magnitude  of  a vector  D ex- 
tending from  the  new  axis  to  the  original  axis  passing  through  the  center  of 
mass  of  the  body.  The  distance  from  an  element  of  the  body  with  mass  Mj 
to  the  original  axis  is  the  magnitude  of  a vector  R,.  The  distance  from  the 
same  element  to  the  new  axis  is  the  magnitude  of  a vector  Rj.  The  relation 
between  these  quantities  is 


r;  = r,  + d 


(10-15) 


The  new  moment  of  inertia  is 


7'  = V MjR]2 

j=i 


402  Rotational  Motion,  II 


To  evaluate  R'2,  we  take  the  dot  product  of  Eq.  (10-15)  into  itself,  pro- 
ducing 

R;  • RJ  = (Rj  + D)  • (Rj  + D) 

= Rj  • Rj  + D • D + 2Rj  • D 

or 

R'j2  = Rj  + D2  + 2Rj  • D 

So  we  have 

/'  = V Mj(R2  + D2  + 2R;  • D) 


/'  = X MjR2  + V Mfl2  + 2 X Mi R,  • D 

j=i  j=i  j=i 

The  hrst  term  on  the  right  side  of  this  equality  is  the  moment  of  inertia  I of 
the  body  for  rotation  about  the  original  axis.  The  second  term  is  a summa- 
tion, giving  the  body’s  total  mass  M times  the  constant  D2.  Thus  we  have 

/'  = I + MD2  + 2 ( X M.-Rjj  • D 

According  to  the  definition  of  center  of  mass  given  by  Eq.  (9-48a),  the  sum- 
mation in  the  third  term  (which  has  been  put  in  parenthesis  for  emphasis) 
has  a value  equal  to  M times  a vector  extending  from  the  original  axis  to  the 
center  of  mass  of  the  body.  But  this  vector  has  zero  length  since  the  center 
of  mass  is  located  on  the  original  axis.  Therefore  the  third  term  is  zero,  and 
the  parallel-axis  theorem,  Eq.  (10-14),  is  proved. 


EXAMPLE  10-3 


Fig.  10-10  A uniform,  solid  cylinder, 
to  be  rotated  about  the  axis  shown  as 
a solid  line. 


Determine/  for  a uniform,  solid  cylinder  of  mass  M = 10.0  kg  and  radius  R = 5.00 
cm,  about  an  axis  parallel  to  its  symmetry  axis  and  tangent  to  its  surface.  See  Fig. 
10-10. 

■ Using  Eq.  (10-13),  obtained  from  Example  10-2,  you  know  that  the  moment  of 
inertia  for  rotation  about  the  axis  of  symmetry  of  the  cylinder  is 

/ = i MR2 

The  axis  of  symmetry  passes  through  the  center  of  mass.  So  you  can  apply  Eq. 
(10-14),  by  setting  the  distance  D between  the  parallel  axes  equal  to  the  radius  R. 
Then  you  have 

/'  = / + MD2  = }MR2  + MR2 


or 


/'  = f MR2 


It  is  very  much  easier  to  evaluate  /'  this  way  than  to  evaluate  it  by  a direct  applica- 
tion of  Eq.  (10-1 1). 

The  numerical  value  of  /'  is 


3 x 10.0  kg  x (0.0500  m)2 
2 

= 3.75  x 10~2  kg-m2 


10-1  Moment  of  Inertia  403 


Table  10-1 


Square  of  Gyration  Radius  for  Uniform-Density  Bodies 


Thin-walled,  cylindrical  shell  G2  = R2 

about  axis  of  symmetry 


Solid  cylinder  about  axis  of  G2  = |R2 

symmetry 


Hollow  cylinder  about  axis  of  G2  = f(R2  + Rl) 

symmetry 


Thin-walled,  cylindrical  shell  G2  = iR2  + T2A2 

about  axis  through  center  of 
mass,  perpendicular  to  axis  of 
symmetry 


Solid  cylinder  about  axis  G2  = IR2  + 1 VA2 

through  center  of  mass,  per- 
pendicular to  axis  of  symmetry 
(for  thin  rod,  set  R = 0) 


Thin-walled,  spherical  shell 
about  diameter 


Solid  sphere  about  diameter 


G2  = |R2 


G2  = |R2 


Rectangular  block  about  axis 
through  center  of  mass  perpen- 
dicular to  face 


G2  = MA'2  + B2) 


404  Rotational  Motion,  II 


Table  10-1  shows  a number  of  bodies  of  uniform  density  and  com- 
monly encountered  shapes,  rotating  about  the  axes  indicated.  It  lists  for 
each  body  the  quantity  G2,  the  square  of  the  gyration  radius.  This  quantity 
provides  a convenient  way  of  specifying  the  moment  of  inertia  of  a body 
for  rotation  about  a certain  axis,  being  defined  so  that 

/ = MG2  (10-16) 

where  M is  the  mass  of  the  body.  The  gyration  radius  is  just  the  radius  of  a 
hoop,  having  the  same  mass  as  the  body,  whose  moment  of  inertia  for  rota- 
tion about  its  axis  of  symmetry  is  equal  to  the  moment  of  inertia  of  the 
body.  In  terms  of  gyration  radii,  the  parallel-axis  theorem  reads 

G'2  = G2  + D2  (10-17) 

Example  10-4  makes  use  of  Eq.  (10-2),  T = la,  and  a gyration  radius 
obtained  from  Table  10-1. 


EXAMPLE  10-4 


Fig.  10-11  A grinding  wheel  mounted 
on  the  shaft  of  an  electric  motor. 


Li«iiMyi*UML4Lai;a^^  iTM-rnri*  wmwi*  ■— « 

A grinding  wheel  of  uniform  density,  radius  32  cm,  and  mass  100  kg  is  depicted  in 
Fig.  10-1  1 . When  it  is  switched  on,  the  electric  motor,  whose  shaft  forms  the  axle  of 
the  wheel,  applies  a torque  which  can  be  assumed  to  be  of  constant  magnitude. 
(Certain  types  of  electric  motors  do  produce  approximately  constant  torques.)  What 
must  be  the  magnitude  of  the  torque  if  the  motor  is  to  be  able  to  bring  the  wheel 
from  rest  to  its  operating  speed  of  200  rotations  per  minute  in  3.0  s?  Ignore  fric- 
tion. 

■ Because  of  the  symmetry  of  the  wheel  about  the  rotation  axis,  its  angular  mo- 
mentum about  an  origin  on  this  axis  is  parallel  to  its  angular  velocity,  no  matter 
where  along  the  axis  you  choose  to  locate  the  origin.  Thus  L = /to,  and  you  can 
use  Newton's  second  law  for  rotational  motion  in  the  form  given  by  Eq.  (10-2): 

T = la 

I he  direction  of  the  torque  vector  T is  the  same  as  that  of  the  angular  acceleration 
vector  a,  namely  along  the  shaft.  Hence  you  can  write  the  equation  in  terms  of 
signed  scalars: 

T = la 

To  evaluate  the  wheel’s  moment  of  inertia  I when  rotating  about  the  axle,  you 
considt  Table  10-1  and  find  from  the  second  entry  that  the  square  of  its  gyration 
radius  is 


G2  = iR- 

where  R is  the  radius  of  the  wheel.  According  to  Eq.  (10-16), 


I = MG2 
or 

I = % MR2 

where  M is  the  mass  of  the  wheel.  Hence 

MR2a 


T = 


9 


10-1  Moment  of  Inertia  405 


10-2  THE  PHYSICAL 
PENDULUM  AND  THE 
TORSION  PENDULUM 


Since  T is  constant,  a is  also.  So  you  can  evaluate  a in  terms  of  the  given  values 
of  the  wheel's  final  angular  velocity  to  and  the  elapsed  time  t by  using  Eq.  (9-8), 

to  = u>i  + at 

which  pertains  to  constant  angular  acceleration.  Setting  to,-  = 0 and  solving  for  a, 
you  have 

to 

a = — 
t 

Thus 


T = 


MR2a> 

2t 


To  obtain  a numerical  value,  first  you  must  express  to  = 200  rotations  per 
minute  in  terms  of  radians  per  second.  This  is  done  by  writing 


to 


200 


rotations 

min 


X 


2tt  rad 
1 rotation 


1 min 

60  s 


= 21  rad/s 


Using  this  value  and  the  other  values  quoted,  you  obtain 

100  kg  x (0.32  m)2  x 21  s”1 
~~  2 x 3.0  s 

= 36  m-kg-m/s2 


or 

T = 36  m-N 


A pendulum  is  a suspended  body  oscillating  under  the  influence  of  the  grav- 
itational force  acting  on  it.  If  the  body  is  treated  as  a particle  and  if  the 
mass  of  the  suspending  cord  or  rod  is  ignored,  the  system  is  called  a simple 
pendulum.  The  system  is  called  a physical  pendulum  if  the  body  is  not 
treated  as  a particle  and/or  if  the  mass  of  the  suspending  member  is  not  ig- 
nored. To  make  accurate  predictions  concerning  any  pendulum,  we  must 
treat  it  as  a physical  pendulum.  This  can  be  done  in  an  easy  way  that  builds 
directly  on  the  results  of  the  detailed  analysis  of  the  simple  pendulum  car- 
ried out  in  Secs.  6-3  and  6-5.  The  procedure  involves  using  a form  of 
Newton’s  second  law'  to  relate  the  torque,  moment  of  inertia,  and  angular 
acceleration  of  the  physical  pendulum  in  an  equation  completely  analogous 
to  the  equation  relating  the  force,  mass,  and  acceleration  in  a simple  pen- 
dulum. In  fact,  we  will  find  the  equations  governing  the  oscillations  of  these 
tw'o  systems  to  be  of  identical  mathematical  form.  We  will  thus  be  able  to 
apply  directly  to  the  physical  pendulum  all  the  results  obtained  in  Chap.  6 
for  the  simple  pendulum. 

A system  which  must  certainly  be  treated  as  a physical  pendulum  is 
shown  in  Fig.  10-12.  A body  of  mass  M and  arbitrary  shape  is  suspended 
from  a fixed  axle  about  which  it  can  rotate,  the  axle  being  perpendicular  to 
the  vertical  plane  represented  by  the  page.  This  plane  passes  through  the 
body’s  center  of  mass.  The  origin  O is  located  on  the  axle  at  its  intersection 
with  the  plane.  At  the  instant  illustrated,  a vector  D extending  from  0 to 
the  center  of  mass  is  inclined  at  an  angle  </>.  This  angle  is  measured  from 
the  downward  direction,  and  we  take  it  to  be  positive  when  D is  rotated 
counterclockwise  from  that  direction,  as  in  the  figure. 


406  Rotational  Motion,  II 


o 


F = Mg 

' ' 


D X A/g  (inward) 


Fig.  10-12  A physical  pendulum  con- 
sisting of  a body  of  mass  M suspended 
from  a horizontal  axle.  The  origin  O lies 
on  the  axle  at  its  intersection  with  the 
vertical  plane  containing  the  body’s  cen- 
ter of  mass.  The  position  of  the  center  of 
mass  with  respect  to  O is  given  by  the 
vector  D.  In  the  auxiliary  construction, 
the  notation  (inward)  next  to  the  vector 
D x Mg  means  that  the  vector  points 
directly  into  the  page,  away  from  the 
viewer. 


The  first  thing  we  must  do  is  to  obtain  a rotational  form  of  Newton’s 
second  law  that  is  applicable  to  the  case  at  hand.  Because  of  the  arbitrary 
shape  of  the  body,  its  angular  momentum  L about  O may  not  be  parallel  to 
its  angular  velocity  to.  Thus  here  there  is  a question  about  the  validity  of 
Eq.  (10-1),  L = /<*>.  This  puts  into  question  the  validity  of  Eq.  (10-2),  T = 
la,  because  the  second  equation  depends  on  the  first.  However,  we  can 
always  use  Eq.  (10-9)  to  write 

L||  = Ioj 

where  Ln  is  the  component  of  L in  a direction  along  the  rotation  axis.  Then 
we  take  components  along  that  direction  of  both  sides  of  the  always-valid 
equation  T = dh/dt.  We  obtain  Tn  = dLjdt  = d(Icj)/dt  = I dw/dt  = la,  or 

T\\  = la  (10-18) 

The  quantity  Tn  in  Eq.  (10-18)  is  the  component  of  T,  the  net  torque 
about  O acting  on  the  body,  in  the  direction  along  the  rotation  axis.  We  take 
that  direction  to  be  positive  outward.  Since  the  force  applied  to  the  body 
by  the  axle  exerts  no  torque  about  O,  T is  the  torque  about  O produced 
by  die  gravitational  force  Mg  applied  to  the  body’s  center  of  mass.  This 
torque  is 

T = D x Mg 

Its  magnitude  is 

T = |D  x Mg|  = D sin  0 Mg 

The  auxiliary  construction  in  the  figure  shows  that  the  angle  6 between  D 
and  Mg  equals  the  magnitude  of  (/>.  Also,  the  construction  shows  that  the 
cross-procluct  right-hand  rule  gives  the  direction  of  T as  inward  at  the  in- 
stant illustrated.  I hus 


Tn—  —DMg  sin  $ (10-19) 

The  minus  sign  reflects  the  fact  that  we  have  chosen  the  outward  direction 
as  positive. 

The  quantity  a in  Eq.  (10-18)  is  a signed  scalar  describing  the  magni- 
tude and  sense  of  the  body’s  angular  acceleration  about  the  axis.  Its  value  is 


da>  d /d<f>\  d2d> 

dt  dt  \ dt ) dt2 


(10-20) 


The  correct  sign  is  given  by  this  expression  because  we  have  taken  the  posi- 
tive sense  of  rotation  (counterclockwise)  and  the  positive  direction  along 
the  rotation  axis  (outward)  so  as  to  conform  with  the  rotation-vector  right- 
hand  rule. 

The  quantity  I in  Eq.  (10-18)  is  the  body’s  moment  of  inertia  for  rota- 
tion about  the  axis.  It  can  be  written  as 

7 = MG2  (10-21) 

where  G is  the  body’s  gyration  radius  for  that  axis. 

Substituting  Eqs.  (10-19)  through  (10-21)  into  Eq.  (10-18),  we  obtain 

, d2d> 

— DMg  sin  <t>  = MG2  -jy 


10-2  The  Physical  Pendulum  and  the  Torsion  Pendulum  407 


(10-22) 


After  canceling  and  transposing,  we  have 

d2({)  Dg 

~dt2~  ~ G2  s'n  ^ 

The  solutions  to  this  differential  equation  describe  the  motion  of  the  physi- 
cal pendulum  by  specifying  how  its  angular  coordinate  <f>  changes  with 
time  t. 

The  differential  equation  governing  the  motion  of  a simple  pendulum 
is  Eq.  (6-8/;): 

d2cj)  g . 

IF  = "7 s,n  * 

Compare  this  with  Eq.  (10-22).  The  two  seem  to  be  identical,  except  that 
the  constant  factor  on  the  right  side  of  the  physical-pendulum  equation  is 
Dg/G2,  whereas  it  is  g/l  in  the  simple-pendulum  equation.  This  means  that 
we  can  apply  to  the  physical  pendulum  everything  we  learned  in  Chap.  6 
about  a simple  pendulum  from  solving  the  simple-pendulum  equation.  All 
we  have  to  do  is  replace  g/l  everywhere  by  Dg/G2.  For  instance,  by  making 
the  replacement  in  Eq.  (6-286),  v = (iTr)\/g/l where  (p  <5C  1,  we  can  con- 
clude immediately  that  the  frequency  v for  small-amplitude  oscillations  of 
a physical  pendulum  is 

where  4>  « 1 (10-23) 

Note  that  this  general  result  contains  also  the  case  of  a simple  pendulum  of 
length  / because  in  a simple  pendulum  G = I and  D = /,  so  Dg/G2  = g/l. 

Example  10-5  makes  use  of  the  relation  between  Dg/G2  in  a physical 
pendulum  and  g/l  in  a simple  pendulum. 


EXAMPLE  10-5 

A physical  pendulum  is  made  by  hanging  a meter  stick  from  a nail  passing  through 
a hole  very  near  the  end  of  the  stick.  Determine  the  length  of  the  string  of  a 
simple  pendulum  that  has  approximately  the  same  frequency  for  small  oscillations 
as  that  of  the  physical  pendulum. 

■ Considering  the  meter  stick  to  be  a rod  of  length  A and  negligible  width  B,  you 
find  from  the  last  entry  in  Table  10-1  that  the  square  of  the  gyration  radius  for  an 
axis  perpendicular  to  the  rod,  and  passing  through  its  center  of  mass,  has  the  value 
tjA2.  Using  the  parallel-axis  theorem  in  the  form  of  Eq.  (10-17),  with 


you  find  that  for  an  axis  at  the  end  of  the  rod  the  square  of  the  gyration  radius  is 


A2  A2 

G2  = — + D2  = — + 
12  12 


A\ 2 
~9, 


A2 

T 


So 


G2  _ A2 2 _ 2A 
Dg  3 Ag  3 g 

The  simple  pendulum  will  have  the  same  period  if  its  length  l is  such  that 

/ _ 2 A 
g 3g 


408  Rotational  Motion,  II 


so  that 


For  the  particular  case  of  a meter  stick,  you  have 


2 x 1 m 

/ = = sm 


You  should  test  this  result  experimentally,  using  a meter  stick  (or  a rod  of  any 
convenient  length)  and  a string  supporting  a compact  body. 


Ml  2 

M/2 

2 R 


Fig.  10-13  A torsion  pendulum  whose 
total  mass  is  M.  It  consists  of  two  compact 
bodies,  each  of  mass  M/2,  connected  by 
a rigid  rod  of  negligible  mass  having 
length  2 R.  The  rod  is  suspended  from 
its  central  balance  point  by  a torsion  fiber. 


For  an  extended  body,  the  square  of  the  gyration  radius  about  a fixed 
axis  which  does  not  pass  through  its  center  of  mass  can  be  measured  by 
using  the  body  as  a physical  pendulum.  First  a measurement  is  made  to  de- 
termine the  location  of  the  center  of  mass  by  suspending  the  body  from 
several  points,  preferably  on  its  periphery,  as  in  Fig.  9-31.  The  value  of  D 
can  then  be  measured.  Next  the  body  is  mounted  on  the  axis  and  set  into 
oscillation,  and  the  value  of  the  frequency  v is  measured.  Since  the  gravita- 
tional acceleration  g is  known,  Eq.  (10-23)  can  be  used  to  evaluate  G2.  The 
moment  of  inertia  / can  then  be  found  by  measuring  the  mass  M of  the 
body  and  setting  I = MG2. 

Alternatively,  if  G2  and  D are  known  for  a physical  pendulum,  a mea- 
surement of  v can  be  used  to  determine  the  value  of  g.  This  is  the  technique 
used  in  1817  by  Captain  Henry  Kater  for  making  the  first  accurate  mea- 
surements of  the  variation  of  the  gravitational  acceleration  from  one  loca- 
tion on  the  surface  of  the  earth  to  another.  Such  measurements  of  g are  of 
great  importance  in  the  held  of  geology,  where  variations  of  the  order  of  a 
few  parts  per  million  can  yield  information  concerning  the  shape  of  the 
earth,  the  composition  and  structure  of  the  subsurface,  and  the  presence  of 
valuable  minerals. 

Now  we  will  apply  rotational  mechanics  to  analyze  the  behavior  of  an 
important  system  called  a torsion  pendulum.  The  simplest  form  of  a tor- 
sion pendulum  is  illustrated  in  Fig.  10-13.  It  consists  of  a rod  of  length  2 R 
having  a negligible  mass,  with  compact  bodies  of  mass  M/2  mounted  on 
each  end.  The  rod  is  suspended  horizontally  at  its  center  from  a wire  called 
the  torsion  fiber.  If  the  wire  is  twisted,  it  obeys  a rotational  version  of 
Hooke’s  law  quite  analogous  to  the  law  F — —kx  obeyed  by  a wire  when  it  is 
stretched  by  an  amount  x.  The  rotational  Hooke’s  law  is 


T — -kef) 


(10-24) 


In  this  expression  the  signed  scalar  <fi  is  the  angle  through  which  the  torsion 
fiber  is  twisted,  the  signed  scalar  T is  the  restoring  torque  the  fiber  exerts 
because  it  is  twisted,  and  k is  a positive  constant  known  as  its  torsion  con- 
stant. 

If  the  rod  is  displaced  from  its  equilibrium  position  through  some 
angle  in  the  horizontal  plane  and  then  released,  the  rod  will  start  to  oscil- 
late about  the  equilibrium  position.  We  can  use  the  analogy  between  ro- 
tational and  translational  mechanics  to  show  that  the  oscillation  is  harmonic. 
Consider  the  system  at  a moment  when  the  angle  of  displacement  from 
equilibrium  is  </>.  The  restoring  torque  T = — k<j>  produces  an  angular 
acceleration  d2(f)/di2  in  accordance  with  the  rotational  form  of  Newton's 
second  law,  T = la  = I d2d>/dt2.  Thus 


10-2  The  Physical  Pendulum  and  the  Torsion  Pendulum  409 


d2(j)  T 


di2  I 


or 


d2</>  k 


(10-25) 


Here  I is  the  moment  of  inertia  of  the  torsion  pendulum  for  rotation  about 
the  perpendicular  axis  through  its  center  of  mass.  This  differential  equa- 
tion is  to  be  compared  to  Eq.  (6-1  1), 


d2x 

dt2 


k 


x 

m 


which  pertains  to  a body  of  mass  m at  the  free  end  of  a spring  of  force  con- 
stant k.  The  differences  between  these  equations  are  that  the  dependent 
variable  is  a rotational  coordinate  in  the  first  and  a translational  coordinate 
in  the  second,  and  that  the  constant  on  the  right  side  is  a rotational 
Hooke’s-law  constant  divided  by  a moment  of  inertia  in  the  hrst  and  a 
translational  Hooke’s-law  constant  divided  by  a mass  in  the  second.  But 
these  differences  are  trivial  from  a mathematical  point  of  view.  Therefore 
we  know  that  if  the  torsion  pendulum  is  displaced  from  its  equilibrium  po- 
sition and  then  released,  the  angular  coordinate  </>  will  henceforth  be  a sin- 
usoidal function  of  time. 

By  analogy  to  Eq.  (6-28«),  the  frequency  of  the  harmonic  oscillation  is 


(10-26) 


Note  that  this  result  is  not  limited  to  oscillations  in  which  (/>  « 1-  1 he  only 
restriction  on  the  oscillation  is  that  it  must  never  be  large  enough  that  the 
restoring  torque  ceases  to  be  proportional  to  </>,  so  that  Eq.  (10-24)  ceases  to 
apply.  In  the  particular  case  of  the  ideal  torsion  pendulum  in  Fig.  10-13, 
whose  entire  mass  M is  concentrated  at  the  ends  of  the  rod  of  length  2 R,  we 
have  I = MR2.  The  frequency  in  this  case  is 


(10-27) 


The  timing  of  a mechanical  watch  is  regulated  by  a torsion  pendulum  in  the 
form  of  a wheel  connected  to  a fine  spiral  spring  which  resists  rotation  of  the 


wheel  from  its  equilibrium  position.  The  wheel  oscillates  about  this  position  at  a 


frequency  governed  by  Eq.  (10-26).  Very  good  bearings  are  used  to  minimize  fric- 
tion. Also,  the  wheel  is  made  from  an  alloy  such  as  Invar,  which  expands  and  con- 
tracts very  little  as  the  temperature  changes.  The  use  of  this  metal  reduces  temper- 
ature variations  of  the  wheel’s  moment  of  inertia  and  therefore  stabilizes  the 
oscillation  frequency.  Each  time  the  wheel  oscillates,  it  causes  the  motion  of  a 
ratchet  arrangement  that  allows  a gear  driven  by  the  mainspring  of  the  watch  to 
rotate  by  one  tooth.  When  this  happens,  a small  blow  is  transmitted  from  the  gear 
through  the  ratchet  to  the  oscillating  wheel.  The  purpose  is  to  compensate  for  fric- 
tion in  the  wheel  bearings,  which  tends  to  damp  the  oscillation.  As  the  main- 
spring unwinds,  the  strength  of  the  blows  diminishes,  and  thus  the  amplitude  of 
the  oscillation  diminishes  also.  But  since  the  oscillation  is  harmonic,  the  oscilla- 
tion frequency  does  not  change.  The  watch  therefore  continues  to  run  at  a constant 
rate. 


410  Rotational  Motion,  II 


The  torsion  pendulum  is  often  used  to  determine  the  torsion  constant  k of  a 
torsion  fiber.  If  the  moment  of  inertia  of  the  system  is  known  from  other  measure- 
ments or  can  be  calculated,  a measurement  of  the  oscillation  frequency  immedi- 
ately yields  the  value  of  k through  use  of  Eq.  (10-26).  Once  this  has  been  deter- 
mined, the  system  can  be  used  to  measure  very  small  torques  (and  thus  very  small 
forces  as  well).  As  we  will  see  in  Chap.  11,  a torsion  pendulum  was  used  by 
Cavendish  in  the  late  1790s  to  determine  the  so-called  universal  gravitational  con- 
stant. This  constant  is  a measure  of  the  strength  of  the  gravitational  force  exerted 
between  any  two  bodies  in  the  universe.  Cavendish  used  its  value  and  the  value  of 
the  gravitational  acceleration  to  obtain  the  first  reliable  evaluation  of  the  mass  of 
the  earth. 


10-3  THE  TOP 


Fig.  10-14  The  uniform  precessional 
motion  of  a child’s  top.  The  symbols 
are  explained  in  the  text. 


In  Chap.  9 the  kinematics  of  rotation  about  a free  axis  was  introduced  by 
calling  your  attention  to  the  gravity-defying  motion  of  a child's  top,  and  an 
explanation  of  the  motion  was  promised.  We  give  the  explanation  in  this 
section.  Then  we  give  examples  of  systems,  much  more  important  than  a 
child’s  top,  whose  motions  have  the  same  explanation. 

A top  performing  what  is  called  uniform  precessional  motion  is  illus- 
trated in  Fig.  10-14.  The  body  spins  rapidly  about  its  axis  of  symmetry  with 
a large  spin  angular  velocity  ws.  We  assume  the  sense  of  spin  to  be  coun- 
terclockwise as  seen  from  above,  so  that  to,  is  directed  along  the  axis  away 
from  the  pointed  end.  At  the  same  time  the  axis  of  symmetry  itself  pre- 
cesses  slowly  and  uniformly  about  the  vertical  direction  in  the  same  sense 
as  the  spin,  tracing  out  a cone  whose  apex  lies  where  the  tip  (the  pointed 
end)  of  the  top  rests  in  the  small  depression  that  the  top's  weight  produces 
in  the  floor.  Thus  the  top’s  center  of  mass  rotates  about  a vertical  axis  with  a 
small  precessional  angular  velocity  cop,  directed  vertically  upward. 

The  total  angular  momentum  of  the  top  about  an  origin  O at  the  fixed 
location  of  its  tip  is  due  principally  to  the  rapid  rotation  of  the  individual 
particles  of  the  top  around  the  spin  axis,  that  is,  the  &>s  axis.  But  there  is  also 
a small  contribution  coming  from  the  slow  rotation  of  the  center  of  mass  of 
the  top  around  the  precession  axis,  that  is,  the  axis.  We  cannot  begin  our 
explanation  by  neglecting  the  latter,  even  though  it  is  small,  because  the 
precessional  motion  is  precisely  what  we  want  to  explain. 

There  is  a relation,  that  we  will  prove  soon,  which  makes  it  easy  to  take 
into  account  both  contributions  to  the  top's  total  angular  momentum.  The 
relation  shows  that  the  total  angular  momentum  L of  a body  about  an  origin 
O can  be  expressed  as  the  sum  of  two  parts.  One  part  is  LaboutCM,  the  angular 
momentum  of  the  body  about  an  origin  moving  with  its  center  of  mass.  I bis 
angular  momentum  arises  from  the  rotation  of  the  body  about  its  center  of 
mass.  The  other  part  is  LofCM,  the  angular  momentum  about  O due  to  the 
rotation  of  the  center  of  mass  of  the  body  about  that  origin.  Thus  the  total 
ansrular  momentum  L has  the  value. 

o 


L — LaboutCM  + Lof  CM  (10-28) 

A body’s  total  angular  momentum  about  an  origin  is  the  angular  momentum  due  to 
its  rotation  about  its  center  of  mass  plus  the  angular  momentum  due  to  the  rotation  of 
its  center  of  mass  about  the  origin.  This  statement  sounds  so  plausible  that  you 
might  think  it  to  be  true  if  “center  of  mass’’  were  replaced  by  “any  point  in 
the  body.”  The  proof  of  Eq.  (10-28)  given  below  in  small  print  shows,  how- 
ever, that  it  applies  only  to  the  center  of  mass. 


10-3  The  Top  411 


Consider  the  top  in  Fig.  10-14  as  an  example  of  a body  to  which  Eq.  (10-28) 
applies.  A typical  particle  of  the  body  has  mass  mj,  as  indicated  in  the  figure.  Its 
position  relative  to  a fixed  origin  O is  given  by  the  vector  r,.  Its  position  relative  to 
a moving  origin  O',  located  at  the  body’s  center  of  mass,  is  given  by  the  vector rj. 
If  we  write  the  position  vector  of  O'  relative  to  O as  r,  the  figure  shows  that 

rj  = rj  + r (10-29) 

We  calculate  the  velocity  vj  = dr/d t of  the  particle  with  respect  to  O by  taking  the 
time  derivative  of  all  terms  in  Eq.  (10-29),  obtaining 

Vj  = vj  + v 

Here  vj  = drj/dt  is  the  velocity  of  the  particle  with  respect  to  O'  and  v = dr/dt  is 
the  velocity  of  O'  with  respect  to  O.  Next  we  multiply  the  velocity  equation  by  mj. 
This  leads  to  a relation  between  p(  = m jVj,  the  momentum  of  the  particle  as  seen 
from  the  origin  O,  and  pj  = rrijVj,  its  momentum  as  seen  from  O'.  The  relation  is 

P = pj  + mjV  (10-30) 

The  angular  momentum  of  the  particle  about  O is 

lj  = Tj  X Pj 

By  using  Eqs.  (10-29)  and  (10-30),  lj  can  be  expressed  in  terms  of  the  primed  quan- 
tities as 


or 


I,  = (rj  + r)  x (pj  + my) 
lj  = rj  x pj  + rj  x mjV  + r x pj  + r x rrijV 


To  evaluate  L,  the  body’s  total  angular  momentum  about  O,  we  sum  over  then 
particles  it  contains.  Thus 

n 

l = 2 h 

j=  1 


Using  the  evaluation  of  just  obtained,  we  have 

n n n n 

L = V rj  x pj  + V rj  x m;v  + ^ r x pj  + ^ r x my  (10-31) 

j=i  1=1  i=i  1=1 

The  first  term  on  the  right  side  of  this  equation  has  the  value 

V r'j  xP'i  = 'Z  = L 

1=1  1=1 

The  quantity  L'  is  the  body’s  angular  momentum  about  O'  due  to  the  motion  of  its 
particles  about  that  point.  Since  O'  is  at  the  body’s  center  of  mass,  L'  can  also  be 
written  as  Lab0UtCM.  Thus 

n 

^ p x Pj  - L — L about  cm  (10-32o) 

i=i 


The  second  term  in  Eq.  (10-31)  is 


i rj  * mjV  = (2  mJrj) 

1=1  vj=i 


X V 


According  to  Eq.  (9- 48a),  the  summation  over;  of  up  r;  gives  the  total  massM  of  the 
body  multiplied  into  a vector  extending  from  O'  to  the  body’s  center  of  mass. 
Since  O'  is  at  the  center  of  mass,  this  vector  has  zero  length.  Thus  the  summation 
has  the  value  zero,  and  so  the  second  term  in  Eq.  (10-31)  is  zero. 


412 


Rotational  Motion,  II 


To  evaluate  the  third  term  in  Eq.  (10-31),  we  write  pj  = rrtjVj  = rnjdr -/dt  and 
use  these  equalities  to  obtain 

" " dr- 

1 r *p'j  = Ir  xm,— 

1=1  i= l 

The  summation  on  the  right  side  of  this  equation  can  be  rewritten  by  considering 
the  following  facts:  (1)  Since  r is  the  same  for  all  terms  in  the  summation,  it  can  be 
taken  outside  the  summation.  (2)  Since  all  the  irq  are  constants,  mjdrJ/dt  = 
(d/dt)  (mjrj').  (3)  The  sum  of  the  derivatives  equals  the  derivative  of  the  sum;  see 
Eq.  (2-14).  Hence  the  differentiation  can  be  performed  after  the  summation  is 
carried  out,  instead  of  before.  Thus  the  third  term  of  Eq.  (10-31)  gives  us 


n d » 

2rxp;=rx-Sm^ 

j=l  }=1 

Since  we  have  just  shown  that  the  summation  over  j ofmj  r/  is  zero,  its  time  deriva- 
tive is  zero  as  well.  Hence  the  third  term  in  Eq.  (10-31)  is  zero  also. 

The  last  term  in  Eq.  (10-31)  is 


Fig.  10-15  Vectors  describing  the  uni- 
form precessional  motion  of  a top.  At 
the  instant  depicted,  the  top’s  center  of 
mass  is  moving  through  the  plane  of  the 
page,  and  all  the  vectors  in  the  figure 
lie  in  that  plane,  except  for  the  vector  P 
which  extends  into  the  plane. 


nij 

j=t  H=i 

Here  M is  the  total  mass  of  the  body  and  P is  the  momentum  which  a particle  of 
that  mass  would  have  as  viewed  from  O if  it  moved  with  the  velocity  v of  the 
center  of  mass  relative  to  O.  Since  r gives  the  position  of  the  particle  relative  to  O, 
the  quantity  r x Pis  LofCM,  the  angular  momentum  about  O due  to  the  motion  of 
the  body’s  center  of  mass  about  that  origin. 

Using  Eqs.  (10-32a)  and  (10-32b)  to  evaluate  the  first  and  last  terms  on  the 
right  side  of  Eq.  (10-31),  together  with  the  fact  that  the  other  two  terms  are  zero,  we 
obtain  the  proof  of  Eq.  (10-28). 

Figure  10-15  applies  Eq.  (10-28)  to  the  uniform  precessional  motion  of 
a top  at  an  instant  when  its  spin  angular  velocity  vector  o>s  lies  in  the  plane 
of  the  page.  The  angular  momentum  LaboutCM  arises  from  the  top’s  rotation 
about  its  center  of  mass  along  the  axis  of  (os . Since  the  top  is  symmetrical 
with  respect  to  this  axis,  the  direction  of  LaboutCM  is  the  same  as  that  of  tos. 
T he  magnitude  of  LaboutCM  is  large  because  the  magnitude  of  o»s  is  large. 
The  vector  r extends  from  the  fixed  origin  O along  the  spin  axis  to  the  top’s 
center  of  mass.  At  the  instant  depicted  in  the  figure,  the  center  of  mass  of 
the  top  of  total  mass  M is  moving  into  the  page  with  velocity  v.  So  the  asso- 
ciated momentum  P = Mv  is  directed  into  the  page.  The  angular  mo- 
mentum arising  from  the  rotation  about  O of  the  center  of  mass  at  the 
precessional  angular  velocity  <op  is  LofCM  = r x P.  The  cross-product  right- 
hand  rule  shows  that  LofCM  is  in  the  plane  of  the  page  and  inclined  to  the 
vertically  directed  axis  of  &>p,  as  illustrated  in  the  figure.  The  magnitude  of 
LofCM  is  small  because  the  magnitude  of  (op  is  small.  The  total  angular  mo- 
mentum of  the  top  about  O is  given  by  the  relation  L = LaboutCM  + LofCM. 
All  three  of  these  vectors  lie  in  the  plane  of  the  page  at  the  instant  depicted, 
fhe  construction  shows  that  the  direction  of  L is  a little  closer  to  the  vertical 
than  is  the  direction  of  LaboutCM  and  that  its  magnitude  is  a little  larger  than 
is  the  magnitude  of  Lab0UtCM.  As  time  passes,  the  plane  containing  all  three 
angular  momentum  vectors  rotates  slowly  about  a vertical  line  with  the  pre- 
cessional angular  velocity  atp.  But  in  every  other  regard  these  vectors  re- 
main as  in  the  construction  of  Fig.  10-15. 

The  only  angular  momentum  vector  shown  in  Fig.  10-16  is  L,  the  top’s 
total  angular  momentum  vector  about  the  fixed  origin  O.  Also  shown  is  the 


v = r x Mv  = r x P = L, 


Of  CM 


(10-32b) 


n / n 

Vr  x n\j\  = r x ( 


10-3  The  Top  413 


Fig.  10-16  A top  is  acted  on  by  a 
net  torque  T about  an  origin  at  its 
pointed  end.  with  T always  in  the 
horizontal  direction.  The  reason  is  that 
T = r x Mg,  so  it  is  the  cross  product 
of  two  vectors  which  both  always  lie  in 
a vertical  plane.  The  first  of  these  is  the 
vector  r from  the  origin  to  the  top’s 
center  of  mass,  and  the  second  is  the 
gravitational  force  Mg  acting  on  the 
center  of  mass.  In  any  small  time  interval 
dt,  the  change  dL  in  the  top’s  total 
angular  momentum  L is  in  the  same 
horizontal  direction  as  T because 
dL  = T dt.  Since  L itself  is  always  in 
a vertical  plane  (the  one  containing 
r and  Mg),  this  means  that  dL  is  always 
perpendicular  to  L.  Consequently,  L 
cannot  change  its  magnitude;  it  can 
change  only  its  direction.  At  the  end  of 
the  time  interval  dt,  the  tip  of  the  vector 
L will  have  moved  to  the  tip  of  the  vector 
dL  shown  in  the  figure.  In  the  next 
time  interval  the  process  repeats  itself. 
However,  the  plane  containing  r,  Mg, 
and  L has  precessed  through  a small 
angle  about  the  vertical  line,  causing  T 
and  dL  to  do  the  same.  As  time  con- 
tinues to  pass,  the  tip  of  the  vector  L 
slowly  describes  a circle  about  the  ver- 
tical line,  and  the  vector  itself  describes 
a cone  about  that  line. 


position  vector  r,  from  O to  the  center  of  mass  of  the  top,  and  the  gravita- 
tional force  vector  F = Mg.  This  downward-directed  force  applied  at  the 
center  of  mass  is  completely  equivalent  in  its  effects  to  the  combined  effects 
of  all  the  gravitational  forces  applied  to  each  of  the  particles  of  the  top.  In 
particular,  it  can  be  used  to  evaluate  the  net  torque  T about  O,  exerted  on 
the  top  by  gravity.  This  torque  is  T = r x F,  or 

T = r x Mg  (10-33) 

The  cross-product  right-hand  rule  shows  that  T is  horizontally  directed,  as 
illustrated  in  the  figure.  (There  is  also  an  upward-directed  force  applied  at 
O to  the  tip  of  the  top  by  the  table  on  which  it  rests.  But  this  force  exerts  no 
torque  about  O.  Why?) 

The  net  torque  T applied  to  the  top  causes  its  total  angular  momentum 
L to  change  in  such  a way  as  to  satisfy  the  rotational  form  of  Newton’s  sec- 
ond law, 

dL 

T=—  (10-34) 

We  can  apply  the  law  because  both  T and  L are  specified  about  an  origin  0 
which  can  be  considered  fixed  in  an  inertial  frame. 

The  essential  point  in  the  explanation  of  the  top’s  uniform  preces- 
sional  motion  is  illustrated  in  Fig.  10-16.  Since  the  torque  T applied  to  the 
top  always  acts  in  a horizontal  direction,  while  its  total  angular  momentum 
L is  always  in  a vertical  plane,  T is  always  perpendicular  to  L.  According  to 
Eq.  (10-34),  the  change  in  L in  a small  time  interval  dt  is  dL  = T dt.  Thus  dL 
is  always  in  the  same  direction  as  T.  It  follows  that  dL  is  always  perpendic- 
ular to  L itself.  (This  is  quite  analogous  to  the  fact  that  when  a centripetal 
force  F acts  on  a body  whose  momentum  is  p,  the  change  in  momentum  dp 
is  perpendicular  to  p itself.)  Inspection  of  the  figure  will  clarify  the  direc- 
tions of  the  vectors  and  show  that  their  consequence  is  to  make  L change  in 
time  in  such  a way  that  its  magnitude  remains  constant  while  its  direction 
precesses  in  a cone  about  a vertical  line.  Thus  L should  behave  just  as  it  is 
observed  to  behave. 

A crucial  feature  responsible  for  the  behavior  of  a uniformly  pro- 
cessing top  is  its  large  spin  angular  momentum  LaboutCM.  If  a top  that  is  not 
spinning  is  placed  with  its  pointed  end  on  the  floor,  and  its  symmetry  axis  is 
inclined  to  the  vertical,  the  top  will  certainly  fall  over  when  released.  Since 
the  top  has  no  initial  angular  momentum,  the  horizontal  increment  in  its 
angular  momentum  produced  by  the  horizontally  acting  torque  in  the  first 
time  interval  after  release  results  in  the  top  having  a purely  horizontal 
angular  momentum  at  the  end  of  the  first  time  interval.  This  means  that 
the  top’s  center  of  mass  is  starting  to  rotate  about  the  fixed  origin  O at  its  tip 
in  a vertical  plane  containing  the  symmetry  axis  of  the  top.  In  other  words, 
the  top  is  beginning  to  fall,  as  illustrated  in  Fig.  10-17.  In  the  next  time  in- 
terval the  increment  in  angular  momentum  is  in  the  same  direction  as  in 
the  first  time  interval  because  the  direction  of  the  torque  is  the  same.  So  the 
magnitude  of  the  angular  momentum  Lof  CM  increases  as  the  fall  accelerates. 

But  if  the  top  has  a large  angular  momentum  along  its  symmetry  axis 
when  released,  the  situation  is  completely  different.  Since  the  increment  in 
angular  momentum  is  perpendicular  to  the  initial  angular  momentum,  the 
angular  momentum  at  the  end  of  the  first  time  interval  is  not  changed  in 
magnitude,  but  is  changed  in  direction.  This  causes  the  direction  of  the  ap- 


414  Rotational  Motion,  II 


T (inward) 
dL  (inward) 
L (inward) 


Fig.  10-17  If  a top  is  released  without  spin,  the  gravitational 
torque  about  the  origin  O at  its  pointed  end  causes  it  to  fall  over. 
The  dashed  vector  labeled  r is  the  position  vector  of  the  top's 
center  of  mass  relative  to  O at  the  instant  of  release.  The  dashed 
vector  labeled  Mg  is  the  gravitational  force  exerted  at  that 
point.  The  corresponding  torque  about  O is  T = r x Mg,  and 
its  direction  is  into  the  page.  The  angular  momentum  change 
dL  produced  by  the  torque  in  a small  time  interval  dt  is 
dL  = T dt,  and  so  it  also  is  directed  into  the  page.  Since  the 
angular  momentum  itself  was  zero  at  the  beginning  of  the  time 
interval,  L is  directed  into  the  page  at  the  end  of  the  time  inter- 
val. For  the  beginning  of  the  next  time  interval,  r and  Mg  are 
shown  by  solid  vectors.  The  corresponding  torque  vector  T 
is  again  directed  into  the  page,  and  hence  the  vector  dL  is  also. 
Thus  at  the  end  of  that  interval  L is  still  directed  into  the  page 
but  has  a larger  magnitude.  The  process  continues,  with  the 
magnitude  of  L continuing  to  increase,  until  the  top  strikes 
the  floor.  In  this  case,  where  Lab0UtCM  = 0,  we  have  L = L0(Cm- 


plied  torque  to  change.  So  the  angular  momentum  of  the  top  precesses 
about  the  vertical,  maintaining  a constant  magnitude.  The  top  cannot  fall 
over  because  such  a motion  corresponds  to  an  aver -increasing  magnitude  of 
angular  momentum,  as  we  found  in  the  preceding  paragraph. 

Example  10-6  develops  a relation  between  magnitudes  of  the  spin 
angular  velocity  and  the  precessional  angular  velocity  and  then  applies  it  to 
obtain  numerical  values  in  a particular  case. 


EXAMPLE  10-6  mmmmmmmmmmmmmmmmmmmm in 

A top  is  spinning  aboul  its  axis  of  symmetry  with  angular  speed  a>s  = 200  rad/s  — (32 
rotations  per  second).  Its  gyration  radius  about  that  axis  is  G = 3.00  cm,  and  the 
distance  from  its  tip  to  its  center  of  mass  is  r = 4.00  cm.  Evaluate  wp,  the  top's  angu- 
lar speed  of  precession,  assuming  it  to  be  very  small  compared  to  cas.  Then  check  the 
validity  of  the  assumption. 

■ First  you  should  make  a sketch  like  the  one  in  Fig.  10-18,  in  which  you  assume 
the  total  angular  momentum  L of  the  top  about  an  origin  O at  its  tip  to  have  the 
value  L = Lab0UtCM.  That  is,  you  assume  L,,fCM  has  such  a small  magnitude  in  com- 
parison to  the  magnitude  of  Lal)0UtcM  that  you  can  ignore  it  in  evaluating  L.  This  as- 
sumption is  based  on  the  fact  that  cop  for  a well-spun  top  is  very  small  compared  to 
ws.  In  the  figure  the  angle  between  L and  the  axis  of  the  precession  cone  is  labeled  y, 
and  d<f>  is  the  angle  swept  out  in  lime  dt  by  a line  drawn  from  the  tip  of  the  vector  L 


Fig.  10-18  A sketch  used  in  Example  10-6  to  evaluate  the  preces- 
sional angular  velocity  of  a top.  The  vectors  T and  dL  are  parallel 
to  each  other,  and  both  are  perpendicular  to  the  plane  containing 
the  vectors  r Mg,  and  L. 


10-3  The  Top 


415 


h to)  eotij  ML 

<toiy2^)  (/n.  W ^ 


o 


tft 

‘H'i  i 


^p-  ~ 


cen'K 


l- 


1 


perpendicularly  in  to  the  cone’s  axis.  The  desired  quantity  a>p  is  precisely  d<f>/dt. 
The  figure  shows  you  that 

dL 

(if  = 7— : 

L sin  y 

And  Eq.  (10-34)  tells  you  that  the  relation  between  dL  and  the  torque  T is 

dL  = T dt 


So  direct  substitution  gives 


df  = 


T dt 
L sin  y 


or 


T 


(10-35) 


df 

mp  ~ j.  ~ 7 • 

dt  L sin  y 

You  can  use  Eq.  (10-33)  to  evaluate  T.  This  gives 

T = |r  x Mg | = rMg  sin  (77  — y) 

where  M is  the  mass  of  the  top.  Since  sin  (77  — y)  = sin  y,  this  is 

T = rMg  sin  y 

Substitution  into  Eq.  (10-35)  gives  you  the  followinjpexpression  for  the  top’s  preces- 
sional  angular  speed: 


wu  = 


rMg  sin  y rMg 
L sin  y L 


Note  that  the  precessional  angular  speed  is  independent  of  the  angle  y between  the  angular 
momentum  of  the  top  and  the  vertical  direction.  Now  L = Ia>s  for  the  rotation  of  the  sym- 
metrical top  about  its  axis  of  symmetry  at  angular  speed  ws.  Also,  its  moment  of  in- 
ertia is  I = MG2,  where  G is  its  gyration  radius.  Hence  you  can  write  the  result  as 

rMg 


or 


MG2oj, 


rg_  J_ 

G 2 w. 


(10-36) 


The  precessional  angular  speed  cop  of  the  top  is  inversely  proportional  to  its  spin 
angular  speed  ws.  (If  you  have  ever  played  with  a top,  you  probably  have  noticed  the 
increase  in  a>p  as  cos  decreases  as  a result  of  friction.)  The  proportionality  constant 
connecting  cap  and  1/co,,  is  the  distance  r from  its  tip  to  its  center  of  mass,  multiplied 
by  the  gravitational  acceleration  g,  and  divided  by  the  square  of  its  gyration  radius  G 
for  rotation  about  its  symmetry  axis. 

To  find  the  numerical  value  of  cop,  you  substitute  the  values  given  for  r,  G,  and 
cos,  and  also  the  value  of  g,  into  Eq.  (10-36).  You  obtain 


3 <3-1  nxA/s 

-fo  I % v 


aiw 


4.00  X 10-2  111  x 9.80  m/s2 
” (3.00  x 10-2  m)2  x 2.00  x 102  rad/s 
= 2.1  8 rad/s  (—0.35  rotations  per  second) 


r 


Since  u>p  is  about  1 percent  of  u>s,  this  justifies  the  assumption  made  at  the  beginning 
of  the  example  — that  LofCM  is  very  small  compared  to  Lab0UtCM-  What  would  you  do 
to  obtain  an  accurate  evaluation  of  ojp  if  the  result  obtained  by  making  the  assump- 
tion ojp  <5C  cos  led  to  a value  of  wp  which  was,  say,  half  as  large  as  the  value  of 


416  Rotational  Motion,  II 


The  behavior  of  a top  can  be  more  complicated  than  it  is  in  uniform 
precessional  motion.  One  thing  that  can  happen  is  that  the  axis  of  sym- 
metry can  rise  and  fall  periodically,  while  precessing  at  a nonuniform  rate 
about  the  vertical  direction.  I bis  nodding  motion  is  called  nutation.  An- 
other possible  motion  of  a top  is  called  sleeping.  When  a top  sleeps,  it  ro- 
tates with  its  axis  of  symmetry  remaining  vertical.  These  motions  are  more 
difficult  than  uniform  precession  to  observe  experimentally  and  to  analyze 
theoretically,  so  they  will  not  be  treated  in  this  book. 

Uniform  precession  is  by  far  the  most  significant  motion  of  a top  because  it  is 
exactly  analogous  to  a motion  found  in  systems  that  are  of  much  more  interest  to 
contemporary  physics  and  related  fields  than  the  top.  In  most  atoms,  the  nucleus 
has  a spin  angular  momentum  parallel  to  its  axis  of  symmetry,  much  like  the 
angular  momentum  of  a top  spinning  about  its  axis  of  symmetry.  Such  a nucleus 
also  has  magnetic  properties,  as  if  it  contained  a microscopic  bar  magnet  along  the 
direction  of  the  axis  of  symmetry.  In  the  presence  of  a magnetic  field,  the  nucleus 
experiences  a torque  that  is  always  perpendicular  to  the  direction  of  the  axis  of 
symmetry,  and  therefore  always  perpendicular  to  the  direction  of  the  angular  mo- 
mentum. This  is  just  the  same  as  the  relation  between  the  direction  of  the  gravita- 
tional torque  experienced  by  a top  and  the  direction  of  its  angular  momentum. 
Therefore  the  nuclear  angular  momentum  performs  uniform  precessional  motion, 
like  a top.  In  fact,  the  situation  is  somewhat  simpler  than  that  for  a top.  This  is  be- 
cause the  center  of  mass  of  the  nucleus  can  be  considered  at  rest  in  an  inertial 
frame,  so  that  the  origin  O can  be  taken  at  the  center  of  mass.  This  means  that  L = 
h about  cm  exactly.  Just  as  in  Eq.  (10-35),  the  angular  speed  of  precession  of  the  nu- 
cleus is  proportional  to  the  torque  (now  magnetic  rather  than  gravitational)  acting 
on  it.  Since  the  torque  is  proportional  to  the  strength  of  the  magnetic  field  applied 
to  the  nucleus,  the  precession  speed  is  proportional  to  the  strength  of  that  mag- 
netic field.  Corresponding  to  the  precession  speed  a>„  is  a precession  frequency 
V„  = (Op  /2  7t.  This  precession  frequency  is  also  proportional  to  the  strength  of  the 
magnetic  field  acting  on  the  precessing  nucleus. 

The  precessing  magnetic  nuclei  in  a sample  of  material  can  absorb  electro- 
magnetic waves,  just  as  if  they  were  tiny  radio  receivers,  providing  the  frequency 
of  the  electromagnetic  waves  is  the  same  as  the  precession  frequency  of  the  nu- 
clei. By  determining  the  frequency  at  which  absorption  occurs,  and  therefore  de- 
termining the  precession  frequency,  physicists  and  chemists  can  measure  with 
great  precision  the  strength  of  the  magnetic  field  acting  on  the  nuclei.  This  allows 
the  scientists  to  study  the  magnetic  field  present  at  the  nuclei  of  atoms,  molecules, 
or  solids  as  a result  of  the  circulation  of  electrons  about  the  nuclei.  The  magnetic 
field  depends  on  the  precise  distribution  of  the  electrons,  and  so  its  measurement 
provides  detailed  information  about  the  structure  of  systems  of  prime  importance 
in  solid-state  physics  and  chemistry.  The  experimental  technique  is  called  nu- 
clear magnetic  resonance. 

Section  10-4  opens  by  considering  one  more  example  of  uniform  pre- 
cessional motion. 


10-4  ROTATION  ABOUT  The  earth  is  a majestic  example  of  the  topic  of  Sec.  10-3 — uniform  preces- 
AN  ACCELERATING  sional  motion.  The  precessional  motion  of  the  earth  can  be  studied  most 
CENTER  OF  MASS  simply  by  using  a reference  frame  with  an  origin  fixed  at  the  earth's  center 

of  mass.  We  will  do  so,  obtaining  thereby  an  example  of  the  topic  of  this 
section — the  rotation  of  a body  about  an  accelerating  center  of  mass. 

The  earth  spins  daily  about  its  polar  axis.  This  axis  is  inclined  at  an 
angle  of  23.45°  to  a perpendicular  to  the  plane  of  the  earth’s  annual  orbit 
about  the  sun.  The  situation  is  depicted  in  Fig.  10-19  from  the  point  of  view 


10-4  Rotation  about  an  Accelerating  Center  of  Mass  417 


of  a reference  frame  whose  origin  is  fixed  at  the  earth’s  center  of  mass. 

I his  reference  frame  does  not  participate  in  the  daily  rotation  of  the  earth. 

fhat  is,  the  earth  is  seen  to  spin  abont  its  polar  axis  when  viewed  from  the 
reference  frame. 

Because  of  its  rapid  rotation  about  the  polar  axis,  the  earth’s  shape 
has  deformed  slightly  from  perfect  sphericity.  The  earth  bulges  at  the 
equator.  A consequence  of  the  equatorial  bulge  is  that  the  gravitational 
forces  exerted  by  the  moon  on  various  parts  of  the  earth  produce  a torque 
on  the  earth.  The  torque  varies  in  strength  as  the  moon  travels  each  month 
through  its  orbit  about  the  earth.  But  the  torque  is  always  directed  perpen- 
dicular to  the  earth's  polar  axis.  You  can  see  this  by  considering  the  forces 
exerted  at  the  four  moon  positions  shown  in  the  figure.  In  position  1 the 
gravitational  attraction  of  the  moon  for  the  nearer  bulge  of  the  earth, 
which  is  below  the  plane  of  the  moon’s  orbit,  is  greater  than  its  attrac- 
tion for  the  farther  bulge,  which  is  above  the  plane.  So  the  moon  pulls  up- 
ward on  the  bulge  below  the  plane  more  than  it  pulls  downward  on  the 
bulge  above  the  plane.  This  produces  a torque  about  an  origin  at  the  earth’s 
center  of  mass,  acting  in  the  direction  shown  in  Fig.  10-19.  In  position  3 the 
moon  exerts  a greater  downward  pull  on  the  nearer  bulge  above  the  plane 
than  on  the  farther  bulge  below  the  plane,  and  again  the  torque  vector  acts 
in  the  direction  shown  in  the  figure.  At  positions  1 and  4 there  is  no  torque. 
If  we  average  over  each  orbit  of  the  moon  about  the  earth,  there  is  a net 
torque  exerted  on  the  earth  by  the  moon.  The  torque  is  directed  perpen- 
dicular to  the  earth’s  polar  axis.  It  is  quite  weak  on  the  scale  of  this  system. 
A very  similar,  but  somewhat  weaker,  torque  is  exerted  on  the  earth  by  the 
sun.  The  sum  of  these  two  constitutes  a net  torque  T acting  on  the  earth 
about  its  center  of  mass.  Averaged  over  any  year,  this  torque  has  an  essen- 
tially constant  magnitude  and  a direction  which  is  in  the  plane  of  the  earth’s 
annual  orbit  about  the  sun  and  perpendicular  to  the  direction  of  the  earth’s 
polar  axis  during  the  particular  year.  That  is,  T is  always  perpendicular  to 
the  spin  angular  velocity  describing  the  daily  rotation  of  the  earth  about 
its  polar  axis. 

The  earth  has  a large  angular  momentum  about  an  origin  at  its  center 
of  mass  because  of  the  daily  rotation.  Since  it  is  essentially  symmetrical 


418  Rotational  Motion,  II 


about  the  polar  axis,  this  angular  momentum  L = LaboutCM  is  in  the  same 
direction  as  o>s. 

Thus  there  is  a weak  net  torque  T acting  on  the  earth  whose  direction 
is  always  perpendicular  to  the  earth’s  large  angular  momentum  L,  just  as  in 
the  case  of  a top.  This  results  in  a very  slow  precession  of  the  angular  mo- 
mentum vector  around  a cone  whose  axis  is  normal  to  the  plane  of  the 
earth’s  motion  about  the  sun.  The  half-apex  angle  of  the  cone  is  23.45°, 
and  the  period  of  the  precession  is  25,920  yr.  The  angular  momentum 
vector  is  at  present  directed  so  that  the  North  end  of  the  polar  axis  of  the 
earth  points  within  1°  of  a particular  star  that  we  call  the  North  Star,  or  Po- 
laris. People  living  13,000  yr  from  now  will  find  that  the  North  pole  misses 
pointing  at  that  particular  star  by  about  47°,  and  they  will  surely  give  it  a 
different  name.  But  after  an  additional  13,000  yr,  Polaris  will  again  be  an 
appropriate  name  because  the  continued  precessional  motion  will  have 
brought  the  earth’s  polar  axis  back  to  its  present  alignment. 


I he  above  discussion  of  the  precessional  motion  of  the  earth  was  sim- 
plified by  taking  the  center  of  mass  of  the  earth  as  the  origin  O used  to  de- 
fine the  torque  and  angular  momentum  under  consideration.  It  was  im- 
plied in  the  discussion  that  for  this  choice  of  origin  the  torque  and  angular 
momentum  are  related  by  Newton’s  second  law  for  rotational  motion  — 
torque  equals  rate  of  change  of  angular  momentum  — just  as  they  are  in 
the  case  of  a top.  Is  such  an  implication  valid? 

In  the  treatment  of  the  precession  of  the  top,  the  origin  O was  at  the 
position  where  its  tip  rests  on  the  floor.  That  choice  of  origin  satisfies  to  a 
very  good  approximation  the  condition,  used  in  deriving  the  rotational 
form  of  Newton’s  second  law  in  Sec.  9-5,  that  O be  the  origin  of  an  inertial 
reference  frame.  The  reason  is  that  the  acceleration  of  O with  respect  to  an 
exact  inertial  frame  is  completely  negligible,  compared  to  the  acceleration 
associated  with  the  motion  of  the  top  about  O.  But  in  the  treatment  of  the 
precession  of  the  earth,  the  origin  O was  at  the  earth’s  center  of  mass.  In 
this  case  the  acceleration  of  O with  respect  to  an  exact  inertial  frame,  re- 
sulting from  the  annual  motion  of  O about  the  sun,  is  not  very  much  smaller 
than  the  acceleration  associated  with  the  motion  of  the  earth  about  O.  Thus 
it  is  not  apparent  that  it  is  valid  to  use  Newton’s  second  law  for  rotational 
motion  with  an  origin  fixed  at  the  earth’s  center  of  mass  in  discussing  the 
precession  of  the  earth.  But,  in  fact,  it  is  valid  to  do  so.  One  of  the  sur- 
prising properties  of  the  center  of  mass  of  any  body  is  that  when  the  net 
torque  acting  on  a body,  and  its  total  angular  momentum,  are  taken  about 
an  origin  at  the  center  of  mass  of  the  body,  then  the  torque  equals  the  rate  of 
change  of  angular  momentum,  no  matter  how  the  center  of  mass  is  moving.  This 
useful  fact  is  proved  in  the  material  in  small  print  which  follows. 


Consider  the  rotating  body  shown  in  Fig.  10-20  and  two  origins  about  which 
its  angular  momentum  can  be  measured.  One  is  the  origin  O of  an  an  inertial  refer- 
ence frame.  The  other  is  the  origin  O'  fixed  at  the  center  of  mass  of  the  body.  In 
general,  the  frame  of  which  O'  is  the  origin  is  not  inertial.  We  designate  the  angu- 
lar momentum  of  the  body  about  O by  the  symbol  L and  its  angular  momentum 
about  O ' by  the  symbol  L '.  A relation  between  these  two  quantities  is  given  by  Eq. 
(10-28).  Using  Eqs.  (10-32)  to  write  this  relation  in  terms  of  our  present  notation, 
we  find 


L = L'  + r x Mv 


10-4  Rotation  about  an  Accelerating  Center  of  Mass  419 


Here  r is  the  vector  from  O to  O',  M is  the  total  mass  of  the  body,  and  v is  the  veloc- 
ity of  O'  as  seen  from  O. 

Next  we  take  the  time  derivative  of  all  terms  in  the  relation  to  obtain 


dL 

~di 


dL'  dr  d 

= dT  + dr*Mv  + rxdT(Mv) 


The  second  term  on  the  right  side  of  this  equation  has  the  value 

dr 

— x Mv  = v x Mv  = 0 
at 


(10-37) 


(10-38a) 


since  v is  parallel  to  Mv.  Newton’s  second  law  for  translational  motion  can  be  ap- 
plied to  evaluate  the  third  term.  This  is  because  the  momentum  Mv  is  measured 
relative  to  the  inertial  frame  whose  origin  is  O,  so  the  law  is  applicable.  It  gives 

d 

r x — (Mv)  = r x F (10-38b) 

where  F is  the  net  force  applied  to  the  body.  Newton’s  second  law  for  rotational 
motion  can  be  applied  to  the  term  on  the  left  side  of  Eq.  (10-37)  because  the  angu- 
lar momentum  L is  measured  about  the  origin  O of  an  inertial  frame.  This  form  of 
the  law  gives 

dL 

-rr  = T (10-38c) 


where  T is  the  net  torque  about  O applied  to  the  body.  Using  Eqs.  (10-38)  in  Eq. 
(10-37),  we  have 

dL' 

T = — + r x F (10-39) 

at 


Now  the  net  torque  about  O applied  to  the  body  is  the  sum  of  the  torques 
about  that  origin  produced  by  the  external  forces  applied  to  each  of  its  n particles. 
The  external  force  applied  to  thejth  particle  is  Fj,  its  position  vector  from  O is  r,-, 
and  the  torque  about  O produced  by  the  force  is  Tj  = rj  x Fj.  Thus 

n 

T = V Fj  x Fj  (10-4 0a) 

j=i 

Furthermore,  the  net  force  applied  to  the  body  is  the  sum  over  the  n particles  of 
the  individual  forces  Fj.  So 

F = Y Fj  (10-40b) 

j=i 


Substituting  the  values  of  T and  F given  by  Eqs.  (10-40)  into  Eq.  (10-39).  we  have 


V Fj  x Fj 

j=i 


dL' 

~df 


r x Y Fj 

j=  1 


Transposing  the  second  summation  on  the  right  side  of  this  equality,  and  using 
the  fact  that  r is  the  same  for  all  terms  in  this  summation,  we  can  rewrite  the  equal- 
ity as 


» " dL' 

V ^ x Fj  — N r x Fj  = — — 

j=i  j=i 

This  can  be  expressed  in  terms  of  a single  summation,  as  follows: 


V (Fj  - r)  x Fj  = 

j=i 


dL' 

~df 


420 


Rotational  Motion.  II 


Figure  10-20  shows  that  — r = rj,  where  rj  is  the  position  vector  of  the  jth  par- 
ticle from  O'.  Thus  we  have 


This  is 


n 


j=l 


X Fj  = 


dL' 

~di~ 


It; 


dL' 

~df 


where  Tj  is  the  torque  about  O'  applied  to  the  jth  particle.  The  sum  of  these 
torques  is  just  T',  the  net  torque  about  O'  applied  to  the  body.  Hence  we  have 
proved  that 


(10-41) 


This  has  the  form  of  Newton’s  second  law  for  rotational  motion,  even  though  O' 
may  not  be  the  origin  of  an  inertial  reference  frame. 


The  physical  reason  for  the  property  of  the  center  of  mass  described 
by  Eq.  (10-41)  can  be  explained  in  terms  of  the  concept  of  fictitious  forces, 
discussed  in  Sec.  5-4.  Newton’s  second  law  involving  force  applies  directly 
only  in  an  inertial  reference  frame.  But  in  an  accelerating  frame  it  can  be 
made  to  apply  if  appropriate  fictitious  forces  are  added  to  the  real  forces. 
The  sum  of  all  the  fictitious  forces  for  a body  is  equivalent  to  a single  fic- 
titious force,  and  it  turns  out  that  this  force  acts  at  the  body’s  center  of  mass. 
Therefore  the  fictitious  force  exerts  no  torque  about  an  origin  at  the  center 
of  mass,  and  it  can  be  neglected  as  far  as  rotation  about  the  center  of  mass  is 
concerned.  The  result  is  that  Newton’s  second  law  for  rotation  applies 
directly  to  rotation  about  an  origin  at  the  center  of  mass,  even  if  the  center 
of  mass  is  accelerating  with  respect  to  an  inertial  frame! 


Equations  (9-39),  T = dL/dt,  and  (10-41),  T'  = dL'/dt,  can  be  sum- 
marized by  saying  the  following:  The  net  torque  applied  to  a body  is  equal  to  the 
rate  oj  change  of  its  total  angular  momentum  if  either  of  the  following  conditions  is 
met  in  choosing  the  origin  about  which  the  torque  and  angular  momentum  are  taken: 

1.  The  origin  is  located  at  any  point,  but  is  the  origin  of  an  inertial  frame. 

2.  The  origin  is  located  at  the  body's  center  of  mass,  but  is  the  origin  of  a refer- 
ence frame  which  need  not  be  inertial. 


Knowing  this,  we  will  henceforth  not  bother  to  make  a distinction  between 
Eqs.  (9-39)  and  (10-41).  That  is,  we  will  drop  the  primes  in  the  latter  and 
write  both  as 


T = 


dL 

dt 


(10-42) 


with  the  understanding  that  T and  L are  the  net  torque  applied  to  the  body 
and  its  total  angular  momentum,  about  an  origin  O of  a frame  which  does 
not  need  to  be  an  inertial  frame  if  O lies  at  the  body’s  center  of  mass.  In  many, 
but  not  all,  circumstances,  the  total  angular  momentum  of  the  body  can  be 
written  in  terms  of  its  moment  of  inertia  7 about  the  rotation  axis  and  its 
angular  velocity  co  as 

L = Ioj  (10-43) 

10-4  Rotation  about  an  Accelerating  Center  of  Mass  421 


The  center  of  mass  itself  moves  in  a manner  governed  by  the  equation 


where  F and  P are  the  net  force  applied  to  the  body  and  its  total  mo- 
mentum. In  this  equation  the  total  momentum  is  measured  from  an  origin 

0 which  must  be  the  origin  of  an  inertial  frame.  In  all  circumstances  the  total 
momentum  of  the  body  can  be  written  in  terms  of  its  mass  M and  the  veloc- 
ity v of  its  center  of  mass,  as 

P = Mv  (10-45) 

1 his  property  of  the  center  of  mass  can  be  summarized  by  saying  the  fol- 
lowing: The  center  of  mass  of  a body  moves  in  such  a way  that  the  net  force  acting  on 
the  body  equals  the  rate  of  change  of  its  total  momentum  relative  to  an  inertial  frame, 
the  total  momentum  being  the  product  of  the  body’s  mass  and  the  velocity  of  its  center 
of  mass  in  the  inertial  frame.  The  justification  of  this  statement  comes  directly 
from  experiment.  We  specified  the  position  of  each  of  the  air  table  pucks  in 
Chap.  4 by  the  position  of  its  center  of  mass,  and  its  velocity  by  the  rate  of 
change  of  that  position.  Then  we  took  the  momentum  of  the  puck  to  be  the 
product  of  its  mass  and  the  velocity  of  its  center  of  mass,  and  we  defined 
the  net  force  acting  on  it  as  the  rate  of  change  of  this  total  momentum.  No 
matter  what  bodies  are  studied  to  deduce  Newton’s  second  law  for  transla- 
tional motion,  the  same  thing  is  done.  So  the  direct  results  of  the  deduction 
are  Eqs.  (10-44)  and  (10-45). 

If  you  wish,  you  can  say  that  the  most  basic  form  of  Newton’s  second  law  is 
F = dp/dt,  where  p = mv  and  where  all  these  quantities  pertain  to  a particle.  An 
exercise  at  the  end  of  this  chapter  requires  you  to  do  a calculation  in  which  these 
relations  are  applied  to  each  of  the  particles  in  a body,  and  then  a summation  is 
taken  over  all  its  particles.  The  definition  of  center  of  mass  is  used  to  express  the 
right  side  of  the  equation  obtained  in  terms  of  the  time  derivative  of  the  product  of 
the  body’s  mass  and  the  velocity  of  its  center  of  mass.  Newton’s  law  of  action  and 
reaction  is  used  to  show  that  the  forces  acting  between  the  particles  in  the  body  to 
hold  it  together  cancel  in  pairs,  so  that  the  left  side  of  the  equation  reduces  to  the 
net  force  applied  to  the  body.  The  equation  is  then  seen  to  be  precisely  Eq.  (10-44), 
in  which  the  quantity  being  differentiated  on  the  right  side  satisfies  Eq.  (10-45). 

Since  the  translational  form  of  Newton’s  second  law,  F = dP/dt,  deals 
with  different  quantities  than  those  dealt  with  in  the  rotational  form,  T = 
cliL/dt,  the  motion  of  the  center  of  mass  of  a body  can  be  considered  separately  from 
the  rotation  of  the  body  about  its  center  of  mass.  We  have  done  just  this,  quite  suc- 
cessfully, in  many  of  the  treatments  given  prior  to  Chap.  9.  1 hese  are  the 
ones  in  which  a body  actually  changes  its  orientation  as  well  as  its  location  as 
it  moves,  but  the  changes  in  orientation  are  ignored  and  only  the  changes 
in  location  are  considered.  We  have  done  the  same  thing  in  Chap.  9,  and 
up  to  this  point  in  Chap.  10,  by  ignoring  whatever  changes  in  location  a 
body  might  have  as  it  moves  and  considering  only  its  changes  in  orienta- 
tion. 


Now  we  investigate  situations  in  which  we  must  take  into  account  both 
changes  in  orientation  and  changes  in  location  of  a body.  We  do  this  by 
describing  the  body’s  motion  as  a combination  of  rotation  about  its  center 
of  mass  and  motion  of  the  center  of  mass.  This  is  not  the  only  way  that  a 
general  motion  of  a body  can  be  described.  (If  you  give  this  book  some  gen- 


422  Rotational  Motion,  II 


Fig.  10-21  Strobe  photo  of  a drum  majorette’s  baton. 
The  path  of  its  center  of  mass  while  in  flight  is  shown 
by  the  solid  part  of  the  curve  superimposed  on  the 
photo.  (Photo  courtesy  of  Harold  E.  Edgerton.) 


eral  motion  and  think  about  the  motion  for  a moment,  you  will  conclude 
that  the  motion  can  be  described  as  a rotation  about  any  point  you  choose, 
combined  with  a motion  of  that  point.)  We  choose  the  point  about  which  we 
consider  the  body  to  be  rotating  to  be  its  center  of  mass  because  then  we 
can  be  sure  that  the  important  equation  T = dL/dt  will  be  valid,  no  matter 
how  that  point  moves. 

The  strobe  photo  of  a drum  majorette’s  baton  in  Fig.  10-21  gives  an 
example  of  a general  motion  of  a body  that  can  be  understood  by  using  the 
rotational  form  of  Newton’s  second  law,  T = dL/dt,  to  analvze  the  motion 
of  the  body  about  its  center  of  mass,  and  its  translational  form,  F = dP/dt,  to 
analyze  the  motion  of  the  center  of  mass.  The  baton  is  a stick  with  one  end 
weighted  so  that  its  center  of  mass  is  not  at  the  center  of  the  stick.  After  the 
baton  is  thrown,  only  the  gravitational  attraction  of  the  earth  acts  on  it.  As  a 
consequence,  its  center  of  mass  moves  along  the  parabolic  trajectory  shown 
by  the  solid  curve  in  the  photo.  This  trajectory  is  the  path  that  any  particle 
obeying  the  equations  dP/dt  — F = Mg  would  follow.  Because  the  baton  is 
thrown  with  an  initial  angular  momentum  about  the  center  of  mass,  the 
baton  also  rotates  while  in  (light.  In  fact,  it  rotates  with  constant  angular 
velocity  a>  about  the  center  of  mass  so  as  to  maintain  a constant  angular  mo- 
mentum L = leu  about  that  point,  its  moment  of  inertia  I about  the  rotation 
axis  being  constant.  (The  mass  elements  of  the  baton  are  distributed  in 
what  is  essentially  a single  plane  of  rotation  passing  through  the  origin  at 
the  center  of  mass,  so  its  angular  momentum  is  always  parallel  to  its  angu- 
lar velocity.  Therefore  it  is  valid  to  describe  its  rotation  in  terms  of  its  mo- 
ment of  inertia.)  The  angular  momentum  cannot  change  because  the  net 
gravitational  attraction  of  the  earth  for  the  body  acts  as  if  the  attraction  were 
applied  at  the  center  of  mass.  So  the  gravitational  force  exerts  no  torque 
about  that  point,  and  d'L/dt  = T = 0. 


10-4  Rotation  about  an  Accelerating  Center  of  Mass  423 


Fig.  10-22  l he  motion  of  a diver.  Her  cen- 
ter of  mass  moves  in  the  parabolic  trajectory 
of  a particle,  while  her  body  rotates  with 
constant  angular  momentum  about  the 
center  of  mass.  But  her  angular  velocity  is 
not  constant  since  she  changes  her  moment 
of  inertia  about  the  rotation  axis. 


An  artist’s  rendition  of  a strobe  photo  of  a diver  is  shown  in  Fig.  10-22. 
The  motion  is  similar  to  that  of  the  baton  because  dP/dt  = Mg,  so  the 
diver's  center  of  mass  moves  along  a parabolic  trajectory.  Also,  dlu/dt  = 0, 
as  for  the  baton.  However,  during  the  dive  the  diver  temporarily  decreases 
the  moment  of  inertia  I of  her  body  about  the  rotation  axis  passing  through 
her  center  of  mass  by  doubling  up.  This  causes  her  angular  velocity  co  to  in- 
crease, since  the  angular  momentum  L = Ico  must  be  constant.  The  in- 
creased angular  velocity  helps  her  complete  the  forward  turn  before  hit- 
ting the  water.  [It  is  valid  to  describe  the  rotation  of  the  diver  in  terms  of  a 
moment  of  inertia  because  her  angular  momentum  is  always  parallel  to  her 
angular  velocity.  The  reason  is  that  the  diver  is  rotating  about  a principal 
axis.  See  the  discussion  in  small  print  following  Eq.  (10-13).] 

The  motion  of  the  earth  exemplifies  a case  in  which  there  are  nonzero 
values  both  for  the  force  F in  the  equation  governing  the  motion  of  the 
center  of  mass  of  a body  and  for  the  torque  T in  the  equation  governing  its 
rotation  about  the  center  of  mass.  As  we  will  see  in  Chap.  1 1,  the  center  of 
mass  of  the  earth  moves  in  an  elliptical  orbit  about  the  sun  under  the  influ- 
ence of  a gravitational  force  exerted  on  that  point  by  the  sun.  The  earth 
also  spins  about  an  axis  passing  through  its  center  of  mass.  We  saw  in  Sec. 
10-3  that  the  angular  momentum  is  not  constant  in  direction  because  of  a 
torque  due  primarily  to  the  unequal  strengths  of  the  moon’s  gravitational 
attraction  on  the  near  and  far  parts  of  the  earth's  equatorial  bulge.  Ex- 
ample 10-7  treats  a simpler  case  of  motion  with  nonzero  values  of  F and  T. 

A wheel  in  the  form  of  a uniform,  solid  disk  of  radius  R rolls  without  slipping  down 
a plane  inclined  to  the  horizontal  at  an  angle  6.  If  the  wheel  starts  from  rest  at  a 
height  y above  the  bottom  of  the  plane,  determine  the  speed  v of  its  center  of  mass 
when  it  reaches  the  bottom. 


424  Rotational  Motion.  II 


■ Your  first  step  should  be  to  make  a drawing  of  the  system  and  the  forces  acting 
on  the  wheel,  as  in  Fig.  10-23.  These  are  the  weight  Mg  acting  vertically  downward 
on  the  center  of  mass,  the  force  N exerted  by  the  plane  on  the  wheel  in  the  direction 
normal  to  its  surface,  and  the  static  contact  friction  force  Cs  exerted  by  the  plane  on 
the  wheel  in  the  generally  upward  direction  along  the  plane  that  prevents  the  wheel 
from  slipping. 

The  motion  of  the  wheel’s  center  of  mass  along  the  inclined  plane  is  found  by 
taking  the  components  in  that  direction  of  the  equation  stating  Newton’s  second  law 
in  the  translational  form,  F = dP/dt.  Choosing  the  positive  direction  to  be  down- 
ward along  the  plane,  you  have 


Fig.  10-23  A uniform  solid  disk  rolling 
down  an  inclined  plane. 


dP  d(Mv)  dv 
Mg  sin  0 - Cs  = — = — = M — 


Since  dv/dt  is  the  acceleration  a of  the  center  of  mass  down  the  plane,  this  can  be 
written 


Mg  sin  0 - Cs  = Ma  (10-46) 

When  the  wheel  of  radius  R rotates  through  an  angle  </>  = 2tt  to  make  one  com- 
plete turn,  its  center  of  mass  moves  down  the  plane  a distance  s = 2ttR.  (This  is  the 
distance  around  its  circumference.)  In  general,  the  relation  between  j and  <£  is  s = 
d>R,  as  long  as  the  wheel  does  not  slip.  Differentiating  with  respect  to  time,  you  find 
the  speed  v of  the  center  of  mass  of  the  rolling  wheel  to  be 

ds  dd> 

v =—  = R — 
dt  dt 

Since  dofr/dt  is  the  wheel’s  angular  speed  of  rotation  cu,  this  can  be  written 


v = Rot) 


(10-47) 


The  rotational  motion  is  found  by  taking  the  components  along  the  rotation 
axis  of  the  equation  stating  Newton’s  second  law  in  the  rotational  form,  T = dL/dt. 
As  explained,  this  equation  can  be  used  for  an  origin  O at  the  center  of  mass,  even 
though  the  point  is  accelerating  down  the  plane.  The  only  force  producing  a torque 
about  the  center  of  mass  is  the  frictional  force,  since  the  gravitational  force  is  ap- 
plied at  that  point  and  the  normal  force  is  directed  toward  it.  Taking  the  inward 
direction  as  positive,  you  have 


dL  d(I(o)  do 

RCS  = — = — — = I — 
dt  dt  dt 


(10-48) 


Here  I is  the  moment  of  inertia  of  the  wheel  for  rotation  about  its  symmetry  axis. 
Solving  Eq.  (10-48)  for  Cs  gives 


Cs 


I da) 
R dt 


Differentiating  Eq.  (10-47)  with  respect  to  t,  you  obtain 


dv  do) 

— = R — 
dt  dt 


or 


da> 

a = R — — 
dt 


This  allows  you  to  write 


I a la 
R R~  R2 


10-4  Rotation  about  an  Accelerating  Center  of  Mass  425 


Then  you  substitute  this  into  Eq.  (10-46),  producing 

la 

Mg  sin  0 — — = Ma 


or 


Mg  sin  0 
M + I/R2 


(10-49) 


Table  10-1  shows  that  the  square  of  the  gyration  radius  of  the  uniform  disk  ro- 
tating about  its  symmetry  axis  is 

G2  = iR2 


So  you  have 


/ MG2 


R 2 R2 


= 4M 


Thus  Eq.  (10-49)  becomes 

Mg  sin  0 
a — 

M + \M 


or 


a = |g  sin  0 


(10-50) 


This  result  for  the  acceleration  a of  the  center  of  mass  of  a disk  rolling  down  a plane 
inclined  at  an  angle  0 can  be  compared  with  the  result  a = g sin  0,  obtained  in  Ex- 
ample 4-4,  for  a block  sliding  without  friction  doWn  the  plane.  Can  you  give  a quali- 
tative physical  explanation  of  why  the  acceleration  is  smaller  here? 

When  the  wheel  rolls  through  a distance  5 down  the  plane,  its  center  of  mass 
drops  through  a height  y.  The  figure  shows  you  that  y = s sin  0 , or 


y 

sin  0 


(10-51) 


The  speed  v of  its  center  of  mass  after  moving  from  rest  through  a distance  s with 
acceleration  a is  obtainable  immediately  from  Eq.  (2-32).  Take  the  initial  value  of  a 
coordinate  extending  down  the  plane  in  the  direction  of  the  acceleration  a to  have 
the  value  zero,  and  assume  that  the  initial  velocity  in  that  direction  has  the  value 
zero  also.  If  the  final  value  of  the  coordinate  is  s,  the  equation  shows  that  s = v2/2 a. 
Thus 


v2  = 2 as 

Using  Eqs.  (10-50)  and  (10-51),  you  find 


= 2 /^  s'11  0\(  y \ _4gy 


3 / Vsin  0j  3 


Hence 


v = 


4gy 


= \/%  \/9crv 


3 v igy 


(10-52) 


The  final  speed  of  the  center  of  mass  of  the  rolling  wheel  is  less  than  the  value 
v = V2 gy  for  the  sliding  block  by  the  factor  Vt  because  of  its  smaller  acceleration. 
So  if  you  have  a physical  explanation  for  why  the  acceleration  is  smaller,  you  should 
have  one  for  why  the  final  speed  is  smaller.  Do  you? 


426  Rotational  Motion.  II 


10-5  ENERGY  IN 
ROTATIONAL  MOTION 


o 

Fig.  10-24  A rotating  body  and  one  of 
its  constituent  particles. 


It  can  be  quite  difficult  to  give  a qualitative  physical  explanation  of  the  mo- 
tion of  a body  which  is  both  translating  and  rotating  by  considering  the 
forces  and  torques  acting  on  it.  If  you  tried  to  do  this  for  the  wheel  rolling 
down  the  inclined  plane  in  Example  10-7,  you  will  undoubtedly  agree.  The 
trouble  is  that  your  intuition  must  deal  simultaneously  with  the  two  quite 
different  types  of  vector  quantities  involved,  the  forces  and  the  torques,  and 
with  the  interplay  between  the  effects  that  they  produce. 

In  many  cases  it  is  much  less  difficult  to  think  through  the  reasons  for 
the  behavior  of  a moving  body  by  considering  energy,  instead  of  force  and 
torque.  You  have  seen  this  before  for  motion  that  involves  only  translation. 
It  is  even  more  true  for  the  case  of  motion  involving  both  translation  and 
rotation.  The  simplicity  of  energy  considerations  results  from  the  fact  that 
they  concern  only  scalar  quantities,  and  scalars  are  much  easier  to  keep 
track  of  than  vectors. 

The  physical  understanding  that  can  frequently  be  found  by  consider- 
ing energy  deserves  to  be  stressed.  As  important  as  it  is  to  be  able  to  make 
quantitative  predictions  concerning  the  behavior  of  a physical  system,  it  is 
even  more  important  to  develop  a qualitative  feeling  for  the  physical 
reasons  underlying  the  behavior. 

As  you  know,  the  energy  relations  also  frequently  have  the  additional 
advantage  of  leading  much  more  easily  than  direct  application  of  Newton’s 
second  law  to  quantitative  predictions  about  physical  systems.  Thus  we 
have  ample  motivation  for  extending  the  energy  relations,  developed  in 
Chap.  7 for  translational  motion,  to  the  case  of  a body  whose  motion  is  ob- 
served to  be  rotational — or  both  translational  and  rotational. 


A body  and  its yth  constituent  particle  are  shown  in  Fig.  10-24.  The  po- 
sition of  that  particle  relative  to  the  center  of  mass  at  O'  is  given  by  the 
vector  rj,  and  its  position  relative  to  the  origin  O of  the  reference  system  is 
given  by  r,-.  The  vector  r specifies  the  position  of  O'  relative  to  O.  Writing 

r j = r + rj 

and  then  taking  time  derivatives,  we  obtain  the  relation 

Vj  = v + vj  (10-53) 


among  the  velocity  Vj  of  the  particle  with  respect  to  O,  its  velocity  vj  with 
respect  to  O',  and  the  velocity  v of  O'  with  respect  to  O. 

The  kinetic  energy  K of  the  body,  as  seen  from  O,  is  the  sum  of  the 
kinetic  energies  of  its  n particles.  From  the  point  of  view  of  O,  each  particle 
of  mass  rrij  has  kinetic  energy  rrijV2/ 2,  since  its  speed  is  Vj.  Therefore 


It  will  be  very  useful  to  express  K in  terms  of  v and  vj . To  this  end,  we  take 
the  dot  product  of  Eq.  (10-53)  into  itself,  producing 


Vj  - Vj  = v v + vj  • vj  + 2vj  • v 


or 

vf  — v2  + vj2  + 2v-  • v 

Substitution  then  gives 


K = X 

t=i 


mcu 


n 

V 

3=1 


t 9 

rrijVj 


+ X 

j=i 


(10-54) 


10-5  Energy  in  Rotational  Motion  427 


(1 0-55a ) 


The  first  term  on  the  right  side  of  this  equation  is  just 

^ mg)2  Mv2 

Zj  9 ~ 9 

j=i 

where  M is  the  total  mass  of  the  body.  The  third  term  can  be  evaluated  by 
writing  vj  = drj/dt,  so  as  to  obtain 


I 

j=i 


v 


V dri 

3=  1 


But  all  the  m,  are  constants,  so  m,  drj/dt  = ( d/dt ) (mjrj).  Furthermore,  the 
sum  of  the  derivatives  equals  the  derivative  of  the  sum,  as  is  shown  by  Eq. 
(2-14).  So  the  differentiation  can  be  performed  after  the  summation  is  car- 
ried out,  instead  of  before.  Thus  this  term  can  be  written  as 


I 


rn  ;v 


j T j 


v 


v 


Now  according  to  the  definition  of  center  of  mass,  the  summation  over  j of 
mgcj  equals  M times  a vector  extending  from  O'  to  the  center  of  mass  of  the 
body.  But  since  O'  is  at  the  center  of  mass,  the  vector  is  of  zero  magnitude. 
Flence  the  value  of  the  summation  on  the  right  side  of  the  last  equation  is 
zero,  and  so  we  have 


X miyS  ' V = 0 
j=i 


( 1 0-55  b) 


Using  Eqs.  (10-55a)  and  (10-556)  in  Eq.  (10-54),  we  obtain 

Mv2  " nijV j2 

“ j=i 


(10-56) 


Thus  the  kinetic  energy  K of  the  body,  measured  in  the  reference  frame  of 
origin  O,  is  the  sum  of  a term  which  is  the  kinetic  energy  of  a single  particle, 
whose  mass  M is  the  total  mass  of  the  body  and  whose  speed  v is  the  speed 
of  its  center  of  mass,  and  a term  which  is  the  summation  of  the  kinetic  en- 
ergies of  the  particles  of  mass  rrij  of  the  body  due  to  their  motion  with 
speeds  vj  relative  to  the  center  of  mass.  This  motion  relative  to  the  center  of 
mass  is  a result  of  the  rotation  of  the  body  about  its  center  of  mass.  So  we 
can  summarize  the  meaning  of  Eq.  (10-56)  by  saying  that:  the  kinetic  energy 
of  a body  is  the  kinetic  energy  due  to  the  motion  of  its  center  of  mass  plus  the  kinetic 
energy  due  to  its  rotation  about  the  center  of  mass. 

The  first  term  on  the  right  side  of  Eq.  (10-56)  is  the  kinetic  energy  a 
body  would  have  if  its  center  of  mass  were  moving  as  it  actually  does,  but 
without  any  rotation  of  the  body  about  its  center  of  mass,  so  that  the  motion 
of  the  body  were  purely  translational.  The  second  term  is  the  kinetic  en- 
ergy arising  from  the  body’s  actual  rotation  about  its  center  of  mass.  Thus 
we  can  also  interpret  the  equation  by  saying  that  the  kinetic  energy  of  a body  is 
its  kinetic  energy  of  translation  plus  its  kinetic  energy  of  rotation. 


A much  more  concise  expression  for  the  kinetic  energy  of  rotation  can 
be  found  by  writing  the  speed  vj  resulting  from  the  rotation  of  the  body 
about  an  axis  passing  through  its  center  of  mass  as 

vj  = coRj 

Flere  Rj  is  the  radius  of  the  circle  on  which  the  jth  particle  is  instanta- 


428  Rotational  Motion,  II 


neously  moving  about  the  rotation  axis,  and  oo  is  the  angular  speed  of  rota- 
tion about  that  axis  (see  Fig.  10-24).  Squaring  both  sides  of  this  equation 
gives 

,,'2  — . ,2n  i2 

Vj  - oj  Kj 


Therefore  the  second  term  in  Eq.  (10-56)  can  be  written 


" rrijVj 


1 2 


= 2 


” nijorR'2  oo2  " 


~9  2 m>R? 


j=l 


1=1 


j=l 


According  to  Eq.  (10-10),  the  summation  on  the  far  right  is  just  the  moment 
of  inertia  / of  the  body  for  rotation  about  the  axis.  So 


n m;v’2  Co2 1 


(10-57) 


j=i 


This  allows  us  to  write  Eq.  (10-56)  in  the  simple  and  symmetrical  form 

Mv2  loo2  n n 

k=~2 ~+ir  (10-58) 


The  second  term  on  the  right  is  the  body’s  kinetic  energy  of  rotation.  The 
kinetic  energy  of  rotation  of  a body  is  one-half  its  moment  of  inertia  times  the  square 
of  its  angular  speed.  Equations  (10-57)  and  (10-58)  are  valid  even  if  the  rota- 
tion axis  passing  through  the  body’s  center  of  mass  is  changing  its  orientation 
in  the  body,  so  that  the  value  of  I is  changing.  But  the  equations  are  most 
useful  in  circumstances  (which,  fortunately,  are  the  most  frequently  en- 
countered ones)  where  the  rotation  axis  is  fixed  in  the  body  so  that  I is 
constant. 


The  work-kinetic  energy  relation,  derived  in  Chap.  7 for  translational 
motion,  applies  just  as  well  to  motion  that  is  rotational  or  a combination  of 
translational  and  rotational.  That  relation  says  that  the  work  W done  by  the  net 
force  acting  on  a body  during  a motion  viewed  from  an  inertial  reference  frame 
equals  the  change  K in  the  body’s  kinetic  energy  in  that  reference  frame: 


W = AK 


(10-59) 


Equation  (10-58)  can  be  used  to  calculate  A K. 

The  work  IT  done  by  a force  applied  to  a point  in  a rotating  body  often 
can  be  calculated  more  easily  by  considering  not  the  force  and  the  displace- 
ment of  the  point  of  application  but  instead  the  torque  produced  by  the 
force  about  some  origin  and  the  rotation  of  the  point  of  application  about 
that  origin.  Figure  10-25  shows  a body  rotating  through  an  infinitesimal 
angle  df  about  an  axis  extending  outward  through  the  origin  O.  During 
the  infinitesimal  rotation,  the  direction  of  the  force  F applied  to  the  point  P 
is  assumed  to  be  in  the  plane  of  the  page,  at  an  angle  8 to  the  displacement 
d s of  P.  The  work  dW  done  by  the  force  is 

dW  = E • ds  = F cos  8 ds 

Since  cos  8 = sin  (tt/2  — 8)y  and  since  ds  = r d<p  with  r being  the  distance 
from  O to  P,  this  expression  can  be  written  as 

dW  = F sin  (tt/2  — 8)  r dxp 


Fig.  10-25  A construction  used  to 
establish  the  relation  dW  = T d<t>. 


10-5  Energy  in  Rotational  Motion  429 


I lie  figure  shows  that  r sin  (tt/2  — 0)  = da,  where  da  is  the  perpendicular 
distance  from  O to  the  line  of  action  of  the  force.  Thus 

dW  = Fda  d<f> 

Bui  Fda  = T,  where  T is  the  magnitude  of  the  torque  about  O produced  by 
the  force.  Therefore  we  have 


dW  = T d<p  (10-60) 

In  die  case  leading  to  Eq.  (10-60),  F is  in  the  plane  of  the  page,  and  the 
vector  T describing  the  torque  about  O applied  to  the  body  is  directed  out- 
ward, as  is  the  vector  c/</>  describing  (he  infinitesimal  rotation  of  the  body. 
If  the  direction  of  F forms  an  angle  a with  the  plane,  then  only  its  com- 
ponent along  the  plane  does  work  in  the  displacement  c/s  along  the 
plane.  This  component  lias  the  value  F cos  a.  Thus  in  the  equations  leading 
to  Eq.  (10-60),  F will  be  replaced  by  F cos  a,  and  Eq.  (10-60)  will  become 
dW  = T cos  a defy.  But  when  F is  at  an  angle  a to  the  plane,  T is  at  an  angle  a 
to  d(f).  Hence  T cos  a d<t>  = T • d(f>,  and  Eq.  (10-60)  becomes 

dW  = T • d<f)  (10-61) 

The  total  work  done  in  a finite  rotation  is 

W = I T • dff>  (10-62) 

Ja>i 

Equations  (10-60)  through  (10-62)  are  valid  not  only  when  T is  a single 
torque  applied  to  a body  because  of  the  application  of  a single  force,  but 
also  when  T is  the  net  torque  applied  to  a body  which  experiences  a rota- 
tion d(/>.  Note  the  complete  analogy  between  these  equations  and  the  corre- 
sponding ones  involving  the  net  force  F and  the  displacement  ds. 

Example  10-8  calculates  the  work  done  by  a torque  applied  to  a ro- 
tating body. 


EXAMPLE  10-8 

Calculate  the  work  done  by  the  constant-torque  electric  motor  of  Example  10-4  in 
bringing  the  grinding  wheel  from  rest  to  its  operating  speed.  The  wheel  has  uni- 
form density,  radius  32  cm,  mass  100  kg,  and  an  operating  speed  of  200  rotations 
per  minute  = 21  rad/s.  Also  calculate  the  average  power  produced  by  the  motor  in 
bringing  the  wheel  up  to  speed  in  3.0  s. 

■ The  easiest  way  to  calculate  the  work  done  is  to  use  the  work-kinetic  energy  re- 
lation of  Eq.  (10-59), 

W = AA 

Since  the  initial  kinetic  energy  of  the  wheel  is  zero,  its  final  kinetic  energy  A,  when 
rotating  at  its  operating  speed,  equals  the  change  in  kinetic  energy  AA.  Hence  you 
have 

W = A 


According  to  Eq.  (10-58),  A is  given  by 


/or 


where  / is  the  wheel’s  moment  of  inertia  for  rotation  about  its  fixed  axle  and  w is  the 


430  Rotational  Motion,  II 


operating  value  of  its  angular  speed.  In  Example  10-4,  the  moment  of  inertia  was 
found  to  have  the  value 


MR2 


where  M is  the  mass  of  the  wheel  and  R is  its  radius.  Putting  it  all  together,  you  ob- 
tain 


/or  MR2u2 


1 he  numerical  value  of  the  work  done  by  the  motor  is 


W = 


100  kg  x (0.32  m)2  x (21  s"1)2 
4 


1.1  X 103  J 


Calculate  this  value  by  applying  Eq.  (10-62),  and  compare  the  result  obtained  with 
the  one  obtained  here. 

A constant-torque  electric  motor  does  not  produce  constant  power.  Why?  The 
average  power  produced  by  this  motor  in  the  3.0-s  interval  while  it  is  bringing  the 
wheel  up  to  speed  is 


1.1  x 103  J 

3.0  s 


= 3.8  x 102  W 


or 


P = 3.8  x 102  W x = 0.50  lip 

/46  W r 


kite  law  of  conservation  of  total  mechanical  energy  applies  to  a body 
whose  motion  is  completely  or  partly  rotational,  just  as  it  does  to  a body 
whose  motion  is  completely  translational.  That  is,  if  the  body  is  in  an  iso- 
lated system  with  workless  constraints  and  conservative  internal  forces,  and 
if  its  motion  is  observed  from  an  inertial  frame,  then 

E = K + U 

Here  E is  the  constant  total  mechanical  energy  of  the  system,  K is  the 
kinetic  energy  of  the  body,  and  U is  the  potential  energy  associated  with  the 
forces  internal  to  the  system.  Using  Eq.  (10-58)  to  evaluate  K , we  have 

Mv2  Ioj 2 

E = — f — — h U ( 10-b3) 

If  U represents  the  gravitational  potential  energy  of  a system  con- 
taining a body  of  mass  M near  the  surface  of  the  earth,  then  Eq.  (9-54) 
shows  that 


U = Mgy  (10-64) 

In  this  expression  y is  used  to  represent  the  height  of  the  body’s  center  of 
mass  above  the  reference  height  y = 0 where  U = 0. 

Example  10-9  applies  the  energy  relations  of  Eqs.  (10-63)  and  (10-64) 
to  solve  a problem  previously  solved  by  applying  the  translational  and  rota- 
tional forms  of  Newton’s  second  law.  It  should  make  the  advantages  of 
using  energy  relations  apparent  to  you. 


10-5  Energy  in  Rotational  Motion  431 


EXAMPLE  10-9 


Use  energy  relations  to  repeat  the  calculation  of  Example  10-7,  in  which  a wheel 
in  the  form  of  a uniform,  solid  disk  of  radius  R rolls  without  slipping  down  an  in- 
clined plane,  starting  from  rest  at  a height  y above  the  bottom.  Determine  the 
speed  v of  its  center  of  mass  when  it  reaches  the  bottom. 

■ You  take  as  an  isolated  system  the  wheel  plus  the  earth.  The  inclined  plane  on 
which  the  wheel  rolls  acts  as  a workless  constraint.  Rolling  is  workless  because  even 
though  a frictional  force  Cs  must  be  exerted  on  the  wheel  by  the  plane  to  make  it 
roll  rather  than  slide,  the  force  is  applied  to  the  instantaneous  point  of  contact  of 
the  wheel  to  the  plane,  and  that  point  is  always  instantaneously  at  rest.  So  the  fric- 
tional force  does  no  work  because  it  does  not  act  through  a distance.  The  same  is 
true  for  the  normal  force  N exerted  by  the  plane  on  the  wheel.  Since  the  gravita- 
tional force  Mg  acting  internal  to  the  system  is  conservative,  you  can  apply  Eqs. 
(10-63)  and  (10-64). 

At  the  instant  when  the  wheel  is  released,  you  have 

Mv2  lor 

E = -r  + ^r  + u 

= 0 + 0 + Mgy 


where  y is  the  initial  height  of  the  center  of  mass. 

At  the  bottom  of  the  inclined  plane,  where  y = 0,  you  have 


E = 


Mv2 

~Y~ 


Equating  the  two  expressions  for  the  constant  total  energy  E gives 

Mv2  lor 
— + _ = Mgy 


Now,  you  can  write 

I = MG 2 


where  G is  the  gyration  radius  of  the  rotating  body.  So  you  can  write 


Mv2 

~Y~ 


MG- or 

+ — g — = Mgy 


or 


G2co2 

2 


= gy 


(10-65) 


Using  the  value  of  G2  = 4R2  for  a uniform,  disk-shaped  wheel  of  radius  R from 
Table  10-1,  you  obtain 


v 


2 


+ 

2 


R2oj2 


gy 


According  to  Eq.  (10-47),  the  relation  between  the  angular  speed  oj  of  the 
rolling  wheel  and  the  speed  v of  its  center  of  mass  is 


v 


Substitution  into  the  equation  immediately  above  gives 


or 


3V2 

4 


= gy' 


432  Rotational  Motion,  II 


and 


(10-66) 

This  result  is  identical  to  the  one  expressed  in  Eq.  (10-52)  of  Example  10-7. 

Comparison  with  Example  10-7  will  show  that  it  is  appreciably  easier  to  evalu- 
ate the  final  speed  by  considering  the  energy  content  of  the  system  than  by  consid- 
ering the  forces  and  torques  acting  in  the  system.  However,  there  are  always  inter- 
esting questions  to  be  asked  about  a system  which  cannot  be  answered  by  using  only 
energy  relations.  For  instance,  you  may  want  to  determine  the  ratio  CjN  in  order  to 
verify  that  it  is  less  than  the  coefficient  of  static  friction  between  the  wheel  and  the 
plane,  so  that  it  is  actually  possible  for  the  wheel  to  roll  without  slipping.  To  do  so, 
you  must  deal  with  forces  and  torques,  not  energies. 

The  energy  relations  certainly  do  provide  a simple  explanation  of  why  the 
wheel  rolls  more  slowly  down  an  inclined  plane  than  a block  slides  without  friction 
down  a plane  inclined  at  the  same  angle.  In  both  cases  the  lost  gravitational  poten- 
tial energy  appears  as  kinetic  energy.  If  the  body  does  not  rotate  about  its  center  of 
mass,  then  all  the  lost  potential  energy  appears  in  the  form  of  kinetic  energy  of  mo- 
tion of  the  center  ot  mass.  But  if  the  body  rotates  about  the  center  of  mass,  then 
some  of  the  lost  potential  energy  appears  as  rotational  kinetic  energy,  and  so  there 
is  less  available  to  appear  as  kinetic  energy  of  motion  of  the  center  of  mass.  Hence 
the  center  of  mass  of  the  body  moves  more  slowly  down  the  plane. 

You  can  see  from  Eq.  (10-65)  that  the  mass  M of  the  rolling  body  does  not  af- 
fect the  rapidity  of  its  motion.  But  the  gyration  radius  G does.  In  other  words,  what 
counts  is  not  the  mass,  but  how  the  mass  is  distributed  about  the  rotation  axis.  If  a 
hoop  of  the  same  mass  and  radius  as  the  solid  disk  were  started  down  the  inclined 
plane  at  the  same  time  as  the  disk,  would  it  arrive  at  the  bottom  at  the  same  time? 


Table  10-2  lists  some  of  the  most  important  equations  of  rotational  me- 
chanics developed  or  extended  since  fable  9-1  was  presented,  and  it  also 
shows  the  analogous  equations  from  translational  mechanics. 


Table  10-2 


T = 


More  Equations  Used  in  Rotational  and  Translational  Mechanics 
Rotational  motion 

dL 

dt 

L = I to 
T = la 

I = X mjRj 

3=  1 

r d>f 

W = T -d<f> 

J d>i 


T(0) 


K = 


dU(0) 

d0 


Translational  motion 

dt 

P = Mv 
F = Ma 

M = V Hi,- 

j=i 

W = | F • ds 
dU(x) 


F(x)  = 


dx 


K = 


Mv2 


Combined  motion 


L: 


about  CM 


+ L, 


of  CM 


e=Mi!+I2T  + 1/ 

2 2 


10-5  Energy  in  Rotational  Motion  433 


EXERCISES 

Group  A 

10-1.  Circle  and  square.  A hoop  of  radius  R and  a 
square  frame  of  side  2 R each  have  mass  M. 

a.  Calculate  the  moment  of  inertia  of  each  object 
about  an  axis  through  the  center  of  mass  normal  to  the 
plane  in  which  it  lies. 

b.  Why  do  the  moments  of  inertia  differ? 

10-2.  Moments  of  inertia:  rectangular  solid.  The  homo- 
geneous rectangular  block  shown  in  Fig.  10E-2  has  mass 
M.  Find  the  moment  of  inertia  of  the  block  about  each  of 
the  following  axes: 


Fig.  10E-2 


a.  The  axis  RR' , which  passes  through  the  center  of 
two  edges  of  length  B. 

b.  The  axis  SS',  which  coincides  with  an  edge  of 
length  C. 

10-3.  Flywheel-powered  cars,  I.  A 200-kg  flywheel  1.00 
m in  diameter  is  rotating  at  6000  rotations  per  minute. 
If  its  kinetic  energy  were  used  to  propel  a car  that  requires 
an  average  propulsive  force  of  500  N,  how  far  could  the 
car  travel  before  the  flywheel  stopped?  For  simplicity, 
assume  that  the  entire  mass  is  concentrated  in  the  rim  of 
the  flywheel. 

10-4.  Rolling  along.  A uniform  sphere  of  mass  M and 
radius  R rolls  along  a floor.  Its  center  of  mass  travels  with 
speed  vc. 

a.  What  is  the  total  energy  of  its  motion? 

b.  Evaluate  your  result  for  M = 10  kg  and  vc  = 5 

m/s. 

c.  Explain  why  it  is  not  necessary  to  know  the 
sphere’s  radius  and  angular  speed,  as  long  as  its  transla- 
tional speed  is  given. 

10-5.  A design  problem.  How  would  you  construct  an 
object  of  specified  mass  M,  having  symmetry  about  its  axis 
of  rotation,  so  tnat  it  would  roll  down  a given  inclined 
plane  as  rapidly  as  possible?  How  would  you  construct  it 
so  that  it  would  roll  down  the  inclined  plane  as  slowly  as 
possible? 

10-6.  Effective  gravity  and  the  precession  of  a top.  A 
space-age  child  is  among  the  passengers  aboard  a space 
bus  that  is  about  to  take  off  from  a launch  pad  on  the 


earth.  He  is  playing  with  a toy  top,  which  spins  with  angu- 
lar velocity  cos  and  is  supported  on  a smooth,  horizontal 
table.  The  top  is  inclined  to  the  vertical,  so  that  it  precesses 
steadily  at  angular  velocity  wp  « «s.  Describe  what 
happens  to  the  precession  velocity  of  the  top  during 
launch,  when  the  space  bus  gradually  develops  an  upward 
acceleration  equal  to  3 g.  Assume  that  the  time  required 
for  the  vehicle  to  reach  full  acceleration  is  much  longer 
than  one  precessional  period,  and  very  much  longer  than 
one  spin  period. 


Group  B 

10-7.  Moment  of  inertia  of  a bar.  Find  the  moment  of 
inertia  for  the  bar  in  Fig.  10-7  when  it  rotates  about  an 
axis  that  passes  through  one  end  and  is  aligned  parallel  to 
the  direction  along  which  the  width  w is  measured.  Do  this 
by  direct  integration,  as  in  Example  10-1.  Then  do  it  again 
by  applying  the  parallel-axis  theorem  to  the  moment  of  in- 
ertia obtained  in  the  example  for  rotation  about  an  axis 
passing  through  the  center  of  the  bar.  Compare  your  two 
results. 

10-8.  A broken  wheel.  A chip  of  mass  4.0  kg  is  broken 
from  the  rim  of  the  grinding  wheel  in  Example  10-4. 
Where  must  the  origin  O be  located  so  that  the  equation 
T = la  still  applies  to  the  asymmetrical  wheel?  Using  that 
origin,  recalculate  the  magnitude  of  the  torque  that  the 
motor  must  apply  to  bring  the  wheel  from  rest  to  200 
rotations  per  minute  in  3.0  s. 

10-9.  The  other  way.  Construct  a figure  like  Fig.  10-6, 
but  with  the  direction  of  the  angular  velocity  to  re- 
versed. Lise  it  to  make  a calculation  like  the  one  in  the  text 
leading  to  Eq.  (10-7),  Ln  = MR2co. 

10-10.  Moments  of  inertia  of  a hoop.  A flat  hoop  of  mass 
M and  radius  R is  shown  in  Fig.  10E-10.  The  hoop  lies  in 
the  xy  plane  and  is  centered  at  the  origin  O.  In  the  figure, 
the  z axis  rises  from  O directly  toward  the  viewer.  Con- 
sider the  mass  element  dm.  It  contributes  an  amount  dlz  = 
R2  dm  to  the  moment  of  inertia  about  the  z axis.  The  ele- 


y Fig.  10E-10 


434  Rotational  Motion,  II 


ment  dm  also  contributes  an  amount  dlx  = y2  dm  to  the 
moment  of  inertia  about  the  x axis;  similarly,  dly  = x2  dm. 

a.  Find  an  equation  relating  dlx,  dly,  and  dlz. 

b.  Find  an  equation  relating  Ix,  Iy,  and  /z. 

c.  Use  symmetry  considerations  to  obtain  an  equa- 
tion relating  Ix  and  Iy. 

d.  Determine  Ix,  Iy,  and  Iz  for  the  hoop.  Express  your 
results  in  terms  of  M and  R. 

e.  Determine  the  moment  of  inertia  of  the  hoop  about 
an  axis  that  passes  through  point  A and  runs  parallel  to 
the  x axis. 

10-11.  A pivoted  rod.  A slender,  uniform  rod  of  mass 
m and  length  l is  pivoted  at  one  end  so  that  it  can  rotate  in 
a vertical  plane.  There  is  negligible  friction  at  the  pivot. 
The  free  end  is  held  almost  vertically  above  the  pivot  and 
then  released. 

a.  What  is  the  rod’s  angular  acceleration  when  it 
makes  an  angle  6 with  the  vertical? 

b.  What  is  the  magnitude  of  the  translational  acceler- 
ation of  the  free  end  of  the  rod  for  this  angle? 

c.  What  is  the  vertical  component  of  this  translational 
acceleration? 

d.  For  what  value  of  0 does  the  vertical  component  of 
the  translational  acceleration  become  equal  to  g? 

10-12.  The  perpendicular-axis  theorem.  Generalize  the 
arguments  involved  in  Exercise  10-  10a  and  c to  prove  the 
following  theorem:  Any  flat  object  in  the  xy  plane  has  a 
moment  of  inertia  Iz  about  the  z axis  which  is  given  by  /.  = 
Ix  + Iy,  where  Ix  and  Iy  are  the  object’s  moments  of  inertia 
about  the  x axis  and  the  y axis,  respectively. 

10-13.  A swinging  ring. 

a.  A ring  of  mass  M and  radius  R is  hung  from  a 
knife-edge,  so  that  the  ring  can  swing  in  its  own  plane  as  a 
physical  pendulum.  Find  the  period  7j  of  small  oscilla- 
tions. 

b.  Suppose  that  an  identical  ring  is  pivoted  from  an 
axis  PP'  lying  in  the  ring  plane  and  tangent  to  the  circum- 
ference. This  ring  can  execute  oscillations  in  and  out  of 
the  plane.  Find  the  period  To  of  those  small  oscillations. 

c.  Which  oscillation  has  the  longer  period?  How 
much  longer? 

10-14.  Moments  of  inertia  of  a thin  disk.  Employ  the 
perpendicular-axis  theorem  of  Exercise  10-12  to  relate 
the  moment  of  inertia  of  a thin,  circular  disk  (about  any 
diameter)  to  its  moment  of  inertia  about  its  central  axis. 
Check  your  results  against  Table  10-1  by  using  the  tabu- 
lated results  for  solid  cylinders,  in  the  limiting  case  that 
the  length  of  the  cylinder  approaches  zero. 

10-15.  A gaucho’s  life.  Two  bodies,  one  of  mass  1.0  kg 
and  the  other  of  2.0  kg,  are  tied  to  the  ends  of  a wire  60  cm 
long.  The  1.0-kg  body  is  held  in  the  hand,  and  the  2.0-kg 
body  is  whirled  in  a horizontal  circle  at  2.0  rotations  per 
second.  Then  the  1.0-kg  body  is  released,  sending  the  pair 
whirling  horizontally  through  the  air.  What  is  the  angular 
velocity  of  rotation  of  the  system  just  after  the  release? 


10-16.  No  slipping,  please.  Starting  from  rest,  a sphere 
rolls  down  a 30°  incline.  What  is  the  minimum  value  of  the 
coefficient  of  static  friction  if  there  is  to  be  no  slipping? 

10-17.  Descent  of  a Yo-Yo.  Consider  a Yo-Yo  with  out- 
side radius  R equal  to  10  times  its  spool  radius  r.  The  mo- 
ment of  inertia  Ic  of  the  Yo-Yo  about  its  spool  is  given 
with  good  accuracy  by  Ic  = iMR2,  where  M is  the  total  mass 
of  the  Yo-Yo.  The  upper  end  of  the  string  is  held  motion- 
less. 

a.  Compute  the  acceleration  of  the  center  of  mass  of 
the  Yo-Yo.  How  does  it  compare  with  g? 

b.  Find  the  tension  in  the  string  as  the  Yo-Yo 
descends.  How  does  it  compare  with  Mg? 

10-18.  Turning  the  wheel.  A heavy  wheel  of  radius  20 
cm  is  mounted  on  a horizonal  axle.  A rope  wrapped 
around  its  rim  is  pulled  straight  downward  with  a constant 
force  of  50  N.  The  rope  moves  a distance  of  50  cm  in  1.0  s. 

a.  What  is  the  angular  acceleration  of  the  wheel? 

b.  What  is  the  moment  of  inertia  of  the  wheel? 

c.  Suppose  that  an  object  whose  weight  is  50  N is  at- 
tached to  the  rope,  and  the  system  is  released  from  rest. 
What  would  the  angular  acceleration  of  the  wheel  be  in 
that  case? 

d.  Account  for  the  difference  between  the  results  of 
parts  a and  c. 

e.  The  wheel  is  a homogeneous,  solid  disk.  What  is  its 
mass? 

10-19.  Cylinder  versus  pipe.  A solid  cylinder  and  a 
thin-walled  pipe  are  simultaneously  released  from  rest  at 
the  upper  end  of  a ramp  of  inclination  0.  Each  object  rolls 
without  slipping. 

a.  Find  the  acceleration  of  the  center  of  mass  of  the 
solid  cylinder. 

b.  Find  the  acceleration  of  the  center  of  mass  of  the 
pipe. 

c.  When  the  cylinder  has  rolled  a distance  sc,  how  fat- 
lias  the  pipe  rolled? 

10-20.  From  pure  sliding  to  pure  rolling.  An  upright 
hoop  is  projected  onto  a pavement  with  an  initial  horizon- 
tal speed  v0  but  without  spin,  so  that  it  slides.  The  re- 
sulting frictional  force  causes  the  hoop  to  lose  translational 
speed  and  to  acquire  an  angular  speed.  Eventually  the 
hoop  rolls  without  slipping.  Prove  that  when  the  hoop 
ceases  to  slip,  it  has  speed  v0/2. 

10-21.  Oscillations  of  an  angle  iron.  A uniform,  right- 
angle  iron  is  hung  over  a thin  nail  so  that  the  iron  pivots 
freely  at  the  bend.  Each  arm  of  the  iron  has  mass  m and 
length  /.  Show  that  the  period  T of  small  oscillations  (in 
the  plane  of  the  iron)  is  given  by  T = 2W2\/2  Z/3g. 

10-22.  A family  of  physical  pendulums,  I.  Figure  10E-22 
represents  a three-dimensional  object  (not  necessarily  of 
uniform  density)  whose  center  of  mass  is  at  point  C.  The 
axis  ZCZ'  passing  through  point  C has  been  chosen  at 
random  and  is  not  necessarily  one  of  the  “principal”  axes 


Exercises  435 


p 


Fig.  10E-22 


Fig.  10E-24 


mentioned  in  the  text.  The  body  had  gyration  radius 
GC(ZZ')  about  the  axis  ZCZ' . the  notation  GC(ZZ')  makes 
explicit  the  fact  that  the  gyration  radius  for  an  axis 
through  C depends  on  the  particular  choice  (ZZ')  of  axis. 

Suppose  that  a physical  pendulum  is  constructed  by 
pivoting  the  body  about  the  axis  PP' , which  is  parallel  to 
ZCZ'  at  a distance  D.  Prove  that  the  frequency  v of  small, 
oscillations  about  equilibrium  is  given  by 

= J_  / Dg 

V 27T V Gc(ZZ')  + D2 

This  shows  that  the  frequency  of  pendulum  oscillations  is 
the  same  for  any  choice  of  axis  PP'  on  a cylinder  of  radius 
D centered  on  ZZ' . 

10-23.  A family  of  physical  pendulums,  II.  Figure  10E-23 
represents  any  thin  and  flat  rigid  object  (not  necessarily  of 
uniform  density).  The  center  of  mass  is  at  C,  and  the  indi- 
cated circle  is  centered  at  C.  Imagine  that  the  object  is  pi- 
voted about  an  axis  which  (1)  passes  through  some  point  P 
on  the  circle  and  (2)  is  perpendicular  to  the  plane  of  the 
object.  The  object  is  allowed  to  move  in  a vertical  plane 
about  this  axis,  forming  a physical  pendulum.  Prove  that 
small  oscillations  about  equilibrium  have  the  same  fre- 
quency for  any  choice  of  P on  the  indicated  circle. 


Fig.  10E-23 


/ 


V 

\ 


c 


/ 


The  Kater  pendulum.  Figure  10E-24  depicts  a 
speciarkmd  of  physical  pendulum  invented  in  1817  by  the 
British  geodesist  Captain  Henry  Kater.  It  is  used  as 
described  in  this  exercise  to  measure  the  acceleration  of 


D, 


D\  + D2 


- Knife-edge  1 


Center  of  mass 
- Knife-edge  2 


gravity  with  high  accuracy.  (Until  recently,  it  was  the  most 
accurate  method  available.)  It  consists  of  a rigid  rod  on 
which  a bob  is  mounted.  The  mass  of  the  bob  is  suffi- 
ciently large  so  that  the  center  of  mass  of  the  pendulum  is 
fairly  far  from  the  middle  of  the  rod.  The  bob  is  mounted 
on  a slide;  by  moving  the  bob  and  then  clamping  it  in  posi- 
tion, the  location  of  the  center  of  mass  of  the  pendulum 
can  be  adjusted.  There  are  two  very  precisely  made 
knife-edges  mounted  on  the  rod.  The  pendulum  can  be 
swung  from  knife-edge  1,  and  then  reversed  and  swung 
from  knife-edge  2. 

a.  Let  G0  be  the  radius  of  gyration  of  the  pendulum 
about  an  axis  through  its  center  of  mass.  Show  that  if 
knife-edge  1 and  knife-edge  2 are  located  respectively  at 
distances  £h  and  D2  from  the  center  of  mass,  the  radii  of 
gyration  Gi  and  G2  about  the  knife-edges  satisfy  the  equa- 
tions Gj  = Go  + D\  and  G\  = Go  + D\. 

b.  Beginning  with  Eq.  (10-23),  express  7\,  the  period 
of  small  oscillations  of  the  pendulum  about  knife-edge  1, 
in  terms  of  G0,  Di,  and  the  acceleration  of  gravity  g.  Like- 
wise, express  T2,  the  period  for  knife-edge  2,  in  terms  of 
G0 , Do,  and  g. 

c.  Using  the  expressions  obtained  in  part  b,  show  that 
(T\  + T\)/{D1  + D2)  = (‘iTT2/g){Gl/D1D2  + 1),  and  that 
OH  - T2)/(D1  - D2)  = —{4n2/g)(Go/D1D2  - 1). 

d.  Using  the  results  of  part  c,  show  that 

__ ^ 8t r _ T\  + 71  T\-  T\ 

g Di  + Do  Di  — Do 

e.  By  counting  pendulum  swings  over  an  accurately 
measured  total  time  of  many  hours,  the  values  of  7j  and 
T2  can  be  determined  with  great  accuracy.  Similarly,  the 
distance  D [ + D2  between  the  knife-edges  can  be  mea- 


436  Rotational  Motion,  II 


sured  very  accurately  by  means  of  a traveling  microscope 
or  a laser  interferometer.  But  evaluation  of  the  quantity 
D\  — Do  depends  on  separate  measurements  of  the  dis- 
tances jD,  and  D2  between  the  knife-edges  and  the  center 
of  mass.  It  is  not  possible  to  determine  the  position  of  the 
center  of  mass  with  great  accuracy.  However,  by  moving 
the  pendulum  bob.  the  oscillation  periods  7\  and  To  can  be 
adjusted.  How  would  you  adjust  them  to  maximize  the 
accuracy  of  the  value  of  g obtained  by  using  the  result  of 
part  d ? 


10-25.  Taking  the  cue.  How  far  above  its  center  should 
a billiard  ball  be  struck  in  order  to  make  it  roll  without  any 
initial  slippage?  Denote  the  ball’s  radius  by  R,  and  assume 
that  the  impulse  delivered  by  the  cue  is  purely  horizontal. 


10-26.  A difficult  way  to  hold  up  a cylinder.  A string  is 
wound  around  an  otherwise  unsupported  homoge- 
neous, horizontal  cylinder  of  mass  M and  radius  R.  As  the 
string  unwinds  and  the  cylinder  spins,  the  end  of  the 
string  is  continually  pulled  vertically  upward  with  a force 
just  sufficient  to  keep  the  cylinder  from  descending  rela- 
tive to  the  ground. 

a.  What  is  the  tension  in  the  vertical  portion  of  the 
string? 

b.  What  is  the  angular  acceleration  of  the  cylinder? 

c.  What  is  the  upward  acceleration  of  any  given  point 
along  the  vertical  portion  of  the  string? 


10-27.  How  much  friction  is  needed  for  pure  rolling ? A 
spool  of  mass  M is  resting  on  a horizontal  surface.  The 
spool  has  moment  of  inertia  MG2  about  its  axis  of  sym- 
metry. The  spool  is  subjected  to  a rightward  horizontal 
force  of  magnitude  F,  applied  at  a distance  r above  the 
axis. 


a.  Show  that  if  there  is  to  be  no  slippage  between  the 
spool  and  the  supporting  surface,  a leftward  frictional 
force / = F(G2c  — >R)/(G2C  + R2)  must  act  on  the  spool. 

b.  Show  that  the  required  frictional  force  j has  the 
value  zero  for  a particular  value  r0  of  the  distance  r. 

c.  Interpret  the  result  of  part  a when  r exceeds  the 
value  r0  found  in  part  b. 

d.  Show  that  the  rightward  translational  acceleration 
of  the  center  of  the  mass  of  the  spool  ac  exceeds  F/M 
when  r > r0.  Explain  how  this  can  happen. 

^(SiTMAh'fK  nof  chcai')"V 

(TO-28.  fiarate  chap.  A uniform,  slender  rod  1.00  m 
longsiioi-tially  at  rest  on  a smooth,  horizontal  surface.  It  is 
struck  a sharp  horizontal  blow  at  one  end,  with  the  blow 
directed  at  right  angles  to  the  rod  axis.  As  a result,  the  rod 
acquires  an  angular  velocity  of  3.00  rad/s. 

a.  What  is  the  translational  velocity  of  the  center  of 
mass  of  the  rod  after  the  blow? 

b.  Which  point  on  the  rod  is  stationary  just  after  the 

bl(>w?JwHarTianie~~is  given  to  the  distance  Between  thisx 
point  and  the  end  of  the  rod?  I/Ia/JIa.  / ' 


10-29.  Unwinding.  Consider  the  system  shown  in  Fig. 


10E-29.  The  cylinder  on  the  30°  incline  has  mass  M and 
radius  R.  A string,  which  is  wound  around  the  cylinder, 
runs  parallel  to  the  incline  and  then  passes  over  a pulley  of 
negligible  mass  to  a hanging  body  of  mass  m.  The  tension 
in  the  string  and  the  kinetic  frictional  force  exerted  on  the 
cylinder  by  the  incline  are  just  sufficient  to  keep  the  cylin- 
der from  descending  the  incline  as  it  turns,  while  the 


string  unwinds  and  the  body  of  mass  m descends  with  accel- 
eration a.  The  coefficient  of  kinetic  friction  /j.k  between  the 
incline  and  the  cylinder  is  given  by  /j.k  = 0.25.  Determine 
the  acceleration  a of  the  hanging  body  and  the  mass  ratio 
M/m. 


<5 


0-. 


Fig.  10E-29 


^ 

I0-30y  The  motion  of  a system’s  center  of  mass.  Consider  / ^ 
y&tem  composed  of  a 1-ke  body  and  a 2-ke  body  ini- 


tposed  of  a 1-kg  body  and  a 2-kg  body 
daily  at  rest  at  a center-to-center  distance  of  1 m.  All  nu-  i- 1 
merical  values  quoted  in  this  exercise  are  to  be  considered 
exact. 


a.  How  far  is  the  system’s  center  of  mass  from  the 
center  of  the  1-kg  body? 

b.  Beginning  at  t = 0 s,  a net  rightward  force  of  2 N 
acts  on  the  2-kg  body.  What  is  its  resultant  acceleration? 

c.  How  far  does  the  2-kg  body  move  between  t = 0 s 
and  t = 1 s? 

d.  How  far  is  the  center  of  mass  from  the  1-kg  body 
at  t = 1 s? 


e.  How  far  did  the  center  of  mass  move  between  t = 
0 s and  t = 1 s? 

f.  What  is  the  acceleration  of  the  center  of  mass, 
beginning  at  t = 0 s? 

g.  Suppose  that  all  the  mass  in  both  objects  was  con- 
centrated at  the  center  of  mass,  and  the  net  rightward 
force  of  2 N acted  on  this  concentrated  mass.  What  would 
its  acceleration  be? 

h.  State  the  general  theorem  which  is  illustrated  by 
the  results  of  parts/ and  g. 


10-31.  Work  to  be  done.  Calculate  the  work  done  by  the 
electric  motor  in  Example  10-8  by  applying  Eq.  (10-61), 

r<br 

W = T • dfi.  Compare  your  results  with  those  ob- 

J <t>j 

tained  in  the  example. 


10-32.  The  three-dumbbell  experiment ? In  a common 
physics  lecture  demonstration,  a lecturer  sits  on  a stool 
that  can  rotate  freely  about  a vertical  axis  on  low-friction 
bearings.  The  lecturer  holds  with  extended  arms  two 
dumbbells,  each  of  mass  m,  and  kicks  the  floor  so  as  to 


i/lijMia  fa  ACuLu4  fit 


fll  fa  J/Uj.fa  Cj  fa 


wrLccLtA  --MdLr 


Exercises  437 


achieve  an  initial  angular  speed  co x.  The  lecturer  then 
pulls  in  the  dumbbells,  so  that  their  distances  from  the  ro- 
tation axis  decrease  from  the  initial  value  Rx  to  the  final 
value  R2.  Determine  the  final  angular  speed  w2,  assuming 
that  the  moment  of  inertia  about  the  rotation  axis  of  the 
lecturer’s  body  plus  the  stool  does  not  change  in  the 
process.  Then  evaluate  K1  and  K2,  the  initial  and  final 
kinetic  energies  in  the  system.  What  is  the  source  of  the 
additional  kinetic  energy? 


a loop  of  radius  R , as  shown  in  Fig.  10E-35.  The  sphere 
rolls  without  slipping  throughout  its  motion. 

a.  Find  the  minimum  elevation  h of  the  starting 
point,  measured  from  the  top  of  the  loop.  Be  careful  to 
take  into  account  the  rotational  kinetic  energy  of  the 
sphere. 

b.  Show  that  the  minimum  elevation  is  negative  for 
r > ttR.  Explain  how  this  can  be  correct. 


> 10  -33.  Toy  top.  A top  consists  of  a uniform  disk  of 
mass  m0  and  radius  r0  rigidly  attached  to  an  axial  rod  of 
negligible  mass.  The  top  is  placed  on  a smooth  table  and  set 
spinning  about  its  axis  of  symmetry  with  angular  speed  ws. 

a.  How  much  work  must  be  done  in  setting  the  top 
spinning?  Evaluate  your  resulufor  m0  = 0.050  kg,  r0  - 
cm,  and  cos  =*w.t r rad/s  (or  rs&Q. rotations  per  minute). 

b.  The  center  of  the  disk  is  a distance  d from  the  top’s 
point  of  contact  with  the  table.  The  top  is  observed  to 
precess  steadily  about  the  vertical  axis  with  angular  speed 
ojp.  Assuming  that  cop  « a>s,  use  Eq.  (10-36)  to  write  cop 
in  terms  of  r0,  d,  ws,  and  g.  Evaluate  cop  for  d — m and 
g = 9.80  m/s2,  with  the  other  quantities  as  given  in  part  a. 
Is  your  result  consistent  with  the  assumption  cop  « 

10-34.  A loaded  disk.  A disk  of  mass  M and  radius  R is 
supported  vertically  by  a pivot  at  its  center.  As  shown  in 
Fig.  1 OE-34,  a small,  dense  object  (also  of  mass  M)  is  at- 
tached to  the  rim  and  raised  to  the  highest  point  above  the 
center.  The  system  is  then  released.  What  is  the  angular 
speed  of  the  system  when  the  attached  object  passes 
directly  beneath  the  pivot? 


- Mass  M Fig.  1 OE-34 


10-36.  The  end  of  the  leash.  A horizontal,  homoge- 
neous cylinder  of  mass  M and  radius  R is  pivoted  about  its 
axis  of  symmetry.  As  shown  in  Fig.  10E-36a,  a string  is 
wrapped  several  times  around  the  cylinder  and  tied  to  a 
body  of  mass  m resting  on  a support  positioned  so  that  the 
string  has  no  slack.  The  body  m is  carefully  lifted  vertically 
a distance  h , and  the  support  is  then  removed,  as  shown  in 
Fig.  10E-366. 

a.  Just  before  the  string  becomes  taut,  evaluate  (1)  the 
angular  velocity  &>0  of  the  cylinder,  (2)  the  speed  v0  of  the 
falling  body  m,  (3)  the  kinetic  energy  K0  of  the  system. 

b.  Evaluate  the  corresponding  quantities,  cou  tq,  and 
Ki,  for  the  instant  just  after  the  string  becomes  taut. 

c.  Why  is  Ki  less  than  K0 ? Where  does  the  energy  go? 

d.  If  M = m,  what  fraction  of  the  kinetic  energy  is 


lost  when  the  string  becomes  taut? 


Fig.  10E-36 


1. 


(a) 


( b ) 


10-35.  Looping  the  loop.  A sphere  of  mass  m and  radius 
r rolls  down  an  incline  and  acquires  enough  speed  to  loop 


Fig.  10E-35 


10-37.  A hollow  sphere.  A thick-walled  hollow  sphere 
has  outside  radius  R0.  It  rolls  down  an  incline  without  slip- 
ping, and  its  speed  at  the  bottom  is  u0-  Now  the  incline  is 
waxed,  so  that  it  is  practically  frictionless,  and  the  sphere 
is  observed  to  slide  down  (without  rolling).  Its  speed  at  the 
bottom  is  observed  to  be  5v0/4. 

a.  Determine  the  radius  of  gyration  of  the  hollow 
sphere  about  an  axis  through  its  center. 

b.  The  central  hollow  has  zero  density  and  unknown 
radius  Rt.  For  Rf  < r < R0,  the  density  is  uniform.  Deter- 
mine Ri/R0-  Compare  the  volume  of  the  cavity  to  the  total 
volume  4nRl/3. 

10-38.  Flywheel-powered  cars,  II.  It  has  been  proposed 
that  the  kinetic  energy  stored  in  a flywheel  be  used  to 
propel  an  automobile.  (A  small  electric  motor  located 


438  Rotational  Motion,  II 


where  the  automobile  is  parked  overnight  could  be  used 
to  spin  up  the  flywheel  each  night  and  thus  make  up  for 
the  energy  used  during  the  day.)  Calculate  the  energy 
content  in  a cylindrical  flywheel  of  uniform  density,  radius 
0.50  m,  mass  200  kg,  and  angular  speed  20,000  rotations 
per  minute.  (This  angular  speed  is  near  the  limit  at  which 
a steel  flywheel  would  break  apart.)  With  the  flywheel 
installed,  an  automobile  has  a total  mass  of  1000  kg.  When 
the  car  is  traveling  on  a level  road  at  a speed  of  100  km/h, 
the  total  frictional  force  acting  on  it  has  a magnitude  et|ual 
to  10  percent  of  the  weight  of  the  automobile.  Calculate 
how  far  it  could  travel  before  using  up  all  the  energy  stored 
in  the  flywheel. 


jr 


10-39.  The  earth  in  space.  The  earth  has  a mass  of 
5.98  x 1024  kg,  a radius  of  6.37  x 106  m,  and  a moment  of 
inertia  of  8.04  X 1037  kg-m2.  It  rotates  around  its  axis  once 
every  8.62  x 104  s.  It  travels  at  (nearly)  constant  speed  in 
its  (nearly)  circular  orbit  about  the  sun,  completing  one 
orbit  in  3.16  x ](E_s.  Its  orbital  radius  is  1.50  x 10u  m. 
The  sense  of  (rotation))!'  the  earth  about  the  sun  is  the  same 
as  the  sense  of  its  rotation  about  its  own  axis. 

a.  Compare  the  earth's  moment  of  inertia  about  its 
own  axis  to  that  of  a uniform  sphere  of  the  same  mass  and 
radius.  Explain  the  difference. 

b.  Evaluate  the  earth's  spin  angular  momentum  Ls 
about  its  own  axis. 

c.  Evaluate  the  earth’s  angular  momentum  of  rota- 
tion about  the  sun  L,.  using  the  sun  as  the  origin. 

d.  Of  the  total  angular  momentum  Ls  + Lr,  what 
fraction  is  spin  angular  momentum? 

e.  Evaluate  the  earth's  spin  kinetic  energy  Ks.  

f.  Evaluate  the  earth's  kinetic  energy  of  (rotation 
about  the  sun,  Kr. 

g.  Of  the  total  kinetic  energy,  Ks  + Kr,  what  fraction 
is  spin  kinetic  energy? 


Group  C 

10-40.  Divide  and,  conquer:  the  spherical  shell.  Table  10-1 
indicates  that  a homogeneous,  thin-walled  spherical  shell 
of  mass  M and  radius  R has  a moment  of  inertia  / about  a 
diameter  given  by  / = f AIR2.  In  this  exercise,  you  will 
confirm  that  expression  by  adding  the  moments  of  inertia 
of  various  small  portions  of  the  shell. 

a.  Show  that  the  spherical  shell  has  a mass  per  unit 
spherical  surface  area  given  by  M/AttR 2. 


b.  Imagine  that  the  shell  is  divided  into  flat  hoops  of 
various  sizes,  as  shown  in  Fig.  10E-40.  Show  that  the  hoop 
indicated  there  accounts  for  an  area  dA  = (2ttR  sin  6) 
( R d0)  of  the  spherical  surface.  What  is  the  mass  dM  of 
that  hoop? 

c.  Obtain  an  expression  for  the  hoop’s  moment  of  in- 
ertia,7)about  the  axis  ZZ'). 

d.  What  range  of  values  of  6 is  needed  to  account  for 
the  entire  spherical  shell? 

e.  Integrate  the  expression  obtained  in  part  c to  ob- 
tain I.  Your  result  should  agree  with  that  in  Table  10-1. 

10-41.  Divide  and  conquer:  the  sphere.  Table  10-1  indi- 
cates that  a homogeneous  sphere  of  mass  M and  radius  R 
has  a moment  of  inertia  I (about  a diameter)  given  by  / = 
iMR2.  In  this  exercise,  you  can  confirm  that  expression  by 
adding  the  moments  of  inertia  of  spherical  shells. 

a.  Show  that  the  sphere  has  a mass  density  p given  by 
p = M /%  ttR3. 

b.  Imagine  that  the  sphere  is  divided  into  concentric 
shells  of  various  sizes.  Show  that  a shell  of  radius  r and 
thickness  dr  has  volume  dV  = 477T2  dr.  What  is  the  mass  dM 
of  that  shell? 

c.  Use  the  expression  for  the  moment  of  inertia  of  a 
spherical  shell  found  from  Table  10-1,  to  write  the  mo- 
ment of  inertia  dl  of  the  shell  (about  a diameter). 

d.  What  range  of  values  of  r is  needed  to  account  for 
the  entire  sphere? 

e.  Integrate  the  expression  obtained  in  part  c to  ob- 
tain I.  Your  result  should  agree  with  that  found  from 
Table  10-1. 

10-42.  Tough  on  the  bearings.  A rod  of  negligible  mass 
and  length  d has  a particle  of  mass  m attached  to  each  end. 
As  shown  in  Fig.  10E-42,  the  midpoint  O of  the  rod  is 
rigidly  attached  to  the  shaft  AB,  also  of  negligible  mass, 
making  an  angle  y with  it.  (Notice  that  the  point  O is  not 
necessarily  midway  between  A and  B.)  The  shaft  is  sup- 
ported by  bearings  at  A and  B,  and  the  system  is  rotating 
about  AB  with  angular  velocity  os,  as  shown.  At  the  instant 
shown,  the  particles  of  mass  in  are  in  the  plane  of  the 
figure. 


a.  Show  that  the  center  of  mass  of  the  system  is  lo- 
cated at  O and  that  it  remains  motionless. 

b.  Determine  the  angular  momentum  vector  L of  the 
rotating  system,  using  0 as  the  origin.  Show  that  the  orien- 
tation of  L is  correctly  shown,  and  show  that  its  magnitude 
is  L = i mco  d2  sin  y. 

c.  Show  that  any  other  choice  of  origin  would  give 
the  same  result  for  L. 


Exercises  439 


d.  What  is  the  magnitude  of  the  component  Ln  ol  L 
along  AB ? What  is  the  magnitude  of  the  component  L± 
perpendicular  to  AB? 

e.  Show  that  Ln  is  constant  but  that  Lx  changes  with 
time.  Find  the  magnitude  and  direction  of  dL±/dt  for  the 
instant  shown  in  the  figure. 

f.  If  the  shaft  AB  has  length  h,  find  the  magnitudes 
and  directions  of  the  forces  at  A and  at  B for  the  instant 
shown  in  the  figure.  Does  your  result  depend  on  exactly 
where  the  rod  is  attached  along  AB?  Why  or  why  not? 


10-47.  Scaling:  mass  and  moment  of  inertia.  Symmetrical 
tops  1 and  2 are  built  from  the  same  material  and  are  geo- 
metrically similar  in  shape,  but  top  2 is  A.  times  as  large  as 
top  1 in  each  linear  dimension. 

a.  Express  the  mass  M2  of  top  2 as  a multiple  of  the 
mass  Mi  of  top  1 . 

b.  Designate  the  moments  of  inertia  (about  the  axis 
of  symmetry)  of  tops  1 and  2 by  f and  h,  respectively. 
Express  /2  as  a multiple  of  f. 

c.  Evaluate  the  results  of  parts  a and  b for  the  follow- 
ing values  of  X:  1.2,  2.0,  5.0. 


10-43.  What’s  inside ? Suppose  someone  hands  you  a 
sphere  of  radius  R and  tells  you  that  it  consists  of  an  inner 
solid  sphere  of  one  material,  covered  by  a concentric 
spherical  outer  covering  of  another  material.  You  are  re- 
quired to  determine  the  size  and  density  of  each  part, 
without  cutting  or  removing  any  material.  Can  you  do  it? 
If  not,  why  not?  If  so,  how? 

10-44.  Pure  rolling,  or  some  slippage?  A thin  cylindrical 
shell,  a solid  cylinder,  a thin  spherical  shell,  and  a solid 
sphere  are  placed  side  by  side  at  the  top  of  an  incline  of 
angle  6.  The  four  objects  are  all  made  of  the  same  mate- 
rial, so  that  all  have  the  same  coefficients  of  friction  with 
respect  to  the  surface  of  the  incline:  static  friction  j±s  and 
kinetic  friction  p,k , which  is  less  than  /jls. 

a.  For  each  object,  find  the  largest  incline  angle  (call  it 
dM)  for  which  that  object  can  roll  down  the  incline  without 
slipping.  (Each  object  has  its  own  value  for  0M.) 

b.  For  each  object,  determine  the  translational  motion 
in  the  case  0 =£  0M. 

c.  For  each  object,  determine  the  translational  motion 
in  the  case  9 > 0M. 

d.  Set  /as  = 0.30,  jxk  = 0.10,  and  6 = 40°.  In  what 
order  will  the  four  objects  reach  the  bottom  of  the  incline? 

10-45.  Trying  to  hit  the  spot.  A uniform  bar  of  length  l 
and  mass  m is  suspended  from  a very  thin  axle  that  passes 
through  a hole  near  the  top  end  A of  the  bar. 

a.  How  far  from  A should  a blow  be  applied  at  right 
angles  to  the  bar,  in  order  to  start  the  bar  rotating  about  A 
without  breaking  the  axle?  (If  the  blow  is  properly  placed, 
there  will  be  no  impulsive  force  on  the  axle  as  the  blow 
is  applied.  In  this  case,  the  point  of  application  of  the  blow 
is  called  the  center  of  percussion  relative  to  A.) 

b.  What  is  the  period  of  oscillation  of  the  rod  when  it 
is  suspended  from  A? 

c.  What  is  the  length  of  the  simple  pendulum  having 
the  same  period?  The  length  you  obtain  here  should  be 
the  same  as  the  distance  you  obtained  in  part  a.  The 
center  of  percussion  relative  to  A is  also  called  the  center  of 
oscillation  relative  to  A$  /fnd  the  distanxFl75IweCTThenTTs 

C^~the  radius  (vF..gynruon  about  A. 

10-46.  Not  so  fast!  Suppose  the  top  described  in  Ex- 
ample 10-6  is  spinning  about  its  axis  with  an  angular 
speed  cos  = 40  rad/s.  Evaluate  its  angular  speed  of  preces- 
sion mp.  This  will  require  modifying  the  procedure  used  in 
the  example  because  wp  is  not  very  small  compared  to  ms. 


10-48.  Scaling  at  constant  angular  speed.  Top  A and  top 
B are  built  from  the  same  material  and  are  geometrically 
similar  in  shape,  but  top  B is  X times  as  large  as  top  A in 
each  linear  dimension.  Top  A is  set  spinning  with  angular 
speed  msA;  top  B is  set  spinning  with  the  same  angular 
speed  msB  = ws_4. 

a.  Find  the  spin  angular  momentum  LsB  of  top  B as  a 
multiple  of  the  spin  angular  momentum  LsA  of  top  A. 
Your  result  should  involve  only  LsA,  LsB,  and  X. 

b.  Find  the  spin  kinetic  energy  KsB  of  top  B as  a mul- 
tiple of  the  rotational  kinetic  energy  KsA  of  top  A. 

c.  Top  A is  inclined  to  the  vertical  as  it  spins  on  a 
smooth  tabletop,  and  it  is  observed  to  precess  with  pre- 
cessional  angular  speed  mpA  . This  is  much  smaller  than  wsA , 
so  Eq.  (10-36)  applies.  Find  the  precessional  angular 
speed  mpB  of  top  B.  Assume  that  mpB  « msB.  Express  mpB 
as  a multiple  of  mpA. 

d.  Express  the  ratio  mpB/msB  as  a multiple  of  mpA/msA. 

e.  Suppose  that  wpA/msA  = 1.0  X 10-3  and  X = 2.0.  Is 
the  assumption  that  mpB  <SC  msB  a valid  one?  Can  you  ex- 
pect -ytmt^result  for  mpB  to  be  accurate? 


sho' 


49)  An  adjustable  top.  Consider  the  adjustable  top 
lg.  10E-49.  The  main  body  of  the  top  is  a uni- 


form hemisphere  of  mass  M with  radius  R.  A short,  light 


Fig.  10E-49 


jA)  Av 
* yv'0  l 


440  Rotational  Motion,  II 


rod  of  length  H serves  as  a “foot”  for  the  top.  A light  rod 
of  length  D extends  upward  along  the  axis  of  symmetry  of 
the  top.  A cylinder  of  mass  m,  radius  c and  thickness  d can 
be  clamped  to  the  rod  so  that  its  center  is  at  any  desired 
distance  / from  the  flat  surface  of  the  hemisphere.  The  ex- 
treme values  for  l are  lmin  = d/2  and  /max  = D — d/2. 

a.  Show  that  the  moment  of  inertia  / of  the  entire 
top  about  the  symmetry  axis  does  not  depend  on  /,  and 
obtain  an  expression  for  I. 

b.  Obtain  an  expression  for  the  distance  r between 

the  center  of  mass  of  the  top  and  F,  the  point  of  contact 
with  the  support  surface.  ^ icio-l 

c.  Using  the  definition  G2  = I/M,  ’which  gives  the 
gyration  radius  of  the  top  in  terms  oFits  moment  of  inertia 
I and  its  total  massM,  apply  Eq.  ( 10-36)  in  order  to  obtain 
an  expression  for  the  angular  speed  of  precession  a>,,  of  the 
top,  if  it  has  spin  angular  speed  cos  » oip. 

d.  Where  should  the  cylinder  be  placed  in  order  to 
obtain  the  maximum  angular  speed  of  precession  for  a 
given  spin  angular  speed?  To  obtain  the  minimum  angu- 
lar speed  of  precession  rate  for  a given  spin  angular 
speed? 


e.  Evaluate  the  ratio  of  the  maximum  angular  speed 
of  precession  to  the  minimum  angular  speed  of  preces- 
sion, under  the  following  conditions:  m/M  = 0.20, 
H/R  = 0.50,  D/R  = 3.0,  c/R  = 0.50,  and  d/R  = 0.20. 


cm.  The  top  is  spinning  at^E#00  rotations  per  minute. 
Using  g = 9.80  m/s2,  evaluate  the  precession  angular 
speed  aj„. 

e.  If  the  top  were  spinning  at  the  same  spin  angular 
speed  but  in  a more  tilted  position  (that  is,  with  a larger 
value  of  a than  is  shown),  what  would  be  the  angular 
speed  of  precession? 

10-51.  No  slippage  here,  I.  Consider  the  system  shown 
in  Fig.  1 0E-5 1 . Both  the  cylinder  and  the  pulley  are  homo- 
geneous, each  having  mass  M and  radius  R.  The  hanging 
object  also  has  mass  M.  The  incline  makes  an  angle  6 with 
the  horizontal.  There  is  no  slippage  between  the  cylinder 
and  the  incline,  nor  between  the  pulley  and  the  string. 
The  pulley’s  axle  is  frictionless. 


Fig.  10E-51 


10-50.  Precession  of  a conical  top.  As  shown  in  Fig. 
10E-50,  a solid  conical  top  of  mass  M,  height  h,  and  radius 
R is  spinning  about  its  symmetry  axis  00'  with  spin  angu- 
lar speed  oj.,.  The  axis  00'  makes  an  angle  a with  the  ver- 
tical. 


Fig.  10E-50 


a.  Determine  the  acceleration  of  the  hanging  weight, 
the  angular  acceleration  of  the  pulley,  and  the  transla- 
tional and  angular  accelerations  of  the  cylinder. 

b.  Determine  the  tensions  in  both  straight  sections  of 
the  string. 

c.  Find  the  minimum  coefficient  of  static  friction  /xs 
between  the  cylinder  and  the  incline  which  is  consistent 
with  jTe=iacF  of  slippage. 


A 

OlAA 


a.  Show  that  the  center  of  mass  of  the  top  is  located 
along  00'  at  a distance  3/t/4  from  the  vertex  O. 

b.  Show  that  the  moment  of  inertia  I about  the  axis 
OO'  is  given  by  I = fo  MR2. 

c.  Find  the  angular  speed  at  which  the  top  pre- 
cesses  about  the  vertical. 

d.  Consider  a top  for  which  h = 10.0  cm  and  R = 3.0 


lo  slippage  here,  II. 

nsider  the  application  of  Newton’s  second  law  to 
a cord  of  negligible  mass  which  lies  along  the  surface  of  a 
pulley  of  radius  R.  Show  that  the  maximum  rate  of  change 
of  tension  with  distance  along  the  cord  is  given  by  dT/ds  =£ 
p^T( (0/R,  where  is  the  coefficient  of  static  friction 
between  the  cord  and  the  pulley. 

b.  Use  the  result  of  part  a to  show  that  the  ratio  of  the 
tension  on  either  side  of  a pulley  that  is  in  contact  with  the 
cord  over  a central  angle  a is  less  than  or  equal  to  chpa. 

c.  Show  that  the  pulley  in  Fig.  10E-51  has  a central 
angle  of  contact  given  by  a — jt/2  + 0. 

d.  Find  the  minimum  frictional  coefficient  up  that  is 
consistent  with  the  assumed  lack  of  slippage  in  Exercise 
10-51. 

10-53.  The  second  law  for  a system  of  particles.  Newton's 
second  law  for  a systern  of  particles  is  stated  in  Eq.  (10-44). 
Prove  the  validity  of  this  law  on  the  basis  of  Newton’s  laws 
for  a single  particle,  in  the  manner  suggested  in  small 
print  below  the  equation. 


Exercises  441 


Gravitation  and 
Central  Force  Motion 


11-1  UNIVERSAL  This  chapter  is  devoted  largely  to  the  application  of  the  mechanics  of  rota- 
GRAVITATION  tional  motion  of  particles,  developed  in  Chap.  9,  to  a subject  which  has  fas- 
cinated humankind  from  prehistoric  times.  This  subject  is  the  motion  of 
heavenly  bodies — and  particularly  the  motion  of  the  moon,  the  planets, 
and  the  sun  — as  they  appear  to  sweep  majestically  across  the  sky. 

The  apparent  motion  of  the  stars  is  relatively  simple,  corresponding  as 
it  does  to  the  daily  rotation  of  the  earth  about  its  own  axis.  The  moon,  the 
planets,  and  the  sun  share  in  this  daily  apparent  motion.  In  addition,  how- 
ever, they  appear  to  move  relative  to  the  stars,  each  in  its  own  way.  The  mo- 
tions of  the  planets  are  particularly  complex  in  appearance.  In  spite  of 
these  complexities,  there  are  intriguing  regularities  about  the  motions  and 
the  interconnections  among  them. 

An  accurate  quantitative  description  of  the  apparent  motions  of  the 
moon,  the  planets,  and  the  sun  may  be  called  the  fundamental  kinematical 
problem  of  astronomy.  Thoughtful  persons  have  risen  to  the  challenge  of 
describing  heavenly  motions  since  long  before  the  invention  of  written 
arithmetic.  One  of  the  most  dramatic  evidences  of  this  response  is  the  com- 
plex at  Stonehenge,  in  southern  England.  Figure  11- lo  shows  a view  of 
Stonehenge,  which  functioned  (perhaps  among  other  purposes)  as  an 
astronomical  observatory  and  calculator  for  measuring  and  predicting 
astronomical  events.  1 he  carefully  oriented  stones  appear  to  have  been 
used  in  various  combinations  to  establish  sight  lines  for  astronomical  ob- 
jects near  the  horizon.  One  such  application  is  suggested  in  the  figure 
caption. 

Aside  from  such  practical  uses  as  predicting  the  onset  of  seasons  for 
agricultural  and  other  purposes,  kinematical  descriptions  of  astronomical 
motions  made  possible  predictions  of  eclipses  and  other  events  which  had 


442 


Fig.  11-1  (a)  General  view  of  Stonehenge.  Besides  the  large  stones,  or  megaliths, 

which  are  evident  in  the  photograph,  many  smaller  stones  are  hidden  in  the  grass. 
The  stones  were  transported  to  the  site  over  a considerable  distance  by  unknown 
means.  Apparently,  the  whole  was  intended  as  an  astronomical  observation  instrument 
and  calculator  which  predicted  eclipses  among  other  purposes,  not  all  of  which  are 
known.  At  the  vernal  equindx  (the  day  in  the  spring  when  clay  and  night  are  each 
12  h long),  the  shadow  of  the  rising  sun  was  cast  along  a major  row  of  stones,  (b)  An  in- 
strument used  in  the  sixteenth  century  to  determine  the  elevation  and  bearing  angles  of 
a heavenly  body.  The  sights  are  at  S and  T.  (From  Tycho  Brahe  Astronomiae  Instrumentae 
Mechanica,  1598.  Courtesy  New  York  Public  Library.) 


religious,  astrological,  or  other  mystical  significance.  But  the  intrinsic  in- 
terest of  the  problem  of  making  more  accurate  observations  and  predictions 
must  always  have  been  a strong  motivation. 

With  the  development  of  increasingly  powerful  mathematical  and  ob- 
servational techniques,  the  arts  of  astronomical  observation  and  predic- 
tion became  more  and  more  precise.  Naked-eye  observation  reached  its 
culmination  in  the  work  of  the  Danish  astronomer  Tycho  Brahe 
(1546-1601).  Figure  11-16  shows  one  of  his  sighting  instruments,  which 
were  designed  by  him  and  were  the  best  in  the  world  at  the  time.  While 
such  instruments  were  far  more  precise  and  versatile  than  the  sighting 
stones  of  Stonehenge,  their  underlying  principles  are  not  very  different. 
To  this  day,  indeed,  the  precise  measurement  of  sighting  angles  is  indis- 
pensable to  observational  astronomy. 

Invention  of  the  astronomical  telescope  by  Galileo  in  1609,  and  its  sub- 
sequent improvement,  made  possible  a precision  of  angular  measurement 
better  by  orders  of  magnitude  than  the  30  arc  seconds  (0.008°)  or  so  that 
represents  the  best  the  naked  eye  can  do  with  the  aid  of  sighting  devices 


11-1  Universal  Gravitation  443 


such  as  Tycho’s.  The  improvement  of  the  clock  by  Huygens  and  others  in 
the  mid-seventeenth  century  and  afterward  was  also  essential;  it  is  much 
more  useful  to  know  precisely  where  a heavenly  body  is  if  you  know  pre- 
cisely when  it  is  there. 

It  is  not  surprising  that  these  strides  in  the  kinematical  description  of 
the  heavens  inspired  attempts  at  mechanical  explanations,  that  is,  explana- 
tions of  the  observed  motions  in  terms  of  the  forces  governing  them.  Mod- 
ern physics  may  be  said  to  have  had  its  birth  in  these  attempts. 

As  long  as  it  was  generally  believed  that  the  earth  lay  at  the  center  of 
the  universe,  there  was  scant  hope  of  even  a semiquantitative  mechanical 
explanation.  But  in  the  first  half  of  the  sixteenth  century  the  Polish- 
Prussian  physician,  church  official,  and  astronomer  Nicolaus  Copernicus 
(1473-1543)  advocated  an  alternative  model  of  the  universe.  In  the  Coper- 
nican  system,  the  planets  ( including  the  earth)  revolve  around  the  sun,  and 
the  moon  revolves  around  the  earth.  This  arrangement — after  consider- 
able modification  of  its  details  had  been  made  by  others — ultimately  made 
possible  a relatively  simple  mechanical  description  in  terms  of  the  single, 
universal,  fundamental  force  called  gravitation. 

But  the  Copernican  system  in  its  original  form  was  not  remarkably 
simple,  and  progress  was  not  easy.  Such  outstanding  men  of  genius  as  Jo- 
hannes Kepler  (1571-1630),  Galileo,  and  Rene  Descartes  (1596-1650) 
speculated  on  the  mechanical  problem  without  real  success. 

It  remained  for  Newton  to  find  the  key,  using  the  groundwork  which 
had  been  laid  by  others.  More  than  a century  after  Copernicus,  he  estab- 
lished firmly  the  fact  that  the  lunar  and  planetary  motions,  as  seen  from 
properly  chosen  reference  frames,  can  be  accounted  for  on  the  basis  of  the 
influence  of  gravitation,  acting  as  a central  force. 

Why  was  the  Copernican  view  of  the  solar  system  so  important  in  Newton’s 
development  of  the  mechanics  of  that  system  on  the  basis  of  a single  kind  of  force? 
You  will  see,  as  we  follow  Newton’s  development  through  this  chapter,  that  the 
development  “lifts  itself  by  its  own  bootstraps.”  It  begins  by  considering  the 
behavior  of  an  ordinary  body  near  the  surface  of  the  earth,  such  as  an  apple,  as  the 
body  moves  under  the  influence  of  gravitational  force.  It  then  considers  quantita- 
tively the  consequences  of  assuming  that  the  motion  of  the  moon  around  the 
earth  is  governed  by  the  very  same  kind  of  force. 

The  next  step  depends  heavily  on  the  Copernican  viewpoint.  From  that  view- 
point, the  motion  of  the  moon  about  the  earth  is  akin  to  the  motion  of  the  planets 
(including  the  earth)  about  the  sun.  On  this  basis,  an  understanding  of  the  me- 
chanics of  lunar  motion  may  well  be  expected  to  lead  directly  to  insights  into  the 
mechanics  of  planetary  motion. 

A remark  on  nomenclature  is  appropriate  here.  In  astronomy,  it  is  conven- 
tional to  call  the  rotation  of  one  body  about  another  revolution.  The  word  rota- 
tion is  restricted  to  the  turning  of  a body  about  its  own  axis.  We  follow  the  astro- 
nomical convention  in  this  chapter.  Elsewhere  in  this  book,  the  word  “rotation”  is 
used  to  denote  any  angular  motion  about  any  center,  but  “revolution”  is  used 
occasionally  in  subsequent  chapters  when  the  distinction  must  be  made. 

Newton  began  by  applying  Huygens’  formula  for  centripetal  accelera- 
tion to  the  revolution  of  the  moon  about  the  earth.  In  the  form  given  by  Eq. 
(3-416),  this  is 


r 


444  Gravitation  and  Central  Force  Motion 


Earth 


(a) 


Fig.  11-2  Method  for  measuring  Rm/Re,  the  ratio  of  the  distance  from  the  earth  to  the  moon 
to  the  radius  of  the  earth,  (a)  Perspective  view  of  the  earth-moon  system.  Two  observers/! 
and  B are  located  at  the  same  latitude  A,  on  opposite  sides  of  the  earth.  Their  distance  apart 
is  2 L = 2Re  cos  A,  where  L is  the  axial  distance  shown  in  part  b.  If  A and  B simultaneously 
measure  the  angles  (j>A  and  </>«,  they  can  determine  the  parallax  angle  0.  which  is  given  by 
6 = tt  — (</>.,,  + 4>b)-  Since  Rm  » R,. , the  angle  0 is  small,  and  the  approximation  6 = sin  0 
is  valid.  Thus  9 = 2 L/Rm  = 2 Re  cos  A /Rm,  which  can  be  solved  to  give  Rm/R,,  = (2  cos  A )/9. 
(In  practice,  the  same  observer  makes  two  observations  approximately  12  h 25  min  apart, 
at  the  rising  and  the  setting  of  the  moon,  and  then  corrects  for  the  motion  of  the  moon  in 
the  interim.) 


On  the  basis  of  the  approximation  that  the  moon’s  orbit  is  circular,  he  then 
made  a calculation  of  the  centripetal  acceleration  of  the  moon  about  the 
earth.  The  details  of  this  calculation  are  given  in  Example  11-1. 


EXAMPLE  11-1  ™— 

The  average  distance  from  the  center  of  the  earth  to  the  center  of  the  moon  is  Rm  = 
3.84  X 105  km,  or  close  to  60  times  the  radius  of  the  earth.  Re.  (The  ratio  of  these 
two  distances  was  hrst  determined  in  classical  antiquity.  A more  modern  method  is 
sketched  in  Fig.  1 1-2.)  The  sidereal  period  of  the  moon — that  is,  the  time  it  takes  to 
make  one  complete  circuit  of  the  earth,  as  seen  by  an  observer  fixed  in  space  — is 
T = 27.3  days.  Find  the  centripetal  acceleration  am  of  the  moon. 

■ Note,  to  begin  with,  that  this  is  simply  the  satellite  problem  of  Example  3-10 
with  the  numbers  changed.  Using  the  argument  developed  in  Sec.  3-6,  you  express 
the  tangential  speed  of  the  moon  in  terms  ol  its  orbit  radius  Rm  and  period  T,  and 
reexpress  Eq.  (11-1)  in  the  form  of  Eq.  (3-45)  to  produce 


You  can  then  insert  the  numbers  to  obtain  the  magnitude  of  the  centripetal  acceler- 
ation of  the  moon. 

„ 3.84  x 108  m 

Clm  ~ 77  (27.3  days  x 86,400  s/day)2 
= 2.72  x l(r3  m/s2 

The  value  of  am  is  very  close  to  1/3600  = l/(60)2  that  of  the  acceleration  of  gravity 
g = 9.80  m/s2  at  the  surface  of  the  earth.  That  is,  it  appears  that  am  / g = (Re/Rm  )2. 


11-1  Universal  Gravitation  445 


Is  it  a coincidence  that  am/g  = (Re/Rm)2?  Quite  to  the  contrary,  it  is  a 
pivotal  cine  in  the  line  of  reasoning  that  led  Newton  to  his  theory  of  gravi- 
tation. Like  his  predecessors,  Newton  asked  himself:  What  is  the  source  of 
the  force  which  produces  this  acceleration?  Kepler  had  speculated  that  the 
force  was  magnetic.  Galileo  had  suggested  tentatively  that  the  inertial  path 
of  an  isolated  body  is  not  a straight  line  but  a large  circle,  so  that  no  force  is 
necessary. 

Newton  was  thinking  about  these  matters,  and  about  mechanics  in 
general,  during  the  plague  year  of  1665.  Cambridge  University  was  closed, 
and  Newton  returned  to  his  childhood  home  in  the  village  of  Woolsthorpe. 
As  it  turned  out,  it  was  about  two  years  before  he  could  go  back  to  Cam- 
bridge, and  those  two  years  were  the  period  in  which  Newton,  who  was  in 
his  middle  twenties,  accomplished  the  major  portion  of  his  truly  prodigious 
creative  life  work.  (He  later  wrote,  “.  . . for  in  those  days  I was  in  the 
prime  of  my  age  for  invention,  and  minded  Mathematicks  and  Philosophy 
more  than  at  any  time  since.”) 

In  later  life  Newton  suggested  to  a friend  that  while  sitting  in  an  or- 
chard he  had  seen  an  apple  fall.  He  had  been  struck  by  the  idea  that  the  very 
same  force  that  made  the  apple  fall  xvas  that  which  constrained  the  moon  to  its 
orbit — the  gravitational  force.  Taking  this  insight  as  far  as  it  would  go, 
Newton  made  the  hypothesis  that  every  body  in  the  universe  exerts  a gravi- 
tational force  on  every  other  body. 


• * 

rfrom  k to  j nij 

Fig.  11-3  Direction  of  the  gravitational 
force  in  the  case  of  two  interacting 
bodies  whose  sizes  are  small  enough, 
compared  to  the  distance  between  them, 
that  they  can  be  considered  as  point 
masses  m,  and  to*.  The  position  of  body 
/ is  given  relative  to  body  k by  the  vector 
r from  k to  j ■ The  gravitational  force  Fonjbvt. 
exerted  on  body  7 due  to  the  presence  of 
body  k is  directed  toward  body  k.  That  is, 

Fon  j by  k f from  k to  j • 


What  sort  of  mathematical  form  must  such  a universal  gravitational 
force  law  take?  The  magnitude  of  the  centripetal  acceleration  of  the  moon, 
calculated  in  Example  11-1,  gives  the  hrst  hint.  If  that  centripetal  accelera- 
tion is  a gravitational  acceleration,  the  apple  would  also  experience  a gravi- 
tational acceleration  if  it  were  placed  a distance  Rm  from  the  center  of  the 
earth  equal  to  that  of  the  moon.  What  is  more,  we  already  know  that  the 
gravitational  acceleration  g experienced  by  bodies  near  the  surface  of  the 
earth  is  independent  of  their  masses.  So  it  is  fair  to  guess  that  the  apple,  if 
placed  at  a distance  Rm  from  the  center  of  the  earth,  would  experience  an 
acceleration  of  magnitude  am  equal  to  that  calculated  for  the  moon  in  Ex- 
ample 11-1.  Thus,  increasing  the  distance  from  the  center  of  the  earth  to 
the  apple  60-fold — from  Re,  the  radius  of  the  earth,  to  Rm  — would  decrease 
the  gravitational  acceleration  which  it  experiences  (60)2-fold  — from  g to 
arn.  Comparison  of  the  gravitational  acceleration  of  the  apple  falling  from 
the  tree  with  the  gravitational  acceleration  of  the  moon  “falling"  in  its 
orbit  suggests  that  the  acceleration  of  a given  body,  because  of  the  gravita- 
tional force  exerted  on  the  body  by  the  earth,  depends  inversely  on  the  square 
of  the  distance  from  the  center  of  the  earth  to  the  body. 

We  are  not  yet  in  a position  to  verify  this  suggestion  by  considering 
further  evidence.  However,  if  it  is  to  lead  to  a truly  universal  law  of  gravita- 
tion, it  must  apply  to  any  pair  of  bodies  j and  k,  and  not  just  to  the  earth  and 
the  apple,  or  the  earth  and  the  moon.  So  let  us  make  a tentative  general- 
ization, subject  to  later  verification.  In  Fig.  11-3,  the  vector  rfromfctoj  de- 
scribes the  position  of  an  arbitrary  body  j with  respect  to  another  body  k. 
We  express  the  distance  dependence  of  the  magnitude  of  the  gravita- 
tional force  Fonjbyfc  exerted  on  body  j due  to  the  presence  of  body  k in  the 
form 


on  j by  k 


( rfr< 


from  k to  j ) 


(11-3) 


446  Gravitation  and  Central  Force  Motion 


This  is  the  inverse-square  law.  Ult  imately  its  validity  will  be  justified  by  the 
consistency  of  predictions  deduced  from  it  with  very  large  numbers  of  ob- 
servations made  over  a very  wide  range  of  circumstances.  For  the  moment, 
however,  we  continue  with  Newton’s  argument. 

The  magnitude  Fonjbyk  of  the  gravitational  force  must  be  proportional 
to  the  mass  m}  of  the  body  on  which  it  is  acting,  say,  the  apple.  If  this  were 
not  so,  different  bodies  falling  to  earth  from  small  heights  above  its  surface 
would  not  experience  the  same  gravitational  acceleration.  Stated  mathemat- 
ically, this  is 

h'on  j by  k ^ Wlj  (11  "4 (l ) 

Since  bodies  j and  k are  chosen  entirely  arbitrarily,  there  must  also  be  a 
force  Fonfebyj-  exerted  on  body  k due  to  the  presence  of  body^/.  A repetition 
of  the  argument  immediately  above  leads  to  the  relation 

Ton  A- by  j * mk  (11-4 b) 

where  mk  is  the  mass  of  body  k. 

According  to  Newton's  third  law  of  motion,  the  forces  FonJbyfc  and 
F0nA- by  j must  have  equal  magnitudes;  that  is, 

Ton  j by  k Ton  k by  j 

The  proportionality  (1  1-4 b)  can  therefore  be  written 

Ton  j by  k x ™k 

This  proportionality,  taken  together  with  that  of  (ll-4fl),  shows  that  the 
force  exerted  on  body  j by  body  k is  proportional  to  both  masses.  But  if  a 
quantity  is  proportional  to  two  independent  quantities,  it  is  proportional 
to  their  product.  Thus  wre  can  write 

Ton  j byte  a mjmk  (ll-4c) 

Combining  this  proportionality  with  the  inverse-square  law  expressed 
in  the  proportionality  of  Eq.  (1  1-3),  we  obtain 

Ton  J by  A-  « . , WJ>”*  ,2  dl-5) 

V from  k toj  ) 

As  usual,  this  relation  is  more  usefully  written  as  an  equation.  We  define 
the  universal  gravitational  constant  G according  to  the  equation 

„ mmk 

T0n  j by  A'  — G ~ (1  l-6(?) 

v from  k to  j ) 

This  equation  can  also  be  written  in  vector  form,  so  as  to  express  direction 
as  well  as  magnitude.  As  shown  in  Fig.  1 1-3,  the  force  exerted  on  body  j by 
body  k is  always  toward  body  k.  Since  the  vector  rfl,omfctoj  is  directed  from 
body  k toward  bodyj,  we  have 


• on  j by  k 


■ = —G 


mjmk 


’ from  k to  j > 


rfrom  k toj 


( 1 1 -6/d 


The  bodies  labeled  j and  k are  quite  arbitrary.  Thus,  if  there  is  a force  ex- 
erted on  body  j by  body  k,  there  must  likewise  be  a force  exerted  on  body 
k by  body  j . That  force  can  be  evaluated  by  repeating  the  entire  argument 


11-1  Universal  Gravitation  447 


leading  to  the  equation  immediately  above,  with  the  roles  of  bodies  j and  k 
interchanged.  Doing  so  leads  to  the  equation 


f on  I:  by  j G ~ W ^from  j to  k ( 1 1 -DC ) 

v from  j to  k ) 

To  compare  this  equation  with  the  previous  one,  note  that  (rfromjtofe)2  = 
b from  k to  j ) ttnd  rfromjtoA-  ffromfttoj-  Consequently  we  have 

^on  k by  j -^on  j by  k 

1 hese  two  gravitational  forces  comprise  an  action-reaction  pair,  in  ac- 
cordance with  Newton’s  third  law. 


While  the  subscript  notation  used  in  the  series  of  equations  immediately 
above  is  explicit  in  its  meaning,  it  has  the  disadvantage  of  being  cumbersome.  We 
may  substitute  a shorthand  notation  for  the  subscripts,  in  which  the  words  are 
omitted  and  their  meaning  is  implied  by  the  order  of  the  indices  ; and  k.  For  ex- 
ample, we  can  define  Fjfc  = Fonjbyfc  and  rkj  = r(romktoj.  In  this  more  compact  nota- 
tion, the  equation  for  the  magnitude  of  the  gravitational  forces  becomes 

(11-  6d ) 


(ll-6e) 


(11-6/) 

Any  of  Eqs.  (11-6a)  through  (11-6/)  is  called  Newton's  law  of  universal 
gravitation. 

In  some  cases,  it  is  necessary  to  consider  the  gravitational  forces  ex- 
erted on  body  j due  to  the  presence  of  a number  of  other  bodies.  The  net 
gravitational  force  on  body  j is  found  by  taking  the  vector  sum  of  the 
individual  forces  calculated  by  repeated  application  of  Eq.  (11-66). 


Fjk  = Fkj  = G 


The  vector  equations  assume  the  forms 


mjmk 


rk 


kj 


and 


_ m,mk 
r jk  — G rfcj 

n-j 


mkmj  . 

Fkj  = -G  — — Tjk 


r jk 


If  the  argument  leading  to  Eq.  (11-66)  stood  alone,  there  would  be 
little  cause  for  surprise  in  the  fact  that  the  gravitational  force  obeys  the 
inverse-square  law.  After  all,  the  inverse-square  law  was  “built  into"  the 
equation  in  the  light  of  the  known  facts  that  the  distance  from  the  center 
of  the  earth  to  the  moon  is  60  times  the  distance  from  the  center  of  the 
earth  to  its  surface,  and  that  the  magnitude  of  the  acceleration  of  the  moon 
as  it  moves  in  its  orbit  is  l/(60)2  that  of  the  falling  apple.  To  see  this 
“building  in"  explicitly,  we  write  Eq.  (1  l-6o)  for  the  special  cases  of  the 
apple  of  mass  ma  located  a distance  Re  from  the  center  of  the  earth,  whose 
mass  is  me,  and  for  the  moon  of  mass  mm  located  a distance  Rm  from  the 
center  of  the  earth.  The  gravitational  forces  on  the  two  bodies  are,  respec- 
tively, of  magnitude 


= r mHme 

r on  apple  by  earth  G 2 

n,. 


and 


F 


on  moon  by  earth 


G 


■>  Wimble 


R'i 


LTsing  Newton’s  second  law,  we  can  immediately  write 

F(,n  apple  by  earth  Wla§ 


448  Gravitation  and  Central  Force  Motion 


(since  we  know  the  magnitude  of  the  acceleration  of  gravity  at  the  surface 
of  the  earth  to  be  g)  and 

f on  moon  by  earth 

where  am  is  the  centripetal  acceleration  of  the  moon,  which  we  have  deter- 
mined in  Example  11-1  on  the  basis  of  its  orbit  radius  and  period.  When 
these  forces  are  substituted  into  the  two  gravitational-force  equations 
above,  the  masses  ma  and  mm,  respectively,  cancel,  and  we  have 


mP 

g = G~^  (11-7  a) 

l\e 

and 

171 

am  = G -ppr  (11-76) 

Km 


While  we  know  neither  the  value  of  the  universal  gravitational  constant  G 
nor  the  mass  of  the  earth  me,  we  can  divide  the  second  equation  by  the  first 
to  obtain  the  ratio 


dm  / Re\ 2 

T ” VflJ 


(11-8) 


We  have  already  seen  that  the  numerical  values  of  the  ratios  on  the  two 
sides  of  the  equation  “answer  pretty  nearly,”  as  Newton  put  it. 

In  deriving  Eq.  (11-8),  we  have  used  Newton’s  second  law,  which  correctly 
describes  the  connection  between  force  and  motion  for  an  observer  fixed  in  an  in- 
ertial frame.  But  the  derivation  was  carried  out  in  a reference  frame  fixed  at  the 
center  of  the  earth.  The  earth  experiences  a centripetal  acceleration  as  it  moves  in 
its  orbit  about  the  sun  which  is  by  no  means  negligible  compared  to  am , the  cen- 
tripetal acceleration  of  the  moon  with  respect  to  the  earth.  (Indeed,  the  accelera- 
tion of  the  earth  is  more  than  twice  a,„  in  magnitude.  Can  you  show  this  by  direct 
calculation?)  Moreover,  if  the  earth  exerts  a gravitational  force  on  the  moon,  the 
sun  presumably  does  so  as  well.  And  the  latter  force  has  been  ignored. 

Why,  then,  do  the  numerical  values  inserted  into  Eq.  (11-8)  “answer  pretty 
nearly”?  To  put  it  more  generally,  why  is  it  valid  to  use  Newton's  second  law  in 
the  noninertial  frame  fixed  at  the  center  of  the  earth  in  support  of  the  hypothesis 
that  the  law  of  gravitation  is  an  inverse-square  law? 

Let  us  begin  to  answer  this  question  by  looking  at  the  earth-moon  system  from 
a frame  of  reference  fixed  with  respect  to  the  sun,  which  can  surely  be  considered 
an  inertial  frame  for  the  purpose  at  hand.  Figure  ll-4a  is  a free-body  diagram  of 
the  moon  as  seen  in  this  frame.  The  force  Fon  moon  by  earth  is  the  gravitational  force 
we  have  been  discussing.  In  addition,  there  must  be  a force  exerted  on  the  moon 
Fan  moon  by  sun  which  constrains  it  to  follow  the  path  around  the  sun  that  it  shares, 
generally  speaking,  with  the  earth.  (We  have  not  established  that  this  force  arises 
from  the  same  type  of  gravitational  interaction  which  makes  the  moon  orbit  the 
earth.  But  for  the  sake  of  the  present  argument,  that  is  not  important.) 

Next,  consider  the  earth-moon  system  from  the  noninertial  frame  fixed  at  the 
center  of  mass  of  the  earth-moon  system  itself.  Figure  ll-4b  is  a free-body  diagram 
of  the  moon  as  seen  in  this  frame.  As  was  shown  in  Sec.  5-4,  it  is  possible  to  use 
Newton’s  second  law  in  a noninertial  frame  if  a suitable  fictitious  force  is  added  to 
the  forces  acting  on  the  body  whose  motion  is  being  studied — in  the  present  case, 
the  moon.  If  the  acceleration  of  the  noninertial  frame  of  reference,  as  seen  from  an 
inertial  frame,  is  A and  the  mass  of  the  body  being  studied — here  the  moon — ismm , 
Eq.  (5-29)  gives  the  fictitious  force  Ffict  as 


Ffict  HI  A 


11-1  Universal  Gravitation  449 


rfrom  sun  to  moon 

► 


on  moon  by  sun  . 


Moon 

_ 


x 

' r ( rm 


ffrom  CM  of  earth-moon  system  to  moon 


(a) 


Fig.  11-4  Free-body  diagram  of  the  moon  from  the  point  of 
view  of  (a)  an  observer  in  the  substantially  inertial  frame  of 
reference  fixed  with  respect  to  the  sun  and  (b)  an  observer  in 
the  noninertial  frame  of  reference  fixed  with  respect  to  the 
center  of  mass  of  the  earth-moon  system.  In  part  a the  two 
forces  shown  acting  on  the  moon  are  those  leading  to  its  ac- 
celeration about  the  sun  and  the  earth  respectively.  In  part  b 
is  shown  the  additional  fictitious  force  Fficl  which  the  accelerated 
observer  requires  in  order  to  apply  Newton’s  laws  of  motion. 


on  moon  by  sun 


■ Moon  - 


Fflct  — T^moon  ( Aearth-moon  system) 


y^0 


on  moon  by  earth 


X 

' rfrom  CM  of  earth-moon  system  to  moon 


( b ) 


The  proper  net  force  to  be  used  in  applying  Newton’s  second  law  in  the  non- 
inertial frame  of  Fig.  ll-4b  is  thus 

F — Fon  moon  by  earth  Fon  m0on  by  sun  T Ffjct 

If  the  moon  were  located  exactly  at  the  center  of  mass  of  the  earth-moon  system, 
the  force  exerted  on  the  moon  by  the  sun  would  produce  an  acceleration  exactly 
equal  to  A,  the  acceleration  of  the  earth-moon  system  about  the  sun.  But,  in  fact, 
the  distance  from  the  sun  to  the  moon  varies  as  the  moon  orbits  the  earth.  Its  value 
oscillates  about  the  distance  from  the  sun  to  the  center  of  mass  of  the  earth-moon 
system,  with  an  amplitude  which  is  small  compared  to  the  latter  distance. 

But  in  spite  of  this  slight  variation,  we  know  that  the  force  exerted  on  the 
moon  by  the  sun  must  be  such  that  its  average  value  would  produce  an  accelera- 
tion of  the  moon  of  magnitude  A.  We  know  this  because  the  period  of  the  moon’s 
orbit  about  the  sun  is  1 yr — the  same  as  the  period  of  the  center  of  mass  of  the 
earth-moon  system  about  the  sun.  We  thus  have 

Fon  moon  by  sun  fTl  mA 

Consequently,  the  net  force  acting  on  the  moon  from  the  point  of  view  of  the  ob- 
server in  the  noninertial  frame  fixed  with  respect  to  the  center  of  mass  of  the 
earth-moon  system  is,  on  average, 

F Fon  moon  by  earth  T U)  mA  111  ,,,A 
or 

F Fon  moon  by  earth 

Finally,  we  must  note  that  the  actual  observer  is  located  in  a frame  fixed  not 
with  respect  to  the  center  of  mass  of  the  earth-moon  system,  but  with  respect  to  the 
earth  itself.  But  as  you  will  see  in  Example  11-6,  the  center  of  mass  is  only  4700 
km  from  the  center  of  the  earth — that  is,  inside  the  earth  itself.  The  additional 
acceleration  of  the  actual  observer  resulting  from  her  or  his  monthly  rotation 
around  the  center  of  mass  of  the  earth-moon  system  is  therefore  negligible. 

This  is  why  it  is  possible  to  carry  out  Newton’s  comparison  of  the  acceleration 
of  the  moon  about  the  earth  with  the  acceleration  of  the  apple  even  though  (1)  the 
frame  of  reference  is  not  inertial  and  (2)  the  force  exerted  on  the  moon  by  the  sun 
is  ignored. 

For  further  evidence  in  support  of  the  inverse-square  law,  Newton  had 
to  extend  his  vision — and  the  scope  of  validity  of  the  law  of  universal  gravi- 
tation— to  more  distant  parts  of  the  solar  system.  We  will  follow  Newton  in 


450  Gravitation  and  Central  Force  Motion 


this  shortly.  But  we  have,  as  Newton  did  not,  supporting  “close-by”  evi- 
dence provided  by  artificial  earth  satellites.  Example  1 1-2  illustrates  the  use 
of  such  evidence. 


EXAMPLE  11-2 

If  the  inverse-square  law  is  correct,  Eq.  (11-8),  am/g  = (Re/Rm)2,  should  be  appli- 
cable to  any  earth  satellite  having  a circular  orbit  of  known  radius  and  period.  Such 
a satellite  is  the  synchronous  satellite,  whose  sidereal  period  is  equal  to  the  sidereal 
period  of  rotation  of  a point  on  the  surface  of  the  earth,  T = 23  h 56  min.  In  Ex- 
ample 3-10  the  altitude,  or  distance  above  the  surface  of  the  earth,  was  given 
without  justification  to  be  3.58  x 107  m.  Show  that  these  data  satisfy  Eq.  (1 1-8)  and 
hence  the  inverse-square  law  as  well. 

Some  comment  is  required  concerning  the  value  of  g appropriate  for  use  in  Eq. 
(11-8).  The  synchronous  satellite  revolves  about  the  earth  as  seen  by  an  observer 
fixed  with  respect  to  the  center  of  the  earth.  But  the  acceleration  of  gravity  is  actu- 
ally measured  by  observers  rotating  with  the  earth.  This  rotation  introduces  a cen- 
trifugal acceleration  which  leads  to  a measured  value  of  g smaller  than  that  which 
would  be  measured  by  a nonrotating  observer  (assuming  that  such  an  observer 
could  measure  g while  hovering  just  above  the  surface  of  the  earth  while  the  earth 
spun  beneath).  The  rotation  of  the  earth  also  introduces  another  complicating 
factor  indirectly.  Because  of  the  rotation,  the  earth  is  not  perfectly  spherical;  its 
radius  is  greater  at  the  equator  than  at  the  poles.  Thus  the  distance  of  the  observer 
from  the  center  of  the  earth  depends  on  the  latitude  at  which  the  observation  is 
made,  and  this  also  influences  the  measured  value  of  g.  When  these  matters  are 
taken  into  consideration,  the  proper  value  of  g,  for  a nonrotating  observer  located 
at  a distance  from  the  center  of  the  earth  equal  to  the  earth’s  mean  radius,  Re  = 
6.367  X 106  m,  is  g = 9.848  m/s2  to  four  significant  figures. 

■ You  modify  the  notation  of  Eq.  (1 1-8)  for  application  to  an  arbitrary  satellite. 
Calling  the  magnitude  of  the  satellite’s  centripetal  acceleration  under  the  influence 
of  the  earth’s  gravitation  as  and  its  orbit  radius  Rs,  you  have 

as  / Re\ 2 

7 = W 

The  necessary  value  of  as  for  a body  moving  in  a circular  orbit  of  radius  Rs  with 
period  T is  given  by  Eq.  ( 1 1-2),  and  is  as  = 4n2Rs/T2.  Substituting  this  value  into  the 
equation  immediately  above,  you  obtain 

4ir2Rs  = Rj 
gT2  ~ R2 


or 


Rs 


113 


Rs  = 


Fhe  numerical  values  give  you 

9.848  m/s2  x (6.367  x 106  m)2  X (23  h x 3600  s/h  + 56  min  x 60  s/min)2jI/3 


47T2 


4.218  x 107  m 


To  find  the  satellite  altitude  h,  you  subtract  the  radius  of  the  earth  from  Rs,  which 
gives  you 

h = Rs  - Re  = 4.218  x 107  m - 6.367  x 106  m 
Rounded  off  to  three  significant  figures,  this  yields 

h = 3.58  x 107  m 


11-1  Universal  Gravitation  451 


Fig.  11-5  Qualitative  justification  of 
Newton’s  conjecture  that  the  gravita- 
tional attraction  resulting  from  a uni- 
form sphere  of  mass  M,  experienced  by 
a body  located  at  point  P outside  the 
sphere,  is  equal  to  that  resulting  from 
a point  particle  of  mass  M located  at  the 
center  of  the  sphere.  The  uniform 
sphere  is  divided  into  a nest  of  infini- 
tesimally thick  shells,  one  of  which  is 
illustrated.  Two  infinitesimally  wide 
zones  A and  B are  shown  on  the  shell. 
All  points  in  zone  A are  essentially  equi- 
distant from  P , and  the  same  is  true  for 
the  points  in  zone  B.  While  zone  B is 
closer  to  P than  is  zoned,  it  also  contains 
less  mass.  Detailed  analysis  shows  that 
the  gravitational  attractions  of  the 
various  zones  "average  out"  to  the  result 
stated  above.  If  that  result  holds  for  a 
shell,  it  holds  as  well  for  a spherical  solid 
made  up  of  a nest  of  concentric  shells. 
A rigorous  proof  is  given  in  Chap.  20. 


It  is  very  gratifying  that  Newton’s  law  of  gravitation,  in  the  scalar  form 
of  Eq.  (1  l-6a),  agrees  so  well  with  observation.  However,  there  is  a weak 
link  in  the  logic  leading  to  this  result,  and  we  must  now  backtrack  to  discuss 
it.  In  applying  Eq.  (1  l-6n)  to  the  moon,  it  is  reasonably  clear  what  is  meant 
by  Rm,  the  distance  from  the  earth  to  the  moon.  Neither  of  them  is  a par- 
ticle, but  the  sizes  of  the  earth  and  of  the  moon  are  both  small  compared  to 
the  distance  between  them,  and  it  cannot  be  far  wrong  to  let  Rm  be  the  dis- 
tance between  their  centers. 

The  apple  is  in  a different  situation.  It  is  itself  small  enough  to  be 
regarded  as  a particle,  but  it  is  close  to  a very  large  earth.  Some  parts  of  the 
earth  are  just  a few  meters  away,  while  others  are  106  times  more  distant. 
To  make  a proper  determination  of  the  force  exerted  on  the  apple  by  the 
earth,  it  is  necessary  to  divide  the  earth  into  small  mass  elements  mj  and  to 
use  Eq.  (1  l-6c)  to  find  the  sum  of  their  individual  effects.  In  the  discussion 
above,  we  tacitly  made  the  same  guess  as  Newton  initially  did.  According  to 
this  guess,  illustrated  in  Fig.  1 1-5,  the  relatively  strong  attraction  of  nearby 
mass  elements  and  the  relatively  weak  attraction  of  distant  ones  “average 
out”  in  such  a way  that  the  earth  as  a whole  attracts  the  apple  just  as  if  all 
the  earth's  mass  were  concentrated  at  its  own  center.  It  turns  out  that  this  plausible 
guess  can  be  proved  correct  for  an  inverse-square  law. 

The  need  to  verify  this  guess  caused  Newton  great  difficulty.  The  proof  was 
one  of  the  first  major  applications  of  the  integral  calculus  which  he  invented.  It  is 
straightforward,  though  somewhat  lengthy.  We  will  not  discuss  it  here,  however. 
An  argument  based  on  symmetry,  developed  a century  and  a half  after  Newton, 
provides  a very  much  simpler  proof  and  leads  to  much  deeper  insights  into  the  na- 
ture of  the  inverse-square  forces.  This  very  powerful  approach,  called  Gauss’  law, 
is  discussed  in  detail  in  Chap.  20,  in  connection  with  the  electric  force.  At  that 
point,  we  also  show  explicitly  how  Gauss’  law  verifies  the  guess. 

Artificial  satellites  provide  experimental  evidence,  not  available  to 
Newton,  that  the  guess  is  correct.  The  synchronous  satellite  discussed  in 
Example  1 1-2  is  only  about  six  earth  radii  distant  from  the  surface  of  the 
earth.  This  is  close  enough  that  the  earth,  when  seen  from  the  satellite, 
does  not  visually  approximate  a point  mass.  Thus  any  error  in  our  guess 
would  lead  to  an  improper  choice  of  the  orbit  radius  for  synchronous  mo- 
tion. A quite  small  deviation  from  the  stationary  appearance  of  the  satellite 
over  a fixed  meridian  of  longitude  would  be  readily  detectable.  But,  in  fact, 
the  calculated  synchronous  orbit  radius  is  quite  accurate.  That  is,  the  earth 
does  “look”  like  a point  as  “seen”  gravitationally  from  the  satellite.  Even  sat- 
ellites much  closer  to  the  earth  have  orbit  periods  whose  deviations  from 
those  calculated  on  the  basis  of  our  guess  are  due  only  to  the  very  small  de- 
parture of  the  earth's  shape  from  perfect  sphericity. 

If  the  law  of  gravitation  were  not  an  inverse-square  law,  it  would  not  be 
correct  to  assume  that  the  distributed  mass  of  the  earth  acts  on  external 
masses  as  if  its  own  mass  were  concentrated  at  its  center.  In  the  long  run, 
the  real  verification  of  the  inverse-square  nature  of  the  law  of  gravitation  is 
its  consistency  with  observation.  In  Sec.  1 1-6,  we  study  hypothetical  plane- 
tary systems  in  which  the  gravitational  force  depends  on  other  powers  of 
the  distance,  and  we  compare  them  with  what  is  actually  observed. 

The  inverse-square  law  appears  to  be  deeply  embedded  into  the  geometrical 
nature  of  the  space  in  which  we  live.  To  see  this  in  a simple  way,  we  can  make  the 
following  argument.  The  body  j of  mass  m,  in  Fig.  ll-6a  is  attracted  toward  the 


452  Gravitation  and  Central  Force  Motion 


Sphere  1 


(a) 


Sphere  2 

(Sphere  1 ) 


mk 


(6) 


body  k of  mass  mk  “simply”  as  a result  of  the  presence  of  body  k in  space.  That  is 
what  we  mean  when  we  say  that  gravitational  attraction  is  a fundamental  ob- 
served property  of  bodies  having  mass.  Now,  ; lies  on  the  surface  of  an  imaginary 
sphere  1 of  radius  iq,  whose  center  is  at  k.  The  magnitude Fi  of  the  force  on;  would 
be  the  same,  no  matter  where;  was  located  on  sphere  1.  That  is,  the  gravitational 
influence  of  the  mass  mk  is  the  same  everywhere  on  the  surface  of  the  sphere 
centered  on  k.  So  we  can  imagine  some  sort  of  “influence”  emanating  from  mA- 
uniformly  in  all  directions.  The  amount  of  the  influence  depends,  as  we  have 
said  above,  on  the  magnitude  of  the  mass  mk. 

What  happens  if  body ; is  relocated  farther  away  from  body  k,  that  is,  on  the 
surface  of  the  larger  sphere  2 of  radius  r2 , as  in  Fig.  ll-6b?  The  magnitude  F2  of  the 
force  on  body;  is  the  same  no  matter  where  j is  located  on  the  surface  of  the  larger 
sphere.  But  F2  is  smaller  than  Ft.  The  influence  emanating  from  body  k still  de- 
pends on  the  magnitude  of  mfc  (which  has  not  changed],  but  the  influence  has 
“spread  out”  over  the  outer  sphere,  which  has  a greater  surface  area  than  the  in- 
ner one. 

Although  the  gravitational  force  gets  smaller  and  smaller  as  the  distance 
between  bodies  increases,  it  never  disappears  entirely.  Thus,  while  the  influence  of 
mk  “spreads  out  thinner  and  thinner,”  it  does  not  vanish.  Put  another  way,  it  is 
plausible  to  assume  that  all  the  influence  which  “passes  through”  sphere  1 passes 
through  sphere  2 as  well. 

A direct  measure  of  the  spreading  out  of  the  influence  of  mk  is  the  ratio  of  the 
surface  areas  Ax  and  A2  of  the  two  spheres.  Making  our  plausibility  argument 
quantitative,  we  write  it  in  the  form 


Fig.  11-6  Diagram  for  the  geometrical 
plausibility  argument,  given  in  the  text, 
for  the  inverse-square  form  of  the  law 
of  gravitation  for  two  point  masses. 


Thus  we  have 


Ff  _ m k/A2 
F i m t/Aj 


Fa  = mA-/477T| 

Fi  mk/4m-f 


or 


(11-9) 


which  is  the  inverse- square  law  we  set  out  to  make  plausible.  When  seen  in  this 
light,  the  inverse-square  law  is  a fundamental  geometrical  property  of  the  three- 
dimensional  space  with  which  we  are  familiar,  euclidean  space. 

We  will  now  consider  the  role  of  mass  in  more  detail  from  a gravita- 
tional point  of  view.  In  Sec.  5-2,  we  dwelt  briefly  on  the  distinction  between 
the  inertial  and  gravitational  aspects  of  mass.  We  defined  mass  in  Chap.  4 as 
the  tendency  of  a body  to  resist  acceleration,  that  is,  mass  in  its  inertial  as- 
pect. To  be  specific,  mass  is  defined  inertially  by  either  of  the  equations 

p = mx  or  F = ma 

In  this  chapter,  mass  appears  in  its  gravitational  aspect  in  Eq.  (ll-6c): 


F* 


-G 


mgyik 

2 

rkj 


This  equation  may  also  be  regarded  as  a definition  of  mass.  But  mass  so  de- 
fined has  an  experimental  (or  observational)  basis  completely  independent  of  its  defi- 
nition in  terms  of  inertia. 


11-1  Universal  Gravitation  453 


We  are  so  used  to  sensing  mass  simultaneously  in  both  these  aspects, 
inertial  and  gravitational,  that  we  do  not  even  give  them  separate  names  in 
ordinary  practice.  This  is  so  in  spite  of  the  fact  that  we  must  take  care  to  dis- 
tinguish the  two  in  analyzing  such  systems  as  the  Atwood  machine  or  the 
block  sliding  down  an  inclined  plane,  with  which  we  dealt  in  Chap.  5.  But  if 
you  think  about  it  for  a while,  you  will  agree  that  it  is  most  striking  that  for 
any  body,  the  two  masses  appear  to  be  equal!  It  is  entirely  due  to  this  fact  that 
different  objects  dropped  to  the  ground  fall  with  equal  accelerations,  neg- 
lecting air  resistance.  Let  us  consider  again  the  falling  apple.  This  time,  we 
make  explicit  the  distinction  between  the  mass  of  the  apple  measured  iner- 
tially,  mIa,  and  its  mass  measured  gravitationally,  mGa.  (As  far  as  the  fall  of 
the  apple  is  concerned,  we  are  interested  in  only  the  gravitational  aspect  of 
the  mass  of  the  earth,  which  we  call  rnGe.)  The  magnitude  of  the  force 
exerted  on  the  apple  by  the  earth  depends  on  the  gravitational  aspect  of 
the  masses  of  the  two  bodies,  and  is  found  by  applying  Eq.  (11- 6« ) to  the 
present  situation  to  obtain 

F = r mGqmGe 

r on  apple  b.v  earth 

But  it  is  the  inertial  aspect  of  the  mass  of  the  apple  which  must  be  used  in 
Newton’s  second  law,  F = ma.  Using  the  fact  that  the  apple’s  acceleration 
is  equal  to  g,  the  acceleration  of  gravity  near  the  surface  of  the  earth  for 
any  body,  we  can  write  Newton’s  second  law  in  the  form 

-^on  apple  by  earth  ™-jag 

Equating  the  two  expressions  for  the  force  exerted  on  the  apple  by  the 
earth,  we  have 

^ mGamGe 
miag  = G — ~2 — 

l\e 

This  equation  can  be  solved  for  the  acceleration  of  gravity  g to  yield 

mGa  GmGe 

g = 

mia  Re 

Generalizing  from  the  apple  to  any  body 7,  we  can  write  the  acceleration 
of  gravity  for  that  particular  body 7 in  the  form 


ma  GmGe 
mu  R'j, 


(11-10) 


The  observed  fact  that  g is  the  same  for  all  bodies  falling  without  friction  near  the 
surface  of  the  earth  could  not  be  true  unless  the  ratio  mGfmij  were  the  same  for  all 
bodies. 

Newton  made  a series  of  experiments  to  verify  that  this  is  indeed  the 
case.  He  made  a number  of  pendulums  of  the  same  length  but  having  bobs 
of  different  materials.  It  is  straightforward  to  show  that  the  period  7j  of  a 
simple  pendulum,  whose  bob  is  body  j,  is 


Tj  = 2n 


if  nijj  and  mGj  are  not  assumed  identical.  Newton  found  that  all  his  pen- 
dulums had  equal  periods  within  the  experimental  error  of  about  1 per- 
cent, which  leads  to  the  conclusion  that  mGj  / m jj  is  substantially  the  same  for 
all  the  materials  tried. 


454  Gravitation  and  Central  Force  Motion 


In  the  late  1880s,  the  Hungarian  physicist  Roland  von  Eotvos  (1848-1919) 
undertook  a long  series  of  experiments  to  determine,  with  the  highest  possible 
precision,  just  how  close  to  the  same  value  the  ratio  of  the  two  masses  is  for  dif- 
ferent materials.  Eotvos  used  a torsion  pendulum  in  which  the  bodies  at  the  two 
ends  of  the  horizontal  rod,  one  of  platinum  and  the  other  of  some  other  material, 
had  the  same  gravitational  mass  mG , as  determined  by  precise  weighing  techniques. 
If  the  beam  of  the  torsion  pendulum  is  oriented  east-west,  the  rotation  of  the  earth 
results  in  a centrifugal  force  being  exerted  on  each  of  the  two  bodies  in  the  direc- 
tion away  from  the  earth’s  axis.  Each  of  the  two  centrifugal  forces  is  proportional 
to  the  inertial  mass  m,  of  the  body  on  which  it  is  exerted.  If  m,  is  the  same  for  both 
bodies,  the  net  torque  is  zero  and  there  is  no  deflection  of  the  torsion  pendulum. 
However,  if  the  two  bodies,  which  have  the  same  mG , have  different  m, , then  the 
apparatus  will  be  twisted  by  the  resulting  net  torque,  since  centrifugal  force  de- 
pends on  inertial  mass.  Eotvos  could  find  no  effect,  although  his  apparatus  could 
have  detected  a difference  in  m 7 for  the  two  bodies  smaller  than  1 part  in  108. 

The  Eotvos  experiment  can  be  done  in  other  ways.  More  recently,  null  results 
have  been  obtained  with  a margin  of  error  almost  a thousandfold  smaller  than  that 
of  Eotvos. 

You  may  well  suspect  that  when  two  quite  different  quantities  coincide 
within  1 part  in  10"  or  better,  it  is  no  accident.  The  assertion  that  the  two  are 
really  the  same  quantity,  observed  in  different  ways  which  obscure  their  funda- 
mental identity,  is  the  logical  starting  point  of  the  general  theory  of  relativity.  For 
this  reason  we  have  usually  spoken  of  the  inertial  and  gravitational  aspects  of 
mass  rather  than  separately  of  the  inertial  mass  and  the  gravitational  mass. 


11-2  DETERMINATION 
OF  THE  UNIVERSAL 
GRAVITATIONAL 
CONSTANT  G 


In  the  calculations  of  Sec.  11-1,  the  gravitational  constant  G always  appears 
as  part  of  the  product  Gme,  where  me  is  die  mass  of  the  earth.  We  can 
determine  the  value  of  that  product  from  the  expression 

Gme 

g = (n-ii) 

which  is  just  Eq.  (1  1-10)  simplified  by  the  assumption  that  the  ratio  of  a 
gravitationally  measured  mass  to  an  inertially  measured  mass  has  the  value 
mG /mi  = 1 for  all  bodies.  If  either  G or  me  is  known,  the  other  can  be 
obtained  immediately  from  Eq.  (1 1-1  1),  since  the  gravitational  acceleration 
g and  the  earth's  radius  Re  are  known.  Knowing  G,  we  will  be  able  to  find 
the  mass  of  any  heavenly  body  which  has  a satellite,  as  yon  will  see  in 
Sec.  1 1-3. 

A first  rough  guess  at  the  value  of  G is  made  in  Example  1 1-3  by  using 
Eq.  (11-11)  together  with  an  estimate  ofme. 


EXAMPLE  11-3 

Make  the  crude  assumption  that  the  density  of  the  earth  is  uniform  throughout, 
and  estimate  me.  1 hen  use  the  result  to  estimate  G.  Geological  observations  distrib- 
uted all  over  the  world  suggest  that  the  approximate  average  density,  or  mass 
per  unit  volume,  of  the  solid  material  on  or  very  near  the  surface  of  the  earth  is 
p — 4 x 1C)3  kg/m3. 

■ You  have  for  the  mass  of  the  spherical  earth,  whose  volume  is  V = 37tR3, 

rne  = pV  = 57 TpR? 

Insert  this  value  into  Eq.  (11-1)  to  obtain 


which  you  solve  for  G,  obtaining 


g = %vGpRe 

G = — — — 
47 jpRe 


11-2  Determination  of  the  Universal  Gravitational  Constant  G 455 


The  numerical  values  give  you 


G 


3 x 9.8  m/s2 


4tt  x 4 x 103  kg/m3  X 6.4  x 106  m 


= 9 x 10_n  N-m2/kg2 


This  calculation  must  certainly  overestimate  the  value  of  G.  Even  if  there  were 
no  tendency  for  denser  materials  to  lie  deeper  in  the  earth,  compression  resulting 
from  the  enormous  pressure  in  the  interior  would  guarantee  a higher  density  there 
than  on  the  surface.  Thus  the  calculation  underestimates  the  average  density  of  the 
earth  and  so  overestimates  G.  As  you  will  see,  the  value  of  G obtained  is  about  50 
percent  too  large.  Still,  that  is  not  bad  for  a first  guess. 


1 he  first  laboratory  measurement  of  G was  made  in  1797-1798  by 
Henry  Cavendish  (1731-  1810),  using  a method  still  employed  today  with 
modifications  only  in  detail.  Something  of  an  eccentric,  Cavendish  was  one 
of  the  brilliant  amateurs  who  made  a very  large  share  a significant 
British  contributions  to  the  physical  sciences  during  the  eighteenth  and 
nineteenth  centuries.  Cavendish  described  his  equipment  as  follows:  “The 
apparatus  is  very  simple;  it  consists  of  a wooden  arm,  6 feet  long,  made 
so  as  to  unite  great  strength  with  little  weight.  [See  Fig.  11-7.]  This  arm  is 
suspended  in  a horizontal  position,  by  a slender  wire  40  inches  long,  and  to 
each  extremity  is  luing  a leaden  ball,  about  2 inches  in  diameter;  and  the 
whole  is  inclosed  in  a narrow  wooden  case,  to  defend  it  from  the  wind.  As 
no  more  force  is  required  to  make  this  arm  turn  round  on  its  centre,  than 
what  is  necessary  to  twist  the  suspending  wire,  it  is  plain,  that  if  the  wire  is 
sufficiently  slender,  the  most  minute  force,  such  as  the  [gravitational]  at- 
traction of  a leaden  weight  a few  inches  in  diameter,  will  be  sufficient  to 
draw  the  arm  sensibly  aside.” 

Before  the  actual  measurements  leading  to  a determination  of  G can 
be  begun,  the  torsion  constant  k of  the  so-called  torsion  balance  must  be 
determined.  This  is  done  by  allowing  it  to  oscillate  and  measuring  the  oscil- 
lation frequency  v.  The  moment  of  inertia  1 is  then  carefully  calculated, 
with  due  allowance  being  made  for  the  small  contribution  of  the  beam  hh, 
shown  ip  Fig.  11-7,  which  supports  the  bodies  xx.  Equation  (10-25),  v = 
(27 T)~1(k/I)112,  is  then  used  to  find  k. 

In  the  main  part  of  the  experiment,  the  equilibrium  position  of  the 
torsion  balance  beam  hh  in  Fig.  1 1-7 b is  measured  with  the  large  spheres  W, 
which  Cavendish  called  “weights,”  in  the  position  shown  by  the  solid  lines. 
Then  the  displacement  of  the  equilibrium  position  is  measured  when  the 
large  spheres  are  swung  around  into  the  position  shown  by  the  dotted  lines. 
Needless  to  say,  all  kinds  of  precautions  are  necessary.  A very  small  amount 
of  electric  charge  on  the  spheres  W , for  example,  will  completely  over- 
whelm the  gravitational  effect.  (That  is  what  we  meant  when  we  said  in  Sec. 
1-2  that  the  gravitational  interaction  is  very  much  weaker  than  the  electric 
interaction.)  Cavendish’s  biggest  problem,  however,  wras  to  minimize  the  ef- 
fects of  the  air  currents.  Even  minute  differences  in  the  temperature  of  dif- 
ferent parts  of  the  apparatus  could  produce  sufficient  motion  of  the  air  to 
lead  to  serious,  disturbing  effects.  Besides  enclosing  the  entire  apparatus  in 
a large  case,  he  made  all  necessary  manipulations  remotely  (to  avoid  the  ef- 
fect of  his  body  heat)  and  made  his  readings  with  small  telescopes. 

The  angular  displacement  of  the  torsion  balance  when  the  large 
spheres  W are  swung  around,  together  with  the  known  value  of  the  torsion 
constant  k,  yields  a value  of  the  torque.  This,  together  with  the  known 


456  Gravitation  and  Central  Force  Motion 


Fig.  11-7  Two  views  of  Cavendish’s  torsion  balance,  taken  from  his  1798  paper. 
The  torsion  balance  proper  consists  of  the  two  small  masses  x which  are  supported 
by  the  light  but  rigid  structure  ghmh  suspended  from  the  torsion  fiber  Ig.  The 
motion  of  the  balance  is  observed  by  means  of  the  light  sources  L,  the  small  mirrors 
n on  the  ends  of  the  balance  beam,  and  the  small  telescopes  T . Gravitational  attrac- 
tion is  exerted  on  the  masses  x by  the  large  masses  IT.  The  latter  can  be  swung 
around,  as  shown  in  part  b,  by  means  of  the  large  pulley  M.  (Philosophical  Trans- 
actions of  the  Royal  Society,  1798.  Courtesy  of  the  New  York  Public  Library.) 


11-2  Determination  of  the  Universal  Gravitational  Constant  G 457 


length  ol  the  beam  hh,  can  be  used  to  calculate  the  gravitational  force. 
Since  the  masses  of  WW  and  xx,  and  the  distance  between  them,  are  known, 
Eq.  ( 1 l-6«)  can  be  solved  to  give  a numerical  value  of  the  universal  gravita- 
tional constant  G. 

Cavendish’s  result,  incorporating  all  the  many  corrections  required, 
was  equivalent  to  G = (6.65  ± 0.48)  x 10~n  N-m2/kg2.  More  modern  mea- 
surements are  in  general  agreement  on  the  value 

G = (6.67  ± 0.01)  x 10“n  N-m2/kg2  (11-12) 

This  is  the  least  accurately  known  of  the  fundamental  physical  constants, 
because  of  the  extreme  weakness  of  the  gravitational  interaction. 

As  a by-product  of  the  experiment,  Cavendish  also  verified  the 
inverse-square  law  at  distances  as  small  as  a few  millimeters.  Up  to  that 
time,  its  validity  had  been  supported  only  over  much  larger  distances  (such 
as  those  involved  in  the  moon-apple  calculation).  While  his  measurements 
seem  to  have  been  reliable  within  about  1 percent,  he  did  not  consider 
them  worthy  of  more  than  passing  mention  (possibly  because  Coulomb  had 
obtained  more  definite  results  for  the  inverse-square  nature  of  the  elec- 
tric force  a decade  earlier).  Nevertheless,  Cavendish’s  results  lend  fur- 
ther support  to  the  use  of  center-to-center  distances  in  applying  Newton’s 
law  of  gravitation  to  spherical  bodies. 

Cavendish  and  his  contemporaries  referred  to  his  measuring  of  G as 
“weighing  the  earth,”  a dramatic  if  inaccurate  phrase  still  sometimes  used 
in  speaking  loosely  of  the  experiment.  Example  1 1-4  suggests  that  what  is 
really  meant  by  the  phrase  is  measurement  of  the  mass  of  the  earth. 


EXAMPLE  11-4 


Using  the  modern  value  of  G,  find  the  mass  and  the  mean  density  of  the  earth. 
■ From  Eq.  (11-11)  you  have  for  the  mass  of  the  earth 

gm 

G 


mP 


Using  the  values  of  g and  Re  given  in  Example  1 1-2,  you  have 

9.85  m/s2  X (6.37  x 106  m)2 

yn  = 

e 6.67  X l(Tn  N-m2/kg2 
= 5.99  x 1024  kg  = 5.99  x 1021  t 

[ I he  ton  t used  here  and  occasionally  elsewhere  in  the  book  is  the  metric  ton,  equal 
to  103  kg.]  For  the  density  you  write  p = me/V , or 


P = 


3 me 
4t ri?3 

3 x 5.99  X 1 024  kg- 


47t  x (6.37  x 106  m)3 


5.53  x 103  kg/m3 


As  expected,  this  is  rather  larger  than  the  value  4 X 103  kg/m3  which  comes  from  a 
survey  of  surface  rocks,  because  of  the  general  increase  in  density  with  depth.  In 
fact,  the  cot  e of  the  earth  is  probably  made  of  metal,  largely  iron  and  nickel,  w hose 
density  is  well  above  8 X 103  kg/m3  at  the  pressures  involved. 


458  Gravitation  and  Central  Force  Motion 


11-3  THE  MECHANICS  In  Sec.  1 1-1  you  had  a glimpse  of  the  way  in  which  Newton’s  consideration 
OF  CIRCULAR  ORBITS:  of  the  earth- moon  system  led  him  to  combine  his  law  of  gravitation  and  his 
ANALYTICAL  ^aws  mechanics’  ar*d  thus  to  extend  the  mechanics  of  terrestrial  objects 
TREATMENT  SO  l^at  em^race<^  die  earth-moon  system  without  essential  change.  The 
next  step  was  to  go  beyond  the  moon  and  consider  the  entire  solar  system. 

As  the  main  basis  for  his  attack  on  this  more  general  problem,  Newton 
used  Kepler’s  analysis  of  the  enormous  volume  of  planetary  observations 
made  by  Tycho  Brahe.  Although  astronomical  data  had  become  very  much 
more  precise  in  the  decades  since  Tycho's  day,  Kepler’s  empirical  general 
rules  of  planetary  motion  remained  valid. 

A man  of  mystical  bent  who  held  a profound  faith  that  the  universe 
was  ultimately  based  on  number,  Kepler  had  spent  many  years  searching 
for  every  mathematical  relationship  he  could  find,  in  his  efforts  to  demon- 
strate the  regularity  of  the  solar  system.  At  least  in  part  because  of  the  re- 
ligious significance  lie  attached  to  the  sun,  he  based  his  approach  on  the 
Copernican  system  with  the  sun  at  the  center  and  the  planets  revolving 
around  it.  As  you  can  imagine,  he  was  able  to  find  empirically  many  dif- 
ferent numerical  and  mathematical  relationships  in  a system  as  complex  as 
the  solar  system.  Most  of  these  failed  to  hold  up  when  more  accurate  data 
became  available,  or  turned  out  to  be  trivial  consequences  of  others,  or 
came  to  be  seen  as  mere  accidental  numerical  coincidences  of  the  sort  in 
which  numerologists  delight. 

But  Kepler  had  used  three  such  relationships  in  achieving  “victory”  in 
his  “war  on  Mars,”  as  he  called  his  intense,  decade-long  effort  to  describe 
the  orbit  of  the  planet  named  after  the  Greek  god  of  war.  The  relatively 
large  deviation  from  circularity  of  that  orbit  made  it  a severe  test  of  any 
proposed  kinematical  theory  of  planetary  motion.  What  is  more,  Kepler 
had  access  to  the  unprecedentedly  accurate  observations  of  Tycho,  for 
whom  he  had  worked  briefly. 

Beginning  with  one  of  Tycho’s  observed  angular  positions  of  Mars, 
Kepler  tried  to  predict  an  angular  position  at  a later  time.  He  knew  that  any 
fit  between  his  predictions  and  Tycho’s  observed  angular  positions  had  to 
be  good  within  2 minutes  of  arc  if  his  kinematical  theory  was  to  be  a valid 
one.  In  this  test  a description  based  on  his  rules  succeeded  where  all  others 
had  failed. 

It  was  Newton  who  first  called  the  crucial  rules  Kepler’s  three  laws, 
leaving  Kepler’s  many  other  empirical  rules  to  fade  into  obscurity.  Newton 
demonstrated  that  the  highly  accurate  but  purely  descriptive  rides  of  Kepler  were 
necessary  logical  co  nsequences  of  his  own  law  of  gravitation  and  the  laws  of  motion. 
In  doing  so,  Newton  put  both  the  science  of  terrestrial  mechanics  and  the 
science  of  celestial  mechanics  on  one  and  the  same  footing,  and  he  set  the 
stage  for  the  transformation  of  astronomy  into  a branch  of  physics. 

Kepler’s  laws  are  as  follows: 

1.  The  orbit  of  every  planet  in  the  solar  system  is  an  ellipse  having  the 
sun  at  one  focus.  See  Fig.  1 1-8  and  its  caption. 

2.  As  a planet  moves  in  its  orbit,  its  speed,  angular  speed,  and  orbit 
radius  all  vary.  However,  the  vector  from  the  sun  to  the  planet  sweeps  out 
equal  areas  in  equal  times,  as  shown  in  Fig.  11-8. 

3.  If  y is  half  the  length  of  the  major  axis  (see  Fig.  1 1-8)  of  the  orbital 
ellipse — the  semimajor  axis — and  T is  the  period  of  revolution  of  the 
planet  around  the  sun  as  seen  by  an  observer  fixed  in  space  (the  sidereal 
period),  then  die  ratio  y3/T2  is  the  same  for  all  planets. 


11-3  The  Mechanics  of  Circular  Orbits:  Analytical  Treatment  459 


Fig.  11-8  Kepler’s  Hrst  and  second  laws.  A planet  P orbits 
the  sun.  The  orbit  is  an  ellipse.  All  the  points  on  an  ellipse 
have  the  property  that  their  distances  from  two  specified 
points,  called  foci,  add  to  the  same  specified  value.  The  sun 
lies  at  one  of  the  two  foci  of  the  orbit.  The  line  passing 
through  the  two  foci  is  the  longest  that  can  be  drawn 
through  the  ellipse;  it  is  called  the  major  axis.  Its  perpendic- 
ular bisector  is  the  minor  axis.  The  vector  from  the  sun  to 
the  planet  is  the  position  vector  of  the  planet.  The  planet 
passes  from  P1  to  P2  in  a time  At.  As  it  does  so,  the  position 
vector  sweeps  out  the  area  A12.  In  a time  At'  the  planet 
passes  from  P3  to  P4.  the  position  vector  sweeping  out  the 
area  A34.  If  At  and  At'  are  equal  times,  then  A12  = A34. 


With  respect  to  the  hrst  law,  Newton  showed  that  the  path  of  a particle 
under  the  influence  of  an  inverse-square  force  emanating  from  a fixed 
point  is  always  a conic  section,  that  is,  a circle,  an  ellipse,  a parabola,  or  a 
hyperbola.  Figure  11-9  makes  clear  the  reason  for  calling  these  curves 
conic  sections.  The  planets  all  have  orbits  which  are  ellipses  differing  only 
slightly  from  circles.  But  there  are  other  objects  in  the  solar  system,  and  in 
other  inverse-square-law  systems,  which  demonstrate  all  the  paths  Newton 
showed  to  be  possible.  Kepler’s  first  law  is  strictly  true  only  if  the  force 
center — in  the  case  of  the  solar  system,  the  sun — is  fixed  in  space.  This  is  a 
good  approximation  for  the  solar  system,  because  the  sun  is  so  much  more 
massive  than  the  rest  of  the  system.  For  the  time  being,  we  will  confine  our 
attention  to  such  systems. 

The  analytical  derivation  of  Kepler’s  first  law  from  Newton’s  laws  is  too 
complex  mathematically  to  discuss  in  this  book.  In  Sec.  1 1-4,  however,  we 
will  accomplish  the  task  numerically. 

You  can  make  an  experimental  check  on  the  validity  of  Kepler’s  sec- 
ond law  by  direct  measurement  on  Fig.  9-15,  which  is  a strobe  photo  of  a 
puck  constrained  by  a string  to  move  in  a noncircular  orbit  on  an  air  table. 


Fig.  11-9  t Ionic  sections.  The  curved 
peripheries  of  the  shaded  regions  are 
the  intersections  of  the  planes  with  the 
cones.  These  curves  are:  circles  in  (a),  el- 
lipses in  (b).  parabolas  in  (c),  and  hyper- 
bolas in  (d). 


id) 


460  Gravitation  and  Central  Force  Motion 


Like  the  gravitational  force,  the  force  exerted  on  the  puck  by  the  string  is  a 
central  force,  always  directed  toward  the  same  point.  The  areas  of  the  tri- 
angles formed  by  any  two  successive  images  of  the  puck  and  the  hole  at  the 
center  of  the  air  table  are  equal,  within  the  limits  of  accuracy  imposed  by  the 
experimental  conditions  and  by  the  error  introduced  in  using  the  areas  of 
triangles  to  approximate  the  areas  actually  swept  out  by  the  position  vector. 

We  will  now  show  that  Kepler’s  second  law  can  be  derived  from 
Newton’s  laws  of  motion  for  any  central  force.  A central  force  is  defined  to  be 
any  force  whose  direction  is  always  toward  (or  away  from)  a given  fixed 
point  in  space,  called  the  force  center,  and  whose  magnitude  depends  only 
on  the  distance  from  the  center  of  force  of  the  body  on  which  it  acts.  For 
convenience,  we  will  assume  that  the  body  is  a planet,  but  that  assumption 
will  not  affect  the  generality  of  the  derivation. 

In  Eq.  (9-24),  we  defined  the  angular  momentum  1 of  a body,  about  a 
specified  origin,  to  be 

l = rxp  (11-13) 

where  r is  the  position  vector  of  the  body  and  p is  its  momentum.  Since  p = 
m\,  the  angular  momentum  of  a planet  (or  similar  body)  can  be  written 

I = mr  x v 

Here  m is  the  mass  of  the  planet,  v is  its  velocity,  and  r is  the  position  vector 
from  the  sun  (fixed  at  the  origin)  to  the  planet.  Multiplying  both  sides  of 
this  equation  by  dt,  we  have 

1 dt  = mr  x v dt 

Since  ds  = v dt  is  the  displacement  of  the  planet  during  the  infinitesimal 
time  interval  dt,  this  can  be  written 

1 dt  = mr  x ds  (11-14) 

In  Fig.  9-8  you  saw  that  the  magnitude  of  the  cross  product  of  two 
vectors  is  geometrically  equivalent  to  the  area  of  the  parallelogram  whose 
adjacent  sides  are  formed  by  the  two  vectors.  In  Fig.  11-10,  the  triangle  of 

Fig.  11-10  Diagram  for  deriving  Kepler’s  second  law  from  the 
principle  of  conservation  of  angular  momentum.  When  the 
planet  experiences  an  infinitesimal  displacement  ds  along  its 
central-force  orbit,  the  position  vector  r of  the  planet  relative 
to  the  center  of  force  sweeps  out  a triangular  area  dA.  This 
is  one-half  the  area  |r  x ds\  of  the  parallelogram  shown. 


ds 


11-3  The  Mechanics  of  Circular  Orbits:  Analytical  Treatment  461 


area  dA,  which  is  swept  out  by  r in  the  time  dt,  is  just  one-half  of  such  a par- 
allelogram; the  other  half  is  shown  by  the  clashed  lines.  Thus  we  have 

dA  = i|r  x ds\ 

Now  the  magnitude  part  of  Eq.  ( I 1-14)  is 

I dt  = m\ r X ds\ 

In  terms  of  dA,  this  is 


/ dt  — 2m  dA 


The  equation  may  be  integrated  over  an  arbitrary  time  interval  tt  to  tf.  This 
gives 


l dt  = 2m 


dA  = 2m A 


swept  area 


where  A is  defined  to  be  the  entire  area  swept  out  by  the  position  vector 
during  the  time  interval.  To  evaluate  the  integral  on  the  left  side  of  the 
equation,  we  need  to  know  how  the  angular  momentum  1 of  the  planet, 
about  an  origin  located  at  the  sun,  varies  with  time.  The  change  in  angular 
momentum  is  related  to  the  force  applied  to  the  planet  by  means  of  Eq. 
(9-28), 


Suppose  (as  is  very  closely  true)  that  the  only  significant  force  exerted  on 
the  planet  is  the  gravitational  attraction  F of  the  sun,  which  is  a central 
force  F = — Tf.  We  then  have 

d I 

— = r X (—  Fr)  = —Fr  X r = 0 
dt 


or 


1 = constant 

so  that  the  angular  momentum  of  the  planet  is  conserved. 

The  integration  of  / dt  can  thus  be  carried  out  immediately,  to  yield 

l(tf  — tj)  = 2mA 

or 


A — -J—  (tf  — tj)  (11-15) 

2m 

Since  l /2m  is  constant,  the  area  A swept  out  by  r in  equal  time  intervals  tf  — k 
will  be  equal,  as  Kepler  had  found  through  his  empirical  study  of  Tycho’s 
astronomical  data.  But  in  this  derivation  of  Kepler’s  second  law  from  the 
rotational  form  of  Newton’s  second  law  of  motion,  we  used  only  the  fact 
that  F lies  along  r,  so  that  r X F = 0.  We  never  invoked  the  inverse-square 
dependence  of  the  gravitational  force  on  distance.  Kepler’s  second  law  there- 
fore holds  true  for  any  central  force,  attractive  or  repulsive. 

We  now  derive  Kepler’s  third  law  from  Newton’s  laws  for  the  special 
case  of  circular  orbits.  Consider  a planet  of  mass  m in  a circular  orbit  about 
the  sun,  whose  mass  is  A'/.  In  this  case,  the  radius  R of  the  orbit  is  equal  to 
the  semimajor  axis  y,  and  can  he  substituted  for  it.  Combining  Newton's 


462  Gravitation  and  Central  Force  Motion 


Table  11-1 


Semimajor  axis  y 

Period  T 

y3/T2 

Planet 

(in  10H  m) 

(in  yr) 

(in  1018  m!/s2) 

Mercury 

57.90 

0.2408 

3.361 

Venus 

108.2 

0.6152 

3.357 

Earth 

149.6 

1 

3.362 

Mars 

228.0 

1.881 

3.363 

Jupiter 

778.4 

11.86 

3.366 

Saturn 

1427 

29.46 

3.362 

Uranus 

2869 

84.01 

3.361 

Neptune 

4497 

164.8 

3.363 

Pluto 

5900 

248.4 

3.343 

second  law  of  motion  and  his  law  of  gravitation,  we  have  for  the  magnitude 
a of  i he  acceleration  of  the  planet 


a 


F_ 

m 


= G 


M_ 
R 2 


But  according  to  Ecp  (1  1-2),  a — 4tt2R/T2,  where  T is  the  sidereal  period  of 
the  planet.  So  we  have 


4tt2R 

rJ£'  2 


= G 


R 2 


(1  1-16) 


Rearranging  terms  yields  the  result 

R3  = GM 
T2  4t72 


(11-17) 


This  satishes  Kepler’s  third  law,  since  the  terms  on  the  right  side  are  all 
constants  for  the  solar  system.  Table  11-1  gives  the  length  of  y of  the  semi- 
major axis,  the  period  T,  and  the  value  of  y3/T2  for  all  the  known  planets. 
Except  for  Pluto,  Mercury,  and  Mars,  the  planets  all  have  orbits  which  are 
quite  close  to  circular.  Thus  the  derivation  for  circular  orbits  applies  quite 
closely  to  them.  However,  Kepler's  third  law  is  true  for  all  elliptical  orbits, 
provided  that  yz/T2  is  used  in  lieu  of  Rz/T2.  We  show  this  by  numerical 
calculation  in  Sec.  11-5. 


Since  Kepler’s  laws  depend  only  on  the  existence  of  an  inverse-square 
force  law,  they  are  applicable  to  many  more  cases  than  that  of  the  sun  and 
the  planets  alone.  Within  the  solar  system,  the  validity  of  Kepler’s  laws  can 
be  demonstrated  as  well  for  any  planet  which  has  more  than  one  satellite 
(artificial  satellites  included).  As  you  will  see  in  Chap.  20,  the  laws  work 
equally  well  for  pairs  of  charged  particles,  such  as  an  electron  revolving 
about  a proton  (so  far  as  the  newtonian  visualization  of  the  system  in  plane- 
tary terms  is  applicable). 

In  Example  1 1-5,  Kepler’s  third  law  is  used  to  determine  the  mass  of 
the  sun. 


EXAMPLE  11-5 

Find  the  mass  of  the  sun  from  the  data  in  Table  11-1. 

■ You  can  rewrite  Kepler’s  third  law,  Eq.  (11-17),  in  the  form 


M = 


4-7r2y3 

GT2 


11-3  The  Mechanics  of  Circular  Orbits:  Analytical  Treatment  463 


Using  any  of  the  values  of  y3/T2  given  in  Table  11-1,  you  have 


M = — 

6.6/  X 


4tt2 

10_n  N-m2/kg2 


x 3.36  x 1018  m3/s2 


= 1.99  x 1030  kg  = 1.99  x 1027  t 

Thus  the  mass  of  the  sun  is  more  than  300,000  times  greater  than  that  of  the  earth. 


The  method  of  Example  11-5  can  used  to  find  the  mass  of  any  body  having  a 
satellite  whose  period  and  orbit  radius  can  be  determined.  For  this  reason,  the 
masses  of  such  distant  planets  as  Jupiter,  which  has  several  moons,  have  been 
known  with  considerable  accuracy  for  centuries,  while  the  mass  of  the  nearby  and 
highly  visible  but  moonless  planet  Venus  was  known  only  poorly  until  very  re- 
cently. In  the  past  few  years,  a number  of  artificial  satellites  have  been  flown  past 
Venus.  Even  in  an  open,  fly-by  orbit,  a satellite  yields  information  as  to  the 
planet’s  mass,  by  means  of  an  extension  of  the  above  theory  to  noncircular  orbits. 
However,  the  repetitive  nature  of  a closed  orbit,  such  as  the  orbits  of  the  vehicles 
used  to  launch  the  Venus  landers,  allows  it  to  yield  much  more  precise  informa- 
tion. 

Jupiter  is,  by  a considerable  margin,  the  most  massive  planet  in  the  solar 
system.  Its  mass  is  1.9  0 0 x 1027  kg,  or  about  318  times  that  of  the  earth.  More  sig- 
nificantly, its  mass  is  9.55  x 10~4,  or  about  0.1  percent,  that  of  the  sun.  Since  all 
the  other  eight  planets  are  smaller  than  Jupiter,  the  sun  contains  practically  all  the 
mass  in  the  solar  system. 


11-4  REDUCED  MASS 


In  describing  the  motion  of  a body  such  as  a planet  about  a force  center,  it 
is  very  useful  to  make  the  assumption  that  the  force  center  is  itself  fixed  in 
an  inertial  frame  of  reference.  It  then  becomes  possible  to  apply  Newton's 
laws  of  motion  directly  to  observations  made  with  respect  to  the  force 
center,  without  the  need  for  dealing  with  fictitious  forces.  Assuming  that 
t lie  force  center  is  fixed  in  an  inertial  frame  is  tantamount  to  making  the 
approximation  that  the  body  whose  motion  is  being  studied  has  negligible 
mass  compared  to  the  central  body.  While  this  approximation  can  never  be 
exactly  true,  it  is  not  a bad  one  for  t he  solar  system.  For  instance,  compare 
die  mass  me  of  the  earth  obtained  in  Example  1 1-4  with  the  mass  M of  the 
sun  obtained  in  Example  11-5.  The  ratio  of  the  two  masses  is 


me 

~M 


5.99  x 1024  kg 

1.99  x 1030  kg 


= 3.01  x 1CT6 


El  owever,  observations  made  on  the  solar  system  are  so  accurate  that 
the  approximation  cannot  lie  made  if  full  advantage  is  to  be  taken  of  their 
accuracy.  Moreoever,  there  are  many  systems  — the  hydrogen  atom  at  one 
end  of  the  size  scale  and  double  star  systems  at  the  other — which  resemble 
the  solar  system  in  many  ways  but  where  the  approximation  is  not  at  all 
justified. 


We  now  develop  a method  which  makes  the  approximation  unneces- 
sary. Consider  two  bodies  of  arbitrary  mass,  each  under  the  gravitational 
influence  of  the  other  but  otherwise  isolated.  They  are  revolving  about 
each  other.  Each  follows  an  orbit  which  is  most  easily  analyzed  from  the 
point  of  view  of  an  observer  located  at  the  center  of  mass  of  the  system, 
since  that  observer  is  in  an  inertial  frame.  I his  is  because  there  is  no  ex- 


464  Gravitation  and  Central  Force  Motion 


ternal  force  applied  to  the  isolated  system;  see  Eq.  (9-50a).  The  problem  of 
analyzing  this  system  is  called  the  two-body  problem.  It  would  seem  at  first 
glance  to  be  at  least  twice  as  complicated  as  the  one-body  problem,  in  which 
one  of  the  bodies  is  stationary  in  an  inertial  frame.  However,  we  will  now 
derive  a very  direct  method  of  reducing  the  two-body  problem  to  a one- 
body  problem. 

Each  of  the  two  bodies  1 and  2,  having  masses  mx  and  m2,  respectively, 
exerts  some  kind  of  force — it  need  not  be  gravitational  or  even  inverse- 
square  in  nature — on  the  other.  We  must  make  the  restriction,  however, 
that  both  forces  are  central  forces.  If  the  force  exerted  on  body  1 is  F,  we 
can  write 

w1al  = F 

Here  is  the  acceleration  experienced  by  body  1,  as  seen  by  an  observer  in 
an  inertial  frame,  say  at  the  center  of  mass.  According  to  Newton’s  third 
law,  there  must  be  an  equal  and  opposite  force  — F exerted  on  body  2,  so 
that 


m2  a2  = -F 


where  a2  is  the  acceleration  of  body  2,  as  seen  by  the  observer  in  the  inertial 
frame.  If  we  now  multiply  the  first  of  these  equations  by  rn2  and  the  second 
one  by  mu  we  obtain  the  pair  of  equations 

m1m2a]  = Fw2 

and 


mxm2a2  = — Fwj 


Subtracting  the  second  of  these  equations  from  the  first  yields 


m1m2(a1  — a2)  = F ( m , + m2 


or 


m\m2 

F = (a,  - a2 

m1  + m2 


-18)  (J 


The  quantity  a1  — a2  is  the  acceleration  of  body  1 as  seen  by  an  observer  on 
body  2.  We  call  this  quantity  the  relative  acceleration  a.  That  is 


a = a,  — a2  (11-19) 

[This  definition  may  be  compared  with  Eq.  (5-23c),  a'  = a — A,  derived  in 
connection  with  the  fictitious-force  approach  of  Sec.  5-4.] 

The  fraction  m.xm2l{mx  + m2)  in  Eq.  (11-18)  has  the  dimensions  of  a 
mass  and  is  called  the  reduced  mass.  It  is  represented  by  the  symbol  p 
which  is  defined  by  the  expression 


m1m2 

V = T 

nii  + m 2 

Written  in  terms  of  a and  /a,  Eq.  (11-18)  becomes 

F = pa 

This  has  the  form  of  Newton’s  second  law  of  motion. 


(ll-20o) 


(1  1-206) 


Equation  (11-206)  is  not  really  Newton’s  second  law,  strictly  speaking. 
The  force  F is  indeed  the  force  exerted  on  body  1 by  body  2.  But  the  accel- 


11-4  Reduced  Mass  465 


eration  a,  being  that  of  body  1 as  observed  by  an  observer  on  body  2,  is  not 
an  acceleration  relative  to  an  inertial  frame.  Thus  the  observer  cannot 
directly  apply  Newton’s  second  law  without  adding  an  appropriate  ficti- 
tious force  (see  Sec.  5-4). 

However,  the  use  of  the  reduced  mass  eliminates  the  need  for  the  ficti- 
tious force.  In  Eq.  (11-206),  the  reduced  mass  is  substituted  for  the  ac- 
tual mass  of  body  1,  while  body  2 is  assigned  an  infinite  mass  and  thus  a 
fixed  position  in  an  inertial  frame  in  which  an  observer  has  no  need  for  ficti- 
tious forces.  Equations  (1  l-20a)  and  (1 1-206)  show  that  this  substitution  is 
valid  because  the  motion  of  body  1 relative  to  body  2 is  exactly  the  same  as 
the  motion  of  the  hypothetical  body  1,  with  reduced  mass  /x,  relative  to  the 
hypothetical  body  2.  which  has  infinite  mass  and  is  therefore  fixed.  That  is, 
for  the  actual  situation  in  which  two  bodies  orbit  about  their  common 
center  of  mass,  we  have  substituted  a single  hypothetical  body  of  mass  /x  in 
orbit  about  a fixed  point  in  an  inertial  frame.  We  have  thus  reduced  the  two- 
body  problem  to  the  one-body  problem,  which  we  have  already  solved  for 
circular  orbits  and  will  discuss  further  for  other  cases  below. 

The  dependence  of  /x  on  the  ratio  ml/m2  is  depicted  in  Fig.  11-11.  Note 
that  /x  is  in  fact  “reduced”;  /x/m2  is  always  less  than  mi/m2.  That  is,  assigning 
an  infinite  mass  to  body  2 requires  a reduction  in  the  mass  assigned  to 
body  1 . 

Once  the  relative  acceleration  a has  been  found  for  the  hypothetical 
particle  of  mass  /x,  it  is  possible  to  transform  back  to  the  original  two-body 
system.  For  body  1,  for  example,  we  have  F = m1a1.  Substituting  this  value 
of  F into  Eq.  (1  1-206)  gives 


m = fx  a = 


m1m2 
mx  + m2 


or 


m ! + rn  2 

The  acceleration  a2  can  be  found  in  terms  of  a in  the  same  way.  It  is 


Wi 

m 2 


466  Gravitation  and  Central  Force  Motion 


(a) 


Fig.  11-12  T he  relation  between  (a)  a 
two-body  system  and  ( b ) the  reduced- 
mass  system  equivalent  to  it.  While  the 
trajectories  are  shown  as  circles,  they  can 
be  of  any  shape  whatever.  The  only  re- 
quirement is  that  the  force  exerted  on 
each  body  be  directed  toward  (or  away 
from)  the  other.  Note  that  the  total  dis- 
tance between  bodies  1 and  2 is  the  sum 
of  their  distances  from  the  center  of 
mass.  It  is  equal  to  the  distance  between 
the  hypothetical  body  of  reduced  mass 
/u,  and  the  fixed  center  about  which  it 
moves. 


EXAMPLE  11-6 


( b ) 


Equations  (1 1-21)  and  (1 1-22)  can  be  integrated  to  give  the  following  trans- 
formation equations  for  the  velocity  and  position  vectors: 


m2 

and 

nii 

(1  1-23) 

, v 

mx  + m2 

v2  — v 

ml  + m2 

m2 

and 

mi 

(11-24) 

, r 

nii  + m2 

1*2  , r 

nii  + m2 

To  verify  the  correctness  of  those  integrations,  differentiate  Eqs.  (11-24) 
once  to  obtain  Eqs.  ( 1 1-23),  and  again  to  obtain  Eqs.  (1  1-21)  and  ( 1 1-22).  In 
the  actual  two-body  system,  the  velocity  of  body  1 relative  to  an  observer  on 
body  2 is  given  by  the  vector  difference 

V = Vj  - v2 

which  is  precisely  the  difference  found  by  subtracting  the  second  of  Eqs. 
(11-23)  from  the  first.  Thus  the  velocity  of  the  hypothetical  body  of  mass  p 
about  the  force  center,  located  in  the  fixed  hypothetical  body  of  infinite 
mass,  is  the  same  as  the  velocity  of  the  actual  body  1 relative  to  body  2. 
Likewise,  the  vector  difference 

r = ri  - r2 

is  the  vector  from  the  force  center  to  the  body  of  mass  p,  and  is  also  the 
vector  from  body  2 to  body  1 in  the  actual  system. 

The  relation  between  the  actual  and  reduced-mass  systems  is  illus- 
trated in  Fig.  1 1-12. 


Find  the  location  of  the  center  of  rotation  of  the  earth-moon  system,  relative  to  the 
center  of  the  earth.  Then  evaluate  p for  the  earth-moon  system.  The  mass  of  the 
earth  is  81  times  that  of  the  moon. 


11-4  Reduced  Mass  467 


■ Using  the  second  of  Eqs.  (1 1-24),  you  have 

m„ i _ _ m,„  _ J_ 

me  + mm  82  mm  82 

so  that  the  distance  r2  from  the  center  of  the  earth  to  the  center  of  rotation  is  of 
the  distance  to  the  moon.  Its  numerical  value  is 

3.84  x 105  km 

r2  = — = 4700  km 


Since  the  radius  of  the  earth  is  6370  km,  the  center  of  rotation  is  inside  the  earth. 
The  value  of  /a  is 


m ,„  x 8bnm 
m,„  + 81  mm 


81 


= — m„ 


82 


1 


82 


m 


e 


So  ijl  is  reduced  from  the  value  of  the  moon's  mass  rnm  = -irme. 


11-5  THE  MECHANICS 
OF  ORBITS: 
NUMERICAL 
TREATMENT 


The  path  of  the  moving  body  lies  in  a plane.  To  see  this,  consider  an  instant 
when  the  velocity  of  the  body  is  v,  so  that  its  momentum  is  p = mv.  The  vector  p 
and  the  force  center  determine  a plane.  (It  is  the  plane  depicted  in  Fig.  ll-13a.) 
The  position  vector  R lies  in  this  plane.  And  since  a central  force  must  always  be 
parallel  or  antiparallel  to  the  position  vector,  F also  lies  in  the  plane.  But  F is  re- 
lated to  the  rate  of  change  of  momentum  by  Newton’s  second  law.  F = dp /dt. 
After  an  infinitesimal  time  interval  dt,  the  new  momentum  p'  is  given  by  the 
vector  sum  p'  = p + (dp/dt)  dt.  Since  p and  dp  /dt  lie  in  the  same  plane,  so  do 
the  new  momentum  p',  the  new  velocity  v'  = p'/m,  and  the  new  position  R'  = 
R + v dt. 


Most  of  what  has  been  done  so  far  has  been  restricted  to  circular  orbits,  for 
which  the  mathematical  analysis  is  relatively  simple.  We  now  tackle  the 
problem  of  motion  in  orbits  of  any  possible  shape.  The  first  task  is  to  set  up 
the  general  equations  of  motion  for  a central  force  of  arbitrary  form.  Then 
we  will  solve  these  equations  numerically  for  a variety  of  important  special 
cases. 

Figure  1 1- 1 3a  depicts  a body  of  mass  m moving  under  the  influence  of 
a central  force  arising  from  the  presence  of  the  central  body  of  much  larger 
mass  M,  which  is  taken  to  be  fixed  at  the  origin.  (Even  if  M is  not  very  much 
larger  than  m,  the  reduced-mass  approach  of  Sec.  1 1-4  can  be  used  to  trans- 
form the  problem  into  the  one  discussed  here.)  The  central  force  F is 
always  directed  toward  the  origin,  although  its  magnitude  will  vary  as  the 
magnitude  of  the  position  vector  R varies. 


We  begin  the  development  of  the  equations  of  motion  by  writing 
Newton’s  second  law.  Equating  the  acceleration  of  the  body  to  F /m  gives 


d- R _ F 

dt2  ~ m 


(1 1-25) 


We  take  the  plane  of  the  orbit  to  be  the  xy  plane.  Writing  Eq.  (11-25)  in 
component  form,  we  have  the  pair  of  equations 


d2x  _ Fx 
~dt2~~^ 


(1  1-26a) 


468  Gravitation  and  Central  Force  Motion 


y axis 


y axis 


(«)  Fig.  11-13  A body  of  mass  m moves  in  an  unspecified  orbit  under  the  influence  of  a central 

force  F.  In  the  text,  equations  of  motion  are  developed  in  a component  form  suitable  for 
numerical  treatment.  Note  that  cos  4>  = -cos  8 = -x/(xr  + y2)112  and  sin  <j>  = -sin  0 = 
-y/(x2  + y2)1'2. 


and 


= F,l 

dt2  m 


(11-266) 


The  force  components  are  shown  in  Fig.  11-136,  where  the  angle  </> 
specifies  the  direction  of  F with  respect  to  the  positive  x direction.  The 
quantities  cos  (f>  and  sin  <f>  are  evaluated  in  the  figure  and  its  caption.  Using 
these  values,  we  can  write 


and 


Fx  = F cos  t/>  = —F 


(x2  + y2)112 


Fy  — F sin  <6  = — F — ~2 


(x2  + y2)112 


Can  you  explain  why  the  negative  sign  appears  in  these  equations,  regard- 
less of  the  location  of  the  moving  body  with  respect  to  the  origin? 

Inserting  the  above  expressions  for  the  force  components  into  Eqs. 
(ll-26«)  and  (1  1-266),  we  obtain 


and 


d2x  F x 

dt2  m (x2  + y2)1/2 

d2y  F y 

df  m (x2  + y2)1/2 


(1 1 - 2 7 ) 


(11-276) 


To  be  specific,  let  us  assume  that  F is  a gravitational  force.  According  to  Eq. 
(1  l-6a),  the  magnitude  of  the  gravitational  force  is  given  by  F = GMm/R2. 
Since  R1  — x2  + y2,  that  equation  can  be  written 

F — GMm  — — ■ — 2 
xr  + y 


11-5  The  Mechanics  of  Orbits:  Numerical  Treatment  469 


Substituting  this  value  of  T into  Eqs.  (1  \-21a)  and  (1 1-276)  leads  to  the  fol- 
lowing form  for  the  equations  of  motion: 


(x2  + y2)312 


(1  l-28«) 


and 


d2y 


(11-286) 


This  is  the  pair  of  differential  equations  which  must  be  solved  in  order  to 
describe  the  motion  of  the  body  of  mass  m.  As  noted  in  Sec.  1 1-3,  analytical 
solutions  do  exist;  however,  a complete  discussion  involves  mathematical 
techniques  beyond  the  level  of  this  book.  But  we  can  do  just  as  well,  as  far  as 
a physical  picture  of  the  motion  is  concerned,  by  means  of  numerical  solu- 
tions which  involve  only  a slight  extension  of  the  procedure  developed  in 
Sec.  6-4.  What  is  equally  significant,  with  almost  no  extra  effort  the  pro- 
cedure can  be  extended  to  deal  with  generalizations  of  Eqs.  (ll-28a)  and 
(11-286),  for  which  no  analytical  solution  exists.  In  particular,  we  will  use 
numerical  solutions  to  study  force  laws  that  deviate  from  the  inverse- 
square  form.  We  will  also  treat  a variety  of  perturbations,  such  as  that  pro- 
duced on  the  motion  of  the  earth  by  the  small  but  nonnegligible  gravitational 
attraction  of  Jupiter. 

To  allow  for  subsequent  generalization,  it  is  most  convenient  to  make  a 
change  in  the  notation  of  Eqs.  (1  l-28fl)  and  (11-286).  We  define 


a = GM 


and  give  the  name  /3  to  the  exponent  of  the  term  x2  + y2.  (In  the  case  of  an 
inverse-square  force  such  as  gravitation,  we  have  f3  = — f.)  Also,  we  use  the 
notation  introduced  in  Eq.  (6-14),  in  connection  with  the  general  method 
for  obtaining  solutions  for  second-order  diffential  equations.  We  write 


(1  l-29a) 


and 


(1  1-296) 


where 


Qx  = ~ax(x2  + y2)0 


(1  l-30a) 


and 


Qu  = ~ay(x2  + y2)0 


(11-306) 


Equations  (1  1 -27a)  and  (1  1-276)  or  (1  l-28n)  and  (1  1-286)  are  called 
coupled  differential  equations.  They  are  “coupled”  from  a mathematical 
point  of  view  by  the  fact  that  each  differential  equation  of  the  pair  contains 
both  dependent  variables.  Physically,  the  acceleration  d2x/dt2  in  the  x direc- 
tion produces  a change  in  x2  + y2  and  thus  a change  in  Fy.  This  changes 
the  acceleration  d2y/dt 2 in  the  y direction,  which  influences  the  change  in 
x2  + y2  and  therefore  Fx , and  so  forth.  Thus  the  x and  y motions  are 
coupled  in  a physical  sense. 


470  Gravitation  and  Central  Force  Motion 


Equations  (ll-29a),  (11-296),  (ll-30«),  and  (11-306)  are  used  in  the 
same  way  that  you  used  Eq.  (6-14).  The  procedure  is  completely  parallel  to 
that  described  in  Sec.  6-4.  To  be  specific,  this  is  what  you  do.  Always  em- 
ploying Eqs.  (1  l-30a)  and  (11-306)  to  express  Qj.  and  Qy  in  terms  of  the 
quantities  on  which  they  depend,  you  select  a.  small  time  increment  At  and 
then  carry  out  the  following  double  set  of  calculations: 

Determine  the  values  of  Qr  and  Qy  from  the  values  at  t0  = 0 of  the 
quantities  on  which  they  depend,  and  use  them  to  calculate 


Or 


At 


and 


(1  1-31(7) 


from  the  given  values  of  (dx/dt)0  and  {dt/dt)0-  Else  the  results  to  calculate 


xi  — *o 


(%)  A, 
\ dt  / n2 


and 


yi  — yo  + 


(£)  At 

\dt)  1/2 


(11-316) 


from  the  given  values  of  x0  and  y0.  Then  set  q = At. 

Next  determine  the  values  of  Qi  and  Qy  from  the  new  values  of  the 
quantities  on  which  they  depend.  Use  them,  and  the  values  of  (dx/dt)u2  and 
{dy/dt)1/2  just  obtained,  to  calculate 


and 


dy\  ^ (dy\ 


dt  h 


3/2 


dth 


1/2 


+ Qy  At  (11-3  lr) 


Use  the  results,  and  the  values  of  xl  and  yx  just  obtained,  to  calculate 


X2  ~ Xi  + 


dx\ 
dt  /3/2 


At  and  y2  — yi  4- 


dy\ 

dt[ 


At 


1-3  Id) 


'3/2 


Then  set  t2  = 2At. 


Next  determine  the  values  of  Qj  and  QtJ  from  the  new  values  of  the 
quantities  on  which  they  depend.  Use  them,  and  the  values  of  (dx/dt)V2  and 
(dy/dt)3/ 2 just  obtained,  to  calculate 


+ Qj-  At 


and 


+ Qy  At 


(1  1-3  lc) 


Use  the  results,  and  the  values  of  x2  and  y2  just  obtained,  to  calculate 

* “ * + A'  and  ^ » + {ft )«  A'  (ll‘31-fl 


Then  set  t3  = 3At. 

Continue  these  calculations  until  t reaches  whatever  value  is  required. 

The  program  required  to  carry  out  the  numerical  calculations 
described  by  Eqs.  (1  1-31),  called  the  central-force  program,  is  listed  in  the 
Numerical  Calculation  Supplement.  The  remainder  of  this  section  is  de- 
voted to  examples  which  carry  out  the  calculations  in  a number  of  important 
cases.  We  begin  with  Example  11-7,  which  studies  the  motion  of  a planet 
very  much  like  the  earth  whose  orbit  is.  however,  perfectly  circular. 

The  motion  of  this  earthlike  planet  is  governed  by  Eqs.  (1  l-28«)  and 
(1 1-286).  Hence  the  value  of  Qj.  given  by  Eq.  (1  l-30fl)  and  that  of  Qy  given 
by  Eq.  (11-306)  must  be  determined  by  setting  a = GM,  where  M is  the 
mass  of  the  sun,  and  by  setting  (3  — — f. 


11-5  The  Mechanics  of  Orbits:  Numerical  Treatment  471 


EXAMPLE  11-7 


Run  the  central-force  program  with  the  following  set  of  initial  conditions  and 
parameters.  (The  units  used  are  discussed  immediately  below.) 

x0  = 1 (in  AU);  ( dx/dt)0  = 0;  y0  = 0;  (dy/dt)0  = 2v  (in  AU/yr);  t0  — 0;  57  = 
(in  yr);  a = 39.5  [in  (AU)3/(yr)2];  fi  = —1.5. 

Since  y0  = 0,  the  initial  conditions  represent  the  earth  on  the  x axis,  and  you 
must  set  x0  equal  to  the  radius  of  the  earth’s  orbit.  The  results  are  most  transparent 
if  you  do  as  astronomers  usually  do,  measuring  the  time  in  years  and  the  distance 
from  the  sun  in  astronomical  units  (AU).  The  astronomical  unit  is  defined  to  be  the 
length  of  the  semimajor  axis  of  the  earth’s  orbit.  That  is,  you  set.\0  = 1AU,  where 

1 AU  ^ 149.6  x 109  m 

Since  the  orbit  is  to  be  circular,  the  earth  must  cross  the  x axis  at  right  angles.  There- 
fore the  initial  velocity  must  have  no  x component,  and  so  ( dx/dt)0  = 0.  This  is  the 
reason  for  the  above  choice  of  the  initial  condition  for  (dx/dt)0.  Furthermore,  since 
the  speed  will  be  constant  in  a circular  orbit,  you  must  have  (dy/dt) 0 equal  to  the  speed 
of  an  object  which  completes  an  orbit  of  radius  of  1 AU,  going  a distance  277  AU,  in  1 
yr.  This  leads  to  the  initial  condition  (dy/dt) 0 = 277  AU/yr. 

In  such  a numerical  calculation,  the  time  increment  At  must  be  chosen  small 
enough  that  the  orbit  segments  are  reasonably  well  approximated  by  straight  lines. 
(Can  you  explain  why?)  Equally  important,  the  gravitational  force  over  any  one  seg- 
ment must  be  reasonably  well  approximated  by  a constant  value.  The  choice  At  = 
/a  yr  makes  each  segment  represent  an  elapsed  time  of  a trifle  more  than  1 week 
(1  yr  = 365.26  days  = 52.180  weeks). 

The  value  of  a = GM  is  that  which  is  consistent  with  a set  of  units  in  which 
length  is  measured  in  astronomical  units  and  time  in  years.  In  converting  from  SI 
units,  you  have 


Fig.  11-14  Results  of  the  numerical  orbit 
calculations  carried  out  in  Examples  1 1-7, 
1 1-8,  and  1 T9.  The  dots  represent  succes- 
sive positions  separated  by  equal  time  inter- 
vals, of  an  “idealized  earth,”  which  moves 
in  a circular  orbit  at  a distance  of  1 AU 
from  the  sun.  The  x’s  represent  successive 
positions  of  a planet  which  is  initially  at  the 
same  position  (x0  = 1 AU,  y0  = 0)  and  is 
moving  in  the  same  direction,  but  with  a 
speed  5 percent  greater  than  that  which 
leads  to  a circular  orbit.  The  resulting  orbit 
is  an  ellipse  with  its  aphelion  at  x = — 1.2B 
AU,  y = 0.  The  variation  of  the  speed  of  the 
planet  in  this  orbit  is  evidenced  by  the  vari- 
ation in  the  distances  between  adjacent 
points.  The  crosses  (+)  represent  successive 
positions  of  a planet  which  is  initially  at  the 
same  position  and  moving  in  the  same  direc- 
tion, but  with  a speed  50  percent  less  than 
that  which  leads  to  a circular  orbit.  The 
resulting  orbit  is  an  ellipse  with  its  perihelion 
atx  = — 0.15AU,y  = 0.  In  this  case  the  non- 
circular character  of  the  orbit  is  quite  ap- 
parent, and  the  perihelion  and  aphelion 
distances  markedly  different  from  each 
other.  The  variation  in  orbital  speed  is  like- 
wise very  marked.  As  discussed  in  the  text, 
the  sum  of  the  focal  distances  f^A  and  Af2 
is  equal  to  the  sum  of  the  focal  distances 
J,B  and  Bf2.  Thus  the  highly  noncircular 
orbit  on  which  the  two  points  A and  B lie 
is,  in  fact,  an  ellipse,  and  Kepler’s  first  law 
is  satisfied. 


472  Gravitation  and  Central  Force  Motion 


G = 6.67  x 10  11  m3/(s2-kg) 


1AU  \3  / 8.640  x 104  s \ 2 / 365.3  days 


1 day 


V 149.6  x 109  m / 

= 1.98  x 10'29  (AU)3/[(yr)2-kg] 

Thus  you  have 

a = GM  = 1.98  x 10“29  (AU)3/[(yr)2-kg]  x 1.99  x 103°  kg 
= 39.5  (AU)3/(yr)2 


1 yr 


■ The  result  of  the  calculation  is  the  orbit  represented  by  the  dots  in  Fig.  11-14. 
The  sequence  of  dots  represents  successive  positions  of  the  planet,  beginning  at 
x0  = 1 AU,3>o  = 0 and  proceeding  counterclockwise.  Since  the  time  interval  between 
adjacent  positions  is  always  the  same,  the  graph  is  a “strobe  photo"  of  the  motion 
of  the  planet.  You  can  see  by  inspection  that  the  orbit  is  indeed  circular,  with 
the  sun  at  the  center.  The  uniform  space  between  dots  tells  you  that  the  speed 
of  the  planet  is  constant,  as  expected.  Also  as  expected,  the  52d  dot  almost 
coincides  with  the  zeroth  at  x = 1 AU,  y = 0,  so  the  period  is  1 yr. 


On  the  scale  of  Fig.  1 1-14,  the  deviation  of  the  earth’s  actual  orbit  from 
perfect  circularity  would  be  just  visible,  with  a difference  of  about  IF  grid 
divisions  between  the  orbit  radius  at  aphelion,  where  the  earth  is  farthest 
from  the  sun,  and  the  radius  at  perihelion,  where  it  is  closest. 

A noncircular  orbit  is  considered  in  Example  1 1-8.  In  it,  the  initial  con- 
ditions of  Example  1 1-7  are  modified  by  increasing  the  initial  tangential 
speed  by  5 percent. 


EXAMPLE  11-8 

Run  the  central-force  program  with  the  following  set  of  initial  conditions  and 
parameters. 

xq  = 1 (in  AU);  (dx/dt) 0 = 0;  y0  = 0;  (dy/dt)0  = 2.0 t (in  AU/yr);  t0  = 0;  A t = 
-h  (in  yr);  a = 39.5  [in  (AU)3/(yr)2];  f3  = -1.5. 

■ The  result  of  the  calculation  is  the  orbit  represented  by  x’s  in  Fig.  1 1-14.  Now 
the  gravitational  force  at  x0  = 1 AU  is  not  strong  enough  to  constrain  the  planet  to 
an  orbit  of  constant  radius,  because  of  its  larger  initial  momentum.  So  it  moves 
along  a curve  of  smaller  curvature,  to  a greater  distance  from  the  sun.  To  be  more 
specific,  the  force  F = dp/dt  acting  on  the  planet  at  its  initial  position  changes  its 
momentum  by  the  same  amount  per  unit  time  as  in  Example  11-7.  However,  the 
initial  momentum  is  greater  in  magnitude,  so  that  the  change  per  unit  time  is  not 
great  enough  to  bend  the  trajectory  into  a circular  orbit.  A somewhat  different 
explanation  of  why  the  planet  starts  moving  outside  the  circular  orbit  is  given  in  Ex- 
ample 11-9. 

You  can  tell  by  measuring  the  distance  between  successive  points  on  the  orbit 
that  the  planet  slows  down  as  it  moves  to  greater  distances  from  the  sun.  Since  the 
orbit,  unlike  the  one  in  Example  1 1-7,  is  not  everywhere  perpendicular  to  the  posi- 
tion vector  R along  which  the  central  force  F acts,  it  follows  that  F has  a component 
parallel  to  the  path  of  the  planet.  This  component  acts  to  change  the  magnitude  of 
the  velocity,  that  is,  the  speed.  As  the  planet  comes  closer  to  the  sun  in  the  second 
half  of  its  orbit,  it  speeds  up  again.  It  reaches  the  starting  point  in  about  62.4 
“weeks,”  or  62.4/52  = 1.20  terrestrial  years. 


Although  the  orbit  of  Example  11-8  is  an  ellipse,  you  cannot  tell  this  by 
simple  visual  inspection  of  the  graph.  On  the  basis  of  such  inspection,  you  might 
easily  conclude  that  the  orbit  is  a circle  whose  center  lies  atx  — -0.115  AU,y  = 
0.  The  shape  of  this  orbit  is  very  close  to  that  of  the  orbit  of  Mars  (though  the  size 
is  only  about  two-thi  rds  that  of  the  orbit  of  Mars) . Mars  has  the  most  noncircular  or- 


11-5  The  Mechanics  of  Orbits:  Numerical  Treatment  473 


bit  of  the  planets  readily  observable  with  the  naked  eye,  and  it  was  largely  for  that 
reason  that  Kepler  chose  Mars  for  his  painstaking  empirical  study.  You  can  see 
why  highly  precise  observations  are  necessary  to  determine  the  shape  of  a plane- 
tary orbit  in  a unique  manner,  and  why  it  would  be  easy  to  guess  at  some  shape 
other  than  an  ellipse.  Indeed,  Kepler  initially  believed  that  the  orbit  was  an  oval, 
that  is,  a curve  composed  of  segments  of  four  circles.  But  computations  based  on 
the  oval  turned  out  to  be  prohibitively  difficult  for  a long  series  of  trial-and-error 
calculations,  and  Kepler  turned  to  the  ellipse  as  a convenient  approximation. 
Kepler  intended,  once  he  had  determined  the  best-fitting  approximate  ellipse,  to 
make  final  calculations  to  determine  the  “exact”  oval.  Imagine  his  surprise  when 
he  found  that  the  ellipse  fitted  perfectly,  within  the  limits  of  observational  error! 

In  Example  1 1-9  we  investigate  the  result  of  reducing  the  initial  speed 
of  the  planet  substantially — to  one-half  that  required  for  a circular  orbit  at 
1 AU. 


EXAMPLE  11-9 

Run  the  central-force  program  with  the  following  set  of  initial  conditions  and 
parameters. 

x o = 1 (in  AU);  (dx/dl)0  = 0;  y0  = 0;  (dy/dt) 0 = tt  (in  AU/yr);  t0  = 0;  A t = 
520  (in  yr);  a = 39.5  [in  (AU)3/(yr)I 2];  /3  = —1.5. 

■ As  you  can  see  from  the  resulting  orbit,  plotted  with  crosses  in  Fig.  1 1-14,  the 
planet  now  “falls  in"  toward  the  sun,  moving  in  an  orbit  inside  the  circular  orbit  of 
radius  1 AU.  This  is  because  the  planet,  initially  moving  more  slowly  than  in  Ex- 
ample 11-7,  must  travel  along  a path  with  a greater  curvature  than  that  of  the 
circle  through  the  initial  position,  in  order  for  its  acceleration  to  be  equal  to  the 
gravitational  force  divided  by  the  planet’s  mass.  (How  would  you  modify  this  ex- 
planation to  discuss  the  case  treated  in  Example  1 1-8?  How  would  you  modify  the 
explanation  given  in  Example  11-8  to  discuss  the  case  treated  here?) 

With  the  initial  conditions  given,  the  planet  moves  so  close  to  the  sun,  and  ac- 
quires such  a great  speed  in  doing  so,  that  the  interval  of  one  “week"  used  in  the 
previous  examples  is  much  too  long  to  satisfy  the  condition  of  nearly  constant  gravi- 
tational force  through  each  time  interval.  It  is  therefore  appropriate  to  choose  an 
interval  one-tenth  as  long.  But  only  every  tenth  point  is  plotted  in  the  figure,  in 
order  to  make  comparison  with  the  other  two  orbits  easier. 

In  this  example,  the  noncircularity  of  the  orbit  is  evident  on  inspection.  So  is  the 
variation  in  orbital  speed,  which  is  reflected  in  the  dramatic  change  in  the  distance 
between  successive  points.  This  speed  variation  is  in  qualitative  accord  with  Kepler’s 
second  law.  Since  the  angular  momentum  about  the  origin  at  the  force  center  1 = 
m R x v must  remain  constant,  the  magnitude  of  v must  increase  as  the  magnitude 
of  R decreases,  and  vice  versa. 

The  orbit  period  in  this  case  is  only  about  22.5  “weeks,”  or  0.43  yr.  There  are 
two  reasons  for  this.  Not  only  is  the  length  of  the  orbit  smaller  than  that  of  the  cir- 
cular orbit  discussed  in  Example  1 1-7,  but  also  the  average  speed  is  greater  even 
though  the  initial  speed,  at  aphelion,  is  less. 


I he  orbits  calculated  in  Examples  1 1-7,  1 1-8,  and  1 T9  can  be  used  to 
test  the  validity  of  all  three  of  Kepler’s  laws.  Table  1 1-2  concerns  the  third 
law.  The  first  two  rows  of  the  table  give  the  semimajor  axis  y and  the  period 
T for  each  of  the  three  orbits,  measured  directly  from  Fig.  1 1-14.  The  val- 
ues vary  quite  widely.  Nevertheless,  the  corresponding  values  of  y d/T'2, 
given  in  the  third  row  of  the  table,  are  equal  to  1 (AU)3/(yr)2  within  the 
limits  of  accuracy  of  the  procedure. 

The  noncircular  orbit  of  Example  1 1-9  makes  it  a useful  test  of  Kepler's 
first  and  second  laws.  Let  ns  verify  the  first  law  by  showing  that  the  orbit  is 

474  Gravitation  and  Central  Force  Motion 


Table  11-2 


Test  of  Kepler’s  Third  Law  Using  the  Results  of  Examples  11-7,  11-8, 
and  11-9 

Orbit  of  Example 


11-7 

11-8 

11-9 

Semimajor  axis  y (in  AU) 

1.000 

1.115 

0.575 

Orbital  period  T (in  yr) 

1.000 

62.4/52 

22.5/52 

rVT2  [in  (AU)3/(yr)2] 

1.00 

1.00 

1.02 

indeed  an  ellipse.  According  to  the  definition  of  an  ellipse,  the  sum  of  the 
distances  from  the  two  foci  to  any  point  on  die  ellipse  is  the  same.  The  sun 
lies  at  one  focus  f1}  whose  coordinates  are  x = 0,  y — 0.  Compare  with  Fig. 
11-8.  The  planet  crosses  the  negative  x axis  at  perihelion  (the  point  la- 
beled A in  Fig.  11-14)  with  x = —0.14  AU.  By  symmetry,  the  other  focus 
must  be  0.14  AU  from  aphelion  at  x = 0.86  AU,  y = 0,  as  indicated  by 
the  point  marked  f2  in  the  figure. 

Let  us  take  as  test  points  the  perihelion  point  A and  the  point  labeled  B 
in  the  figure,  whose  coordinates  are  x = 0.61  AU,  y = 0.36  AU.  The  sum  of 
the  distances  from  point  A to  the  two  foci  is  found  by  reading  directly  off 
the  x axis,  and  it  is 

dx  = 0.14  AU  + 1.00  AU  = 1.14  AU 

We  can  find  the  distances  from  point  B to  the  two  foci  by  using  the 
Pythagorean  theorem  twice.  The  distance  to  the  focus  /x  at  the  origin  is 
[(0.61  AU)2  + (0.36  AU)2]1/2,  where  the  two  numbers  are  the  x and  y coor- 
dinates of  the  point  B , read  directly  off  Fig.  11-13.  The  distance  to  the 
other  focus /2  is  found  from  both  the  difference  between  the  x coordinates 
for  /2  and  B and  the  difference  between  their  y coordinates.  Using  these 
numbers  in  the  pythagorean  theorem  gives  [(0.86  A LI  — 0.61  AU)2  + 
(0.00  - 0.36  AU)2]1'2.  Tims  the  sum  of  the  distances  to  the  foci  for  point  B 
is 


d2  = [(0.61  AU)2  + (0.36  AU)2]1/2+  [(0.25  AU)2  + (-0.36  AU)2]1'2 
= 1.14  AU 

Since  dt  = d2  within  the  accuracy  of  the  procedure,  Kepler’s  first  law  is 
satisfied. 

Figure  11-15  is  a copy  from  Fig.  11-14  of  the  same  orbit.  In  order  to 
verify  Kepler’s  second  law,  two  sectors  have  been  marked  out,  each  one 
swept  out  in  a time  interval  of  one  “week.”  To  measure  the  areas  swept  out, 
chords  uv  and  wz  of  the  two  sectors  are  drawn,  dividing  each  into  a triangle 
containing  most  of  the  area  and  a smaller  region  bounded  by  the  orbit  and 
the  chord.  The  area  of  the  triangles  can  be  found  by  measuring  their  bases 
and  altitudes  in  terms  of  the  unit  of  the  graph  grid.  This  can  be  done  with 
either  a ruler  or  a pair  of  dividers.  The  smaller  areas  are  best  measured  by 
counting  grid  squares,  with  estimates  made  of  the  contributions  ol  partial 
squares.  The  error  in  estimation  is  not  too  serious,  since  the  total  contribu- 
tion of  the  area  involved  is  relatively  small. 

When  this  is  done,  the  area  labeled  A in  Fig.  1 1-15  is  found  to  contain 
78  grid  squares,  while  the  area  labeled  B contains  77.  Thus  in  spite  of  their 


11-5  The  Mechanics  of  Orbits:  Numerical  Treatment  475 


Fig.  11-15  Test  of  Kepler’s  second  law.  The  orbit  is  the  same 
one  denoted  by  crosses  in  Fig.  11-14.  The  areas  A and  B 
are  each  swept  out  in  rz  yr.  In  spite  of  their  quite  different 
shapes  and  the  quite  different  speeds  of  the  planet  in  the  two 
regions  of  the  orbit,  the  two  areas  contain  78  and  77  grid  squares, 
respectively,  and  are  thus  equal  within  the  limits  of  accuracy  of 
this  method  of  area  determination. 


very  different  shapes,  the  areas  are  substantially  equal,  in  accord  with 
Kepler’s  second  law. 


y 


Fig.  11-16  Diagram  for  the  proof, 
given  in  the  text,  that  |R  x v|  = xvv  — 
yvx- 


You  can  make  a more  accurate  verification  of  Kepler’s  second  law  by  using 
the  central-force  program.  In  the  course  of  each  cycle  of  calculation  the  velocity 
components  [dx/dt)J+ll2  = vx  and  (dy/dt) j+ll2  = vy  are  computed  [see  Eqs.  [11-31]]. 
You  can  obtain  these  values  by  stopping  the  calculating  device  during  any  cycle 
and  recalling  the  current  values  from  memory.  You  can  then  use  them,  together 
with  corresponding  values  of  Xj  and  yJt  the  most  recent  position  coordinates 
of  the  planet,  to  calculate  the  magnitude  of  the  vector  product  R x v. 

As  seen  in  Sec.  11-3,  both  the  law  of  equal  areas  and  the  constancy  of  R x v 
are  direct  consequences  of  the  law  of  conservation  of  angular  momentum.  Hence 
if  R x v is  found  to  have  a constant  magnitude,  the  law  of  equal  areas  is  satisfied. 
If  you  compare  values  of  |R  x v|  obtained  at  different  points  on  the  orbit,  you  will 
find  them  to  be  quite  close  to  equal. 

The  easiest  way  to  calculate  jR  x v|  is  directly  from  the  components  of  R and  v. 
This  avoids  the  necessity  of  calculating  the  angle  between  R and  v explicitly.  For 
two  vectors  R and  v which  lie  in  thexy  plane,  the  magnitude  of  the  vector  product 
is  given  by 

|R  x v|  = xvy  - yvx  [11-32] 

The  proof  of  Eq.  (11-32)  is  as  follows.  In  Fig.  11-16,  the  angle  between  the  positive 
x axis  and  R is  called  </>,  and  the  angle  between  the  positive  x axis  and  v is  called  p. 
The  angle  between  R and  v is  called  6.  It  is  evident  from  the  diagram  that 


ip  = cf>  + 0 


so  that 


d = ip  — cf> 

According  to  the  definition  of  the  vector  product,  we  have 

|R  x v|  = Rv  sin  6 


and  thus 


|R  x v|  = Rv  sinfip  - c/>) 

Using  the  trigonometric  identity  for  the  sine  of  the  difference  of  two  angles  yields 
[R  x v|  = Rv(sin  p cos  cj>  — cos  i jj  sin  <p) 

= (R  cos  0)(v  sin  i//)  — (R  sin  </>)(v  cos  p) 


476  Gravitation  and  Central  Force  Motion 


Fig.  11-17  Results  of  the 
numerical  orbit  calculation 
carried  out  in  Example 
11-10.  The  conditions  are 
the  same  as  those  which 
led  to  the  circular  orbit  of 
Example  11-7,  except  that 
the  initial  velocity  is 
greater  by  the  factor  v2. 
The  resulting  orbit  is 
parabolic.  The  fundamen- 
tal geometric  definition  of 
the  parabola  can  be  used 
to  verify  this  statement. 
According  to  this  defini- 
tion. a parabola  is  the  locus 
of  all  points  equidistant 
from  the  focus  / and  a 
straight  line  called  the 
directrix.  Kepler's  first 
law  states  that  the  force 
center  is  located  at  one  of 
the  two  foci  of  an  elliptical 
orbit.  Since  the  orbit  in  this 
figure  is  a limiting  case  of 
the  elliptical  orbit,  the  fo- 
cus of  interest  must  be  at 
the  origin,  where  the  force 
center  is  located.  (The 
other  focus  lies  at  x = — oo.) 
If  the  points  in  the  figure 
do  lie  on  a parabola,  the 
definition  of  a parabola  re- 
quires that  the  directrix 
cross  the  y axis  at  x = 2.0. 
so  that  the  point  on  the  pa- 
rabola at  (x  = 1.0,  y = 0) 
can  be  equidistant  from  the 
focus  and  the  directrix.  It 
you  imagine  the  other  half 
of  the  curve  described  by 
the  points  below  the  x axis, 
symmetry  dictates  that  the 
directrix  lie  parallel  to  the 
y axis,  since  the  curve — if 
it  is  a parabola — is  sym- 
metrical with  respect  to 
the  x axis.  You  can  see 
from  the  figure  that  the 
curve  crosses  the  y axis  at 
y = 2.0,  that  is,  2.0  units 
from  the  focus.  It  is  also 
2.0  units  from  the  directrix 
and  thus  satisfies  the 
criterion  for  a parabola. 
You  can  verify  that  other 
points  on  the  curve  satisfy 
this  criterion  as  well,  by 
using  a ruler. 


According  to  the  fundamental  definition  of  the  x and  y components  of  a vector, 
this  can  be  written 

|R  x v|  = xvy  - yvx 
which  is  what  we  set  out  to  prove. 

Example  11-8  investigated  the  consequences  of  increasing  the  initial 
speed  of  a planet  by  5 percent  over  that  required  to  produce  a circular 
orbit.  The  result  was  an  elliptical  orbit,  albeit  one  whose  deviation  from  cir- 
cularity was  not  dramatic.  In  Example  11-10  we  retain  the  initial  conditions 
and  parameters  leading  to  the  circular  orbit  of  Example  1 1-7,  except  that 
we  increase  the  initial  speed  over  that  required  for  a circular  orbit  by  a 
factor  of  V2. 


EXAMPLE  11-10 

Run  the  central-force  program  with  the  following  set  of  initial  conditions  and 
parameters. 

x0  = 1 (in  AU);  (dx/dt)0  = 0;  y0  = 0;  (dy/dl)0  = V2(2tt)  (in  AU/yr);  /„  = 0; 
A t = 4c  (in  yr);  a = 39.5  [in  (AU)3/(yr)2];  (3  = -1.5. 

■ The  resulting  motion  of  the  “planet”  is  plotted  in  Fig.  11-17.  The  initial  mo- 
mentum is  now  so  great  that  the  gravitational  force  is  inadequate  to  bend  the  path 
into  an  orbit  which  closes  on  itself  as  an  ellipse  does.  Thus  the  path  is  open.  As  the 


11-5  The  Mechanics  of  Orbits:  Numerical  Treatment  477 


planet  moves  away  from  the  sun,  the  path’s  curvature  continually  decreases,  ap- 
proaching a straight  line.  Once  past  the  point  of  closest  approach  to  the  force  center 
(which  is  the  initial  point  in  this  example),  the  planet  forever  increases  its  distance 
from  the  force  center.  As  the  force  F decreases  with  increasing  distance,  dp/dt  de- 
creases and  the  path  approaches  the  straight-line  path  for  which  p is  constant. 

As  the  planet  moves  away  from  the  sun,  the  force  acting  on  it  continually  slows 
it  down.  You  can  see  this  from  the  spacing  of  the  points  on  the  path. 


The  path  of  Example  11-10,  in  fact,  is  a parabola.  This  statement  is 
verified  in  Fig.  11-17  and  its  caption.  The  parabola  forms  the  boundary  in 
the  hierarchy  of  conic  sections  between  the  family  of  closed  ellipses  and 
that  of  open  hyperbolas,  as  can  be  seen  in  Fig.  1 1-9. 

The  considerable  variations  in  the  paths  of  the  last  four  examples  were 
produced  by  varying  the  initial  speed — and  hence  the  initial  kinetic 
energy — of  the  moving  body.  But  the  initial  position  of  the  moving  body 
with  respect  to  the  body  fixed  at  the  force  center  was  the  same  in  all  cases. 
You  may  well  guess  on  this  basis  that  the  system  always  had  the  same  initial 
potential  energy.  In  Sec.  1 1-6  we  consider  in  detail  the  kinetic  and  potential 
energies  of  a system  containing  a moving  body  acted  on  by  a central  force, 
and  we  develop  criteria  for  determining  the  form  of  paths  in  terms  of  en- 
ergy. 


11-6  ENERGY  IN  The  very  nearly  constant  properties  of  the  orbits  in  which  the  planets 
GRAVITATIONAL  move,  over  time  spans  of  hundreds  of  millions  of  years,  suggest  that  me- 
ORBITS  chanical  energy  is  very  nearly  conserved  as  a planet  orbits  its  sun.  How- 
ever, you  saw  in  Sec.  1 1-5  that  the  speed  of  a planet  in  a noncircular  orbit 
varies  significantly  through  each  orbital  circuit.  Thus  in  each  circuit  there 
must  be  an  interchange  of  mechanical  energy  between  its  kinetic  and  po- 
tential forms,  much  as  there  is  such  an  interchange  for  oscillating  systems 
in  each  cycle  of  oscillation.  The  kinetic  energy  of  the  sun-planet  system 
(which  resides  almost  entirely  in  the  planet,  since  the  much  more  massive 
sun  is  nearly  motionless)  decreases  because  tbe  speed  of  the  planet  de- 
creases as  the  planet  “climbs  outward”  toward  aphelion.  As  this  happens, 
the  system  must  gain  potential  energy  in  order  for  its  total  mechanical  en- 
ergy to  remain  constant.  The  inverse  process  takes  place  as  the  planet  “falls 
inward”  toward  perihelion,  with  the  system  losing  potential  energy  and 
gaining  kinetic  energy. 

The  conservation  of  energy  which  is  observed  to  hold  in  the  solar 
system  is  attributable  to  the  isolation  of  the  system  and  the  fact  that  the  only 
significant  forces  operating  within  the  system  are  conservative  gravitational 
forces.  As  far  as  any  particular  planet  is  concerned,  the  solar  gravitational 
force  is  by  far  the  largest  of  these.  We  therefore  consider  the  principle  of 
conservation  of  mechanical  energy  as  it  applies  to  an  idealized  system  con- 
sisting of  a single  planet  of  mass  m and  a very  much  more  massive  sun  of 
mass  M.  The  results  will  apply  directly,  however,  to  any  system  in  which  a 
body  is  acted  on  by  a central  gravitational  force  (including  systems  treated 
by  means  of  the  reduced-mass  method).  In  Chap.  21  you  will  see  that  only  a 
trivial  modification  is  required  to  apply  the  results  to  a system  in  which  the 
gravitational  force  is  replaced  by  any  other  central  inverse-square  force. 


478  Gravitation  and  Central  Force  Motion 


Fig.  11-18  Diagram  for  evaluating  the 
change  in  potential  energy  of  a sun- 
planet  system  as  the  planet  moves  along 
a segment  of  its  orbit  from  initial  posi- 
tion st  to  final  position  S/.  The  coordi- 
nates Si  and  5/  are  measured  along  the 
orbit  from  some  fixed  origin  lying  on 
the  orbit.  At  any  point  along  the  orbit, 
the  direction  of  the  gravitational  force 
is  F = — r,  where  r is  the  position  vector 
from  the  sun  to  the  planet.  As  the  planet 
moves  through  the  infinitesimal  dis- 
placement d s,  the  gravitational  force 
does  work  dW  = F*  ds. 


We  want  to  apply  the  conservation  principle  of  mechanical  energy  in 
the  form  of  Eq.  (7-52),  which  is 

E = K + U — constant  (11-33) 


To  do  this,  we  must  develop  an  explicit  expression  for  the  gravitational  po- 
tential energy  U.  Figure  11-18  depicts  a planet  as  it  passes  along  a segment 
of  its  orbit.  Specifically,  let  the  planet  move  from  the  initial  position,  speci- 
fied by  the  coordinates*  measured  along  its  path  from  an  arbitrary  point  on 
the  path,  to  the  hnal  position  specified  by  the  coordinate  sf.  According  to 
Eq.  (7-35),  the  work  W done  by  the  gravitational  force  F is 

W = J ' F • ds  (11-34) 


Here  ds  represents  an  infinitesimal  path  element,  as  shown  in  the  figure. 

The  work  integral  defined  by  Eq.  (11-34)  can  be  evaluated  by  substi- 
tuting into  that  equation  the  value  of  F given  by  Eq.  (1  1-66).  This  yields 


W = 


GMm  „ 

— -y — r • ds 


As  can  be  seen  from  Fig.  1 1-18,  the  scalar  product  is  r • ds  — 1 cos  0 ds  — 
dr.  That  is,  r • ds  is  the  value  dr  of  the  radial  component  of  ds.  Thus 
by  using  the  rule  of  Eq.  (7-20)  to  perform  the  actual  integration,  we  obtain 


[n  dr 

/I 

1\ 

W = - GMm  -j  = GMm 

— 

Jr,  r2 

Vf 

rd 

(11-35) 


In  this  expression,  r*  and  rf  are  the  magnitudes  of  the  vectors  r,  and  rf, 
which  describe  the  initial  and  final  positions  of  the  planet  with  respect  to 
the  sun.  The  vectors  correspond  to  the  coordinates  st  and  sf  measured  along 
the  path.  The  fact  that  the  work  W depends  only  on  the  distances  r*  and  rf, 
and  not  on  the  specific  positions  st  and  sf,  arises  directly  from  the  fact  that 
the  gravitational  force  under  consideration  is  a central  force. 

Equation  (1 1-35)  can  be  used  to  confirm  that  the  gravitational  force  is  a 
conservative  force.  If  the  planet  follows  an  arbitrary  closed  path  (for  ex- 
ample, once  around  its  orbit)  so  that  rf  = r* , then  the  work  done  by  the 
force  is  zero.  As  was  discussed  in  Sec.  7-4,  this  is  a necessary  and  sufficient 
condition  for  a conservative  force. 

W e now  use  the  work-potential  energy  relation  of  Eq.  (7-46)  to  express 
the  change  A U in  the  potential  energy  of  the  sun-planet  system  as  the 
planet  passes  from  its  initial  to  its  final  position.  In  the  present  notation, 
that  relation  is 


AH  = -IT  (11-36) 

Hence  the  potential  energy  change  is 

A U = GMm(---)  (11-37) 

\n  rf) 

This  expression  tells  you  that  the  potential  energy  increases  as  the  planet 
moves  outward.  For  outward  motion,  rf  > rt  and  hence  1/r*  > 1 /rf,  leading 
to  a positive  value  for  A U.  This  is  in  accord  with  the  intuitive  notion  that 
such  outward  motion  is  “uphill” — that  is,  in  a direction  opposed  by  the 
attractive  gravitational  force.  Conversely,  the  potential  energy  decreases  as 
the  planet  moves  inward. 


11-6  Energy  in  Gravitational  Orbits  479 


As  is  true  of  any  potential  energy,  the  potential  energy  U of  the 
sun-planet  system  is  defined  with  respect  to  some  agreed-upon  reference 
position  at  which  U has  the  value  0.  It  is  most  often  convenient  to  choose 
this  reference  position  as  one  where  r = oo,  that  is,  when  the  planet  is  sepa- 
rated from  the  sun  by  an  infinite  distance.  Suppose  that  the  planet  is  ini- 
tially at  an  infinite  distance  from  the  sun  (at  r,  = °°)  and  moves  to  a final  po- 
sition whose  distance  from  the  sun  has  the  arbitrary  value  r (at  rf  = r). 
Then  Eq.  (11-37)  yields 


A U = - 


GMm 

r 


But  U = 0 at  the  initial  position  because  the  planet  is  then  at  the  reference 
position.  Thus  the  value  of  U at  the  final  position  is  equal  to  the  change  A U. 
So  we  have  U = A 17,  or 


GMm 

U = 

r 


(11-38) 


It  may  seem  awkward  that  the  potential  energy  of  a sun-planet  system 
always  has  a negative  value.  But  you  will  soon  see  that  this  is  outweighed  by 
the  advantages  when  the  total  mechanical  energy  E of  the  system  is  consid- 
ered. In  any  case,  note  that  the  potential  energy  of  the  system  increases  (be- 
comes less  negative)  as  r increases.  Note  also  that  while  the  (hypothetical) 
path  through  which  the  planet  was  moved  in  deriving  Eq.  (1 1-38)  was  of  in- 
finite length,  the  result  is  not  an  infinite  change  in  potential  energy.  Can 
you  explain  why? 

The  value  of  U would  be  — °°  if  the  planet  were  to  move  precisely  to  the 
force  center,  where  r is  0.  However,  this  is  not  a physical  possibility.  If 
there  is  to  be  a gravitational  force,  the  center  of  force  must  be  occupied  by  a 
body  possessing  mass.  Such  a body  always  occupies  space,  and  that  space  is 
not  available  to  the  body  on  which  the  gravitational  force  is  exerted. 

In  Example  11-11  the  concept  of  gravitational  potential  energy  is  ap- 
plied to  an  artificial  earth  satellite. 


EXAMPLE  11-11 

How  much  work  is  necessary  to  raise  a satellite  of  mass  m = 100.0  kg  slowly  from 
the  surface  of  the  earth  to  t lie  altitude  of  a synchronous  orbit?  How  much  work 
would  be  required  to  remove  the  satellite  entirely  from  the  earth's  gravitational 
influence? 

■ Referring  to  Example  1 1-2,  you  have  r,  = Re  = 6367  km  (the  radius  of  the 
earth)  and  rs  = Rs  = 42,180  km  (the  orbit  radius).  Since  the  center  of  the  earth  is  the 
center  of  force  for  the  satellite,  you  can  set  the  mass  of  the  earth  equal  to  M in  Eq. 
( 1 1-37)  and  use  that  equation  to  determine  the  potential  energy  increase  A U of  the 
earth-satellite  system  as  the  satellite  is  raised.  This  energy  increase  must  be  accom- 
plished by  doing  work  W'  against  the  gravitational  force,  using  a rocket.  Thus  W = 
— W = A U,  and  you  have 


W'  = GMm  — 

\±\.g 


This  equation  can  be  solved  directly,  by  using  the  numerical  values  given.  However, 
you  can  also  proceed  as  follows.  In  the  present  notation,  Eq.  (1  1-11)  becomes  g = 
GM/R2e.  Substituting  into  the  equation  for  W'  gives 


i?2 

W = mgRe  - mg-r~=  mgRe 


480  Gravitation  and  Central  Force  Motion 


Inserting  the  numerical  values  gives  you 


H 


T' 


100.0  kg  x 9.807  m/sI 2  x 6.367  x 106  m x 


6.36  7 x 106  m \ 
4.218  x 107  nJ 


= 6.244  x ]()9  x (1  - 0.1509)  J 
= 5.302  x 109  J 

The  term  1 — 0.1509  in  the  next-to-last  line  of  the  calculation  suggests  that 
most  of  the  work  required  to  remove  the  satellite  entirely  from  the  earth  has 
already  been  done  by  the  time  it  has  been  raised  to  the  altitude  of  the  synchronous 
orbit.  Indeed,  you  need  only  do  approximately  an  additional  15  percent  of  this  work 
to  complete  the  removal.  The  total  work  W"  required  to  raise  the  satellite  to  an  infinite 
altitude,  where  rf  = and  the  satellite  is  removed  entirely  from  the  earth's  gravita- 
tional influence,  is  given  by 

W"  = mgR,  = 6.244  x 1 09  J 


Example  11-12  considers  the  kinetic  as  well  as  the  potential  energy  of  a 
system  bound  by  the  gravitational  force. 


EXAMPLE  11-12 

If  a small  body  of  mass  m is  dropped  to  the  earth  from  an  initial  state  of  rest  at  a 
very  great  altitude,  how  fast  will  it  be  going  when  it  strikes  the  earth?  Ignore  air  re- 
sistance. 

■ The  initial  kinetic  energy  of  the  body  is  K =0.  If  you  take  its  initial  distance 
from  the  center  of  the  earth  to  be  infinite,  its  initial  potential  energy  is  U = 0.  Thus 
its  initial  total  mechanical  energy  is£=A’+f/  = 0 + 0 = 0.  Since  the  system  is  a 
conservative  one,  E is  constant  and  so  must  maintain  the  value  E = 0.  Thus  the 
kinetic  energy  K = E — U at  the  moment  of  collision  with  the  earth  is 

GMm 

* = 0+“7T“ 

where  Re  is  the  radius  of  the  earth  and  M is  its  mass.  Since  the  mass  m of  the  small 
body  is  very  much  less  than  that  of  the  earth,  M,  it  possesses  essentially  all  the 
kinetic  energy  of  the  system  at  the  moment  of  collision.  In  terms  of  the  speed  v of 
the  body  at  that  moment,  you  have 


mv2  GMm 


Solving  for  v gives  you 

/2GMT2 

w = hd  (1,-40) 

Substituting  the  numerical  values  of  these  quantities,  you  obtain 

_ /2  x 6.67  x ltr11  N-nr/kg2  x 5.97  x 1024  kg 
l'  ~ V 6.37  x 106  m / 

= 1.12  x 104  m/s  =11.2  km/s 


I he  use  of  energy  conservation  in  Example  11-12  implies  that  the 

process  works  both  ways.  If  you  were  able  to  shoot  a projectile  upward 

from  the  surface  of  the  earth  with  a speed  of  1 1.2  km/s,  and  if  there  were 

no  air  resistance,  the  projectile  would  never  come  completely  to  rest,  hut 


11-6  Energy  in  Gravitational  Orbits  481 


would  continue  outward  indefinitely  as  its  kinetic  energy  and  speed  de- 
creased asymptotically  to  zero.  That  is,  it  would  never  return  to  earth.  But 
any  projectile  with  lower  initial  speed  would  ultimately  do  so.  The  speed 
v — 11.2  km/s  is  therefore  called  the  escape  speed  from  the  surface  of  the 
earth.  You  can  see  from  Eq.  (1  1-40)  that  for  a planet  other  than  the  earth 
the  escape  speed  is  proportional  to  the  square  root  of  the  planet’s  mass  M 
and  inversely  proportional  to  the  square  root  of  its  radius  R. 

The  escape  speed  is  often  loosely  called  the  escape  velocity.  This  is  mis- 
leading, however,  since  the  direction  in  which  the  projectile  is  launched 
with  escape  speed  is  immaterial  (provided,  of  course,  it  is  not  aimed  into 
the  earth).  This  is  implicit  in  the  scalar  nature  of  energy.  Only  the  scalar  v 2 
(and  not  the  vector  v)  appears  in  the  energy  relation,  Eq.  (1 1-39),  used  to 
evaluate  the  escape  speed. 

Let  us  again  drop  a body  from  r = °c.  But  this  time  let  us  prevent  colli- 
sion with  the  earth  by  giving  the  body  a very  small  initial  velocity  perpen- 
dicular to  the  radial  (earth-body)  direction  at  the  moment  of  release.  Its 
total  energy  will  now  be  only  negligibly  greater  than  zero.  But  as  a result  of 
this  initial  tangential  velocity,  the  body  will  travel  along  the  path  displayed 
in  Fig.  11-19.  The  body  will  pass  the  center  of  the  earth  at  some  minimum 
distance  R,  called  the  distance  of  closest  approach,  or  perigee.  Continuing 
further  along  this  path,  the  body  will  ultimately  arrive  at  a very  large 
distance  from  the  earth  with  negligible  velocity.  According  to  Eqs.  (1 1-39) 
and  (1  1-40),  the  kinetic  energy  and  the  speed  of  the  body  at  perigee  have 
the  values 

GMm  _ , , 

K = — - — tor  zero  total  energy  path 

R 

and 


The  kinetic  energy  may  be  compared  with  that  of  the  same  body  when  it  is 
in  a circular  orbit  about  the  earth  at  distance  R.  For  a circular  orbit,  the 
acceleration  is  purely  centripetal  and  has  magnitude  a = v2/R.  The  magni- 

Fig.  11-19  A small  body  is  allowed  to  fall  toward  the  earth  from  a large  initial  distance. 
As  it  is  released,  it  is  given  a very  small  initial  velocity  in  the  direction  perpendicular  to  that 
toward  the  earth,  whose  mass  is  M.  As  a result,  it  "falls’'  in  a parabolic  orbit,  missing  the 
center  of  the  earth  bv  a distance  R.  called  the  distance  of  closest  approach,  or  perigee. 


2 GM\  !'2 


R 1 


for  zero  total  energy  path 


482  Gravitation  and  Central  Force  Motion 


tude  of  the  necessary  centripetal  force  is  ma  = mv2/R,  and  this  is  provided 
by  the  gravitational  attraction  F = GMm/R2.  Thus  we  have  GMm/R 2 = 
mv2/R,  or 

GMm 


Since  mv2  = 2K,  this  leads  to  the  relation 

GMm  . 

K = for  circular  orbit 

IK 

For  a circular  orbit  of  radius  R,  the  kinetic  energy  is  just  one-half  that  of 
the  same  body  in  the  path  having  a perigee  distance  R and  for  which  the 
total  energy  of  the  system  is  essentially  zero.  The  circular-orbit  speed  is 

(GM\m  . 

v = I ■ — — I tor  circular  orbit 

Comparing  this  circular  orbit  speed  with  the  speed  at  perigee  of  the  zero 
total  energy  path,  we  see  that  the  latter  is  greater  by  a factor  of  \/2.  Then 
looking  again  at  Example  1 1-10,  we  find  that  the  initial  conditions  used  in 
the  numerical  calculation  done  there  are  ones  in  which  a body  has  a speed 
at  the  perigee  distance  R which  is  greater  by  a factor  of  y/2  than  the  speed 
ii  has  in  a circular  orbit  of  radius/?.  Thus  the  path  produced  in  the  numerical 
calculation,  and  displayed  in  Fig.  11-17,  is  a zero  total  energy  path.  An 
analysis  carried  out  in  the  figure  caption  shows  that  the  path  is  a parabola. 
Hence  we  can  conclude  that  a path  of  zero  total  energy  is  a parabola. 

Another  conclusion  that  can  be  drawn  from  these  considerations  is 
that  the  speed  of  a minimum-orbit  earth  satellite,  7.9  km/s  (see  Sec.  3-6), 
is  l/v2  times  the  escape  speed  from  the  surface  of  the  earth,  for  which 
R — Re.  The  numerical  result  for  the  escape  speed,  obtained  in  Example 
1 1-12,  is  11.2  km/s,  in  agreement  with  this  conclusion. 


For  a circular  orbit,  the  total  energy  E is  found  by  adding  the  value  of 
the  kinetic  energy  K just  derived  to  the  value  of  the  potential  energy  U 
given  by  Eq.  (1  1-38).  This  gives 


E = K + U — 


GMm 
2 R 


GMm 

R 


(1  l-41rt) 


or 


E = for  circular  orbit  (11-416) 

2R 

Equation  (ll-41a)  shows  that  the  kinetic  energy  in  the  case  of  a circular 
orbit  has  the  value 

K = —\U  for  circular  orbit  (11-42) 

All  closed  orbits — orbits  in  which  the  distance  to  the  moving  body 
from  the  center  of  force  increases  and  decreases  periodically — have  asso- 
ciated with  them  total  energies  E less  than  zero.  These  orbits  are  the  circle 
and  the  ellipses  of  Fig.  1 1-9.  The  open  parabolic  path  is  characterized  by 
zero  (or  negligible)  total  energy.  Paths  with  nonnegligible  positive  total  en- 
ergy are  hyperbolic.  Figure  11-9  illustrates  the  geometric  connection  among 
these  paths,  all  of  which  are  conic  sections.  A body  in  such  a path  has  signif- 
icant speed  when  it  is  a very  large  distance  from  the  central  body.  A 


11-6  Energy  in  Gravitational  Orbits  483 


Energy  E 


E+ 


Ei 

Ei 


Fig.  11-20  Representation  of  the  dependence  of  the  potential 
energy  U of  a sun-planet  system  on  the  distance  r between  the 
sun  and  the  planet.  The  curve  represents  the  relation  U « — r_1. 
The  negative  quantities  Ex,  E2,  and  E3  represent  possible  total 
energies  of  the  system  for  which  the  planet  is  bound,  that  is, 
confined  to  a closed  (elliptical  or  circular)  orbit.  The  distances 
, r2,  and  r3  are  the  maximum  possible  aphelia  (extreme  sun- 
planet  distances)  for  the  three  cases,  respectively.  In  each  case, 
they  correspond  to  extremely  elongated  ellipses  with  essentially 
zero  perihelion  distances  (semiminor  axes).  In  each  case,  orbits 
are  possible  which  pass  through  any  lesser  distance.  (Can  you 
predict  the  orbit  radii  for  the  circular  orbits  of  total  energy 
E1,  E2,  and  £3?)  But  a greater  distance  (represented  in  each 
case  by  the  shaded  extension  of  the  horizontal  line  denoting 
the  total  energy)  would  impose  the  requirement  that  the 
potential  energy  of  the  system  exceed  its  total  energy.  Since 
K = E — U,  this  would  require  that  K = mv2/2  < 0,  which  is 
impossible.  If  the  total  energy  of  the  system  is  E = 0,  the  sun- 
“planet”  distance  r can  have  any  value  at  all.  since  the  corre- 
sponding potential  energy  can  never  exceed  the  reference  value 
U = 0.  The  possible  paths  are  parabolas.  If  the  total  energy  of 
the  system  has  any  positive  value  E+,  the  system  is  unbound  and 
the  possible  paths  are  hyperbolic.  Compare  with  Fig.  8-18, 
which  is  the  analogous  plot  for  a diatomic  molecule. 


two-body  system  with  negative  total  energy  is  called  a bound  system;  a 
system  with  zero  or  positive  total  energy  is  called  an  unbound  system.  Fig- 
ure 1 1-20  illustrates  this  distinction.  This  figure  should  be  compared  with 
Figs.  8-17  and  8-18,  which  describe  the  relation  between  potential  energy 
and  total  energy  for  systems  with  different  force  laws. 

In  the  solar  system,  practically  all  (if  not  all)  bodies  have  elliptical  orbits, 
either  around  the  sun  or  (in  the  case  of  satellites)  around  their  planets.  With  a few 
exceptions,  the  planetary  and  satellite  orbits  are  ellipses  of  small 
“eccentricity” — that  is,  they  are  not  very  different  from  circles.  Some  of  the  as- 
teroids, and  nearly  all  the  comets,  have  highly  eccentric  elliptical  orbits.  In  the 
case  of  some  comets,  the  presumably  elliptical  orbits  are  indistinguishable  from 
parabolas  on  the  basis  of  observation.  If  the  aphelion  of  a comet  is  very  far  from  the 
sun — beyond  the  orbit  of  Pluto — the  acceleration  and  velocity  of  the  comet  near 
aphelion  will  be  so  small  that  the  period  may  be  millions  of  years. 

It  is  unlikely  that  any  celestial  bodies  are  observed  in  hyperbolic  paths.  Such 
a path  would  imply  that  the  body  had  entered  the  solar  system  with  nonzero 
velocity — that  is,  it  is  not  bound  to,  and  is  not  a member  of,  the  solar  system.  But 
the  space  outside  the  solar  system  is  so  empty  (compared  to  the  solar  system  itself) 
that  such  encounters  with  bodies  of  observable  size  cannot  occur  with  any  fre- 
quency. 

However,  particles  moving  in  unbound  paths  are  of  great  importance  in 
atomic  and  nuclear  phenomena.  The  attractive  force  in  this  case  is  electric  rather 
than  gravitational,  but  the  general  principles  of  orbit  mechanics  are  the  same 
except  for  scale,  because  the  electric  force  obeys  an  inverse-square  law  just  as 
the  gravitational  force  does.  It  is  quite  easy  to  alter  the  total  energy  of  an  atom 
so  as  to  ionize  it — that  is,  to  supply  energy  to  the  atom  and  transfer  an  atomic 
electron  from  a bound  orbit  to  an  unbound  path  taking  it  away  from  the  atom. 
It  is  equally  simple  to  observe  the  reverse  process,  called  electron  capture,  in 
which  the  system  emits  energy  in  the  form  of  light  or  otherwise  as  an  electron 
moving  along  an  unbound  path  approaching  the  atom  jumps  onto  a bound  orbit. 


484  Gravitation  and  Central  Force  Motion 


A final  case,  which  we  discuss  in  detail  in  Chap.  20,  is  the  one  in  which  the 
inverse-square  force  is  repulsive  rather  than  attractive.  This  is  the  case  for  two 
bodies  with  electric  charges  of  the  same  sign,  such  as  an  alpha  particle  and  a ura- 
nium nucleus.  In  this  case,  the  total  energy  is  always  positive,  since  the  potential 
energy  increases  from  a minimum  value  of  zero  as  one  particle  approaches  the 
other  from  a large  distance. 

Experiments  in  which  a particle  is  shot  at  a target  particle  with  positive  total 
energy  are  called  scattering  experiments.  Since  the  path  of  the  projected  particle  is 
necessarily  not  a closed  one,  the  particle  can  be  collected  by  a detector  after  un- 
dergoing some  deflection  (as  in  Fig.  11-19)  because  of  the  force  exerted  on  it  by  the 
target  particle.  A study  of  the  details  of  the  deflection  as  a function  of  the  energy  of 
the  particle  and  its  distance  of  closest  approach  to  the  target  particle  can  yield  a 
tremendous  amount  of  information  about  the  details  of  the  force,  and  hence  about 
the  structure  of  the  target  particle. 


11-7  PERTURBATIONS  So  far  our  study  of  orbits  has  been  restricted  to  the  case  of  central  forces,  in 
AND  ORBIT  STABILITY  which  the  gravitational  interaction  takes  place  between  only  two  bodies.  We 

made  the  further  simplification  in  Sec.  1 1-6  that  one  of  the  bodies  has  a 
much  greater  mass  than  the  other,  so  that  only  the  motion  of  the  less  mas- 
sive body  need  be  considered.  No  great  complication  is  introduced  by  re- 
laxing this  latter  restriction,  since  the  reduced-mass  procedure  can  be  used 
to  transform  any  two-body  system  into  a system  in  which  one  of  the  bodies 
has  infinite  mass.  However,  the  solar  system  is  not  a two-body  system,  and 
the  planets  exert  gravitational  forces  on  one  another.  These  forces  are 
small  compared  to  the  forces  due  to  the  sun,  but  they  result  in  measurable 
disturbances  of  the  planetary  motions  described  by  the  two-body  treat- 
ment. These  disturbances  are  called  perturbations.  Perturbations  are  espe- 
cially significant  in  certain  cases  where  the  forces,  which  vary  periodically  as 
the  planets  move  around  the  sun,  are  in  step  with  the  periodic  motion  of 
the  body  on  which  they  are  acting.  Such  perturbing  forces  produce  large 
cumulative  effects  over  long  periods.  A complication  is  introduced  by  the 
fact  that  the  orbits  of  tbe  planets  do  not  lie  quite  in  the  same  plane.  Thus 
the  analysis,  which  has  been  two-dimensional  to  this  point,  must  be  ex- 
tended to  three  dimensions  if  accurate  calculations  are  required.  We  will 
not  consider  this  extension  because  it  is  quite  complicated.  Nevertheless,  a 
good  deal  of  insight  can  be  gained  into  the  effects  of  perturbing  forces  by 
carrying  out  a quite  simple  numerical  calculation  in  two  dimensions. 

The  largest  perturbations  in  the  solar  system  are  produced  by  Jupiter, 
whose  mass  is  about  0.1  percent  that  of  the  sun.  To  give  an  example  of  the 
effect  of  Jupiter  on  other  planets,  consider  its  effect  on  the  earth.  It  is 
about  4 times  more  distant  than  the  sun  when  it  is  at  its  distance  of  closest 
approach  to  the  earth.  Thus  the  magnitude  of  the  force  it  exerts  on  the 
earth  at  that  point  is  about  10-3/42  — 6 x 10-5  times  that  produced  by 
the  sun. 

This  closest  approach  of  Jupiter  to  the  earth  takes  place  when  the  sun, 
the  earth,  and  Jupiter  lie  in  a straight  line  in  that  order.  Astronomers  call 
this  alignment  opposition.  It  takes  place  about  every  399  days,  which  is  the 
time  required  for  the  earth  to  circle  the  sun  and  catch  up  with  Jupiter 
again.  Hence  the  distance  between  the  earth  and  Jupiter  varies  with  this 
399-day  period,  called  the  synodic  period  of  Jupiter.  The  perturbing  force 
exerted  on  the  earth  by  Jupiter  must  also  vary  with  the  synodic  period.  But 
the  perturbing  force  is  relatively  small,  and  it  falls  off  rather  rapidly  (xr~2) 


11-7  Perturbations  and  Orbit  Stability  485 


with  increasing  distance.  Hence  it  is  possible  to  make  a respectable  brst 
approximation  to  the  effect  of  Jupiter  on  the  earth  by  substituting  for  the 
periodically  varying  force  an  instantaneous  impulse  of  appropriate  magni- 
tude, applied  to  the  earth  at  the  moment  of  opposition. 

According  to  Eqs.  (8-9)  and  (8-10),  the  impulse  I can  be  written 

I = j ' F dt  = Ap 

Thus  the  impulse  I can  be  substituted  for  Ap,  the  entire  change  in  the  mo- 
mentum of  the  earth  produced  by  the  attraction  of  Jupiter  over  an  entire 
synodic  period  tf  — tt  = 399  days.  During  most  of  that  period,  the  force  F is 
negligible,  at  least  in  first  approximation.  Thus  F has  the  character  of  an 
impulsive  force,  something  like  the  one  displayed  in  Fig.  8-8.  To  a hrst 
approximation,  therefore,  the  effect  of  Jupiter  on  the  motion  of  the  earth 
can  be  found  by  assuming  that  the  impulse  I — and  hence  the  change  in 
momentum  Ap  — takes  place  entirely  at  the  instant  of  closest  approach 
between  Jupiter  and  the  earth.  This  method  was  hrst  used  by  Newton.  It  is 
typical  of  his  penetrating  insight  into  physical  phenomena. 

Example  11-13  gives  a good  general  idea  of  what  happens.  It  employs 
an  extension  of  the  numerical  orbit  calculation  technique  developed  in  Sec. 
1 1-5.  We  hrst  allow  the  earth  to  make  a complete,  unperturbed  circular 
orbit.  At  week  52,  when  the  earth  returns  to  its  starting  point,  we  assume 
that  it  passes  Jupiter  and  experiences  an  instantaneous  impulse  directed 
away  from  the  sun.  (This  direction  would  be  precisely  correct  if  the  orbits 
of  the  earth  and  Jupiter  lay  in  exactly  the  same  plane.)  The  effect  of  this 
instantaneous  impulse  is  represented  by  adding  to  the  earth's  velocity  a 
new  velocity  Av  directed  radially  outward.  In  the  interest  of  clarity  in  dem- 
onstrating the  effect,  we  exaggerate  greatly  and  let  Av  have  a magnitude 
equal  to  5 percent  of  the  earth’s  unperturbed  speed.  The  radially  outward 
impulse  I is  thus  represented  by  giving  the  earth  a radially  outward  mo- 
mentum change  Ap  = m Av.  At  week  109,  when  the  earth  catches  up  with 
Jupiter  again  399  days  later,  we  add  a second  velocity  increment,  of  magni- 
tude equal  to  the  first  and  again  directed  radially  outward  (a  direction 
which  is  now  approximately  35°  counterclockwise  from  the  x axis). 


EXAMPLE  11-13 

Run  the  central-force  program  through  1 yr  (52  “weeks”  or  calculation  cycles)  with 
the  following  set  of  initial  conditions  and  parameters,  which  are  identical  with  those 
of  Example  11-7: 

x0  = 1 (in  AU);  (dx/dt)0  = 0;  y0  = 0;  (dy/dt) 0 = 2 it  (in  AU/yr);  t0  = 0;  At  = 
wz  (in  yr);  a = 39.5  [in  (AU)3/(yr)2];  f3  = — 1.5. 

Stop  the  calculating  device  when  the  52d  cycle  of  calculation  is  completed.  Add 
0.1 77  to  the  dx/dt  storage  register.  (This  gives  the  earth  a radial  velocity  toward  Ju- 
piter whose  magnitude  is  5 percent  of  the  magnitude  of  the  tangential  velocity 
dy/dt  = 27r.)  Then  continue  running  for  57  more  “weeks”  or  cycles,  for  a total  of 
109  “weeks.”  At  this  time  the  sun,  the  earth,  and  Jupiter  are  again  in  line. 

Stop  the  calculating  device  when  the  109th  cycle  is  completed.  Add  0.1 77  cos  35° 
to  the  dx/dt  storage  register  and  0.1 77  sin  35°  to  the  dy/dt  storage  register.  [This 
again  gives  the  earth  a radial  velocity  toward  Jupiter  whose  magnitude  is  5 percent 
of  277.  Now,  however,  the  line  from  the  sun  through  the  earth  to  Jupiter  lies  35° 
counterclockwise  from  the  x axis.  To  see  this,  note  that  109  weeks  is  5 weeks  more 
than  2 yr.  And  (5  weeks/52  weeks)  x 360°  = 35°.]  Finally,  run  the  calculation  for 
52  more  “weeks”  to  complete  another  orbit. 


486  Gravitation  and  Central  Force  Motion 


Fig.  11-21  Results  of  the  numerical  orbit  calculations  carried  out  in  Example  1 1-13.  The 
Hrst  orbit,  shown  by  dots,  is  identical  to  the  circular  orbit  of  Example  11-7  and  represents 
an  idealized  earthlike  planet  with  constant  orbit  radius  1 ALL  As  the  first  orbit  is  completed 
at  x = 1 AU,  y = 0,  the  planet  is  given  an  outward  radial  velocity  equal  to  5 percent  of  its 
tangential  velocity,  in  order  to  simulate  (in  exaggerated  form)  the  gravitational  attraction 
of  Jupiter.  The  subsequent  orbit,  represented  by  x's,  is  both  elongated  into  an  ellipse  whose 
major  axis  lies  along  they  axis  (perpendicular  to  the  direction  of  the  perturbation)  and  shifted 
in  the  positive  y direction  (so  that  the  sun  remains  at  the  focus  of  the  ellipse).  Fifty-seven 
“weeks”  later,  a second  outward  radial  velocity  component  is  added  to  simulate  the  effect 
of  Jupiter.  The  third  orbit,  represented  by  crosses,  is  again  elongated  from  that  point  onward 
and  shifted  in  the  direction  perpendicular  to  the  perturbational  effect.  The  beginning  of  a 
fourth  orbit  is  represented  by  squares. 


■ The  results  are  shown  in  Fig.  11-21.  The  initial  circular  orbit  is  denoted  by 
dots,  the  second  orbit  by  x’s,  the  third  by  crosses,  and  the  fourth  (partial)  orbit  by 
squares. 

It  is  clear  that  the  first  impulsive  force,  or  blow,  distorts  the  circular  orbit  into  an 
ellipse.  It  may  seem  paradoxical  that  a blow  in  the  positive  x direction,  delivered  at 
x = 1 AU,  y = 0 both  shifts  the  orbit  in  the  positive  y direction  and  elongates  it 
along  the  y axis.  I.ikewise,  the  second  blow,  delivered  radially  at  about  a 35°  angle 
counterclockwise  from  the  x axis,  shifts  and  elongates  the  ellipse  along  the  perpen- 
dicular direction.  Can  you  explain  this  effect  qualitatively? 


Perhaps  the  most  remarkable  thing  about  Example  1 1-13  is  the  way  it 
illustrates  the  stability  of  the  orbit.  A substantial  blow  does  indeed  change 
the  orbit.  But  the  new  orbit  is  not  very  different  from  the  old  one  and  is  it- 
self a stable  orbit  which  would  retrace  itself  indefinitely  if  no  further  per- 
turbations were  applied.  In  order  to  emphasize  the  significance  of  this 
point,  and  the  special  quality  which  it  gives  to  inverse-square  forces,  we  will 
again  investigate  the  result  of  applying  an  impulse  of  the  same  magnitude 
and  direction  as  the  first  one  given  to  the  planet  in  Example  11-13.  How- 
ever, this  time  we  will  apply  it  to  a planet  in  a circular  orbit  under  the  influ- 


11-7  Perturbations  and  Orbit  Stability  487 


ence  of  an  attractive  inverse-cu/;c  force,  whose  magnitude  is  given  by  F = 
G'Mm/R3.  The  calculation  will  show  that  the  orbit  is  unstable.  That  is,  the 
slightest  disturbance  disrupts  it  into  an  open  spiral. 

It  is  quite  easy  to  show  that  a circular  orbit  is  possible  for  an 
inverse-cube  force,  in  the  total  absence  of  perturbations.  We  have  mac  = F, 
or 


mv 2 
~R 


G'Mm 


R3 


/ 


where  G'  is  the  proportionality  constant  in  the  hypothetical  “inverse-cube 
law  of  gravitation.”  Solving  for  v,  we  find  that  if  the  speed  of  the  planet  is 


v 


G'M\112 
R2  ) 


and  its  direction  of  motion  is  perpendicular  to  the  position  vector  from  the 
center  of  force,  t lie  centripetal  acceleration  ac  will  have  the  proper  relation 
to  the  force  and  the  mass,  and  the  motion  will  be  circular. 

In  Example  11-14  we  follow  the  unperturbed  planet  through  one- 
quarter  of  a revolution  to  verify  the  circularity  of  the  orbit,  and  then  we  im- 
pose the  blow.  The  centripetal  force  is 

F = -G'Mm—r  R = —G'Mm  .,  R 

RJ  (x2  + y 2)3'- 

1 hits  Eq.  ( 1 l-26«)  shows  that  the  x component  of  the  acceleration  is 

dzx  F r F x x 

dt2  m m (x2  + y2)I/2  (x2  + y2)2 


In  the  notation  of  Eq.  (1  l-29«),  this  can  be  written 


Q,  = -CM 


,2\2 


(x~  -+-  y 

Likewise,  the  y component  of  the  acceleration  is 


d2y  F 


y 

dt 2 m m (x2  + y2)112  (x2  + y2)2 

In  the  notation  of  Eq.  (11-2%),  this  becomes 


F 


= -G'M 


y 


G M (x2  + y2)2 

The  general  definitions  of  Qr  and  Qy  appropriate  to  all  central-force  calcu- 
lations are  given  by  Eqs.  (1  l-30o)  and  (11-306),  respectively.  These  are 

Qx  = ~ax(x2  + y2)e  and  Qy  = -ay(x2  + y2)0 

The  values  of  a and  /3  applicable  to  the  case  of  an  inverse-cube  force  can  be 
found  by  comparing  the  specific  expressions  for  Qj  and  Qy  immediately 
above  with  the  definitions.  For  the  inverse-cube  case  we  have 

a — G'M  and  f3  = — 2 


The  choice  of  the  numerical  value  of  the  constant  G ' in  the  above  inverse-cube 
equations  is  arbitrary.  For  convenience,  we  let  its  numerical  value  equal  that  of  the 
universal  gravitational  constant  G.  If  this  is  done,  the  numerical  value  of  the  con- 
stant a required  for  the  calculation  will  again  be  39.5.  Since  the  calculating  device 


Gravitation  and  Central  Force  Motion 


manipulates  numbers  only,  and  not  units,  the  substitution  of  G ' forG  will  not  alter 
the  calculation.  However,  the  units  of  G'  must  be  different  from  those  of  G.  Can 
you  show  that  for  the  system  of  units  used  in  the  central-force  calculations,  the 
proper  units  for  G'  are  (AU)V(yr)2? 


EXAMPLE  11-14 

Run  the  central-force  program  for  13  “weeks,”  or  calculation  cycles,  with  the  follow- 
ing set  of  initial  conditions  and  parameters: 

x0  = 1 (in  AU);  ( dx/dt)0  = 0;  yf)  = 0;  {dy/dt)0  = 2tt  (in  AU/yr);  t0  = 0;  At  = 
A (in  yr);  a = 39.5  [in  (AU)4/(yr)2];  (3  = -2.' 

Stop  the  calculating  device  when  the  13th  cycle  is  completed.  Add  0.1 7r  to  the 
dy/dt  storage  register.  (This  gives  the  planet  an  outward  radial  velocity  increment 
whose  magnitude  is  5 percent  of  the  magnitude  of  the  tangential  velocity  dx/dt  = 
27t.)  Run  the  calculating  device  through  a sufficient  number  of  further  cycles  to 
show  clearly  what  kind  of  an  effect  the  perturbation  has  on  the  original  circular 
orbit. 

■ The  results  are  plotted  in  Fig.  1 1-22.  The  orbit  is  evidently  unstable.  The  single 
outward  blow  causes  the  planet  to  leave  its  circular  orbit  and  begin  to  spiral  ever 
outward.  The  speed  of  the  planet  decreases  as  the  distance  from  the  center  of  force 
increases,  and  the  curvature  of  the  orbit  continues  to  decrease  indefinitely. 


A stable  planetary  system  would  not  be  possible  if  the  law  of  gravita- 
tion were  an  inverse-cube  law.  Even  given  perfectly  circular  orbits  to  begin 
with,  the  very  slightest  perturbation  destroys  the  system.  In  other  words, 
circular  orbits  in  an  inverse-cube  force  “universe”  are  not  stable  in  the 
sense  discussed  in  Sec.  9-7.  If  the  perturbation  is  very  small,  the  spiral  will 
be  very  tight  and  it  will  take  some  time  for  the  planet  to  leave  the  system; 
but  it  will  inevitably  happen.  An  inward  impulse  will  produce  an  inward 
spiral  which  culminates  in  a collision  of  the  planet  with  the  star. 


. 

. 

• 

• 

* 

• 

\ 

• 

• 

.s 

1 

4 

1 

0 

. 

■j 

| 

• 

• 

1 * 

• 

. 

• 

* 

. 

1.4 

-T2 

;0.8 

().(> 


0.4 


o.: 


• 

• , 

• 

• 

• 

• 

• 

. 

• 

. 

• 

• 

• 

. 

• 

• 

• 

. 

,2 

o 

1 

n 

| 

(y 

1 

o 

2 

4 

■> 

s 

; 

; 

* 

; 

.* 

. 

/ 

. 

• 

.* 

• 

• 

/ 

t 

• 

• 

* 

. 

.. 

t 

• 

•* 

* 

. 

* 

• 

• 

• 

• 

* 

i— 

_L 

-0.2 

~=pr2 

-H0.-4 


0.8 


1.0 


-1.4 


-4:6 


-4:8 

— I — }-• 

-TO 


J—L 


2.4 


Fig.  11-22  Results  of  the  numerical 
calculations  of  Example  11-14,  showing 
the  instability  of  the  orbit  of  a planet 
under  the  influence  of  a hypothetical 
inverse-cube  force.  The  first  quarter- 
revolution  is  unperturbed  and  is  circu- 
lar. An  outward  radial  velocity  incre- 
ment equal  to  5 percent  of  the  tangential 
velocity  imposed  at  that  point  then  leads 
to  the  open  spiral  trajectory. 


11-7  Perturbations  and  Orbit  Stability  489 


By  using  the  techniques  of  advanced  mechanics,  it  can  be  shown  quite  gener- 
ally that  the  orbit  of  a body  under  the  influence  of  a force  law  of  the  form  F oc 
- (1  /r“)r  will  be  stable  if  n <3.  The  orbit  will  be  unstable  if  n > 3.  If  n =3,  the 
stability  is  neutral.  It  is  not  difficult  to  see  how  this  result  comes  about  in  the  spe- 
cial case  of  a planet  in  a circular  orbit.  To  do  this,  we  place  ourselves  in  the  frame 
of  reference  of  an  observer  located  on  the  planet.  As  was  the  case  for  the  observer 
in  the  discussion  accompanying  Fig.  ll-4b,  it  is  necessary  to  allow  for  the  fact  that 
this  frame  of  reference  is  not  inertial,  because  the  planet  is  subjected  to  a centripe- 
tal acceleration.  Regardless  of  the  nature  of  the  central  force  producing  this  accel- 
eration, it  can  be  written  as  the  signed  scalar  A = — v2/R,  where  the  positive  direc- 
tion is  taken  to  be  the  direction  from  the  force  center  to  the  planet.  The  quantity  v 
represents  the  speed  of  the  planet,  and  R is  its  distance  from  the  force  center. 

In  order  to  apply  Newton’s  laws  in  his  noninertial  frame,  the  observer  must 
claim  that  there  is  a fictitious  force  acting  on  the  planet,  whose  mass  is  m.  He  calls 
this  force  the  centrifugal  force  Fcentrif-  According  to  Eq.  (5-29),  written  in 
signed-scalar  notation,  this  force  is  Fcentnf  = m(— A).  And  since  A = — v2/R,  we 
have 

mv2 

r centrif 

In  a circular  orbit,  both  v and  R are  constant.  From  the  point  of  view  of  the  ob- 
server on  the  planet,  the  centrifugal  force  is  equal  in  magnitude  and  opposite  in 
direction  to  the  central  force  Fcentiai  when  the  distance  of  the  planet  from  the  force 
center  has  the  equilibrium  value  Re . That  is,  the  net  force  Fnet  exerted  on  the  planet 
is 

Fnet  ^centrif  T Fcentral  0 for  R Re 

What  happens  if  the  planet  is  disturbed  slightly  from  its  equilibrium  orbit 
radius  R = Re?  Will  the  direction  of  Fnet  be  such  as  to  tend  to  return  the  planet  to 
its  original  orbit  or  move  it  still  farther  from  that  orbit?  In  order  to  answer  this 
question,  we  must  express  the  centrifugal  force  in  the  form  Fcentrif  l/R1'  and  de- 
termine the  value  of  the  constant  v.  Once  we  have  done  this,  we  can  compare  the 
way  in  which  Fcentrif  varies  with  R to  the  way  in  which  the  central  force  Fcentrai  a 
-l/R"  varies  with  R,  and  obtain  the  desired  result. 

However,  the  equation  Fcentrif  = mv2/R  is  not  in  the  desired  form.  This  is  be- 
cause the  speed  v itself  depends  on  the  distance  R.  So  we  must  eliminate  v from 
the  equation.  To  do  this,  we  multiply  the  right  side  of  the  equation  by  the  quantity 
mR2/mR2,  which  is  equal  to  1.  This  yields 

mVR2 

Fcentnf  ~ mR3 

The  quantity  in  the  numerator  of  the  fraction  on  the  right  side  of  this  equation  is 
equal  to  I2,  the  square  of  the  magnitude  of  the  angular  momentum  of  the  planet 
about  the  force  center.  The  reason  is  that  the  orbit  is  circular,  so  that  the  planet’s 
momentum  p is  perpendicular  to  the  vector  R from  the  force  center  to  the  planet. 
As  in  Example  9-4,  we  therefore  have 

1 = Rp  = Rmv 


or 

l2  = mV2R2 


Using  this  result,  we  can  write  the  centrifugal  force  in  the  form 


F 


centrif 


I2  i_ 
m R3 


490 


Gravitation  and  Central  Force  Motion 


F (in  arbitrary  units)  F (in  arbitrary  units) 


Since  the  angular  momentum  of  the  planet  about  the  force  center  is  constant,  the 
quantity  J2/m  is  constant,  and  we  have  the  desired  proportionality 

1 

Fcentrif  a | , ;j 

That  is,  the  exponent  v is  equal  to  3. 

In  Fig.  ll-23a,b,  and  c,  this  centrifugal  force  is  compared  with  central  forces 
proportional  to  — l/R2,  -l/R3,  and  -l/R4,  respectively.  In  each  case,  the  stability 
of  the  orbit  is  determined  by  the  sign  of  Fnet  = Fcentrif  + Fcentrai  when  R deviates 
slightly  from  the  value  R = RP.  The  details  are  given  in  the  figure  caption. 


( b ) 


Fig.  11-23  Plots  of  the  dependence  of  the  centrifugal  force  Fcentrif 
« 1/R:l  as  a function  of  die  distance  R from  the  force  center  to  the 
planet.  The  force  is  plotted  in  arbitrary  units.  The  scale  of  force 
units  has  been  adjusted  so  that  Fcentrif  = 1 unit  when  R = Re . (a)  In 
this  case,  the  central  force  conforms  to  the  rule  Fcentral  oc  — l/R'1, 
as  does  Newton’s  law  of  gravitation.  Its  magnitude  has  been  ad- 
justed so  that  Fcentral  = —1  unit  when  R = Re.  Thus  Fnet  = 0 when 
R = Re,  in  conformity  with  the  condition  that  the  planet  be  in  a 
circular  orbit  of  radius  Re . Note  that  Fnet  = Fcentrit  + Fcentral  is 
positive  (outward)  when  R < Re\  and  that  it  is  negative  (inward) 
when  R > R,,.  That  is,  the  net  force  tends  always  to  restore  the 
planet  to  the  orbit  radius  Re  if  the  planet  is  disturbed  slightly,  and 
the  orbit  is  stable,  (b)  Here  the  central  force  conforms  to  the  rule 
Fcentrai  a — 1/F3.  Again,  its  magnitude  has  been  adjusted  so  that 
Fcentrai  = — 1 unit  when  R = Re.  The  net  force  is  zero  regardless 
of  the  value  of  R.  and  there  is  no  tendency  to  restore  the  planet 
to  the  orbit  radius  Re  if  it  is  disturbed.  The  orbit  is  neutral — neither 
stable  nor  unstable,  (cj  Here  the  central-force  law  is  FcelUrai 
— 1/F4.  We  have  again  set  Fcentral  = —1  unit  when  R = Rr.  In  this 
case,  Fnel  is  negative  (inward)  when  R < Re  and  positive  (outward) 
when  R > Rc . Thus  any  slight  disturbance  in  the  orbit  radius  leads 
to  a force  which  tends  to  remove  the  planet  still  farther  from  its 
initial  orbit  radius  Re,  and  the  orbit  is  unstable. 


11-7  Perturbations  and  Orbit  Stability  491 


There  is  special  interest  in  the  behavior  of  an  orbit  under  the  action  of 
an  attractive  force  whose  magnitude  is  given  by 

F = GMm  -~ 

where  8 is  a number  small  compared  to  1.  1 he  behavior  induced  by  the 
slight  deviation  from  the  inverse-square  law  of  newtonian  gravitation  is 
very  important  in  the  general  theory  of  relativity,  as  is  explained  briefly  in 
discussing  Example  1 1-15.  In  Example  1 1-15  we  exaggerate  the  deviation 
greatly  by  letting  8 = 0.1,  so  that  F * R~21.  The  corresponding  value  of  (3  is 
seen  from  Eqs.  (1  l-30<7)  and  (1  1-308)  to  be  /3  = —3.1/2  = —1.55.  We  also 
choose  the  initial  tangential  velocity  large  enough  to  produce  a rather  non- 
circular orbit,  to  make  the  effect  of  the  deviation  of  8 from  zero  more 
evident. 


Fig.  11-24  Results  of  the  numerical  calculations  of  Example  1 1-15,  showing  in  exaggerated 
form  the  consequences  of  a slight  deviation  from  conformity  of  the  law  of  gravitation  to~an 
inverse-square  law.  Such  a deviation  is  predicted  by  the  general  theory  of  relativity.  The 
resulting  path  does  not  quite  close  on  itself,  but  can  be  thought  of  as  an  elliptical  orbit 
which  rotates  or  precesses  slowly  as  the  planet  moves  along  it.  The  first  four  orbits  are  shown 
as  dots,  x’s,  crosses,  and  circles,  respectively.  Their  perihelia  are,  respectively,  P4,  P2,  P3, 
and  P4.  The  precession  of  the  perihelion  about  the  sun  is  a convenient  measure  of  the 
precession  of  the  entire  orbit. 


492 


Gravitation  and  Central  Force  Motion 


EXAMPLE  11-15 


mwn  i ini  im ) ' ir  i'  i'  TPi'ii  iin  i in i » iiiiiiiiiitiiiii  ii'i  ii  wmi  in  wmi  Hi  nw— —iwin  i*  ii  iii— inin  n rri  i iiirTtrrBrii  MwiwwwMiw~nTrATi^ 

Run  the  central-force  program  with  the  following  set  of  initial  conditions  and 
parameters: 

x0  = 1 (in  AU);  (dx/dt) 0 = 0;  y0  = 0;  (dy/dt) 0 = 2.27T  (in  AU/yr);  t0  = 0;  A t = 
52  (in  yr);  a = 39.5  [in  (AU)3/(yr)2];  (3  = — 1.55. 

■ The  results  are  plotted  in  Fig.  11-24.  Here  again,  as  in  Example  11-14,  this 
orbit  does  not  retrace  itself.  However,  it  does  not  fail  to  close  in  the  sense  that  the 
planet  spirals  away  from  (or  into)  the  sun.  Rather,  the  entire  ellipse  of  the  orbit  ap- 
pears to  advance  slowly,  that  is,  to  rotate  in  the  same  (counterclockwise)  sense  that  the 
planet  moves.  This  precession  of  the  orbit  is  conventionally  measured  by  noting  the 
position  of  perihelion  during  each  revolution.  In  Fig.  11-24,  successive  perihelia 
were  determined  by  measurement  on  the  graph,  and  are  denoted  by  Pu  P2,  P3,  and 
P4.  For  the  orbit  parameters  chosen,  the  rate  of  precession  is  about  ts  revolution,  or 
about  20°,  per  revolution  of  the  planet. 


Every  planetary  orbit  displays  precession.  Nearly  all  the  effect  is  due  to 
the  perturbations  of  the  other  planets,  as  seen  in  exaggerated  form  in  Ex- 
ample 1 1-13.  In  the  case  of  Mercury,  the  precession  rate  is  about  9.6  min- 
utes of  arc  (0.16°)  per  century.  When  the  effects  of  all  the  other  planets  are 
subtracted,  however,  there  remains  a tiny  effect — a precession  rate  of  43 
seconds  of  arc  (0.012°)  per  century.  In  his  general  theory  of  relativity, 
Einstein  argued  that  this  arises  from  “warping  of  space”  in  the  vicinity  of 
the  sun,  owing  to  its  very  large  mass.  The  warping  makes  Mercury  move 
past  the  sun  along  a path  which  bends  more  than  it  would  according  to  the 
laws  of  newtonian  physics,  with  an  inverse-square  force  law  and  space  re- 
maining unwarped  near  the  sun.  The  increased  bending  of  the  orbit  near 
the  sun  makes  the  orbit  “whip  around”  the  sun,  leading  to  a precession  that 
makes  the  perihelion  advance.  The  calculation  carried  out  in  Example 
11-15  uses  an  inverse-2. 1-power  law  in  an  unwarped  space  to  simulate  (in 
exaggerated  fashion)  the  ef  fect  of  an  inverse-square  law  in  a warped  space, 
and  thus  to  produce  a perihelion  advance.  The  simulation  works  because 
the  warping  of  space  in  Einstein’s  theory  affects  the  motion  in  much  the 
same  way  as  does  making  the  gravitational  attraction  of  the  sun  relatively 
stronger  near  the  sun  than  it  is  for  the  inverse-square  law,  and  this  is  just 
what  is  done  by  modifying  the  force  law  to  an  inverse-2. 1 -power  law. 

I he  orbit  precession  effect  has  been  observed  since  1974  in  a system 
where  its  magnitude  is  very  much  greater  than  it  is  for  Mercury.  This 
system  consists  of  two  stars  which  orbit  about  their  common  center  of  mass. 
One  of  the  stars  is  a so-called  radio  pulsar,  designated  PSR  1913  + 16.  The 
radio  telescope  “sees”  this  star  emit  brief  bursts  of  radio  waves  at  time  in- 
tervals measurable  to  1 part  in  1011.  I bis  makes  possible  an  accurate 
description  of  the  pulsar  orbit  in  spite  of  the  very  great  distance  of  the 
double-star  system  from  the  earth. 

The  observed  orbit  period  of  the  pulsar  is  approximately  7.75  h,  com- 
pared to  the  88  days  of  the  period  of  Mercury.  The  distance  between  the 
pulsar  and  its  companion  star  is  considerably  smaller  than  the  mean  radius 
of  Mercury’s  orbit  about  the  sun.  In  addition,  the  orbit  of  the  pulsar  is  con- 
siderably more  eccentric  (that  is,  noncircular)  than  the  orbit  of  Mercury. 
For  all  these  reasons,  the  precession  of  the  orbit  of  the  pulsar  is  very  much 
more  rapid  than  that  of  the  orbit  of  Mercury.  The  value  is  4.226  ± 0.002 
degrees/yr  compared  to  43  arc  seconds  per  century  or  1.2  x 10-4 
degrees/yr. 


11-7  Perturbations  and  Orbit  Stability  493 


EXERCISES 


Group  A 

11-1.  Finding  the  altitude  from  the  period.  If  a satellite  is 
in  a circular  orbit  above  the  earth  with  a period  of  2.00  h, 
how  far  is  it  from  the  center  of  the  earth?  How  high  is  it 
above  the  earth’s  surface? 


whose  period  is  1.77  days.  Rhea  is  a satellite  of  Saturn.  Its 
distance  from  Saturn  is  527,000  km,  and  its  period  is  4.52 
days.  What  is  the  ratio  of  the  mass  of  Jupiter  to  the  mass  of 
Saturn? 


11-2.  Orbital  period  of  an  asteroid.  An  asteroid  (a  small 
planet)  is  in  a circular  orbit  about  the  sun  whose  radius  is 
4.0  times  the  radius  of  the  nearly  circular  orbit  of  the 
earth.  What  is  the  asteroid's  period  of  revolution?  Ex- 
press your  answer  in  years. 

11-3.  Hot  g.  The  radius  of  the  earth’s  orbit  around 
the  sun  is  1.50  X 108  km,  and  its  orbital  period  is  3.16  X 
107  s.  Taking  the  radius  of  the  sun  to  be  6.96  X 105  km, 
find  the  acceleration  of  gravity  at  the  surface  of  the  sun. 

11-4.  When  the  altitude  equals  the  radius. 

a.  What  is  the  value  of  g at  an  altitude  equal  to  the 
radius  of  the  earth? 

b.  What  is  the  period  of  an  earth  satellite  in  a circular 
orbit  at  this  height? 

c.  What  is  the  weight  of  an  astronaut  inside  the  satel- 
lite if  his  weight  on  earth  is  70  x 9.8  N? 

d.  What  is  the  astronaut’s  mass  inside  the  satellite? 


11-5.  Acceleration  due  to  lunar  gravity.  Let  ge  be  the 
acceleration  of  falling  bodies  near  the  earth’s  surface  and 
grn  be  that  near  the  moon's  surface.  If  the  mass  of  the 
earth  is  81  times  the  mass  of  the  moon  and  the  radii  are 
re  = 6370  km  and  rm  = 1740  km,  respectively,  compute 

ge/gm- 

11-6.  How  high  -would  it  go?  Kathy  finds  that  if  she 
throws  a baseball  directly  upward,  it  falls  back  to  earth 
after  6.00  s.  (Neglect  air  resistance  throughout  this  exer- 
cise.) 

a.  What  is  the  initial  speed  of  the  baseball?  What  is  its 
peak  altitude?  Take  the  acceleration  of  gravity  at  the 
earth’s  surface  to  be  g = 9.80  m/s2. 

b.  Suppose  that  Kathv  could  throw  the  ball  upward 
with  the  same  initial  speed  from  the  surface  of  some  other 
astronomical  object,  where  the  local  acceleration  due  to 
gravity  is  g' . Find  the  peak  altitude  and  the  total  flight 
time. 

c.  Evaluate  the  results  of  part  b for  the  following  ob- 
jects: 

(i)  Mars  (its  mass  is  0.107  times  that  of  the  earth;  its 
radius  is  0.533  times  that  of  the  earth) 

(ii)  the  moon  (its  mass  is  0.0123  times  that  of  the 
earth;  its  radius  is  0.272  times  that  of  the  earth) 


11-7.  A transplanted  pendulum.  The  mass  of  Mars  is 
0. 107  times  the  mass  of  the  earth.  Its  radius  is  0.533  times 
the  earth’s  radius.  What  would  be  the  period  of  a pen- 
dulum on  Mars  if  its  period  on  the  earth  were  2.00  s? 


Jupit 


Mass  ratios  in  the  solar  system,  I.  Io  is  a satellite  of 
Jtose  distance  from  the  planet  is  472,000  km  and 


494  Gravitation  and  Central  Force  Motion 


11-9.  Finding  the  neutral  gravity  point  and  the  equal  grav- 
ity point.  The  mass  of  the  earth  is  8 1 times  the  mass  of  the 
moon.  At  what  two  points  along  the  line  joining  the  earth 
and  the  moon  does  the  magnitude  of  the  pull  of  the  earth 
on  a body  equal  that  of  the  moon?  What  is  the  distance  d 
between  the  two  points? 


11-10.  Planetary  values  for  the  acceleration  due  to  gravity. 

a.  Prove  that  the  value  of  g' , the  gravitational  acceler- 
ation of  a falling  body  at  the  surface  of  any  planet,  is  given 
by  g'  = 4TrGpr/3  where  p is  the  average  density  of  the 
planet  and  r is  its  radius.  Neglect  the  effect  of  rotation. 

b.  The  ratio  of  the  average  density  of  Jupiter  to  that 
of  the  earth  is  0.243.  The  ratio  of  their  radii  is  1 1.2.  What 
is  the  ratio  of  g'  at  the  surface  of  Jupiter  to  g at  the  earth’s 
surfac 


JMeglect  rotational  effects. 


* 


1 1-1  Id  Earth  satellites  obey  Kepler!  The  following  data 
were  published  in  newspapers  for  an  artificial  satellite 
launched  on  March  18,  1958: 

Minimum  altitude:  655  km 
Maximum  altitude:  4044  km 
Period  of  revolution:  135  min  = 2.25  h 
Show  that  these  data  satisfy  Kepler's  third  law.  (Hint:  The 
period  of  this  orbit  is  the  same  as  the  period  of  a circular 
orbit  whose  radius  equals  the  average  of  the  minimum 
and  maximum  distances  from  the  earth’s  center.) 
Auxiliary  data: 

Average  radius  of  the  moon’s  orbit 
Period  of  the  moon 
Radius  of  the  earth 


km 


3.8^-x  105 

27.3  days 

X 103  km 


11-12.  Dirty  snowball.  The  nucleus  of  a typical  comet 
(that  is,  the  small,  relatively  dense  part  of  its  “head”)  is  be- 
lieved to  be  a roughly  spherical  solid  body  of  frozen  water 
and  “gases”  (sometimes  called  a “dirty  snowball”)  with  a 
density  of  about  0.8  g/cm3  and  a radius  of  about  5 km. 

a.  Find  the  mass  of  the  comet’s  nucleus.  Express  your 
result  in  kilograms  and  as  a fraction  of  the  mass  of  the 
earth. 

b.  Find  the  gravitational  acceleration  g'  at  the  surface 
of  the  comet’s  nucleus.  Express  your  result  in  meters  per 
second  squared  and  as  a fraction  of  the  terrestrial  gravita- 
tional acceleration  g. 

c.  Find  the  escape  speed  at  the  surface  of  the  comet’s 
nucleus.  Express  your  result  in  meters  per  second  and  as  a 
fraction  of  the  terrestrial  escape  speed. 


11-13.  Thrown  away.  Estimate  the  size  of  a rocky 
sphere  with  a density  of  3.0  g/cm3  from  the  surface  of 
which  you  could  just  barely  throw  away  a golf  ball  (and 
have  it  never  return). 


Up  and  away'? 


rifle  has  a muzzle 


velocity  of  magnitude  (oPSOOm/s. 

a.  If  it  were  fired  vertically  upward  from  the  moon’s 
surface,  would  the  bullet  escape  from  the  moon’s  gravita- 
tional held?  (Take  the  moon’s  mass  to  be  7.35  x 1022  kg 
and  the  moon's  radius  to  be  1738  km.) 

b.  If  you  find  that  the  bullet  would  escape,  what 
would  its  final  speed  be?  (Use  the  principle  of  the  conser- 
vation of  mechanical  energy,  and  ignore  the  other  bodies 
in  the  solar  system.) 

c.  If  you  find  that  the  bullet  would  not  escape,  what 
would  be  its  maximum  distance  from  the  center  of  the 
moon?  From  the  surface  of  the  moon?  Find  the  total  flight 
time  of  the  bullet. 


11-17.  No  equivalence.  Assume  that  the  gravitation- 
ally measured  mass  ma  of  a small  body  is  not  necessarily 
equal  to  its  inertially  measured  mass  m,.  Show  that  if  a 
simple  pendulum  of  length  / is  made,  with  the  body  used 
for  a bob,  the  period  T for  small  oscillations  of  the  pen- 
dulum is  given  by  the  equation  displayed  immediately 
below  Eq.  (1  1-10),  which  can  be  written  in  the  form 


/ / mi 


Gramj  B 

\l  1-15J  Earth,  moon,  and  sun.  ,/ 

a.  Fhe  sun's  mass  is  about  320,000  times  the  earth’s 
mass.  The  sun  is  about  dOOurmes  as  far  from  the  eartTTas" 
the  moon  is.  What  is  the  ratio  of  the  magnitude  of  the  pull 
of  the  sun  on  the  moon  to  that  of  the  pull  of  the  earth  on 
the  moon?  (For  the  purpose  of  diis  exercise,  it  may  be  as- 
sumed that  the  sun-moon  distance  is  constant  and  equal  to 
the  sun-earth  distance.) 

b.  From  the  result  of  part  a,  it  is  seen  that  the  pull  on 
the  moon  is  always  directed  toward  the  sun.  What  is  the 
direction  of  the  curvature  of  the  moon’s  orbit  as  seen  from 
the  sun?  A qualitative  answer  will  suffice. 

11-16.  Restoring  the  balance.  A standard  1 -kg  mass  is 
suspended  from  each  side  of  a sensitive  beam  balance, 
as  in  Fig.  1 IE-16.  The  wire  supporting  the  right  mass  goes 
through  an  opening  in  the  floor  so  that  it  is  10.00  m below 
the  left  mass. 


T = 277  \/-a 

g v mG 

11-18.  Saturn's  rings.  The  rings  of  Saturn  consist  of 
myriad  small  particles,  with  each  particle  following  its  own 
circular  orbit  in  Saturn’s  equatorial  plane.  The  inner  edge 
of  the  innermost  ring  is  about  70,000  km  from  Saturn’s 
center;  the  outer  edge  of  the  outermost  ring  is  about 
135,000  km  from  the  center. 

3*7 0 a.  Find  the  orbital  period  of  the  outermost  particles 

q as  a multiple  of  the  orbital  period  of  the  innermost  par- 
otides. 


b.  Spectroscopic  studies  indicate  that  the  outermost 
particles  have  a speed  of  17  km/s.  Find  the  mass  of  Saturn. 
Express  your  result  in  kilograms  and  as  a multiple  of  the 
earth’s  mass. 


11-19.  Mass  ratios  in  the  solar  system,  II.  Newton, 
without  knowledge  of  the  numerical  value  of  the  gravita- 
tional constant  G,  was  nevertheless  able  to  calculate  the 
ratio  of  the  mass  of  the  sun  to  the  mass  of  any  planet,  pro- 
vided the  planet  has  a moon. 

a.  Show  that  for  circular  orbits 


AT 

AT 


R \3/7’  \ 2 

1 p \ / r« 


R, 


T,. 


1 kg 


a.  What  is  the  fractional  excess  in  the  weight  of  the 
right  mass  over  that  of  the  left  mass?  (This  is  done  most 
easily  by  using  differentials.) 

b.  How  many  milligrams  must  be  placed  on  the  left 
mass  to  restore  the  balance? 


where  Ms  is  the  mass  of  the  sun,  Mp  the  mass  of  the  planet, 
R„  the  distance  of  the  planet  from  the  sun,  R,„  the  distance 
of  the  moon  from  the  planet,  Tm  the  period  of  the  moon 
around  the  planet,  and  Tp  the  period  of  the  planet  around 
the  sun 

b.  If  the  planet  is  the  earth,  Rp  = 1.50  x 108  km, 
Rm  = 3.85  x 105  km,  Tm  = 27.3  days,  and  Tp  = 365.2 
days.  Calculate  Ms/Mp. 

11-20.  Period  of  a low-altitude  satellite. 

a.  Find  the  orbital  period  of  a satellite  in  a circular 
orbit  just  above  the  surface  of  a spherical  planet  of  mass M 
and  radius  r. 

b.  Rewrite  the  result  of  part  a in  terms  of  the  average 
density  (p)  = 3M/4m3  to  show  that  for  a given  average 
planetary  density  the  satellite’s  orbital  period  is  indepen- 
dent of  the  size  of  the  planet. 

11-21.  Reduced  mass  via  fictious  jorce.  In  the  inertial 
reference  frame  O of  Fig.  1 IE-21,  two  bodies  of  mass  »q 
and  m2  exert  gravitational  forces  on  each  other  along  the 
line  of  r.  These  forces  are  — Fr  = m 1a1  and  Fr  = m2 a2 . 
Studying  the  system  from  a noninertial  reference  frame 


Exercises  495 


O attached  to  body  2 requires  the  introduction  of  a fictitious 
force  —m-, a2  acting  on  body  1.  Show  that  this  leads  to 
Eq.  (11-18). 

11-22.  Harmonic  motion  of  a two-body  system.  Two 
bodies  of  unequal  masses  m1  and  m2  are  attached  to  the 
ends  of  an  unstretched  spring  of  negligible  mass,  length  / 
and  force  constant  k.  See  Fig.  1 IE-22.  The  system  is  placed 
on  a frictionless  table,  and  the  bodies  are  brought  closer 
together  by  compressing  the  spring.  The  bodies  are  re- 
leased simultaneously.  Since  the  system  is  isolated,  its 
center  of  mass  remains  fixed. 


f.  ■ 

a.  If  a satellite  is  raised  to  a height  h above  the  earth's 
surface  and  then  placed  in  a circular  orbit,  what  is  the 
ratio K/AU  of  its  kinetic  energy  to  the  increase  in  potential 
energy  when  it  is  raised  from  the  surface?^ 

b.  For  an  orbit  whose  height  is  h —HeS  the  radius  of 
the  earth,  what  is  the  numerical  value  oiK/AU ? 

11-26.  The  Bohr  atom.  A hydrogen  atom  consists  of  an 
electron  of  a very  small  mass  and  a massive  proton.  In 
Bohr’s  model  of  1913,  the  electron  revolves  about  the 
proton  in  an  orbit  of  radius  R.  To  first  approximation, 
the  proton  can  be  considered  as  fixed.  There  is  an  electric 
force  of  attraction  between  the  two  of  magnitude  given 
by  k/R2,  where  k is  a constant. 

a.  Wl?at  is  the  expression  for  the  potential  energy  of 
the  system  in  terms  ol  R ? 

b.  For  a circular  orbit,  what  is  the  kinetic  enersrv  K of 
the  electron?  What  is  the  total  mechanical  energy  E of  the 
system? 

c.  For  a circular  orbit,  the  angular  momentum  of  the 
electron  about  the  proton  is  given  by  the  expression  / = 
mvR,  where  m is  the  mass  of  the  electron  and  v is  its  speed. 
What  is  the  expression  for  R in  terms  of  /,  m,  and  A?  What 
is  the  value  of  E in  terms  of  these  quantities? 

d.  Consider  two  circular  orbits  for  which  U = 2 lv 
What  is  the  ratio  R2/Rf  E2/E1 ? 


Energy  of  motion  vs.  energy  of  elevation. 


^ITmTYYTYTYY^ 


Fig.  11E-22 


a.  What  is  the  force  constant  kl  of  the  part  ot  the 
spring  between  the  center  of  mass  and  body  1? 

b.  What  is  the  period  of  oscillation  Tx  of  body  1? 

c.  Repeat  parts  a and  b for  body  2. 

d.  Show  that  the  period  of  oscillation  of  the  system 
can  be  obtained  by  using  the  reduced  mass  of  the  system. 

11-23.  Reduced  mass  and  the  internal  kinetic  energy  of  a 
two-body  system.  Two  particles  of  mass  mx  and  m2  constitute 
an  isolated  system  in  rotation  with  common  angular  veloc- 
ity o)  about  a fixed  center  of  mass  C.  The  distance  from 
particle  1 to  C is  ry;  that  from  particle  2 to  C is  r2. 

a.  What  is  the  kinetic  energy  of  particle  1 ? Of  particle 
2?  What  is  the  total  kinetic  energy  of  the  system? 

b.  Show  that  the  total  kinetic  energy  equals  i/AocPr3', 
where  p is  the  reduced  mass  of  the  system  and  r is  the  dis- 
tance between  the  two  particles.  This  confirms  that  the 
kinetic  energy  of  the  system  can  be  obtained  by  regarding 
one  particle  as  fixed  and  the  other  as  rotating  about  the 
fixed  one,  provided  the  mass  assigned  to  the  moving  par- 
ticle is  the  reduced  mass. 

1 fc24  A central  force  proportional  to  1 /r.  Consider  an 
attractive  force  which  is  central  but  inversely  proportional 
to  the  first  power  of  the  distance.  Prove  that  if  a particle 
is  in  a circular  orbit  with  such  a force,  its  speed  is  inde- 
pendent of  the  orbital  radius,  but  its  period  is  proportional 
to  the  radius. 


Group  C 

11-27.  Tied  together.  Two  earth  satellites,  with  masses 
M,  and  M2,  are  linked  by  a tether  of  length  L.  as  shown  in 
Fig.  1 1F-27.  The  satellites  are  traveling  in  circular  orbits 
of  radius  Rx  and  R2  = Rx  + L. 


Fig.  11E-27 


a.  Find  the  (common)  orbital  period  of  the  linked 

pair. 

b.  Determine  the  tension  in  the  tether. 

c.  Evaluate  your  results  for  the  case  of  an  astronaut 
at  the  end  of  a 10-m  tether  attached  to  a craft  comparable 
to  Skylab:  yV/j  = 50,000  kg,  y V/2  = 70  kg,  L = 10  m,  and 
Rx  = 6800  km.  Find  the  orbital  period  in  minutes,  and 
express  the  tension  in  newtons  and  also  as  a multiple  of 
M2g. 


496  Gravitation  and  Central  Force  Motion 


d.  Evaluate  the  required  tension  for  two  large  craft 

linked  by  a 1.0-km  tether:  Mt  = M2  = 50,000  kg,  Rj  = 

6800  km,  L = 1.0  km.  - . . 

/ — (Vs  I t 

\1  -28?) R ache’s  limit.  A planet  of  mass  M has  a satellite's^-!^ 
in  an  orbit  of  radius  R.  The  density  of  the  satellite  is  p, . I 

a.  Show  that  the  satellite  will  be  able  to  hold  itself 
together  by  internal  gravitational  forces  only  if  the  orbit d, 
radius  R is  greater  than  a certain  minimum  distance  R^, 
which  is  given  approximately  by  Rr  = 

This  minimum  orbit  radius  is  called  Roche's  limit.  If  the 
orbit  radius  is  smaller  than  Roche’s  limit,  the  satellite  rnijs 
hold  itself  together  by  nongravitational  forces. 

b.  The  orbit  radius  of  the  moon  is  fliininishin^'ery 
slowly,  owing  to  the  effects  of  friction  resulting  from  tidal 
forces.  Find  the  value  of  Roche’s  limit  for  the  moon.  Take 
the  average  density  of  the  moon  to  be  ps  = 3.4  x 1()3 
kg/m3. 

c.  Where  in  the  solar  system  can  you  find  observa- 
tional confirmation  for  the  theory  you  have  developed  in 
part  «? 

1 1-29.  A binary  star  system.  A pair  of  stars  close  to  each 
other  revolve  about  their  common  center  of  mass. 

a.  Using  their  reduced  mass,  show  that 


ergy  and  angular  momentum,  and  see  the  note  in  Exercise 
1 l-30c.) 

b.  Eet  y and  8 be  the  semimajor  and  semiminor  axes 
of  the  ellipse.  Let  c be  the  distance  of  the  focus  from  the 
center  of  the  ellipse.  Calculate  the  values  of  y,  8,  and  c for 
the  ellipitical  orbit  in  part  a. 

11-32.  Elliptical  orbits,  II.  Figure  1 IE-32  represents 
the  elliptical  orbit  of  a planet  about  the  sun.  The  major 
axis  AA'  has  length  2y,  the  minor  axis  BB'  has  length  28, 
and  the  sun  lies  at  one  of  the  foci,  F,  a distance  c from  the 
center  C of  the  ellipse. 

a.  When  the  planet  is  located  at  aphelion  A,  its  veloc- 
ity v is  perpendicular  to  the  major  axis.  The  velocity  v'  at 
perihelion  A'  is  likewise  perpendicular  to  the  major  axis. 
Call  the  mass  of  the  planet  m,  and  use  the  principle  of  con- 


Fig.  11E-32 


A' 


M + m 


4t t2R3 
GT 2 


where  M is  the  mass  of  one  star,  m the  mass  of  the  other,  R 
the  distance  between  the  stars,  and  T their  common 
period  of  revolution. 

b.  If  Mg  is  the  mass  of  the  sun.  Rs  the  radius  of  the 
earth’s  orbit,  and  Ts  the  earth’s  period  of  revolution,  show 

thiit 


M + m 


- $(r 


c.  For  the  double  star  70  Ophiuchi,  R = 23 Rs  and 
T = 88  yr.  Calculate  ( M + m)/Ms , the  ratio  of  the  mass  of 
70  Op  bmp  hi  to  that  of  the  sun. 


Peak  altitude  of  a satellite.  A satellite  is  launched 
froim+rpTurface  of  the  earth  by  firing  it  horizontally  at  a 
speed  of  10  km/s.  Assume  that  the  effect  of  air  resistance 
is  negligible. 

a.  Apply  the  laws  of  conservation  of  angular  mo- 
mentum and  mechanical  energy  to  find  the  satellite’s 
maximum  distance  above  the  eat  ill’s  surface. 

b.  If  the  satellite  were  fired  vertically  upward  with 
the  same  speed,  what  would  be  its  maximum  height? 

c.  Account  for  the  difference  between  the  results  of 
parts  a and  b.  (Note  that  GMe  = gr|:,  where  M,,  is  the  mass 
of  the  earth  and  r„  is  its  radius.) 

* 

11-31.  Elliptical  orbits,  I. 

a.  An  earth  satellite  in  an  elliptical  orbit  is  250  km 
above  the  earth’s  surface  at  its  closest  point,  and  its  speed 
there  is  9 km/s.  What  is  its  height  above  the  earth’s  surface 
at  its  farthest  point?  (Use  conservation  of  mechanical  en- 


servation  of  angular  momentum  / to  write  an  equation  re- 
lating the  quantities  of  m,  v,  v' , c,  and  y. 

b.  In  terms  of  these  quantities  and  the  gravitational 
constant  G,  what  is  the  total  mechanical  energy  of  the 
sun-planet  system  when  the  planet  is  at  perihelion?  At 
aphelion?  What  is  the  relation  between  these  two  en- 
ergies? 

c.  From  the  results  of  parts  a and  b,  show  that  the 
total  energy  is  E = —GMm/ 2y,  where  M is  the  mass  of 
the  sun,  and  that  therefore  E depends  on  only  the  length 
of  the  major  axis  of  the  ellipse,  and  not  on  its  shape. 

d.  The  area  of  an  ellipse  is  given  by  7ryS.  Use  Eq. 
(11-15)  to  express  Kepler’s  second  law  in  the  form 
rry8/T  = l /2m,  where  T is  the  period  of  the  planet. 

e.  Another  general  property  of  ellipses  is  the  relation 
Sz  = y1  - c2.  Show  that  T2  = Arryd/GM,  so  that  the  period 
of  a planet  depends  on  only  the  major  axis  of  its  orbit. 

1 1-33.  Transfer  orbit.  An  astronaut  who  is  in  a circular 
orbit  about  the  earth  at  7000  km  from  its  center,  wishes  to 
dock  with  a space  station  which  is  in  a circular  orbit  about 
the  earth  at  1 0,000  km  from  its  center.  See  Fig.  1 1F-33.  He 
fires  booster  rockets  briefly,  to  increase  his  speed  so  as  to 
arrive  at  the  point  S2  on  the  space  station  orbit  where  it  is 
on  the  opposite  side  of  the  earth  from  the  astronaut’s  po- 
sition when  he  hres  the  rockets.  He  also  wishes  to  ap- 
proach S2  along  the  tangent  to  the  space  station  orbit. 

a.  What  must  be  his  speed  after  the  completion  of  the 
burning? 

b.  What  was  the  increase  in  his  speed  over  the  short 
rocket  burning  period? 

c.  With  what  speed  does  he  arrive  at  S2? 


Fig.  11E-33 


d.  What  must  he  do  to  dock  with  the  space  station? 

e.  Find  die  value  of  0 giving  Si,  the  position  of  the 
space  station  relative  to  S2,  when  the  astronaut  fires  his 
booster  rockets. 

(Hint:  See  Exercise  1 1-32,  parts  c,  d,  and  e;  also  note  that 
GMe  = gr where  Me  is  the  mass  of  the  earth  and  re  its 
radius.) 

11-34.  Satellite  watch.  A satellite  is  placed  in  a circular 
orbit  traveling  from  west  to  east  in  the  plane  of  the  equa- 
tor of  the  earth  at  an  altitude  of  800  km.  How  long  will  it 
remain  above  the  horizon  at  any  one  place? 

Numerical 

11-35.  A numerical  test  of  Kepler's  first  law.  Run  the  cen- 
tral force  program  with  initial  conditions  and  parameters 
as  in  Example  1 1-9,  but  use  (dy/dt)0  = 1.5  tt  (in  AU/yr). 
Sufficient  accuracy  will  be  obtained  if  you  take  At  = 
1/(4  x 52)  = 1/208  (in  yr),  and  plot  every  fourth  point. 
(While  plotting,  pick  some  calculational  cycle  arbitrarily 
and  record  the  numerical  values  of  x,  y,  dx/dt,  and  dy/dt 
stored  in  the  registers  in  that  cycle.  These  data  will  be 
employed  in  another  exercise.)  Use  your  plot  to  test  Kep- 
ler’s first  law. 

11-36.  A numerical  test  of  Kepler's  second  law.  Use  the 
plot  obtained  in  Exercise  1 1-35  to  test  Kepler’s  second  law. 
Employ  the  geometrical  procedure  of  Fig.  1 1-15. 

11-37.  A numerical  test  of  angular  momentum  conserva- 
tion. Use  the  values  of  x,  y,  dx/dt,  and  dy/dt  from  Exercise 
11-35  for  the  initial  calculational  cycle,  and  for  the  arbi- 
trary calculational  cycle  for  which  you  recorded  these  val- 
ues, to  test  the  conservation  of  angular  momentum  in  cen- 
tral force  motion.  Employ  Eq.  (1 1-32). 

11-38.  A numerical  test  of  Kepler's  third  law.  Combine 
the  results  obtained  in  Exercise  1 1-35  for  the  semimajor 
axis  of  the  orbit  and  the  orbital  period,  with  those  listed  in 
able  1 1-2,  to  test  Kepler’s  third  law. 


14  numerical  test  of  energy  conservation.  Lise  the 
y,  dx/dt,  and  dy/dt  front  Exei^cVe  1 1-35  for  the 
initial  calculational  cycle,  and  for  the  arbitrary  calcula- 


tional cycle  for  which  you  recorded  these  values,  to  evalu- 
ate the  potential  and  kinetic  energy  of  the  planet  at  the 
two  points  in  its  orbit.  Then  use  these  data  to  test  the  con- 
servation of  total  mechanical  energy.  Explain  the  signifi- 
cance of  the  sign  of  the  total  energy. 

11-40.  Orbit  of  a comet.  A comet  is  located  at  x = 4 
AU,  y = 3 AU  with  velocity  components  dx/dt  = 4.5 
AU/yr  and  dy/dt  = 0.  The  sun  is  at  the  origin.  Take  a = 
GM  = 39.50(AU)3/(vr)2 

a.  Is  the  total  mechanical  energy  of  the  sun-comet 
system  positive,  zero,  or  negative? 

b.  Is  this  a bound  or  unbound  system? 

c.  Taking  At  = 0.08  yr,  run  the  central  force  pro- 
gram given  in  the  Numerical  Calculation  Supplement.  At 
several  points  along  the  orbit  record  the  values  of  x,  y, 
dx/dt,  and  dy/dt  stored  in  the  registers.  Use  these  values 
and  Eqs.  (1  1-13)  and  ( 1 1-32)  to  verify  Kepler’s  second  law. 

d.  Plot  the  orbit. 

e.  Repeat  parts  a through  d assuming  gravitational  re- 
pulsion. While  there  is  no  evidence  that  gravitational  re- 
pulsion exists,  inverse-square  electric  repulsion  is  just  as 
common  as  attraction,  and  the  gravitational  calculation  is 
essentially  the  same  as  the  electrical  one.  To  use  the  cen- 
tral force  program  for  repulsion,  the  value  of  a is  taken  to 
be  negative:  specifically  a = —39.5.  Plot  the  orbit  you 
obtain  on  the  same  graph  as  part  d. 

f.  Compare  the  position  of  the  center  of  force  in  rela- 
tion to  the  orbit  in  the  two  cases. 

1 1 -41 . Numerical  construction  of  a hyperbolic  orbit.  Run 
the  central  force  program  as  in  Example  11-10,  but  use 
(dy/dt) o = 2(27r)  (in  AU/yr).  (While  plotting,  pick  some 
calculational  cycle  arbitrarily  and  record  the  numerical 
values  of  x,  y,  dx/dt,  and  dy/dt  stored  in  the  registers  in  that 
cycle.  These  data  will  be  employed  in  another  exercise.) 
Compare  the  hyperbolic  trajectory  you  obtain  with  the 
parabolic  one  obtained  in  Example  11-10  fot  ( dy/dt)0  = 
V2(2  tt)  (in  AU/yr).  Can  you  make  a geometrical  test  to 
show  that  your  trajectory  actually  is  a hyperbola? 

11-42.  Angular  momentum  conservation  in  a hyperbolic 
orbit.  Use  the  values  of  x,  y,  dx/dt,  and  dy/dt  from  Exercise 
11-41  for  the  initial  calculational  cycle,  and  for  the  arbi- 
trary calculational  cycle  for  which  you  recorded  these  val- 
ues, to  test  the  conservation  of  angular  momentum  in  cen- 
tral force  motion.  Employ  Eq.  (1 1-32). 

11-43.  Energy  conservation  in  a hyperbolic  orbit.  Use  the 
values  of  x,  y,  dx/dt,  and  dy/dt  from  Exercise  1 1-41  for  the 
initial  calculational  cycle,  and  for  the  arbitrary  calcula- 
tional cycle  fot  which  you  recorded  these  values,  to  evalu- 
ate the  potential  and  kinetic  energy  of  the  celestial  bod}-  at 
the  two  points  in  its  trajectory.  Then  use  these  data  to  lest 
the  conservation  of  total  mechanical  energy.  Explain  the 
significance  of  the  sign  of  the  total  energy. 

1 1-44.  Stability  test  for  a circular  orbit  with  a central  force 
of  constant  magnitude.  Test  the  stability  of  circular  orbits  for 


498  Gravitation  and  Central  Force  Motion 


an  attractive  central  force  obeying  the  law  F r°  (that  is,  F 
is  a constant).  Do  this  by  running  the  central  force  pro- 
gram with  a perturbation  given  to  the  orbiting  body  as  in 
Example  11-14,  but  use  the  value  of  the  parameter  (3  cor- 
responding to  this  force  law.  (An  example  of  a system  in- 
volving the  force  law  is  a puck  on  an  air  table  connected  by 
a string  over  a swiveling  pulley  at  the  center  of  the  table  to 
a suspended  weight,  as  in  Fig.  3-30.  The  force  acting  on 
the  puck  is  an  attractive  central  force  of  essentially  con- 
stant magnitude.) 

1 1-45.  Stability  test  for  a circular  orbit  with  ci  cental  force 
proportional  to  the  distance.  Test  the  stability  of  circular  orbits 
for  an  attractive  central  force  obeying  the  law  F a r1.  Do 
this  by  running  the  central  force  program  with  a pertur- 
bation given  to  the  orbiting  body  as  in  Example  1 1-14,  but 
use  the  value  of  the  parameter  /3  corresponding  to  this 
force  law.  (An  example  of  a system  involving  the  force  law 
is  a puck  on  an  air  table  connected  to  one  end  of  a spring 
whose  other  end  is  attached  to  a fixed  swivel  at  the  center 
of  the  table.  If  the  spring  is  very  extensible,  and  of  neg- 
ligible length  when  unextended,  the  magnitude  of  the 
attractive  central  force  acting  on  the  puck  will  obey 
Hooke’s  law;  that  is  it  will  be  proportional  to  the  distance  r 
from  the  center.) 

11-46.  Stability  test  for  a circular  orbit  with  an  inverse 
fourth  power  central  force.  Test  the  stability  of  circular  orbits 
for  an  attractive  central  force  obeying  the  law  F “ >~4.  Do 
this  by  running  the  central  force  program  with  a perturba- 
tion given  to  the  orbiting  body  as  in  Example  11-14,  but  use 
the  value  of  the  parameter  /3  corresponding  to  this  force 
law.  Compare  the  spiral  trajectory  obtained  with  the  one 
shown  in  Fig.  I 1-22,  and  explain  the  difference  between 
them. 


11-47.  Noncircular  orbits  with  a central  force  of  constant 
magnitude.  Run  the  central  force  program  as  in  Example 
11-15,  but  use  the  parameter  f3  corresponding  to  an 
attractive  central  force  obeying  the  law  F <*.  r°.  (See  Exer- 
cise 1 1-44.)  Compare  the  precession  of  the  orbit  with  that 
found  for  the  F * r~2A  orbit  plotted  in  Fig.  1 1-24. 

11-48.  Noncircular  orbits  for  a centred  force  proportioned 
to  distance.  Run  the  central  force  program  as  in  Example 
1 1-15,  but  use  the  parameter  f3  corresponding  to  an  at- 
tractive central  force  obeying  the  law  F r1.  (See  Exercise 
1 1-45).  Can  you  explain  why  there  is  no  precession  of  the 
orbit? 

11-49.  Effect  of  solar  wind  on  a satellite-orbit.  Use  the 
central  force  program  to  study  the  effect  of  the  “solar 
wind.”  This  is  a stream  of  ionized  gas  flowing  at  high 
speed  outward  from  the  sun.  One  of  its  effects  is  to  sweep 
comet  tails  away  from  the  sun.  Another  is  to  disturb  the 
motion  of  earth  satellites.  To  study  the  effect  on  satellites, 
modify  the  program  in  such  a way  that  the  number  0.001 
is  added  to  the  register  holding  dx/dt  in  each  calculational 
cycle — above  and  beyond  whatever  normally  happens  to 
dx/dt  in  the  cycle.  This  continual  addition  of  a small  veloc- 
ity in  the  x direction  represents  the  effect  of  a weak  force 
continually  exerted  on  the  satellite  in  that  direction,  as 
seen  from  a reference  frame  in  which  the  earth  is  always 
at  the  origin  and  the  sun  is  always  in  the  negative  x direc- 
tion. Run  the  program  with  the  following  set  of  initial  con- 
ditions and  parameters:  x0  = 1;  ( dx/dt)0  = 0;  y0  = 0; 
(dy/dt) o = 1;  t = 0;  At  = 0.1:  a = 1;  /3  = — 1.5.  (These 
simple  dimensionless  values  are  used  since  the  AU  and 
the  year  are  not  appropriate  units  of  distance  and  time 
here.)  Can  you  explain  why  the  orbit  shifts  in  a direction 
perpendicular  to  the  solar  wind? 


Exercises  499 


Mechanical 
Traveling  Waves 


12-1  TRAVELING  Imagine  a cork  floating  on  the  surface  of  a calm  pond.  You  throw  a pebble 
WAVES  into  the  pond  at  some  point  remote  from  the  cork  and  then  watch  the  cork. 

After  a time,  you  see  the  cork  move  up  and  down.  Some  of  the  mechanical 
energy  you  originally  gave  to  the  pebble  has  appeared  on  the  cork.  This  en- 
ergy is  carried  from  the  pebble  to  the  cork  by  the  ripple  that  spreads  over 
the  surface  of  the  pond  from  where  the  pebble  struck  the  surface.  If  you 
put  other  corks  elsewhere  on  the  surface  of  the  pond,  they  also  will  be  set 
into  motion  by  the  ripple.  A small  fraction  of  the  energy  carried  by  the 
ripple  is  deposited  on  any  cork  that  the  ripple  happens  to  meet.  And  even 
if  there  are  no  corks,  still  the  ripple  itself  carries  energy.  This  transport  of  en- 
ergy by  the  ripple  does  not  involve  the  transport  of  matter.  That  is,  particu- 
lar water  molecules  do  not  continually  move  along  with  the  ripple.  Instead, 
water  molecules  make  a small  up-and-down  motion  as  the  ripple  passes,  re- 
turning to  their  original  positions  after  it  has  passed.  What  is  moving  across 
the  pond,  and  carrying  energy,  is  not  water  molecules  but  a disturbance  in 
the  position  of  water  molecules. 

A ripple  traveling  over  the  surface  of  a pond  is  a particular  case  of  a 
general  phenomenon  called  a traveling  wave.  A traveling  wave  in  a me- 
chanical system  is  a disturbance  in  the  positions  of  the  particles  of  the 
system  from  their  normal  positions,  which  moves  through  the  system.  En- 
ergy is  transported  over  long  distances  by  the  moving  disturbance,  even 
though  the  particles  of  the  system  move  only  very  short  distances  as  the 
wave  passes  them,  and  end  up  where  they  started. 

The  topic  of  this  chapter  is  traveling  waves  in  mechanical  systems. 
Sound  waves  traveling  through  the  air  are  the  most  important  example 
found  in  nature.  But  there  are  many  other  important  examples,  such  as 
seismic  waves  traveling  through  the  earth. 


500 


In  many  cases  our  interest  in  the  energy  carried  by  a mechanical  travel- 
ing wave  seems  secondary  to  our  interest  in  the  information  carried  by  the 
wave.  Consider  sound  waves.  Their  importance  in  carrying  information 
from  the  mouth  of  a speaker  to  the  ear  of  a listener  is  obvious.  Nevertheless, 
if  the  sound  wave  did  not  carry  enough  energy  to  set  the  listener’s  eardrum 
into  vibration,  it  could  carry  no  information.  Geologists  produce  seismic 
waves  by  detonating  explosives  buried  in  the  earth.  From  measurements  of 
the  speeds  at  which  the  waves  travel  through  the  earth  along  different 
paths,  geologists  obtain  information  which  is  invaluable  in  finding  oil. 
However,  the  method  would  not  work  if  seismic  waves  did  not  carry 
enough  energy  to  be  able  to  actuate  the  geologists'  detectors.  In  some  cases 
the  amount  of  energy  transported  by  a mechanical  wave  is  so  large  that  the 
energy  itself  is  of  primary  interest.  A very  loud  sound  wave  is  painful  be- 
cause it  imparts  so  much  energy  to  the  eardrum.  A seismic  wave  produced 
by  an  earthquake  is  dangerous  because  of  the  tremendous  amount  of  en- 
ergy it  carries. 

Later  in  this  chapter  we  apply  newtonian  mechanics  to  a simple  system 
for  the  purpose  of  explaining  the  observed  properties  of  mechanical  travel- 
ing waves.  But  hrst  we  must  build  up  the  linguistic  and  mathematical  tools 
that  are  needed  in  describing  these  properties.  That  is,  hrst  we  must  develop 
traveling-wave  kinematics.  We  do  so  in  this  section  for  a wave  consisting  of 
a single,  isolated  disturbance  traveling  through  a system.  Such  a traveling 
wave  is  known  as  a wave  pulse.  In  Sec.  12-2  we  extend  our  considerations 
to  a traveling  wave  comprising  a series  of  adjacent  pulses. 

The  familiar  separation  between  kinematics  and  mechanics  is  particularly 
useful  in  connection  with  wave  motion.  The  reason  is  that  the  universe  is,  literal- 
ly, filled  with  waves  that  travel  through  nonmechanical  systems.  These  are  light 
waves,  radio  waves,  and  the  many  other  members  of  the  family  of  electromagnetic 
waves.  In  most  regards  the  “mechanics”  of  electromagnetic  waves  is  completely 
different  from  the  mechanics  of  waves  traveling  through  mechanical  systems. 
This  is  because  the  properties  of  electromagnetic  waves  must  be  explained  in 
terms  of  laws  of  nature  that  are  quite  distinct  from  Newton’s  laws  of  motion.  But 
the  kinematics  of  electromagnetic  waves  is  exactly  the  same  as  the  kinematics  of 
mechanical  traveling  waves,  in  most  of  its  features.  So  we  will  be  able  to  make  use 
of  much  of  what  we  develop  here  on  a number  of  occasions  later  in  the  book  when 
we  study  light  and  other  electromagnetic  waves. 

We  begin  developing  traveling-wave  kinematics  by  considering  a wave 
pulse  traveling  along  a stretched  rope,  or  heavy  string.  One  end  of  the 
string  is  tied  to  a rigid  support.  The  other  end  is  pulled  by  the  hand  of  an 
experimenter,  with  tension  being  applied  by  the  hand  stretching  the  string. 
The  experimenter  gives  the  end  being  held  a single,  sharp  up-and-down 
jerk,  forming  a bulge  in  the  string  as  illustrated  in  the  hrst  part  of  Fig. 
12-1.  The  bulge  does  not  stay  in  place.  Instead  it  travels  uniformly  along 
the  string  as  a wave  pulse,  in  the  manner  illustrated  in  subsequent  parts 
of  the  figure.  A description  of  this  situation  is  simpler  than  a descrip- 
tion of  a ripple  traveling  over  the  surface  of  a pond.  As  the  ripple  spreads 
over  the  surface  of  the  pond  in  two  dimensions  from  the  point  where  it  is 
produced,  its  height  diminishes.  This  is  because  the  energy  carried  by  the 
ripple  is  spread  over  a larger  and  larger  region.  But  a wave  pulse  traveling- 
in  one  dimension  along  a string  does  not  spread.  In  fact,  experiment  shows 
that  the  pulse  itself  maintains  the  same  shape  as  it  travels  along  the  string, 


12-1  Traveling  Waves  501 


Fig.  12-1  A wave  pulse  being  pro- 
duced in  a stretched  string.  The  top  dia- 
gram shows  the  string  stretched  between 
the  fixed  support  at  its  right  end  and  the 
hand  pulling  on  it  at  its  left  end.  The 
diagrams  below  show  the  position  of  the 
hand  and  the  shape  of  the  string  at  sub- 
sequent times.  The  time  interval  be- 
tween successive  diagrams  is  the  same. 


provided  that  frictional  effects  are  not  significant  and  that  the  height  of  the 
pulse  is  not  so  large  that  the  tension  in  the  string  is  affected  significantly  by 
the  presence  of  the  pulse.  (Experimental  evidence  is  presented  in  Fig. 
12-2.)  Another  reason  why  the  pulse  traveling  along  the  stretched  string  is 
simpler  to  describe  is  just  that  there  is  one  less  dimension  that  must  be  dealt 
with,  and  so  one  less  coordinate  that  must  be  used. 

You  should  note  that  what  is  traveling  horizontally  along  the  string  is 
not  the  particles  constituting  the  string.  Each  of  these  particles  only  moves 
up  a little  and  then  back  down.  This  behavior  is  typical  of  wave  motion. 
Although  they  transport  energy  through  the  system,  the  individual  particles 
of  the  system  move  very  little.  In  fact,  the  total  displacement  of  each 
particle  is  zero.  Furthermore,  what  motion  the  particles  do  go  through  is 
not  even  along  the  string  (the  way  the  pulse  moves)  but  perpendicular  to  it. 
When  the  motion  of  the  particles  of  the  system  is  everywhere  perpendic- 
ular to  the  direction  in  which  the  wave  travels,  it  is  said  to  be  a transverse 
wave.  So  the  wave  pulse  moving  along  the  stretched  string  is  a transverse 
traveling  wave. 

How  can  we  describe  the  wave  pulse  traveling  along  the  stretched 
string?  At  any  instant  the  complete  shape  of  the  string  has  a unique  and 
distinct  character.  Just  after  the  wave  pulse  is  generated,  for  instance,  the 
string  has  a bulge  at  its  left  end  and  is  otherwise  straight.  At  some  particu- 


502  Mechanical  Traveling  Waves 


Fig.  12-2  Frames  from  a motion  picture 
of  the  generation  of  a wave  pulse  in  a 
stretched  spring.  A spring  was  used,  instead 
of  a string,  because  it  was  easier  to  photo- 
graph. (From  PSSC  Physics,  2d  ed.,  D.  C. 
Heath,  Boston,  1965.  Courtesy  Educational 
Development  Center. ) 


lar  time,  the  complete  shape  of  the  string  can  be  specified  by  giving  the  po- 
sition of  all  the  elements  of  the  string  in  terms  of  the  mathematical  function 

y = fix)  (12-1) 

In  this  expression  y is  the  vertical  coordinate  of  an  element  of  string,  mea- 
sured from  its  undisturbed  position  at  y = 0.  The  location  of  the  element 
along  the  string  is  denoted  by  its  horizontal  coordinate  x.  The  specific 
mathematical  form  of  the  function  is  whatever  is  required  to  describe  the 
specific  shape  of  the  string  at  the  time  being  considered.  Figure  12-3  de- 
picts such  a function.  Its  graphical  representation  is,  indeed,  nothing  but  a 
“snapshot”  of  the  string,  at  the  time  to  which  the  function  applies,  with  the 
proper  coordinate  axes  superimposed. 

The  function  in  Eq.  (12-1)  describes  the  shape  of  the  string  at  only  one 
instant  of  time.  But  the  vertical  coordinate  y of  an  element  of  the  string 
whose  horizontal  coordinate  is  x also  depends  on  the  time  t,  since  the  wave 
pulse  moves  along  the  string.  Thus,  in  order  to  describe  the  positions  of  all 
the  elements  of  the  string  at  all  times,  we  need  a function  which  depends  on 
both  x and  t.  Symbolically,  such  a function  is  written  as 

y = f{x,  t)  (12-2) 

and  is  called  a wave  function. 

Fig.  12-3  A plot  of  the  function  y = f(x)  giving,  at  some  instant 
of  time,  the  dependence  on  the  coordinate  x measured  along 
a string  of  the  transverse  coordinate y of  elements  of  the  string. 

y 


x 

0 


12-1  Traveling  Waves  503 


t > 0 

1 

X =x,  + vt 

0 b ) 


Fig.  12-4  (a)  A plot  of  the  wave  function  y = fix,  t)  versus  x for  t = 0. 

It  shows  a profile  at  that  instant  of  a wave  pulse  traveling  along  a 
string.  ( b ) A plot  for  the  later  instant  t > 0 shows  the  profile  of  the 
wave  pulse  at  that  instant.  The  profile  is  the  same  as  at  the  earlier 
instant,  but  it  has  traveled  along  the  string.  It  moves  an  amount  vt, 
where  v is  the  velocity  of  the  wave  pulse. 


We  can  find  a more  explicit  form  for  this  abstract  expression  by  con- 
sidering the  physical  situation  in  more  detail.  As  time  passes,  the  wave 
pulse  moves  uniformly  along  the  string  in  some  direction,  say  the  direction 
of  positive  x.  It  makes  perfect  sense  to  describe  the  motion  of  the  pulse  in 
terms  of  a wave  velocity  v,  whose  value  is  given  by 


dx 

dt 


(12-3) 


Here  x is  the  horizontal  coordinate  at  time  t of  any  characteristic  point  on 
the  pulse,  say  its  maximum.  Be  sure  to  note  that  v is  not  the  velocity  of  any 
material  object.  The  wave  velocity  is  the  velocity  of  the  moving  disturbance  in  the 
elements  of  the  string,  and  not  the  velocity  of  any  element  of  the  string.  The  signifi- 
cance of  the  wave  velocity  is  illustrated  in  Fig.  12-4.  The  wave  velocity  is  a 
signed-scalar  quantity  v,  whose  value  is  a positive  constant  for  the  case  de- 
picted in  the  figure. 

The  next  step  in  finding  a more  explicit  form  for  a wave  function 
describing  a wave  pulse  traveling  along  the  string  is  to  consider  what  the 
string  looks  like  from  the  points  of  view  of  two  different  observers,  one 
fixed  with  respect  to  the  string  and  the  other  moving  with  the  pulse.  This 
consideration  will  show  us  that  the  variables  x and  t can  enter  into  the  func- 
tion f(x,  t ) only  in  a certain  combination.  A wave  pulse  is  moving  along  the 
string  in  the  positive  x direction.  An  observer  O remains  fixed  at  the  end  of 
the  string  where  the  pulse  was  produced.  She  measures  the  vertical  coordi- 
nate y of  each  element  of  the  string  with  horizontal  coordinate  x at  various 
times  t.  The  summary  of  her  observations  is  precisely  what  we  mean  by  the 
wave  function  of  Eq.  (12-2): 

y = f(x,  t) 

Another  observer  O'  is  moving  with  respect  to  O at  a constant  velocity 
which  he  adjusts  to  be  exactly  equal  to  the  wave  velocity  v.  That  is,  O sees  O' 
moving  at  a velocity  just  equal  to  the  wave  velocity  she  observes  for  the 
pulse.  The  situation  at  time  t = 0 is  depicted  from  the  viewpoint  of  O in 
Fig.  12-5«.  At  that  time  O'  passes  O,  and  the  production  of  the  pulse  has 
just  been  completed. 

Observer  O'  also  watches  the  wave  pulse  in  the  string.  His  observations 
are  made  from  the  primed  coordinate  system,  which  moves  along  with  the 
wave  pulse.  This  moving  coordinate  system  and  the  pulse  are  shown  from 


504  Mechanical  Traveling  Waves 


y,y 


Fig.  12-5  (a)  A wave  pulse  propagating  along  a string  with 

velocity  v to  the  right,  as  seen  at  time  t = 0 in  the  reference 
frame  of  an  observer  0 who  is  stationary  with  respect  to  the 
string.  Also  shown  is  the  reference  frame  of  an  observer  0'  who 
is  moving  with  velocity  v.  ( b ) The  situation  at  a later  time  t > 0, 
as  seen  in  the  reference  frame  of  O.  Both  the  wave  pulse  and 
the  reference  frame  of  O'  have  moved  by  the  same  amount 
v t with  respect  to  O. 


y y' 


(5) 


the  viewpoint  of  O at  the  time  t > 0 in  Fig.  12-5 b.  Since  O'  moves  along  with 
the  wave  pulse,  its  position  does  not  change  from  his  viewpoint.  So  the 
function  which  summarizes  his  observations  of  the  vertical  coordinates  y'  of 
elements  of  the  string  depends  only  on  their  horizontal  coordinates  x'  mea- 
sured from  the  origin  of  his  coordinate  system.  There  is  no  time  depend- 
ence because  the  pulse  does  not  move,  as  he  sees  it,  and  also  because  the 
shape  of  the  pulse  itself  does  not  change.  Thus  O'  writes  the  function  as 

y'=f(x')  (12-4) 

In  order  to  compare  the  observations  of  O'  with  her  own,  O must  con- 
vert data  expressed  in  terms  of  x'  and  y'  to  data  expressed  in  terms  of  her 
own  coordinates  x and  y.  Figure  12-56  shows  that  the  conversion  is  given  by 
the  two  equations 

x'  = X — vt 


and 


y'  = y 

Applying  the  first  of  these  to  Eq.  (12-4),  she  has 

y'  = fix')  = fix  - vt) 

Applying  the  second,  she  finds 

y = y'  = fix  - vt) 


or 

y = fix  — vt)  (12-5) 

The  function  on  the  right  side  of  Eq.  (12-2),  y = fix,  t),  describes  the 
observations  of  O.  The  function  on  the  right  side  of  Eq.  (12-5),  y = 
fix  - vt ),  describes  the  observations  of  O'  in  terms  of  the  coordinates  used  by 
O.  Since  O and  O'  have  observed  exactly  the  same  phenomenon,  these  two 
functions  are  describing  exactly  the  same  thing,  and  so  can  be  equated  to 
obtain 


fix,  t)  —f{x  - vt)  (12-6) 

12-1  Traveling  Waves  505 


I lie  only  possible  form  for  the  wave  function  y = f(x,  t),  which  describes  the 
wave  pulse  moving  along  the  string  in  the  positive  direction  with  wave 
velocity  v,  is  y = f(x  — vt).  The  variables  x and  t must  always  enter  into  this 
function  in  the  combination  x — vt,  no  matter  what  the  specific  form  of  the 
function  is.  Any  other  combination  of  x and  t would  lead  to  the  physically 
impossible  situation  where  different  observers  could  not  reconcile  their  ob- 
servations. For  instance,  this  would  be  the  case  for  the  dimensionally  con- 
sistent, but  physically  impossible,  combination  x — 2 vt. 

The  specific  form  of  the  function  ofx  — vt  which  must  be  used  depends 
on  the  specific  shape  of  the  wave  pulse  that  the  function  describes.  An  ex- 
ample of  such  a function  (to  be  used  early  in  Sec.  12-2  to  describe  a series  of 
adjacent  pulses  extending  above  and  below  the  undisturbed  string)  is  y = 
A cos[5(x  — uf)],  where  A and  B are  constants. 


Because  of  the  importance  of  Eq.  (12-6),  and  because  its  interpretation  may 
cause  you  some  difficulty  at  first,  it  is  worthwhile  to  go  through  an  argument 
which  verifies  that  the  equation  actually  does  describe  a wave  pulse  that  travels  in 
the  positive  x direction,  at  the  wave  velocity  v,  without  changing  its  shape.  A par- 
ticular point  on  the  pulse — say  the  point  marked  P in  Fig.  12-5b — is  a point 
where  the  function/ (x  — vt)  always  maintains  a particular  value  y.  If  the  function 
is  always  to  maintain  a particular  value,  it  is  necessary  that  the  argument x - vt  of 
the  function  always  maintain  a particular  value.  That  is,  for  any  particular  point 
on  the  wave  pulse  we  must  have 

x — vt  = constant 

Different  points  are  associated  with  different  values  of  the  constant.  For  instance, 
the  point  marked  M in  the  figure  is  associated  with  the  constant  whose  value  leads 
to  the  function /(x  — vt)  having  a maximum  value.  As  the  time  t increases,  the 
coordinate x of  a particular  point  on  the  pulse  must  become  more  positive  in  order 
to  compensate  for  the  fact  that  the  quantity  — vt  becomes  more  negative,  since  v 
has  a positive  value.  This  must  be  so  ifx  - vt  is  to  maintain  a constant  value.  Thus 
the  point  on  the  pulse  moves  in  the  positive  x direction.  You  can  calculate  its 
velocity  dx/dt  by  taking  the  time  derivative  of  each  term  in  the  equation  x — vt  = 
constant,  remembering  that  v is  constant,  to  obtain 

dx 


or 


dx 

~di~V 

Thus  the  point  moves  with  a velocity  v that  is,  according  to  the  definition  of  Eq. 
(12-3),  the  wave  velocity.  Since  you  will  obtain  the  same  result  for  dx/dt  no  matter 
what  the  value  of  the  constant,  every  point  on  the  pulse  moves  with  that  same 
velocity.  As  a consequence,  the  pulse  maintains  its  shape  as  it  moves. 

If  a wave  pulse  is  moving  along  a stretched  string  in  the  negative  direc- 
tion, then  its  wave  velocity  v will  have  a negative  value.  Let  us  write  it  as  v = 
— |u|,  where  |u|  represents  th e speed  of  the  wave,  and  then  substitute  into  Eq. 
(12-6).  We  obtain  the  wave  function//*,  t)  = f(x  + |u|f).  From  the  viewpoint 
of  an  observer  O fixed  with  respect  to  the  string,  this  describes  a wave  trav- 
eling along  the  string  in  the  negative  x direction  at  speed  |u|.  You  can 
check  this  wave  function  in  either  of  two  ways.  First,  you  can  derive  it  directly 
from  an  argument  like  the  one  leading  to  Eq.  (12-6),  but  with  the  wave 


506  Mechanical  Traveling  Waves 


pulse  and  the  observer  O'  moving  in  the  negative  x direction.  Second,  you 
can  go  through  an  argument  like  the  one  in  small  print  above. 

Ii  is  worthwhile  to  write  an  expression  which  is  completely  equivalent 
to  Eq.  (12-6),  but  which  makes  more  apparent  the  fact  that  the  sign  of  the 
numerical  value  of  the  second  term  in  the  argument  of  the  wave  function 
depends  on  the  direction  in  which  the  wave  travels.  The  expression  is 

f(x,  t.)  = f(x  + (12-7) 

The  minus  sign  is  used  for  a wave  traveling  in  the  positive  x direction  ( v > 
0),  and  the  plus  sign  is  used  for  a wave  traveling  in  the  opposite  direction 
(v  < 0). 


12-2  WAVE  TRAINS  We  have  spoken  so  far  of  a single  traveling  wave  pulse,  produced  by  a 

single  cycle  of  agitation  of  the  end  of  the  string.  There  are  two  excellent 
reasons  for  considering  the  case  of  repeated  agitations,  where  an  oscillating 
source  moves  the  end  of  the  string  up  and  down  continually. 

One  reason  is  that  there  are  very  many  important  physical  situations  in 
which  waves  are  excited  by  an  oscillating  system.  For  example,  most  sources 
of  sound  and  of  radio  waves  are  of  this  type. 

Another  reason  is  that  we  will  see  in  Chap.  13  that  any  wave,  no  matter 
how  complicated  its  shape,  can  be  built  up  of  separate  waves,  each  of  which 
is  produced  by  a source  (that  is,  a device)  executing  harmonic  motion  of  a 
particular  frequency.  Thus  an  understanding  of  the  relatively  simple  situa- 
tion in  which  one  end  of  a stretched  string  is  set  into  motion  by  such  au  os- 
cillator is  a long  step  toward  the  analysis  of  waves  of  arbitrary  shape. 

For  these  reasons,  we  consider  the  traveling  waves  produced  in  a 
stretched  string  whose  movable  end  is  shaken  repeatedly  up  and  down  by 
a source  which  itself  moves  vertically  in  harmonic  motion.  Using  Eq.  (6-27), 
we  can  write  the  vertical  position  y of  the  end  of  the  string  connected  to  the 
source  as 


y = A cos(27iU)  (12-8) 

Here  y — 0 corresponds  to  the  harmonic  oscillator  being  at  its  equilibrium 
position,  which  is  also  the  undisturbed  position  of  the  end  of  the  string. 
The  amplitude  A is  the  maximum  displacement  of  the  oscillator  from  that 
position.  The  frequency  v is  the  number  of  times  per  second  that  the  oscil- 
lator goes  through  its  cycle  of  oscillation.  And  the  zero  of  the  time  scale  is 
chosen  so  as  to  make  y — A at  t = 0. 


H ow  does  the  stretched  string  behave  as  it  is  shaken  by  the  source?  Let 
us  assume  that  the  string  is  very  long,  so  that  we  need  not  worry  about  what 
happens  at  the  other  end.  Each  half-cycle  of  the  harmonic  oscillator  is  very 
much  like  the  single  shake  we  have  already  discussed,  which  put  a wave 
pulse  into  the  string.  Here,  however,  each  upward  shake  is  immediately  fol- 
lowed by  a downward  shake,  and  so  on.  Each  pulse  travels  down  the  string 
just  as  before,  followed  immediately  by  the  next  one.  The  result  is  that 
there  is  a series  of  pulses  traveling  down  the  string,  called  a wave  train.  If 
we  take  a snapshot  of  the  string  at  some  particular  instant,  it  will  look  like 
Fig.  12-6a.  Since  the  excitation  of  the  end  of  the  string  is  a repetitive 
process,  a point  on  the  string  at  some  characteristic  location  on  a wave  — 
say,  a maximum — will  have  exactly  the  same  coordinate  y as  any  other 


12-2  Wave  Trains  507 


y 


y 


( b ) 


Fig.  12-6  (a)  A continuous,  sinusoidal  wave  propagating  along  a 

string.  The  transverse  coordinate  y of  elements  of  the  string  is 
plotted  versus  the  coordinate  x measured  along  the  string  at  a fixed 
time  t.  This  plot  represents  the  result  of  taking  a “snapshot”  of  the 
string  at  one  time  t,  over  a range  of  values  of  x.  The  plot  defines  the 
wavelength  A.  of  the  wave  and  its  amplitude  A.  ( b ) The  same  wave 
with  the  transverse  coordinate  y plotted  versus  the  time  t for  the 
element  of  string  at  a fixed  coordinate  x.  This  plot  represents  the 
result  of  taking  a “movie”  of  the  string  at  one  location  x,  over  a 
range  of  values  of  t.  The  plot  defines  the  period  T of  the  wave  and 
its  amplitude  A . The  reciprocal  of  7 is  the  frequency  v of  the  wave; 
that  is,  v = 1 /T. 


point  at  a corresponding  location  on  the  wave  train.  Furthermore,  the  cor- 
responding locations  are  spaced  at  equal  distances  along  the  string.  This 
distance  is  called  the  wavelength  A.  (Greek  lambda).  The  figure  shows  that 
it  does  not  matter  at  what  point  on  the  wave  we  begin  the  measurement  of 
the  wavelength.  All  that  is  necessary  is  that  we  end  at  the  next  point  having 
a corresponding  location  on  the  wave  train. 

Figure  12-6a  illustrates  what  the  entire  string,  extending  over  space, 
looks  like  at  a fixed  instant  of  time  t.  How  does  a single  particle  of  the  string 
at  a fixed  horizontal  coordinate  x appear  to  move  when  it  is  observed  over 
an  extended  time?  We  can  imagine  making  a motion  picture  of  a little  piece 
of  the  string,  using  a vertical  slot  in  a mask  in  front  of  the  string  to  select  a 
fixed  value  of  x.  Then  we  use  the  film  to  measure  the  y coordinate  of  the 
piece  as  a function  of  the  time  t as  the  piece  oscillates  up  and  down.  A plot 
of  the  results  would  look  like  Fig.  12-66.  Here  the  horizontal  axis  is  a t axis, 
rather  than  an  x axis.  But  otherwise  there  is  a very  strong  resemblance 
between  Fig.  12-66  and  12-6a.  The  reason  is  that  there  is  a strong  connec- 
tion between  the  snapshot  of  a wave  train  and  the  motion  picture  of  a par- 
ticle of  the  system  through  which  the  wave  train  travels.  It  takes  a time  T 
(the  period  of  the  particle’s  oscillation)  for  a particle  of  the  string  to  go 
through  one  complete  cycle  of  displacements  y in  the  hxed-x  movie  of  Fig. 
12-66.  Analogously,  it  takes  a length  A.  (the  wavelength  of  the  wave  train) 
for  the  wave  train  to  go  through  one  complete  cycle  of  displacements  y in 
the  hxed-£  snapshot  of  Fig.  12-6a. 

Since  the  source  oscillator  is  moving  in  a sinusoidal  fashion,  and  since 
the  wave  it  generates  in  the  string  moves  off  in  the  positive  x direction  at  a 
constant  velocity,  the  string  itself  has  at  any  instant  the  shape  of  a sinusoidal  curve. 
The  proof  of  this  statement  is  as  follows.  At  a time  t,  the  y coordinate  — 
given  by  y — fix  — vt) — of  a particle  of  the  string  with  a certain  x coordi- 
nate is  the  same  as  the  y coordinate  — given  by  y = /( 0 — vt') — of  a particle 
at  the  origin  x = 0 at  an  earlier  time  t'  — t — x/v.  This  is  true  because 

f{x  — vt)  =/(  0 — vt')  ( 12-9fl) 


508  Mechanical  Traveling  Waves 


if 


(12 -9b) 


x 

t'  = t 

V 

But  the  y coordinate  of  a particle  of  string  at  the  origin  is  determined  by  the 
y coordinate  of  the  source  oscillator,  to  which  it  is  firmly  attached.  At  time 
t' , the  latter  is  given  by  Eq.  (12-8)  to  be  y = A cos(2vvt').  I bus  we  have  the 
relation 


/( 0 - vt')  = A COS(2TTVt') 

Using  Eq.  (12-9«)  and  then  Eq.  (12-9/>),  we  have  from  this  relation 


f(x  — Vt)  = A COS(27 TVt')  — A cos 


2ttv\  t 

V 


Since  multiplying  the  argument  of  a cosine  by  - 1 does  not  affect  the  value 
of  the  cosine,  this  equation  for  y = f(x  — vt)  can  be  rewritten  in  the  form 


y 


— A cos 


(x  — vt) 


(12-10) 


At  any  instant — that  is,  for  any  fixed  value  of  t — the  vertical  position  y of  a 
particle  of  the  string  is  a sinusoidal  function  of  its  horizontal  position  x, 
which  is  what  we  set  out  to  prove.  Incidentally,  notice  that  Eq.  (12-10)  pro- 
vides a specific  example  of  a function  which  possesses  the  general  form  re- 
quired for  a traveling  wave  by  Eq.  (12-6):  f(x,  t)  — f(x  — vt). 


Equation  (12-10)  can  be  used  to  obtain  a very  important  connection 
between  the  frequency  v of  a wave  train,  its  wave  velocity  v,  and  its  wave- 
length k.  As  defined  by  Fig.  12-6a,  k is  the  distance  along  the  x axis  in  which 
the  y coordinate  passes  through  one  complete  cycle  of  oscillation  while  t re- 
mains fixed.  Consider  Eq.  (12-10)  for  any  fixed  value  of  t.  For  y to  go 
through  one  complete  cycle,  the  quantity  2rrvx/v  must  increase  or  decrease 
by  the  amount  2tt  as  x increases  by  the  amount  defined  to  be  k.  That  is, 


2ttvx 

v 


± 277  = 


2ttv(x  + k) 
V 


Multiplying  through  by  v/2tt  and  transposing,  we  have 

v{x  + A.)  = vx  ± v 

Canceling  and  solving  for  v produce 

v = ± vk  (12-1  la) 

The  dual  signs  allow  for  the  velocity  v to  have  either  a positive  value  or  a 
negative  value.  In  terms  of  the  speed  |u|,  the  equation  can  be  written  as 

|ij  = vk  (12-11/;) 

The  speed  |u|  of  a wave  equals  the  product  of  its  frequency  v and  its  wavelength  k. 

While  we  have  derived  Eq.  (12-1 1Z>)  for  the  particular  case  of  a trans- 
verse sinusoidal  wave  train,  and  have  considered  in  particular  a wave  train 
traveling  along  a string,  the  equation  actually  is  valid  for  any  type  of  repeti- 
tive wave  train  traveling  through  any  medium.  This  can  be  proved  by  an 
argument  that  you  may  consider  to  be  simpler  than  the  one  just  given. 
Imagine  you  are  at  a fixed  position,  watching  any  repetitive  wave  train 
moving  by  you  to  the  right.  At  a certain  instant  one  maximum  of  the  wave 
train  is  at  your  position,  and  an  adjacent  maximum  is  a distance  to  the  left 


12-2  Wave  Trains  509 


of  your  position  equal  to  A,  the  wavelength  of  the  wave.  In  a time  T the 
points  on  the  wave  at  your  fixed  position  go  through  one  complete  cycle  of 
oscillation,  where  T is  the  period  of  the  oscillation.  And  as  this  happens,  the 
maximum  of  the  wave  that  had  been  to  your  left  moves  up  to  your  position. 
The  maximum  moved  a distance  A in  a time  T,  so  its  speed  is  |u|  = k/T. 
This  is  the  speed  of  the  wave  itself.  But  the  period  T of  the  oscillation 
equals  the  reciprocal  of  v,  the  frequency  of  the  oscillation.  Setting  T = \/v, 
you  obtain  |u|  = vk , in  agreement  with  Eq.  (12-1 1ft). 

Examples  12-1  and  12-2  apply  Eq.  (12-1  lft)  to  a mechanical  traveling 
wave  and  to  a nonmechanical  traveling  wave. 


EXAMPLE  12-1  nrrr  - 

The  frequency  ot  die  musical  note  called  middle  C is  261.6  Hz.  Find  the  wavelength 
of  the  sound  wave  traveling  through  air  when  a flutist  plays  middle  C (the  lowest 
note  on  the  Hute).  The  speed  of  sound  in  air  at  room  temperature  (20°C)  has  the 
value  |u|  = 344  m/s. 

■ From  Eq.  (12- 11  ft)  you  have 

M 

V 


The  numerical  value  is 


344  m/s 
261.6  s”1 


or 


A = 1.31  m 


EXAMPLE  12-2  ■in 

The  frequency  and  wavelength  of  the  radio  wave  emitted  by  a broadcasting  station 
are  measured  to  be  v = 1.200  x 10s  Hz  and  A = 2.498  x 102  m.  Evaluate  the  speed 
of  radio  waves. 

■ Equation  (12-1  lft)  shows  the  speed  to  be 

\v\  = v\  = 1.200  x 106  s”1  x 2.498  x 102  m 
or 

\v\  = 2.998  x 108  m/s 


Now  we  will  express  the  sinusoidal  wave  function  of  Eq.  (12-10),  used 
to  describe  a sinusoidal  wave  train,  in  several  simpler  and  very  useful 
forms.  Let  us  consider  such  a wave  train  traveling  in  the  direction  of  posi- 
tive x so  that  Eq.  (12-1  1 a)  gives  for  its  velocity  v the  positive  value  v = vk. 
Substituting  this  value  into  Eq.  (12-10),  we  obtain 

y = A cos^  2 77 

If  we  investigate  the  wave  as  it  looks  frozen  in  time — that  is,  with  t arbitrary 
but  fixed  — then  the  term  2ttvI  will  have  a fixed  value.  Writing  it  as  2Trvt  = 
— S,  we  have 

y = A cos^27 t~  + (12-13) 


— — 2nvt 
A 


(12-12) 


510  Mechanical  Traveling  Waves 


This  equation  has  a directly  evident  physical  meaning.  When  x varies  in 
such  a way  as  to  change  the  value  of  the  fraction  x/k  by  1.  the  argument  of 
the  cosine  function  changes  by  277,  and  the  cosine  function  itself  passes 
through  one  cycle.  This  will  be  true  regardless  of  the  initial  value  of  the 
argument  of  the  cosine.  The  arbitrariness  is  expressed  mathematically  by 
the  presence  of  the  phase  constant  8.  It  can  have  any  value  at  all,  positive 
or  negative.  Its  value  determines  the  “starting  point”  of  the  wave  function, 
that  is,  the  value  of  x for  which  the  value  of  y is  A. 

Equation  (12-13)  completely  describes  the  y coordinates  of  all  points 
on  the  wave,  provided  it  is  frozen  in  time.  That  is,  it  is  the  algebraic  equiva- 
lent of  the  snapshot  of  Fig.  12-6a.  It  is  therefore  called  a time-indepen- 
dent wave  function. 

We  can  begin  again  with  Eq.  (12-12),  this  time  fixing  x and  allowing  t to 
vary.  An  argument  completely  analogous  to  the  one  which  led  to  Eq. 
(12-13)  leads  to  the  expression 

y — A cos(27 Tvt  + 8')  (12-14) 

in  which  8'  is  a phase  constant.  Now  the  frequency  v is  related  to  the  period 
T by  the  definition  in  Eq.  (6.5): 

1 

v - T 

Substituting  this  value  of  v into  Eq.  (12-14)  yields  the  space-independent 
wave  function 


y — A cos ^ 277 ~ + S'j  (12-15) 

Equation  (12-15)  completely  describes  the  y coordinate  of  a particular  point 
on  the  string  for  all  times.  It  is  thus  the  algebraic  equivalent  of  the  “movie 
through  a slot”  of  Fig.  12-66.  The  equation  tells  you  that  the  cosine  func- 
tion goes  through  one  complete  cycle  when  t varies  so  as  to  change  the  frac- 
tion t/T  by  1.  Just  as  in  the  discussion  leading  to  Eq.  (12-13)  for  the 
time-independent  case,  this  statement  does  not  depend  on  the  initial  value 
of  t. 


Let  us  now  return  to  the  more  general  wave  function,  Eq.  (12-12), 


y = A cos 


from  which  the  time-  and  space-independent  wave  functions  [Eqs.  (12-13) 
and  (12-15),  respectively]  were  derived.  We  can  cast  it  into  a more  symmet- 
rical form  by  using  again  the  relation  v — \/T  to  obtain 


y = A cos 


(12-16) 


This  expression  still  does  not  possess,  however,  the  most  general  form  pos- 
sible for  a function  representing  sinusoidal  waves  traveling  in  the  positive  x 
direction.  That  is,  it  does  not  represent  mathematically  all  such  possible 
waves.  In  particular,  it  represents  only  those  waves  for  which  the  y coordi- 
nate of  the  point  on  the  string  at  the  origin  happened  to  have  the  max- 
imum possible  value  A at  time  t = 0.  As  we  have  done  before,  we  can  re- 
move this  rather  artificial  constraint  on  the  choice  of  the  zeros  for  x and  t by 
inserting  a phase  constant  8,  the  value  of  which  may  be  adjusted  to  specify 


12-2  Wave  Trains  511 


any  desired  coordinate  y of  the  string,  in  the  range  — A =£  y 
one  location  and  time.  We  thus  arrive  at  the  wave  function 

( x t\ 


= A cos 


2tt(  k TJ  + 8 


for  v > 0 


=£  A,  at  any 


(12-17) 


With  proper  choices  of  A,  A.,  T , and  8,  this  will  describe  any  sinusoidal  wave 
traveling  in  the  positive  x direction.  Note  the  symmetry  of  Eq.  (12-17)  with 
respect  to  x and  t.  Both  of  them  are  numerators  of  dimensionless  fractions 
whose  denominators  express  constant  physical  properties  of  the  wave. 

For  a wave  traveling  in  the  negative  x direction,  the  velocity  v has  the 
negative  value  v = — vK.  If  you  go  again  through  the  argument  leading  to 
Eq.  (12-17),  you  will  see  that  the  effect  of  this  sign  change  is  to  change  the 
minus  sign  in  that  equation  to  a plus  sign,  so  that  it  becomes 


y = A cos 


for  v < 0 


(12-18) 


There  is  a more  compact  way  of  writing  Eqs.  (12-17)  and  (12-18), 
which  will  come  in  handy  later.  Note  hrst  that  in  both  these  equations  the 
variable  t is  multiplied  by  the  factor  2tt/T.  Since  1 /T  = v,  we  can  rewrite 
this  factor  in  the  form  2tt/T  — 2ttv.  Then  we  can  introduce  the  angular 
frequency  oi,  defined  in  Eq.  (6-26)  to  be  w = 27 tv,  so  that  the  factor  be- 
comes 


277 

T 


w 


(12-19) 


Physically,  the  angular  frequency  is  the  number  of  times  that  the  source  (or 
any  point  on  the  string)  goes  through  its  oscillation  cycle  in  277  s,  just  as  the 
frequency  v is  the  number  of  times  that  the  source  goes  through  its  oscilla- 
tion cycle  in  1 s.  Then  we  define  tfie  wave  number  k by  the  equation 


277 

A. 


k 


(12-20) 


file  wave  number  is  the  number  of  waves  contained  in  277  m,  just  as  the 
angular  frequency  is  the  number  of  cycles  contained  in  2n  s.  In  terms  of 
the  wave  number  and  the  angular  frequency,  Eqs.  (12-17)  and  (12-18)  can 
be  rewritten  as 

y = A cos(kx  + cut  + 5)  (12-21) 

The  minus  sign  is  used  for  a wave  traveling  in  the  positive  x direction 
(v  > 0),  and  the  plus  sign  is  used  for  a wave  traveling  in  the  opposite  direc- 
tion (v  < 0). 


12-3  THE  WAVE  Lip  to  this  point  we  have  concentrated  on  describing  some  of  the  important 
EQUATION  properties  of  traveling  waves.  Now  we  undertake  the  task  of  explaining  the 
origin  of  these  properties.  Since  the  waves  we  are  considering  are  waves 
traveling  through  mechanical  systems,  we  can  expect  that  the  explanation 
will  be  based  on  the  laws  governing  the  behavior  of  mechanical 
systems — Newton’s  laws  of  motion.  These  laws  predict  the  motion  of  a par- 
ticle, not  a wave.  But  a wave  in  a mechanical  system — which  will  be  exem- 
plified by  a stretched  string — is  just  an  organized  motion  traveling  through 
the  particles  comprising  the  string.  So  we  can  use  Newton’s  laws  to  treat  the 


512  Mechanical  Traveling  Waves 


motion  of  these  particles  or,  more  practically,  of  sets  of  adjacent  particles 
that  form  very  short  segments  of  the  string.  Each  segment  moves  under  the 
influence  of  the  forces  exerted  on  it  by  the  segments  on  either  side.  By 
studying  their  motion  we  should  be  able  to  obtain  predictions  concerning 
the  wave. 

First  let  us  look  at  what  happens  in  a qualitative  way.  An  experimenter 
holds  the  end  of  a long,  uniform  string  extending  to  the  right  along  the  x 
axis,  maintaining  tension  in  the  string  by  pulling  on  its  end.  The  experi- 
menter then  begins  to  generate  a wave  pulse  in  the  end  of  the  string  by  rap- 
idly moving  his  or  her  hand  upward  in  the  y direction.  This  applies  a force 
in  that  direction  to  the  left  end  of  the  segment  of  string  next  to  the  hand. 
But  since  the  string  segment  has  mass,  and  therefore  inertia,  its  accelera- 
tion is  finite,  and  so  it  can  begin  to  move  upward  only  gradually,  in 
response  to  the  force  applied  to  it.  So  some  time  must  pass  before  the  seg- 
ment is  displaced  upward  an  appreciable  amount.  When  it  has  been,  it  in 
turn  exerts  a force  in  the  upward  direction  on  the  left  end  of  the  next  seg- 
ment of  string.  After  more  time  passes,  this  segment  is  displaced  upward 
and  then  exerts  an  upward  force  on  the  left  end  of  the  next  segment.  The 
process  continues.  Its  net  effect  is  that  the  “leading  edge"  of  a wave  travels 
at  a certain  speed  along  the  string.  If  the  experimenter’s  hand  starts  to 
move  down  to  the  x axis  immediately  after  it  finishes  moving  up,  the  hand 
starts  to  exert  a force  on  the  segment  of  string  next  to  the  hand  in  the 
downward  direction.  After  enough  time  has  passed  for  the  inertia  of  that 
segment  to  be  overcome,  the  segment  is  displaced  downward  and  then 
exerts  a downward  force  on  the  next  segment.  After  more  time  passes,  this 
segment  is  displaced  downward,  and  then  it  exerts  a downward  force  on 
the  next  segment.  The  downward  motion  propagates  from  segment  to  seg- 
ment at  the  same  speed  as  the  speed  at  which  the  leading  edge  of  the  wave 
travels.  It  constitutes  the  “trailing  edge”  of  the  single  wave  pulse  which  the 
experimenter  generates  in  the  string. 

It  is  said  that  the  wave  propagates  through  its  propagation  medium, 
the  string.  The  process  involves  two  key  factors:  the  force  that  each  particle 
of  the  medium  exerts  on  its  neighbor,  which  tends  to  make  the  neighbor 
follow  its  own  motion,  and  the  mass  of  each  particle,  which  tends  to  prevent 
the  neighbor  from  following  the  motion  instantly. 

Our  qualitative  analysis  suggests  that  the  speed  of  the  pulse  traveling 
along  the  string  will  depend  on  the  tension  in  the  string,  since  the  strength 
of  the  force  which  each  segment  of  the  string  exerts  on  the  neighboring 
segment  increases  as  the  tension  increases.  Also  the  speed  of  the  pulse  will 
depend  on  the  mass  per  unit  length  of  the  string,  since  the  mass  of  each 
segment  will  be  proportional  to  this  quantity.  Would  you  guess  that  in- 
creasing the  tension  will  increase  or  decrease  the  speed  of  the  pulse?  What 
about  the  mass  per  unit  length?  The  quantitative  analysis  we  go  through 
next  will  develop  the  exact  relation  among  these  quantities. 


Figure  12-7 a shows  the  uniform  string  extending  along  the  x axis.  Its 
right  end  is  fixed  to  a rigid  support,  and  the  experimenter’s  hand  is  pulling 
on  its  left  end.  The  force  applied  by  the  hand  stretches  the  string  a certain 
amount  because  it  puts  the  string  under  tension.  That  is,  each  segment  of 
the  string  applies  a force  to  the  neighboring  segment  directed  in  such  a way 
as  to  stretch  the  neighbor.  These  forces  are  all  of  the  same  magnitude,  and 
the  common  magnitude  equals  that  of  the  force  exerted  by  the  hand.  (The 


12-3  The  Wave  Equation  513 


F 


F 


Fig.  12-7  (a)  By  pulling  horizontally  on  the  end 

of  a long  string,  whose  other  end  is  attached  to  a 
rigid  support,  an  experimenter  produces  a tension 
force  of  magnitude  F in  the  string,  (b)  The  experi- 
menter's hand  executes  a quick  up-and-down 
motion,  thereby  inducing  a single  wave  pulse  in 
the  end  of  the  string  being  held.  The  wave  pulse 
propagates  along  the  string  and  is  shown  at  a time 
when  it  has  moved  away  from  the  hand,  (c)  A very 
short  segment  of  the  string,  in  the  region  where 
the  segments  have  transverse  displacements,  at 
the  same  instant  depicted  in  part  b.  If  all  the  trans- 
verse displacements  are  small,  the  tension  force 
acting  on  both  ends  of  the  segment  will  have  a 
magnitude  F which  is  the  same  as  the  tension  in 
the  undisturbed  string. 


y =/(*,  t) 


Fig.  12-8  An  enlarged  view  of  the  seg- 
ment of  string  shown  in  Fig.  12-7c.  The 
sizes  of  the  displacements,  and  therefore 
the  sizes  of  the  angles,  have  been  exag- 
gerated for  the  sake  of  clarity. 


situation  is  the  same  as  the  one  for  the  system  of  connected  springs,  illus- 
trated in  Fig.  4-26.)  The  forces  are  called  the  tension  forces  in  the  string. 
We  will  use  the  symbol  F to  represent  their  magnitude. 

If  the  experimenter  gives  the  end  of  the  string  a quick  up-and-down 
motion  in  the  y direction,  while  maintaining  the  tension,  a single  wave  pulse 
will  be  generated  in  the  string.  Figure  12-76  shows  the  pulse  after  it  has 
traveled  some  distance  along  the  string.  Figure  12-7c  depicts  the  situation 
at  the  same  instant  as  in  Fig.  12-76,  but  it  shows  only  a very  short  segment 
of  the  string  at  a location  where  the  pulse  is  passing.  An  enlarged  view  of 
this  segment  is  given  in  Fig.  12-8. 

The  profile  of  the  string  shown  in  Fig.  12-76  is  just  a plot  of  the  x 
dependence  of  the  wave  function  y = f(x,  t)  describing  the  transverse  dis- 
placements along  the  string,  at  the  particular  fixed  value  of  time  t illus- 
trated in  the  hgure.  And  the  segment  of  that  profile  shown  in  Figs.  12-7c 
and  12-8  is  a plot  of  a segment  of  that  function  versus  x,  for  that  fixed  t,  ex- 
tending from  the  coordinate  x of  one  end  of  the  segment  of  string  to  the 
coordinate  x + dx  of  the  other  end.  These  plots  are  like  the  one  considered 
in  Fig.  12-56.  But  there  we  concentrated  our  attention  on  describing  the 
motion  of  the  wave  pulse.  Now  we  must  explain  the  motion  by  applying 
Newton's  second  law.  Thus  we  must  now  stipulate  that  the  reference  frame 
containing  the  string,  and  used  to  measure  x and  y,  be  an  inertial  frame  so 
that  the  second  law  can  be  applied. 

The  first  step  in  applying  Newton’s  second  law  is  to  find  the  forces 
acting  on  a typical  segment  of  the  string,  such  as  the  one  shown  in  Fig.  12-8. 
If  we  ignore  gravity,  these  are  the  forces  exerted  on  each  of  its  ends  by  the 
adjacent  parts  of  the  string.  The  part  of  the  string  to  the  left  of  the  segment 
pulls  on  it  with  a force  of  magnitude  F.  The  part  of  the  string  to  the  right  of 
the  segment  pulls  on  it  with  a force  of  the  same  magnitude,  acting  in  a 
direction  which  is  not  quite  opposite  to  the  direction  of  the  other  force  be- 
cause of  the  curvature  of  the  string.  We  will  treat  the  segment  of  string  as  a 
particle  that  moves  under  the  influence  of  these  two  applied  forces. 

Also,  we  will  treat  only  waves  of  small  transverse  displacement  in  a string 
which,  when  undisturbed,  is  stretched  by  a large  tension.  The  effect  of  these 
two  restrictions  is  to  ensure  that  the  further  stretch  occasioned  by  the 
transverse  wave  in  the  string  does  not  increase  the  tension  significantly. 
Thus  we  deal  with  a case  in  which  we  can  take  the  magnitude  F of  the 
forces  acting  at  the  ends  of  the  segment  of  string  to  be  constant. 


514  Mechanical  Traveling  Waves 


When  this  is  not  the  case,  the  wave  itself  affects  the  mechanical  properties  of 
the  system  through  which  it  travels  by  increasing  the  tension.  This  leads  to  con- 
siderable mathematical  complications,  as  well  as  to  much  more  complicated  phys- 
ical behavior.  In  many  practical  circumstances  (such  as  most  waves  in  stringed 
musical  instruments)  the  tension  in  the  undisturbed  string  is  large  enough,  and 
the  transverse  displacements  in  the  wave  small  enough,  to  satisfy  very  well  the 
condition  of  constant  tension. 

To  apply  Newton’s  second  law  to  the  segment  of  string,  we  must  deter- 
mine the  net  force  acting  on  the  segment.  The  net  force  is  the  vector  sum  of 
the  forces  acting  on  the  left  and  right  ends.  The  x component  of  the  net 
force  has  the  value 


F cos{0)x+dx  - F cos{0)x 

Here  (0)x+dx  is  the  angle  between  a line  tangent  to  the  segment  at  its  right 
end  and  a line  parallel  to  the  x axis,  and  (d)x  is  the  angle  at  its  left  end.  Since 
we  have  assumed  small  transverse  displacements,  both  angles  are  small 
(although  they  are  exaggerated  in  the  figure  for  clarity),  and  so  the  values 
of  both  cosines  are  very  nearly  equal  to  1.  Therefore 

F cos«?W  - F cos(0)x  — T - F = 0 (12-22) 

T he  negligibly  small  magnitude  of  the  component  of  the  net  force  acting 
on  the  segment  in  the  direction  parallel  to  the  string  is  consistent  with  the 
fact  that  it  does  not  move  appreciably  in  that  direction  when  the  wave  prop- 
agates through  it.  1 he  segment,  and  all  other  segments  of  the  string,  move 
only  in  the  perpendicular  direction — the  wave  is  transverse. 

To  treat  the  transverse  motion  of  the  segment,  we  determine  the  y 
component  of  the  net  force  acting  on  it.  This  perpendicular  component  of 
the  net  force  is 

F sin(0)x+dx  - F sin(0)x 


Since  each  angle  6 is  small,  it  is  an  excellent  approximation  to  replace  sin  6 
by  tan  6.  But  tan  6 is  just  the  slope  of  the  segment  . The  slope  is  the  deriva- 
tive of  y = f(x,  t)  with  respect  to  x,  evaluated  for  the  fixed  value  of  t used  in 
the  figure.  Such  a derivative  is,  by  a definition  analogous  to  that  of  Eq. 
(7-60),  th e partial  derivative  of  f(x,  t)  with  respect  to  x.  That  is, 


sin  6 — tan  6 


df{x,  t) 


dx 


evaluated  by  treating  t as  a constant 


df(x,  t) 
dx 


Thus  the  net  perpendicular  force  on  the  string  segment  is  approximately 


F sin(0)x+dx 


F sin)#).*.  = F 


df(x,  t) 


dx 


x+dx 


df(x,  t) 
dx 


The  quantity  in  braces  is  the  difference  between  df(x,  t)/dx  at  the  right 
end  of  the  segment,  where  the  x coordinate  has  the  value  x + dx,  and 
df(x,  t)/dx  at  the  left  end,  where  that  coordinate  has  the  valuex.  It  can  be  ex- 
pressed as  the  rate  of  change  of  df(x,  t)/dx  with  respect  to  the  coordinate  x, 
evaluated  for  fixed  t,  times  the  change  dx  in  the  coordinate.  Thus 


" df(x,  t) ' 

df(x,  t ) 

d 

' df(x,  t) ' 

dx 

x+dx 

dx 

x dx 

dx 

12-3  The  Wave  Equation  515 


516 


The  term  multiplying  dx  is  the  partial  derivative  with  respect  to  x of  the 
quantity  df{x,  t)/dx.  In  the  concise  notation  of  calculus,  it  is  written  as  the 
second  partial  derivative  with  respect  to  x of  /(x,  t).  That  is,  we  define 

d df(x,  t) 
dx  l dx 

Hence  we  have 

d2f(x  t) 

F sin(d)x+dx  ~ F sin(0)x  = F — ~ — dx  (12-24) 

This  result  gives  a good  approximation  to  the  net  perpendicular  force 
acting  on  the  segment  of  string. 


d2/(x,  t) 
dx2 


(12-23) 


Newton's  second  law  requires  that  this  net  force  equal  the  mass  of  the 
segment  times  its  acceleration  in  the  perpendicular  direction.  The  mass  is 
dm  = fx  dl.  The  quantity  fx  is  the  linear  density  of  the  uniform  string,  that 
is,  its  mass  per  unit  length.  Since  the  segment  of  string  is  not  parallel  to  the 
x axis,  we  have  dx  — dl  cos  6.  But  d is  small.  So  we  can  take  cos  0=1  and 
write  the  mass  of  the  segment  as 

dm  ~ /x  dx  (12-25) 

to  a good  approximation. 

The  acceleration  of  the  segment  of  string  is  the  second  time  derivative 
of  the  coordinate)),  giving  its  transverse  displacement  from  its  undisturbed 
position  on  the  x axis.  The  value  of  x must  be  held  fixed  when  the  second 
time  derivative  of  y = f(x,  t)  is  computed  since  the  value  of  that  derivative 
varies  significantly  with  the  location  along  the  string  of  the  segment  being 
considered.  Therefore  we  compute  the  perpendicular  acceleration  of  the 
segment  at  a fixed  location  x by  taking  the  second  partial  derivative  with 
respect  to  t of  /(x,  t ): 

d2f(x,  t) 
dt2 


Using  this  with  Eqs.  (12-24)  and  (12-25)  in  Newton’s  second  law,  we 
obtain 


„ d2f(x,  t) 

F 


dx  — (fx  dx) 


d2f  (x,  t) 
dt 2 


The  term  on  the  left  side  of  this  equation  is  the  net  force  applied  to  the  seg- 
ment of  the  string,  which  we  have  seen  to  be  in  the  direction  perpendicular 
to  the  undisturbed  string.  The  term  in  parentheses  on  the  right  side  is  the 
mass  of  the  segment.  The  remaining  term  on  the  right  side  is  its  accelera- 
tion, which  is  also  in  the  perpendicular  direction.  (In  the  particular  situa- 
tion illustrated  in  Fig.  12-8  the  acceleration  of  the  segment  is  in  the  positive 
y direction  since  the  net  force  acting  on  it  is  in  that  direction.)  Canceling  the 
common  factor  dx,  and  then  dividing  through  by  F,  yields  the  final  result 

d2f(x,  t ) ix  d2f(x,  t) 

l 2 = T J (12-26) 

dxz  F dt 

This  is  the  wave  equation.  The  equation  applies  in  an  inertia!  refer- 

ence frame  to  small  transverse  waves  in  the  string.  It  says  that  the  coordi- 
nate in  the  transverse  direction  of  an  element  of  the  string,  y = /(x,  t ),  has  a 

Mechanical  Traveling  Waves 


second  partial  derivative  with  respect  to  its  coordinate  x along  the  axis  of 
the  undisturbed  string  that  is  proportional  to  its  second  partial  derivative 
with  respect  to  the  time  t.  The  proportionality  constant  depends  on  the  me- 
chanical properties  of  the  system,  being  the  mass  per  unit  length  p,  of  the 
string  divided  by  the  tension  F in  the  string.  The  wave  equation  is  the 
fundamental  equation  governing  the  behavior  of  transverse  waves  in  the 
string.  All  such  waves  can  be  studied  by  analyzing  the  various  solutions  to 
this  equation. 

An  equation  of  exactly  the  same  mathematical  form  as  Eq.  (12-26)  governs  the 
propagation  of  all  types  of  mechanical  waves  in  essentially  one-dimensional 
systems  which,  like  the  string,  have  uniform  mass  distributions  before  being  dis- 
turbed and  whose  resistance  to  being  disturbed  is  proportional  to  the  disturbance. 
For  pressure  waves  in  air  enclosed  in  a tubing  (the  common  feature  of  most  non- 
stringed  musical  instruments)  the  second  partial  derivative  of  the  pressure  with 
respect  to  position  along  the  tube  equals  a constant,  depending  on  the  mechanical 
properties  of  air,  times  the  second  partial  derivative  of  the  pressure  with  respect  to 
time.  And  the  behavior  of  the  electric  or  magnetic  field  in  a one-dimensional  wave 
of  electromagnetic  radiation  (such  as  a light  wave)  obeys  a mathematically  equiva- 
lent equation. 

For  mechanical  waves,  the  wave  equation  is  obtained  from  Newton’s  second 
law.  But  for  electromagnetic  waves,  essentially  the  same  wave  equation  is  ob- 
tained from  laws  specifying  the  properties  of  electric  and  magnetic  fields.  Since 
these  laws  are  completely  unrelated  to  Newton’s  laws,  it  is  not  universally  true 
that  the  wave  equation  is  only  a reexpression  of  Newton’s  equation.  From  the  per- 
spective of  physics  as  a whole,  it  is  reasonable  to  say  that  the  fundamental  laws 
of  particle  motion  and  wave  motion — Newton's  equation  and  the  wave 
equation — stand  on  an  equal  footing.  In  its  own  domain,  each  is  the  basic  govern- 
ing relation. 

Example  12-3  employs  the  wave  equation  to  help  you  develop  a physi- 
cal understanding  of  the  mechanics  underlying  the  motion  of  a traveling 
wave  in  a stretched  string. 


EXAMPLE  12-3 

Make  direct  use  of  the  wave  equation  to  discuss  qualitatively  both  the  motion  of  a 
segment  of  a string  along  which  a wave  pulse  is  propagating  to  the  right  and  the  re- 
lation of  this  motion  to  the  motion  of  the  wave  pulse.  First  relate  the  acceleration  of 
the  segment  to  its  curvature.  Then  use  the  acceleration  to  describe  the  motion  of 
the  segment  and  the  motion  of  the  wave  pulse. 

■ The  wave  equation  tells  you  that  the  acceleration  [measured  by  d2f(x,  0/d/2]  of  a 
segment  at  any  location  along  the  string  at  any  time  is  proportional  to  the  curvature 
[measured  by  d2f(x,  t)/dx2]of  that  segment.  The  physical  reason  is  that  the  curvature 
is  proportional  to  the  net  force  acting  on  the  segment.  For  instance,  if  the  segment 
is  perfectly  straight,  so  that  there  is  no  curvature,  a construction  analogous  to  that  in 
Fig.  12-8  shows  that  there  is  a perfect  cancellation  of  the  perpendicular  force  com- 
ponents acting  on  its  two  ends. 

In  Fig.  12-9rt  the  pulse  is  just  beginning  to  move  over  the  segment  under  con- 
sideration (the  shaded  part  of  the  string).  The  segment  is  concave  upward 
[rt2/(x,  O/dx2  > 0],  and  so  its  acceleration  is  upward  [ d2f(x , t)/df  > 0].  The  direction 
of  this  acceleration  is  indicated  by  the  small  arrow.  As  a result  of  its  upward  accelera- 
tion, the  segment  under  consideration  moves  away  from  its  initial  location  on  the 
x axis,  with  an  upward-directed  velocity. 

But  as  the  wave  continues  to  advance  to  the  right,  the  segment  is  soon  in  the  sit- 
uation illustrated  in  Fig.  12-9/r  It  becomes  concave  downward  and  so  experiences  a 


12-3  The  Wave  Equation  517 


1 


dK 


Fig.  12-9  A wave  pulse  traveling  past  a par- 
ticular segment  of  a string.  The  segment  is  in- 
dicated by  shading.  Parts  a,  b,  c,  and  d show  its 
position  at  four  successively  later  times.  The  ar- 
rows give  the  direction  of  the  acceleration  of  the 
segment.  The  displacements  have  been  exag- 
gerated. 


<rf) 


downward  acceleration.  This  reduces  its  upward  velocity  to  zero  at  the  instant  when 
the  segment  has  its  maximum  transverse  displacement. 

As  the  wave  continues  its  motion  to  the  right,  the  segment  being  considered  is 
next  in  the  situation  depicted  in  Fig.  12-9c.  It  continues  to  have  a concave- 
downward  curvature  and  therefore  a downward  acceleration.  This  acceleration 
develops  a downward  velocity,  and  the  segment  of  string  begins  to  return  to  its 
undisturbed  location  on  the  x axis. 

In  Fig.  12-9 d the  wave  has  advanced  to  such  a point  that  the  segment  is  concave 
upward.  So  it  has  an  upward  acceleration,  which  reduces  its  downward  velocity  to 
zero  just  as  it  comes  back  to  the  x axis. 

Of  course  the  wave  pulse  could  have  been  moving  to  the  left,  instead  of  to  the 
right.  You  should  repeat  the  discussion  for  such  a case. 


12-4  TRAVELING-WAVE 
SOLUTIONS  TO  THE 
WAVE  EQUATION 


In  this  section  we  use  analytical  methods  to  prove  that  the  wave  equation 
has  solutions  describing  traveling  waves.  In  other  words,  we  show  that  it 
has  solutions  of  the  form 

fix,  t)  = f(x  - vt)  (12-27) 


We  obtained  this  form  in  Sec.  12-1  by  analyzing  observations  of  waves  trav- 
eling along  a stretched  string.  So  we  know  already  that  waves  of  this  form 
actually  can  travel  along  the  string.  In  what  follows,  we  use  the  wave  equa- 
tion to  show  that  waves  of  this  form  should  be  able  to  travel  along  the  string. 
There  are  two  reasons  why  this  is  very  much  worth  doing.  First,  since  the 
wave  equation  applies  to  a wide  variety  of  mechanical  systems,  showing  that 
it  has  solutions  of  the  form  given  by  Eq.  (12-23)  amounts  to  showing  that 
waves  of  this  form  can  propagate  through  each  of  these  systems.  Second,  in 
the  course  of  the  calculation  we  will  obtain  a very  important  relation  which 
tells  us  how  to  evaluate  the  speed  of  waves  traveling  along  a sti'etched  string 
in  terms  of  the  tension  in  the  string  and  its  linear  density. 

The  wave  equation 

d2f(x,t)  fl  d2f(x,  t) 

(12‘28) 


518  Mechanical  Traveling  Waves 


is  a partial  differential  equation,  that  is,  a differential  equation  containing 
partial  derivatives.  Although  it  is  likely  that  you  have  not  yet  studied  such 
equations  in  a mathematics  course,  this  will  cause  no  difficulty.  All  that  you 
will  need  to  know  about  the  analytical  methods  that  we  will  use  for  solving 
partial  differential  equations  will  be  developed  fully  here.  (It  is  also  possible 
to  solve  partial  differential  equations  numerically.  An  example  is  given  in 
Chap.  21.)  Actually,  this  book  has  already  introduced  you  to  the  basic  idea 
of  the  analytical  method.  It  is  the  same  for  differential  equations  con- 
taining partial  derivatives  as  it  is  for  those  containing  ordinary  derivatives. 

As  was  explained  in  Sec.  6-5,  the  idea  is  that  you  use  whatever  prior 
knowledge  you  have  to  guess  at  the  form  of  the  solution  to  the  differential 
equation.  Then  you  substitute  the  form  into  the  equation  and  see  whether 
it  is  possible  to  obtain  a consistent  result.  The  assumed  form  of  the  solution 
can  be  based  on  observation  and  qualitative  consideration  of  the  behavior 
of  the  system  whose  physical  properties  are  represented  by  the  partial  dif- 
ferential equation.  This  is  just  what  we  have  done  in  Secs.  12-1  and  12-2. 
The  discussion  there  leads  us  to  assume  that  Eq.  (12-27)  is  a solution  to  the 
partial  differential  equation.  We  will  verify  the  assumption  by  substituting 
the  second  partial  derivatives  of  Eq.  (12-27)  into  Eq.  (12-28). 


To  facilitate  computing  the  required  derivatives  of  the  function  in  Eq. 
(12-27), 


f(x,  t)  = f(x  - vt) 


we  define  a quantity  h to  be 

h = x — vt 

1 lien  we  can  write,  for  example, 

dj (x  — vt)  _ df  (h)  dh 
8t  dh  dt 

This  is  a form  of  the  “chain  rule”  of  differential  calculus.  Its  validity  is  al- 
most self-evident  if  you  express  it  in  words:  The  rate  of  change  of  / with 
respect  to  t equals  the  rate  of  change  of/ with  respect  to  h times  the  rate  of 
change  of  h with  respect  to  t.  Note  that  when  /is  considered  to  be  a function 
of  x — vt,  its  derivative  with  respect  to  t must  be  written  as  a partial  deriva- 
tive, with  x held  constant.  But  when  it  is  considered  to  be  a function  of  the 
single  quantity  h,  its  derivative  with  respect  to  that  quantity  is  an  ordinary 
derivative.  The  derivative  of  h with  respect  to  t is  a partial  derivative. 

If  we  assume  that  the  wave  is  traveling  along  a uniform  string,  its  veloc- 
ity v will  be  a constant.  Thus  Eq.  (12-29)  gives 


dh 


(12-29) 


(12-30) 


and  Eq.  (12-30)  produces 


a/(.v  - vt)  = _vm_  (12.31) 

dt  dh 

The  same  procedure,  applied  to  calculating  the  partial  derivative  with 
respect  to  t of  the  quantity  df{x  — vt)/dt,  yields 


12-4  Traveling-Wave  Solutions  to  the  Wave  Equation  519 


d 

df(x  — vt) 

d 

df(h)~ 

dh 

d 

\dfm 

dt 

dt 

" dh 

[ U dh 

dt  " 

V dh 

dh 

2 d_  \df  W 
dh  dh 


Using;  second  derivative  notation,  we  can  write  this  as 

o 

d2f(x  - vt)  _ „ d2f(h) 
dt 2 ~ V dh2 


(12-32) 


Now  that  we  have  found  the  required  partial  derivative  with  respect  to 
t,  we  use  the  chain  rule  in  the  same  way  to  find  the  partial  derivative  with 
respect  to  x.  It  reads 

df  (x  — vt)  _ df(h)  dh 
dx  dh  dx 


Since  Eq.  (12-29)  shows  that 


^ = i 
dx 


we  obtain 


df(x  — vt)  df(h) 


dx  dh 

Differentiating  with  respect  to  x again  produces 


(12-33) 


d 

df(x  — vt) 

d 

\df(h)l 

dh 

d 

\df(h)l 

dx 

dx 

dh 

dh 

dh 

dh 

dh 

or 


d2f(x  — vt)  _ d2f(h) 


1-34) 


dx2  dh2 

Having  evaluated  the  second  partial  derivatives  of  the  function 

f(x,  t)  = /(x  - vt) 

which  we  guessed  to  be  a solution  to  the  wave  equation,  we  next  substitute 
them  into  the  wave  equation, 

d2f(x  — vt)  _ /j,  d2f(x  — vt) 


dx 2 


F 


dt 2 


The  purpose  is  to  see  whether  the  function  actually  is  a solution  to  the 
equation.  By  substituting  Eqs.  (12-32)  and  (12-34)  into  the  wave  equation, 
we  obtain 

d2f(h)  = fx  2 d2f(h) 
dh2  F L dh2 

If  this  can  be  satisfied,  then  we  have  proved  that  the  wave  equation  has  the 
traveling-wave  solution.  Can  it  be?  Certainly,  providing  the  velocity  v of  the 
traveling  wave  is  such  that 


or 


(12-35a) 


520  Mechanical  Traveling  Waves 


f(x,  t ) 


Fig.  12-10  A geometrical  interpretation  of  the  equation 

d2f(x , t)  1 d2f(x,  t) 
dx2  v2  dt2 

obtained  by  using  Eq.  (12-35a)  to  write  the  factor  fx/F  in  the  wave  equation  for  a stretched 
string  as  1 /v2.  The  surface  plots  the  function/(x,  t ) versus  x and  t for  the  simple  case  of  a single 
pulse  traveling  in  the  direction  of  increasing  x.  It  represents  the  same  thing  shown  by  the 
last  seven  diagrams  in  Fig.  12-1.  That  is,  the  “wrinkle”-shaped  surface  intersects  a plane  per- 
pendicular to  the  t axis  in  a curve  that  gives  the  shape  and  location  of  the  pulse  at  the  value 
of  t for  that  plane,  and  as  t increases,  the  pulse  moves  in  the  positive  x direction.  Its  velocity 
v specifies  the  angle  between  the  wrinkle  and  the  t axis — the  higher  the  velocity,  the  greater 
the  angle.  For  any  point  P with  a particular  set  of  x and  t values,  the  quantity  32/(x,  t)/dx 2 
measures  the  curvature  at  that  point  of  the  intersection  of  the  surface  with  the  plane  perpen- 
dicular to  the  t axis  passing  through  the  point.  This  “curvature  in  the  x direction”  is  indicated 
by  the  dashed  line.  The  quantity  d2f(x,  t)/dt2  measures  the  curvature  at  P of  the  surface’s 
intersection  with  a plane  perpendicular  to  the  x axis  passing  through  the  point.  The  dotted 
line  indicates  this  “curvature  in  the  t direction.”  The  wave  equation  requires  that  at  each  point 
the  curvature  in  the  x direction  be  the  product  of  \/v2  and  the  curvature  in  the  t direction. 
If  you  consider  several  points  other  than  P,  you  will  see  qualitatively  that  the  two  curvatures 
are  everywhere  proportional.  And  you  can  verify  qualitatively  the  role  played  by  l/v2  if  you 
visualize  the  surface  formed  by  keeping  the  shape  of  the  pulse  the  same  — thus  keeping 
the  curvatures  in  the  x direction  the  same — while  increasing  v.  so  as  to  increase  the  angle 
between  the  wrinkle  to  the  t axis  and  decrease  \/v2.  The  greater  the  angle,  the  more  abrupt 
the  changes  in  slope  of  the  intersection  of  the  surface  with  any  plane  perpendicular  to  the 
x axis.  Flence  the  curvatures  in  the  t direction  increase  as  1 /v2  decreases,  in  agreement  with  the 
fact  that  their  product — the  curvatures  in  the  x direction — is  unchanged. 


Either  sign  is  allowed  by  the  wave  equation.  The  positive  sign  corresponds 
to  a wave  traveling  in  the  direction  of  positive  x,  and  the  negative  sign  cor- 
responds to  a wave  traveling  in  the  opposite  direction.  Thus  what  the  wave 
equation  actually  determines  is  the  speed  |u|  of  the  wave.  In  these  terms, 
Eq.  (1 2-35a ) can  be  written 

(12-356) 

In  addition  to  verifying  that  the  wave  equation  has  traveling-wave  so- 
lutions of  the  form/(x,  t)  — f(x  — vt),  we  have  found  something  new  and 
important.  This  is  Eq.  (12-356),  which  shows  how  the  speed  |u|  of  the  travel- 
ing wave  depends  on  the  tension  F and  the  mass  per  unit  length  fx  of  the 
string  along  which  the  wave  is  traveling.  Be  sure  to  remember  that  the 
speed  |x;|  we  have  calculated  is  the  speed  of  the  traveling  wave  measured 
with  respect  to  an  inertial  reference  frame  containing  the  string.  Thus  |u|  is 
the  speed  of  the  wave  with  respect  to  the  inertial  reference  frame  of  the  medium 
through  which  it  propagates. 

Figure  12-10  provides  a geometrical  interpretation  of  the  result  ob- 
tained in  the  calculation  just  carried  out.  And  Example  12-4  is  intended  to 
help  you  understand  the  calculation  by  going  through  a similar  one  in 
which  the  function /(x,  t)  = fix  — vt)  is  given  an  explicit  form. 


12-4  Traveling-Wave  Solutions  to  the  Wave  Equation  521 


EXAMPLE  12-4 


In  Eq.  (12-16)  a sinusoidal  wave  traveling  in  the  direction  of  positive  x on  a long 
string  was  represented  by  the  function 


fix,  t)  = A cos 


where  A is  the  amplitude  of  the  wave,  A is  its  wavelength,  and  T is  its  period.  (For 
convenience  the  phase  constant  has  been  chosen  to  be  8 = 0.)  By  direct  substitution 
in  the  wave  equation,  show  that  this  particular  traveling  wave  is,  in  fact,  a solution  to 
the  equation  if 


A 

T 


(12-36) 


where  F and  p.  are  the  tension  and  linear  density  of  the  string. 

■ In  principle,  the  calculation  asked  for  is  unnecessary  because  the  form  to  be 
verified  can  be  written 


f(x,  t) 


A 


cos 


Using  Eq.  (12-1 lo)  with  a positive  sign,  we  have  \v  = A ,/T  = v,  where  v is  the  posi- 
tive velocity  of  the  wave.  Thus  we  can  write 


f(x,  t)  = A cos 


vt) 


or 


fix,  t)  = fix  - vt) 

Since  we  have  given  a general  proof  that/(x  — vt),  with  v = y/F/pi,  is  a solution  for 
any  form  of  the  function /,  it  surely  is  for  the  case  where  the  general  symbol/ repre- 
sents the  particular  operation  of  taking  the  cosine  of  the  constant  2tt/\  times  the 
quantity  x — vt.  But  it  is  worthwhile  to  make  an  independent  verification  of  this  im- 
portant particular  case  since  it  is  easy  to  do  and  may  clarify  the  general  proof. 
First  you  evaluate 


and 


df 

dt 


A sin 


df 

dt2 


/ 2?n2  , 

f (x  f\ 

A cos 

2tt 

V T ! 

L \ A T 1 \ 

Then  you  evaluate 


dx 


{ 2tt  \ 

" /x  t V 

— A sm 

2-77 

\ A / 

L V A T ) \ 

and 


f f 

dx2 


A cos 


You  can  now  substitute  into  the  wave  equation, 

d*f  = Ed*f 
dx2  F dt2 


(12-37  a) 


( 1 2-376) 


Mechanical  Traveling  Waves 


to  get 


A cos 


fj.  -in2 

~F^P 


A cos 


Cancelling,  you  obtain 


= M 
k2  F 


The  positive  sign  is  taken  in  the  square  root  because  all  four  quantities  involved  are 
intrinsically  positive.  If  this  equation  is  satisfied,  then  t he  sinusoidal  traveling  wave 
certainly  satisfies  the  wave  equation.  Comparison  with  Eq.  (12-36)  shows  you  have 
proved  what  you  were  required  to  prove. 


Example  12-5  uses  Eq.  (12-356),  |u|  = \Zf//jl. 


EXAMPLE  12-5 


b 1 H 

m 

M 

Fig.  12-11  An  apparatus  used  to  dem- 
onstrate transverse  wave  pulses. 


An  apparatus  for  lecture  demonstration  of  transverse  wave  pulses  is  shown  in  Fig. 
12-11.  One  end  of  a long  piece  of  rubber  tubing  is  fastened  to  a wall,  and  the  other 
end  passes  over  a pulley  and  then  down  to  a suspended  weight.  Plucking  the  tubing 
near  one  end  produces  a transverse  pulse  which  travels  along  the  tubing  to  the 
other  end.  If  the  distance  from  the  wall  to  the  pulley  is  / = 8.0  m,  the  mass  of  the 
tubing  in  that  length  is  m = 0.65  kg,  and  the  mass  of  the  suspended  weight  is  M = 
5.0  kg,  what  is  the  time  t required  for  the  pulse  to  travel  from  the  wall  to  the 
pulley? 

■ The  time  t for  the  pulse  to  travel  a distance  l has  the  value 

l 

\v\ 

where  the  speed  of  the  pulse  is  |u|.  According  to  Eq.  (12-356), 


w\ 


The  tension  F in  the  tubing  is  the  force  applied  by  the  weight,  or 


F = Mg 


The  linear  density  /i  of  the  tubing  is 


m 


since  a length  l contains  mass  m.  Combining  these  equations,  you  have 

t _ l _ I V m/l  _ I ml 
Vf/JI  VMg  V Mg 

Inserting  the  numerical  values  gives  you 

/ 0.65  kg  x 8.0  m 

t = \ - ■ . TFo T = 0.33  s 

V 5.0  kg  x 9.8  m/s2 


12-4  Traveling-Wave  Solutions  to  the  Wave  Equation  523 


12-5  ENERGY  IN 
WAVES 


y=f(x , t) 


Fig.  12-12  A short  segment  of  string 
at  a time  when  a wave  pulse  is  passing 
the  segment. 


One  of  the  fundamental  properties  of  a wave  is  that  it  contains  energy.  If  it 
is  a traveling  wave,  the  energy  is  carried  along  by  the  wave  as  it  moves. 
Thus  energy  can  be  transported  through  the  system  in  which  the  wave 
travels,  from  the  location  at  which  the  energy  is  put  into  the  system  in  the 
process  of  producing  the  wave  to  a location  at  which  energy  is  absorbed 
from  the  system  by  an  interaction  between  the  wave  and  some  object.  An 
example  discussed  qualitatively  in  Sec.  12-1  is  the  transport  of  energy  from 
a pebble  dropped  into  a pond  to  a distant  cork  floating  on  the  pond,  by 
means  of  the  water  wave  produced  by  the  pebble.  A more  important  ex- 
ample is  found  in  the  transport  of  energy  by  seismic  waves  in  an  earth- 
quake. Even  more  important  is  the  transport  of  energy  from  the  sun  to  the 
earth  by  the  electromagnetic  waves  called  sunlight.  This  process  is  the  orig- 
inal source  of  almost  all  the  energy  available  to  the  earth. 

Let  us  evaluate  quantitatively  the  energy  content,  and  then  the  energy 
transport,  in  the  simplest  type  of  mechanical  wave — a transverse  wave  trav- 
eling along  a stretched  string.  Figure  12-12  shows  a short  segment  of  the 
string  at  a time  and  place  for  which  the  wave  has  displaced  it  from  its 
normal  location  on  the  x axis.  The  length  of  the  segment  is  dl,  and  its  mass 
per  unit  length  is  /jl.  Thus  its  mass  dm  is  /i  dl.  But  since  we  assume,  as  be- 
fore, that  transverse  displacements  are  small,  so  that  the  angle  6 between 
the  segment  and  the  x axis  is  also  small,  for  the  purpose  of  calculating  dm 
we  again  use  the  approximation  cos  6=1.  Then 


dl  = dx 


(12-38) 


and 


dm  = dx  (12-39) 

The  kinetic  energy  dK  of  the  segment  can  be  calculated  from  its  mass 
dm  and  its  velocity.  Since  the  motion  of  the  segment  is  entirely  in  the  trans- 
verse direction,  the  velocity  is  the  time  rate  of  change  of  its  transverse  coor- 
dinate y = f{x,  t),  for  fixed  x.  Thus  the  velocity  of  the  segment  at  the 
particular  location  x is 

dy  _ 3/(x,  t) 
dt  dt 


Its  kinetic  energy  is  one-half  its  mass  times  the  square  of  its  velocity,  or 

' df{x,  ty 


dK 


dm 

9 


dt 


Evaluating  the  mass  dm  from  Eq.  (12-39),  we  obtain 


dK  = 


/x  dx 


df(x,  t) 
dt 


Dividing  through  by  dx  yields 

dK  _ jx  ~df{x,  t ) 2 
dx  2 _ dt 

The  quantity  dK/dx  is  the  kinetic  energy  per  unit  length  along  the  x axis.  It 
is  called  the  kinetic  energy  density  and  is  written  pK.  LIsing  this  symbol,  we 

have 


dfjx,  t) 
dt 


2 


(12-40) 


524  Mechanical  Traveling  Waves 


The  velocity  imparted  to  the  segment  of  string  as  the  wave  passes  it  is 
proportional  to  its  maximum  transverse  displacement.  [You  can  see  this  for 
a sinusoidal  traveling  wave  from  Eq.  ( 1 2-37 a).  Note  that  the  velocity  df/dt  is 
directly  proportional  to  the  amplitude  A,  which  is  the  maximum  transverse 
displacement.]  Since  the  displacements  are  assumed  to  be  small,  the  pro- 
portionality means  that  the  velocity  of  the  segment  is  also  a small  quantity. 
Thus  the  factor  [df(x,  t)/dt]2  in  Ecp  (12-40)  is  the  square  of  a small  quantity, 
and  therefore  pK  is  very  small.  We  must  take  this  fact  into  account  in  the 
next  step  of  the  calculation. 


The  next  step  is  to  evaluate  the  potential  energy  dU  of  the  segment  of 
the  string,  so  that  we  can  End  the  potential  energy  density  pL,  — dU/dx  asso- 
ciated with  the  passage  of  the  wave  pulse  through  the  segment.  Because  the 
kinetic  energy  density  involves  the  square  of  a small  quantity,  it  seems  likely 
that  the  same  will  be  true  of  this  potential  energy  density.  So  in  the  calcula- 
tion we  must  be  careful  to  keep  track  of  terms  involving  even  the  squares  of 
small  quantities.  The  very  small  potential  energy  produced  in  the  segment 
by  the  passage  of  the  wave  results  from  the  very  small  stretch  occurring  in 
the  segment  when  this  happens.  Since  the  segment  is  under  the  tension  F 
even  before  the  wave  passes,  it  is  already  longer  than  its  relaxed  length. 
The  situation  is  the  same  as  when  a spring  which  is  already  longer  than  its 
relaxed  length,  because  a force  F is  applied  to  it,  stretches  a very  small 
amount  with  that  force  applied.  During  the  process  the  magnitude  of  the 
force  remains  essentially  constant  because  the  length  of  the  string  changes 
only  a very  small  amount.  The  force  does  work  equal  to  its  magnitude  times 
the  stretch,  since  the  force  acts  through  a distance  equal  to  the  stretch.  This 
work  is  stored  as  potential  energy. 

Before  the  wave  arrives,  the  segment  of  string  illustrated  in  Fig.  12-12 
lies  along  the  x axis  from  x to  x + dx.  So  its  length  is  then  dx.  As  its  length  is 
stretched  to  dl  by  the  passage  of  the  wave,  the  amount  of  stretch  is  dl  — dx. 
To  evaluate  this,  we  use  the  pythagorean  theorem 

dl  = [(dx)2  + {dy)2]112 


where  dy  is  the  change  in  the  y coordinate  from  one  end  of  the  segment  to 
the  other,  at  the  particular  instant  t illustrated  in  the  figure.  Now, 


dy 


df(x,  t) 
dx 


dx 


That  is,  dy  can  be  calculated  by  taking  the  rate  of  change  of  y = fix,  t)  with 
respect  to  x for  fixed  t times  dx,  the  change  in  x along  the  segment.  There- 
fore we  have 


or 


dl 


(dx)2 


df(x,  t) 

dx 


1/2 


dl  = dx 


~ dfjx,  Q1  2 
dx 


1/2 


(12-41) 


Because  we  assume  small  transverse  displacements,  the  slope  6 f(x,  t)/dx 
of  the  string  is  always  small.  This  suggests  we  simplify  the  expression  for  dl 
by  using  an  approximation  obtained  from  the  binomial  expansion.  The 
approximation  has  been  used  before;  it  is  Eq.  (3-36), 

(1  + 2)1/2  =*  1 + tz 


where  z « 1 

12-5  Energy  in  Waves  525 


with  z representing  any  quantity.  We  apply  the  approximation  to  Eq.  (12-41) 
by  setting  z = [6 f(x,  t)/dxf-  This  gives  us 


dl  — dx  ( 1 + — 


df(x,  t )^2 
dx 


12-42) 


Using  this  result  to  compute  the  amount  of  stretch,  dl  — dx,  we  find  that  to 
a good  approximation  it  is  equal  to 


dl 


dx  = - 


df(x,  t) 


dx 


dx 


Thus  the  work  done  during  the  stretch  is 


F{dl  — dx)  — 


F 

2 


~df{x,  fij2 
dx 


dx 


1 his  is  also  the  potential  energy  dU  produced  by  the  work.  Setting  the  left 
side  of  this  equation  equal  to  dU,  dividing  through  by  dx,  and  then  writing 
dU/dx  as  pv,  we  have 


Pu 


F_ 

9 


df{x,  t ) 


dx 


(12-43) 


The  quantity  pv  is  the  potential  energy  density  in  the  wave. 


Can  you  explain  why  we  were  justified  in  ignoring  the  squared  term  in 
Eq.  (12-42)  when  we  evaluated  the  dl  in  Eq.  (12-38),  although  we  could  not 
ignore  it  in  obtaining  Eq.  (12-43)?  Why  is  it  that  in  obtaining  Eq.  (12-43)  we 
could  ignore  the  fact  that  the  tension  F in  the  segment  of  string  increases  as 
the  segment  stretches? 


It  is  easy  to  show  that  for  any  traveling  wave  the  value  of  the  kinetic  en- 
ergy density  equals  the  value  of  the  potential  energy  density.  To  do  so,  we 
consider  the  fact  that  for  such  a wave/(x,  t)  = f(x  — vt),  with  v the  velocity 
of  the  w'ave.  Equations  (12-31)  and  (12-33)  state  that  for  the  traveling  wave 

df(x  - vt)  df{h) 

~ Jt  ” dh 


and 


df(x  - vt)  _ df(h) 
dx  dh 


where  h = x — vt.  Therefore  Eqs.  (12-40)  and  (12-43)  give 


p 

df  (x  — Vt) 

2 pv2 

\df(h)] 

2 

dt 

2 

L dh  J 

and 


F 

df(x  — vt) 

2 

F 

\dfm 

Pu  - 9 

dx 

" 2 

. dh  J 

Dividing  Eq.  (12-44«)  by  Eq.  (12-44 b)  produces 

Pa-  _ 

Pu  F 


( 12-44# ) 


(12-44  b) 


526  Mechanical  Traveling  Waves 


But  Eq.  (12-35)  says  that  v 2 = F / fx.  So 


(12-45) 


Pk  — Pu 

The  total  mechanical  energy  per  unit  length  of  the  wave  is  the  sum  of 
its  kinetic  energy  per  unit  length  and  its  potential  energy  per  unit  length. 
Thus  the  total  energy  density  pE  of  the  wave  is 

Pe  = Pk  T pu  (12-46) 

Using  Eq.  (12-45),  we  have 


Pe  ~ -Pk  — 2pu  (12-47) 

We  therefore  can  determine  the  total  energy  density  of  a traveling  wave  by 
taking  either  twice  its  kinetic  energy  density  or  twice  its  potential  energy 
density. 

Example  12-6  applies  the  theory  just  developed  to  a sinusoidal  travel- 
ing wave. 


EXAMPLE  12-6 

A sinusoidal  wave 


f(x,  t)  = A cos 


K-l. 


of  amplitude  A,  wavelength  A.,  and  period  T is  traveling  along  a stretched  string. 
Calculate  the  kinetic,  potential,  and  total  energies  in  a length  of  the  wave  equal  to 
one  wavelength.  Interpret  the  results  physically. 

■ Since  calculations  involving  this  wave  were  carried  out  in  Example  12-4,  you 
already  have  expressions  for  df/dt  and  df/dx.  According  to  Eqs.  (12-37a)  and 
( 12-376), 


df 

dt 


2tt 

= — A sin 
T 


and 


3/ 

dx 


sin 


So  the  kinetic  and  potential  energy  densities  are 


p 

(df 

\2  2v2fxA2  . , 

X 

t \1 

Pk  - g 

vat 

) - r 

2tt 

77. 

( 12-48«) 


and 


Pu  = 


F 

9 


2 7 t2FA2 


snr 


Equation  (12-36)  shows  that 


=F_ 
T2  A2 


( 12-486) 


If  you  substitute  this  expression  for  fx/T2  into  the  right  side  of  Eq.  (12-48a)  and 
compare  the  result  with  the  right  side  of  Eq.  (12-486),  it  becomes  apparent  immedi- 
ately that 

Pk=Pu  (12-49) 


12-5  Energy  in  Waves  527 


For  this  traveling  wave  the  kinetic  and  potential  energy  densities  are,  indeed,  equal. 

You  can  evaluate  the  kinetic  energy  A for  one  wavelength  of  the  wave  by  inte- 
grating dK/dx  = pK  over  one  wavelength  X.  That  is, 

fx  dK  r* 

K = I — dx  = pK  dx 
J o dx  J0 

For  simplicity,  make  the  integration  at  t = 0.  Then  Eq.  (12-48«)  gives 


A' 


1 he  value  of  the  integral  is 


(12-50) 


[You  can  verify  this  by  performing  the  integration.  You  can  also  do  it  by  noting  that 
the  value  of  the  integral  is  the  average  value  over  one  wavelength  of  sin2(27rx/2) 
times  the  integral  from  0 to  X of  dx.  The  latter  factor  equals  X.  The  former  is  easy 
to  evaluate:  Since  (sin2(27rv/X))  = (cos2(27 rx/K)}  and  since  sin2(277.v/X)  + 

cos2(2-77.v/X)  = 1,  it  follows  that  (sin2(2TOc/X))  = 2-]  With  the  integral  evaluated,  the 
kinetic  energy  for  one  wavelength  of  the  wave  is  seen  to  be 


K = 


7T2jU,XT2 

T2 


(12-51) 


The  potential  energy  U for  one  wavelength  is 

f*  dU  fA 

U = — dx  = pi,  dx 

J o dx  Jo 

But  Eq.  (12-49)  shows  that  pv  = pK.  So  you  have 

U = pK  dx  = A 
Jo 

and  Eq.  (12-51)  gives 


U = 


7 T2p.kA2 

T2 


(12-52) 


I he  total  energy  content  E = K + U in  one  wavelength  of  the  wave  is 


E 


2n2p.KA2 


(12-53 a) 


Expressed  in  terms  of  the  frequency  v = 1 /T.  this  is 

E = 2tt2plKv2A2  (12-53  b) 

You  can  give  a physical  interpretation  of  Eqs.  (12-53a)  and  (12-536)  by  noting 
that  p.  is  the  mass  per  unit  length  of  the  string.  The  quantity  /xX  in,  say,  Eq.  ( 12-53a), 
is  therefore  the  mass  involved  in  one  wavelength.  The  amplitude  A is  proportional 
to  the  transverse  distance  traveled  by  that  mass  in  the  period  of  time  T required  for 
one  oscillation.  So  A/T  is  a measure  of  the  speed  of  transverse  motion  of  the 
mass.  4'hus  Eq.  (12-53«)  is  seen  to  be  of  the  form  of  any  kinetic  energy  expression: 
kinetic  energy  mass  x (speed)2.  The  left  side  of  the  equation  contains  a total 
energy  E , not  a kinetic  energy  A.  But  the  interpretation  remains  valid  since  Eq. 
(12-47)  shows  these  energies  are  proportional. 


Now  we  will  evaluate  the  energy  transport  in  a wave  traveling  along  a 
stretched  string.  This  is  the  energy  carried  per  second  by  the  wave  past  a 


528  Mechanical  Traveling  Waves 


Fixed 

location 


Fixed 

location 


(a) 


Fixed 

location 


( b ) 

Fig.  12-13  (a)  A wave  pulse  on  a string 

at  two  successive  instants  separated  by  a 
time  interval  dt.  During  this  interval  the 
wave  moves  an  amount  dx  = v dt,  where 
v is  its  velocity.  All  the  energy  contained 
in  a region  of  the  wave  of  length  equal 
to  this  amount  is  carried  past  a fixed 
location.  Thus  in  the  time  interval  dt  the 
energy  in  the  shaded  region  flows  past 
the  fixed  location,  (/;)  A schematic  rep- 
resentation of  energy  flowing  at  velocity 
v past  some  fixed  location.  In  time  dt 
the  energy  contained  in  a region  of 
length  v dt  flows  past  this  location.  This 
energy  is  the  energy  per  unit  length  pE 
multiplied  by  the  length  v dt.  So  it  is 
pE  ii  dt.  The  energy  flow  per  unit  time  is 
pEv,  which  is  defined  to  be  the  energy 
flux  S.  Therefore  S = pEv.  Since  in  this 
argument  no  specification  has  been 
made  of  how  the  energy  is  transported, 
the  result  S = pEv  is  of  general  appli- 
cability. It  applies  not  only  to  the  energy 
carried  by  all  types  of  waves,  but  also  to 
situations  where  energy  is  being  carried 
by  a completely  different  mechanism. 
For  instance,  the  schematic  could  rep- 
resent a conveyor  belt  carrying  a uni- 
formly distributed  set  of  charged  auto- 
mobile batteries  at  velocity  v past  a Axed 
location,  with  pE  being  the  energy  con- 
tent of  the  batteries  per  unit  length  of  the 
conveyor  belt  and  S the  associated  energy 
flux. 


fixed  location.  When  undisturbed,  the  string  lies  along  the  x axis.  Consider 
some  fixed  location  on  that  axis  at  an  initial  time  t when  the  wave  is  passing 
the  location.  In  the  short  time  interval  from  t to  t + dt  the  wave  moves  an 
amount 

dx  = v dt 


where  v is  its  velocity.  As  shown  in  Fig.  12- 13a,  the  part  of  the  wave  initially 
contained  within  an  adjacent  region  of  length  dx  travels  past  the  fixed  loca- 
tion during  the  time  dt.  As  this  happens,  the  total  energy  dE  of  that  part  of 
the  wave  is  carried  past  the  location.  The  energy  dE  has  the  value 


, dE  , 
dE  = — dx  = 
dx 


dE  dx  , 
— — r dt 
dx  dt 


This  is 


dE  — pEv  dt 


(12-54) 


where  pE  = dE  /dx  is  the  total  energy  density  of  the  wave.  The  rate  at  which 
energy  is  transported  by  the  wave  past  the  fixed  location  is  dE/ dt.  This  rate 
at  which  energy  How's  by  tbe  fixed  location  is  called  tbe  energy  flux  S.  (Flux 
is  the  Latin  word  for  flow.)  Thus,  by  definition,  the  energy  flux  is 


5 


dE_ 

dt 


(12-55) 


Dividing  both  sides  of  Eq.  (12-54)  by  dt  and  then  applying  this  definition, 
we  obtain 

5 = pEv  (12-56) 

The  energy  flux  eq  uals  the  energy  density  times  the  velocity  at  which  energy  is  being 
transported. 


Equation  (12-56)  is  a general  relation  applying  to  the  flow  of  energy  in  all 
waves — and  in  all  other  situations  in  which  energy  is  flowing.  Figure  12-13b  is  a 
schematic  representation  of  the  energy  flow.  In  the  caption  of  the  figure  the  repre- 
sentation is  used  to  show  that  Eq.  (12-56)  has  general  applicability.  While  energy 
is  not  a material  substance,  it  is  still  quite  useful  to  think  of  it  as  flowing  past  a 
fixed  location,  just  as  we  think  of  water  flowing  past  a fixed  point  on  the  bank  of  a 
river. 


If  the  traveling  wave  consists  of  a single  pulse,  then  there  is  a nonzero 
energy  flux  5 at  a given  location  only  when  there  is  a nonzero  energy  den- 
sity pE  at  that  location.  Before  or  after  the  wave  pulse  passes  the  location, 
there  is  zero  energy  density  there  and  hence  zero  energy  flux.  But  if  the 
wave  is  a sinusoidal  traveling  wave,  there  is  almost  always  a flux  of  energy 
passing  a given  location.  The  energy  flux  S is  not  constant  in  such  a case, 
however,  because  the  energy  density  pE  is  not  the  same  at  all  points  in  a si- 
nusoidal traveling  wave.  On  the  other  hand,  the  energy  flux  will  be  con- 
stant if  it  is  averaged  over  the  time  required  for  one  wavelength  of  the 
wave  to  pass  the  location.  That  is,  the  average  flux  (S)  will  be  the  same  from 
one  oscillation  to  the  next. 

Examples  12-7  and  12-8  involve  evaluating  (S). 


12-5  Energy  in  Waves  529 


EXAMPLE  12-7 


Calculate  the  average,  over  the  passage  of  one  wavelength,  of  the  energy  flux 
carried  by  a sinusoidal  wave  of  amplitude  A,  wavelength  X,  and  frequency  v which  is 
traveling  along  a stretched  string. 

■ This  is  easy  to  do  because  the  total  energy  content  E in  one  wavelength  of  the 
wave  was  obtained  in  Example  12-6.  All  you  have  to  do  is  to  take  that  energy  from 
Eq.  (12-536),  which  is 

E = 2iT2p.Kv2A2 

where  p is  the  linear  density  of  the  string.  Dividing  the  total  energy  E for  one  wave- 
length by  the  wavelength  X,  you  obtain  the  average  energy  density  (pE)  in  a wave- 
length. It  is 

( Pe ) = v = tt2  pv2  A2  (12-57) 

A 


1 he  average  flux  is  this  quantity  times  the  constant  velocity  v.  Thus 

(S)  = ( pE)  v 

And  so  (S)  has  the  value 

(S)  = 2tt'2  pirAZV 


(12-58) 


You  can  make  an  interesting  comparison  between  this  result  and  the  expres- 
sion for  the  total  energy  of  a single  particle  of  mass  m that  is  executing  harmonic  os- 
cillations of  amplitude  A and  frequency  v.  Writing  Eq.  (8-286)  in  terms  of  v (instead 
of  (o  = 2 77 , you  will  find  that  the  total  energy  E of  the  oscillator  is 

E = 2ttWA2  (12-59) 


Then  use  the  fact  that  each  segment  of  the  string  acts  like  a transverse  harmonic  os- 
cillator, when  a sinusoidal  wave  travels  along  it,  to  explain  the  relation  between  Eqs. 
(12-58)  and  (12-59). 


EXAMPLE  12-8 

Easing  the  demonstration  apparatus  considered  in  Example  12-5,  a lecturer  sends  a 
sinusoidal  wave  of  amplitude  A = 4.0  cm  and  frequency  v = 5.0  Hz  traveling  down 
the  8.0-m-long  rubber  tubing  by  touching  a vibrator  to  it  near  one  end.  Evaluate  the 
average  energy  flux  (S)  passing  the  center  of  the  tubing  during  the  time  before 
waves  reflected  from  its  other  end  return  to  the  center  and  complicate  the  issue. 

■ Using  the  equations  developed  in  Example  12-5  and  the  numerical  values 
given  there,  you  have 

m 0.65  kg 

M ~~  / 8.0  m 

= 8.1  x 10-2  kg/m 

and 


= 2.5  X 101  m/s 


/ 5.0  kg  x 9.8  m/s2 
V 8.1  x If)-2  kg/m 


So 


(S)  = 2tt2plv2A1v 

= 2?r2  X 8.1  X 10"2  kg/m  x (5.0  s”1)2  x (4.0  x 10~2  m)2  x 2.5  x 101  m/s 
= 1.6  J/s  = 1.6  W 


530  Mechanical  Traveling  Waves 


The  results  obtained  in  Example  12-7  for  the  average  energy  density 
(pE)  and  average  energy  flux  (S)  of  a sinusoidal  wave  traveling  along  a 
stretched  string,  and  applied  in  Example  12-8  to  a particular  case,  are  actu- 
ally of  much  wider  validity.  For  all  one-,  two-,  or  three-dimensional  waves 
continually  traveling  through  any  type  of  mechanical  medium,  the  average 
energy  density  ( pE ) and  average  energy  flux  (S)  are  both  proportional  to 
the  square  of  the  amplitude  of  the  wave  at  any  location,  and  also  propor- 
tional to  the  square  of  the  frequency  of  the  wave.  Furthermore,  in  Chap.  27 
we  will  see  that  closely  analogous  relations  apply  to  electromagnetic  travel- 
ing waves. 


12-6  LONGITUDINAL 
WAVES  AND 
MULTIDIMENSIONAL 
WAVES 


In  Sec.  12-3  we  derived  for  the  case  of  a stretched  string  an  equation  gov- 
erning the  propagation  of  transverse  waves  (waves  in  which  the  particles  of 
the  propagation  medium  move  perpendicular  to  the  line  of  motion  of  the 
wave).  This  wave  equation  is  Eq.  (12-26): 

d2f(x,  0 = p d2f(x,  t) 
dx2  F df 


The  mass  per  unit  length  p of  the  string  and  the  tension  F in  the  string  are 
quantities  which  have  physical  meaning  only  in  connection  with  a stretched 
string.  However,  Eq.  (12-35a)  tells  us  that  p/F  = \/v2,  where  v is  the  veloc- 
ity of  the  waves  with  respect  to  their  propagation  medium,  the  string.  Thus 
the  wave  equation  can  be  written  in  the  form 


d2/(*,  t)  = i a2/(x,  t) 

dx  t2  V2  dt2 


(12-60) 


where  the  wave  velocity  v does  not  depend  on  x or  t.  Even  though  v does 
depend  on  the  specific  properties  of  the  stretched  string,  it  has  a meaning 
independent  of  them.  An  observer  can  measure  the  value  of  v directly  and 
reach  substantial  conclusions  about  the  behavior  of  the  wave  without  a 
knowledge  of  either  F or  p. 

These  considerations  suggest  that  Eq.  (12-60)  is  more  general  than  the 
system  from  which  we  derived  it,  and  this  is  indeed  the  case.  The  equation 
governs  the  propagation  in  any  one-dimensional  mechanical  medium  not 
only  of  transverse  waves  but  also  of  longitudinal  waves.  Longitudinal 
waves  are  waves  in  which  the  particles  of  the  propagation  medium  oscillate 
parallel  to  the  line  of  motion  of  the  wave.  For  mechanical  waves  in  one  di- 
mension, Eq.  (12-60)  is  applicable  in  all  circumstances,  providing  two 
simple  criteria  are  met: 


1.  All  regions  of  the  medium  resist  disturbance  in  the  same  linear  (that 
is,  “Hooke’s  law”)  fashion.  In  the  case  of  the  uniform  stretched  string,  the 
force  required  to  give  a small  transverse  displacement  to  a segment  of  the 
string  is  proportional  to  the  displacement,  and  the  proportionality  constant 
is  everywhere  the  same.  But  there  are  many  mechanical  systems  which  con- 
form to  this  criterion  in  quite  different  ways. 

2.  The  mass  in  the  medium  is  uniformly  distributed,  before  the 
medium  is  disturbed.  This  means  the  inertial  tendency  of  each  region  of 
the  medium  to  avoid  changing  its  state  of  motion  is  the  same  as  that  of 
every  other  region  of  equal  size.  In  the  string  this  criterion  is  met  when  the 
mass  per  unit  length  is  constant. 


12-6  Longitudinal  Waves  and  Multidimensional  Waves  531 


The  propagation  of  sound  through  gases  or  liquids  provides  a very  im- 
portant example  of  the  applicability  of  Eq.  (12-60)  to  propagation  of  longi- 
tudinal mechanical  waves  in  one  dimension.  It  is  possible  to  derive  a wave 
equation  for  sound  traveling  in  one  dimension  directly  from  the  elastic  and 
thermodynamic  properties  of  fluids.  The  equation  proves  to  be  mathemati- 
cally identical  to  Eq.  (12-60).  But  the  derivation  is  quite  complicated,  so 
here  we  consider  the  matter  qualitatively. 

Figure  I 2- 14a  shows  a long,  air-filled  tube,  with  a metal  diaphragm  (a 
thin  plate)  at  one  end  that  can  be  made  to  move  back  and  forth,  into  and 
out  of  the  tube.  A disturbance  can  be  generated  in  the  air  by  pushing  the 
diaphragm  in  once  rapidly  and  letting  it  return  equally  rapidly  to  its  origi- 
nal position.  The  air  in  the  tube  satisfies  the  two  criteria  just  enumerated: 
(1)  The  rapid  inward  motion  of  the  diaphragm  causes  a decrease  in  the  vol- 
ume available  to  the  air  in  the  region  near  the  diaphragm  and  thus  an  in- 
crease in  the  air  pressure  in  this  region.  The  resistance  of  the  air  to  this 
disturbance  is  directly  proportional  to  the  disturbance.  (2)  The  air  is  uni- 
formly distributed  before  it  is  disturbed. 

The  motion  of  a wave  of  slightly  higher  air  pressure  along  the  tube  is 
quite  analogous  to  the  motion  of  a wave  of  small  transverse  displacement 
along  a stretched  string.  When  the  diaphragm  has  completed  its  inward 
travel,  the  higher  air  pressure  in  the  first  region  beyond  the  diaphragm 
begins  to  compress  the  air  in  the  next  region  by  forcing  air  molecules  to 
move.  The  molecules  in  this  second  region  cannot  respond  by  changing 
their  positions  instantaneously,  however,  because  molecules  have  inertia. 


Movable 

^diaphragm 

(a) 


0 


(e) 


x 


( b ) 


<D  <U 

1—  C/3 

zs  ^3  , 

c/3  <U  ! 


X. 


x 


(f) 


(c) 


X 


(?) 


{</) 


Fig.  12-14  Production  of  a sound  wave  pulse  in  a gas.  On  the  left  is  the  apparatus  producing 
the  pulse:  a tube  Riled  with  gas  and  closed  on  one  end  by  a movable  diaphragm.  The  motion 
of  the  diaphragm  and  its  effect  on  the  density  of  the  gas  in  its  vicinity  are  shown  in  parts 
a,  b,  c,  and  d at  four  successively  later  times.  The  plots  on  the  right  sides  of  these  figures 
represent  the  increase  in  gas  pressure  above  the  undisturbed  pressure  versus  position  at 
each  of  the  times  illustrated.  That  is,  part  e corresponds  to  part  a,  and  so  forth. 


532  Mechanical  Traveling  Waves 


But  soon  they  do  change  their  positions,  and  the  air  pressure  in  the  second 
region  increases.  Meanwhile,  the  diaphragm  has  returned  to  its  original 
position,  and  the  air  pressure  in  the  first  region  has  returned  to  its  undis- 
turbed value.  The  increase  in  air  pressure  next  propagates  to  the  third 
region,  then  to  the  fourth,  and  so  on  along  the  tube.  The  analogy  between 
this  pulse  of  increased  air  pressure  traveling  along  the  air-filled  tube  and 
the  pulse  of  transverse  displacement  traveling  along  a stretched  string  is 
made  evident  in  Fig.  12- 14c  through  12-14g.  The  traveling  air-pressure 
pulse  is  a rudimentary  sound  wave. 

The  generalization  to  the  propagation  of  periodic  sound  waves  is 
straightforward.  Although  it  is  not  directly  visible,  the  air  in  the  tube  be- 
haves very  much  like  the  spring  in  Fig.  12- 15a.  The  corresponding  wave  in 
air  is  visualized  in  Fig.  12-15/?. 

There  are  several  different  but  closely  related  ways  to  describe  the  air 
through  which  a sound  wave  passes.  The  wave  function  may  describe  the 
change/?  in  the  pressure  of  the  air  as  a function  of  position  and  time;  that  is, 

fi(x,  t)  = p(x,  t ) (12-6 la) 

Or  it  may  describe  the  longitudinal  displacement  5 of  small  volume  ele- 
ments of  the  air  from  their  undisturbed  positions  as  they  oscillate: 

fi(x,  t)  = p(x,  t)  (12-61/?) 

A sound  wave  is  a longitudinal  wave  because  the  volume  elements  of  air 
move  in  directions  parallel  to  the  line  of  motion  of  the  wave. 

If  the  two  wave  functions  of  Eqs.  ( 1 2-6 1 a ) and  (12-61/?)  describe  the 
same  wave,  there  must  be  a relation  between  them.  From  a physical  point 
of  view,  it  is  not  surprising  that  the  change  p in  the  air  pressure  is  related  to 
the  change  5 in  the  longitudinal  position  of  the  volume  elements  of  air.  It  is 
the  latter  which  leads  to  the  “squeezing”  of  the  air  and  thus  to  the  change  in 
pressure.  However,  the  two  wave  functions  p(x,  t ) and  5(x,  t)  are  not  in  step 
with  each  other.  This  can  be  seen  from  considering  Fig.  12-16.  Shown  in 


(6) 

Fig.  12-15  (a)  A view  at  a particular  instant  of  a longitudinal  periodic  wave  traveling  along 

a spring.  (Note  that  a spring  can  act  as  a propagation  medium  for  longitudinal  waves  as  well 
as  for  transverse  waves,  as  in  Fig.  12-2.  Which  type  of  wave  occurs  depends  on  how  the  spring 
is  disturbed  by  the  source  inducing  the  wave.)  (b)  A longitudinal  periodic  wave  traveling  along 
a gas-filled  tube  at  a particular  instant.  (Only  longitudinal  waves  can  propagate  through  a 
gas-filled  tube,  just  as  only  transverse  waves  can  propagate  through  a stretched  string.  What 
are  the  physical  reasons  for  these  observations?) 


12-6  Longitudinal  Waves  and  Multidimensional  Waves  533 


Undisturbed 


Fig.  12-16  The  relation  between  lon- 
gitudinal displacement  of  volume  ele- 
ments of  air  in  a tube  through  which  a 
sound  wave  propagates  and  the  pres- 
sure changes  in  the  tube. 


Fig.  12-16rt  are  evenly  spaced  points  representing  small  volume  elements  in 
the  undisturbed  air.  In  Fig.  12- 16ft  a sinusoidal  longitudinal  displacement 
wave  is  passing  through  the  air,  and  the  position  of  each  element  is  shown 
at  a given  instant  of  time.  Each  element  is  displaced  by  an  amount  s from  its 
own  equilibrium  position  x.  You  can  see  that  the  density  change  — and  thus 
the  pressure  change  p — is  largest  at  the  locations  where  the  displacement  is 
zero.  Just  at  these  locations  the  neighboring  elements  have  crowded  in 
from  both  left  and  right,  resulting  in  a maximum  pressure,  or  else  have 
moved  away  to  both  left  and  right,  resulting  in  a minimum  pressure.  The 
position  change  s is  plotted  versus  x in  Fig.  12-1 6<r,  and  the  pressure  change 
p is  plotted  versus  x in  Fig.  12-16d. 

Longitudinal  waves  carry  energy  just  as  transverse  waves  do.  In  Sec. 
12-5  we  considered  in  detail  the  energy  content  and  energy  transport  in 
waves  of  transverse  displacement  traveling  along  a stretched  string.  Nearly 
everything  we  learned  there  applies  directly  to  waves  of  longitudinal  dis- 
placement traveling  along  an  air-hlled  tube.  We  can  define  an  average  en- 
ergy density  (pE)  for  these  one-dimensional  sound  waves  to  be  their  total 
energy  content  per  unit  length,  averaged  over  one  wavelength.  Also,  we 
can  define  the  average  energy  flux  (S)  as  the  energy  the  waves  carry  past  a 
fixed  point  per  unit  time,  averaged  over  one  wavelength.  These  two  quan- 
tities are  related  to  the  sound  wave  velocity  v just  as  in  Eq.  (12-56).  That  is, 

(5)  = (pE)v  (12-62) 

As  we  have  explained,  the  relation  energy  flux  equals  energy  density  times  the 
velocity  at  which  energy  is  transported  applies  no  matter  how  the  energy  is 
being  transported.  And  since  it  applies  to  the  unaveraged  quantities  S and 
pE , it  certainly  applies  to  their  averages  (S)  and  ( pE ).  You  will  soon  see  that 
Eq.  (12-62)  holds  even  when  it  is  necessary  to  change  the  precise  definitions 
of  (S)  and  (pE)  in  dealing  with  waves  traveling  in  two  or  three  dimensions. 


Sound  waves,  and  certain  other  types  of  longitudinal  waves,  typically 
travel  in  three  dimensions.  The  same  is  true  of  some  types  of  transverse 
waves.  But  we  will  introduce  a discussion  of  multidimensional  waves  by  con- 

534  Mechanical  Traveling  Waves 


Fig.  12-17  Photograph  of  circular  water  waves  propagating 
away  from  a point  where  a vertically  oscillating  rod  touches 
the  surface  of  a shallow  tank  of  water.  The  apparatus  is  called 
a ripple  tank.  ( From  PSSC  Physics,  2d  ed.,  D.  C.  Heath,  Boston, 
1965.  Courtesy  Education  Development  Center.) 


sidering  first  a wave  traveling  in  two  dimensions.  An  example  is  shown  in 
(he  ripple-tank  photo  of  Fig.  12-17.  A vertically  oscillating  rod  of  circular 
cross  section  touches  the  surface  of  a shallow  layer  of  water  in  a transparent 
tank  and  produces  transverse  waves  in  the  form  of  ripples  in  the  surface. 
These  waves  are  illuminated  from  below  by  a light  source  and  pho- 
tographed from  above  by  a camera.  The  waves  propagate  outward  from 
the  source  in  all  directions. 

The  most  obvious  feature  of  these  waves  traveling  in  two  dimensions 
can  be  specified  by  the  shapes  of  curves  which,  at  any  instant,  connect  all 
comparable  points  of  the  waves.  Such  curves  are  called  wave  fronts.  For  ex- 
ample, a curve  that  passes  through  the  points  where  the  crest  of  a particu- 
lar ripple  is  located  at  any  instant  is  a wave  front.  At  the  same  instant  there 
is  also  a wave  front,  lying  outside  the  one  just  mentioned,  which  connects 
all  the  points  where  a trough  is  located.  There  is  another  wave  front,  lying 
inside,  where  another  trough  is  located,  and  so  forth.  Since  all  directions 
are  equivalent,  the  waves  travel  away  from  their  source  at  a speed  which  is 
the  same  in  every  direction.  As  a consequence,  all  the  wave  fronts  are 
circles  centered  on  the  source.  For  this  reason  the  waves  are  called  circular 
waves. 

Careful  inspection  of  the  photo  will  show  you  that  the  amplitude  of  the 
waves  (the  “height”  of  the  ripples)  decreases  slowly  with  increasing  distance 
from  the  source.  A small  part  of  this  decrease  is  due  to  frictional  energy 
loss.  But  for  the  most  part  the  decrease  in  amplitude  is  explained  by  ap- 
plying energy  conservation  to  the  geometry  of  circular  waves  in  an  argu- 
ment, starting  in  the  next  paragraph,  that  pertains  to  water  waves,  sound 
waves,  or  waves  of  any  other  type. 

Consider  the  circular  wave  system  sketched  in  Fig.  12-18.  The  source 
applies  a force  to  the  water  at  the  surface  of  the  ripple  tank,  and  this  water 


12-6  Longitudinal  Waves  and  Multidimensional  Waves  535 


<s> 


Fig.  12-18  The  set  of  circular  wave 
crests  pictured  in  Fig.  12-17.  The 
dashed  circle  of  radius  r is  used  in  an 
argument  concerning  the  energy  flow- 
ing away  from  the  source  of  circular 
waves. 


is  displaced  by  the  action  of  the  force.  Thus  the  source  does  work  on  the 
water,  supplying  energy  to  it.  The  average  rate  at  which  the  source  contin- 
ually supplies  energy  is  the  source  power  ( P ).  The  energy  is  not  accumu- 
lated in  the  region  near  the  source.  Instead,  the  energy  flows  outward 
through  this  region  because  it  is  transported  by  the  waves  that  the  source 
produces.  This  means  that  the  average  energy  per  unit  time  crossing  any 
closed  curve  surrounding  the  source  must  be  ecjual  to  the  average  source 
power  ( P ) . Consider  such  a curve  in  the  form  of  a circle  of  radius  r centered 
on  the  source.  For  the  two-dimensional  situation  we  are  dealing  with  here, 
the  average  energy  flowing  per  unit  time  across  a unit  length  of  the  periphery 
of  the  circle  is  defined  as  the  average  energy  flux  (S).  Since  by  symmetry  the 
value  of  the  average  energy  flux  is  the  same  everywhere  on  the  circle,  the 
value  of  (S)  is  the  source  power  (P)  divided  by  the  circumference  277T  of 
the  circle.  Thus 

( P ) 1 

(S)  = — for  uniform  circular  waves  (12-63) 

2.tt  r 

In  this  case  the  flux  is  an  energy  flow  per  unit  time  per  unit  length.  Its  value 
is  the  constant  (P) /2tt  multiplied  by  the  reciprocal  of  the  distance  r from 
the  source.  The  flux  is  inversely  proportional  to  the  distance  from  the  source  be- 
cause the  same  energy  per  unit  time  flows  across  a greater  periphery  the 
greater  the  distance. 

O 

Next  we  employ  Eq.  (12-62),  (S)  = { pE)v . Here  the  average  energy 
density  (pE)  of  the  waves  is  their  energy  content  per  unit  area,  and  v is  their 
speed.  A justification  of  Eq.  (12-62)  for  a two-dimensional  case  is  given  in 
Fig.  12-19  and  its  caption.  Solving  for  (pE)  and  using  Eq.  ( 12-63),  we  obtain 

(P)  1 

(Pf)  = t; for  uniform  circular  waves  (12-64) 

zttv  r 


U v dt ► 

T 

►-  dt  — * 

Fixed 

location 

Fig.  12-19  A schematic  representation  of  the  energy,  flowing  at  speed  v.  which  in  the 
infinitesimal  time  interval  dt  passes  across  an  infinitesimal  length  dl  of  a fixed  marker  line 
that  is  perpendicular  to  the  direction  of  flow.  In  that  time  all  the  energy  in  the  shaded 
region  flows  past  the  indicated  length  of  the  marker  line.  The  shaded  region  extends  a 
distance  along  the  direction  of  flow  equal  to  v dt.  Its  area  is  v dt  dl.  The  average  total  energy 
content  in  the  region  is  the  product  of  its  average  total  energy  per  unit  area  ( pE ) and  its 
area  v dt  dl.  Since  all  this  energy  flows  past  the  length  dl  of  the  perpendicular  marker  line, 
I he  average  energy  flow  per  unit  time  per  unit  length  of  the  marker  line  is  (pE)  v.  This 
quantity  is  the  average  energy  flux  (S).  Therefore  (S)  = ( pE ) v.  Why  is  the  result  valid 
when  applied  to  waves  that  spread,  like  circular  waves,  even  though  no  indication  of  spreading 
is  seen  in  the  figure?  When  you  come  to  the  discussion  of  waves  in  three  dimensions,  read  the 
remainder  of  this  caption.  For  a three-dimensional  situation,  the  shaded  region  could  represent 
the  projection  on  the  page  of  a rectangular  volume  of  infinitesimal  cross-sectional  area  da, 
extending  a distance  v dt  along  the  direction  of  flow.  Its  volume  is  v dt  da.  The  average  total 
energy  content  in  the  region  is  the  product  ot  its  average  total  energy  per  unit  volume  ( pE ) and 
its  volume  v dt  da.  Since  all  this  energy  Hows  past  the  area  da  of  the  perpendicular  marker 
surface,  the  average  energy  flow  per  unit  time  per  unit  area  of  the  marker  surface  is  (pE)  v. 
This  quantity  is  the  average  energy  flux  (S).  Therefore  (S)  = ( pE ) v. 


536 


Mechanical  Traveling  Waves 


Since  everywhere  on  the  circular  wave  v has  the  same  magnitude,  the  quantity 
( P)/2ttv  is  constant.  Thus  this  equation  leads  to  the  proportionality 
(pE)  x \/r.  The  energy  density  is  inversely  proportional  to  the  distance 
from  the  source  because  the  greater  the  distance,  the  more  thinly  spread  is 
the  energy.  Why  is  it  inversely  proportional  to  u? 

Finally,  we  use  the  proportionality  exhibited  in  Eq.  (12-57)  between 
( pE ) and  A2,  the  square  of  the  amplitude  of  the  waves.  As  discussed  briefly 
after  Example  12-8,  this  proportionality  applies  to  waves  of  any  nature. 
Since  A2  (pE),  while  Eq.  (12-64)  shows  that  (pE)  K 1/r,  it  follows  that  A2  * 
1/r,  or 


A 


for  uniform  circular  waves 


(12-65) 


The  gradual  decrease  in  the  amplitude  of  the  circular  waves  in  Fig.  12-17  is 
thus  seen  to  be  a consequence  of  energy  conservation  and  geometry. 


If  a source  of  power  (P)  is  emitting  waves  which  propagate  away  from 
it  in  all  directions  in  a uniform,  three-dimensional  medium,  the  wave  fronts 
form  spheres  centered  on  the  source.  Consequently,  the  traveling  waves 
are  called  spherical  waves.  The  energy  in  any  given  spherical  wave  will  be 
distributed  uniformly  over  the  wave  if  the  source  is  injecting  energy  into 
the  system  uniformly  in  all  directions.  (This  is  what  the  symmetrical 
ripple-tank  source  is  doing  in  the  two-dimensional  case  shown  in  Fig. 
12-17.)  In  these  circumstances  the  arguments  just  gone  through  for  cir- 
cular waves  can  be  modified  immediately  to  apply  to  spherical  waves.  In 
three  dimensions,  the  average  energy  flux  (S)  is  the  energy  flowing  per 
unit  time  across  a unit  area,  and  the  average  energy  density  ( pE ) is  the  en- 
ergy content  per  unit  volume.  The  latter  part  of  the  caption  of  Fig.  12-19 
shows  that  these  quantities  are  still  related  by  Eq.  (12-62),  (S)  = ( pE)v , 
where  v is  the  speed  at  which  the  waves  transport  energy. 

If  you  repeat  the  arguments,  using  a sphere  of  radius  r and  surface 
area  47rr2  centered  on  the  source  instead  of  a circle  of  radius  r and  circum- 
ference 27rr,  you  will  have  no  difficulty  in  showing  that 

for  uniform  spherical  waves  (12-66) 


for  uniform  spherical  waves  (12-67) 

and  that  the  amplitude  A obeys  the  relation 

A for  uniform  spherical  waves  (12-68) 

What  is  the  physical  interpretation  of  these  results? 


and 


(S)  = 


(Pe) 


Cp>1 

47t  r2 


(P)  1 

47 tv  r2 


The  relations  quoted  in  Eqs.  (12-65)  and  (12-67)  show  that  waves  trav- 
eling in  two  and  three  dimensions  through  a medium  in  which  frictional 
losses  are  negligible  (so  that  mechanical  energy  is  conserved)  do  not  main- 
tain constant  amplitudes.  This  is  in  contrast  to  the  constant  amplitude  of 
the  one-dimensional  traveling  waves  found  by  solving  the  one-dimensional 


12-6  Longitudinal  Waves  and  Multidimensional  Waves  537 


wave  equation,  Eq.  (12-60).  The  wave  equations  in  two  and  three  dimen- 
sions are  similar  to  Eq.  (12-60),  but  not  identical  to  it  or  to  each  other.  The 
differences  in  the  geometry  in  the  three  cases  lead  to  differences  in  the 
exact  forms  of  the  wave  equations.  In  fact,  they  make  the  multidimensional 
wave  equations  considerably  more  complicated  than  the  one-dimensional 
wave  equation.  So  we  will  do  nothing  with  them  other  than  to  state  that 
direct  solutions  of  these  wave  equations  lead  directly  to  the  dependence  of 
A on  r seen  in  Eqs.  (12-65)  and  (12-68). 

The  relations  given  by  Eqs.  ( 12-63)  through  (12-68)  are  valid  for  waves 
of  any  nature,  because  they  are  based  on  considerations  of  geometry  and 
energy  conservation  that  pertain  to  all  waves.  Example  12-9  applies  Eq. 
(12-66)  to  light  waves. 


EXAMPLE  12-9 

A 100-W  light  bulb  uses  about  that  much  electric  power.  But  the  power  in  the  visible 
light  produced  by  the  bulb  is  no  more  than  about  10  W. 

a.  Estimate  how  much  power  in  the  form  of  visible  light  flows  onto  the  pupil  of 
an  eye,  of  radius  1 .0  mm,  located  a distance  of  5.0  m from  a bulb  emitting  10  W of 
visible  light,  assuming  light  is  emitted  from  the  bulb  uniformly  in  all  directions. 

b.  What  would  be  the  effect  of  increasing  the  distance  from  the  bulb  to  the  eye 
to  10.0  m,  without  changing  the  orientation  of  the  bulb? 

■ a.  Using  Eq.  (12-66),  you  hrst  evaluate  the  energy  flux 


<S> 


(P) 

4t7T2 


The  numerical  value  is 


<S> 


10  w 

477  x (5.0  m)2 


0.032  W/m2 


Multiplying  this  energy  flow  per  unit  time  per  unit  area  by  the  area  ttR2  of  the  pupil 
of  radius  R,  you  have 


Energy  flow  per  second  = ( S)ttR 2 

= 0.032  W/m2  x 77  x (1.0  X 10“3  m)2 
= 1.0  x 1CT7  W 


This  can  be  only  an  estimate  because  light  is  not  emitted  uniformly  in  all  direc- 
tions by  a light  bulb.  (For  instance,  no  light  comes  past  the  base  of  the  bulb.) 

b.  If  the  orientation  of  the  bulb  is  held  constant  while  its  distance  from  the 
pupil  is  made  larger  by  a factor  of  2,  the  inverse-square  law  of  Eq.  ( 1 2-67)  predicts, 
with  good  accuracy,  that  the  energy  flowing  onto  the  pupil  of  fixed  radius  will  be- 
come smaller  by  a factor  of  i.  Why  are  nonuniformities  in  the  emission  of  light  into 
various  directions  not  of  importance  here? 


12-7  THE  DOPPLER  The  Doppler  effect  concerns  properties  of  mechanical  traveling  waves 
EFFECT  which  result  from  the  fact  that  when  the  waves  are  propagating  through  a 
particular  medium  fixed  in  an  inertial  frame,  they  maintain  a constant  speed 
with  respect  to  the  propagation  medium.  We  proved  this  in  Eq.  ( 1 2-35/t)  for 
waves  in  a stretched  string.  But  it  is  true  of  all  mechanical  traveling  waves. 

You  have  met  one  example  of  the  Doppler  effect  if  you  have  noticed 
how  the  pitch,  or  frequency,  of  the  horn  of  an  approaching  automobile  ap- 
pears to  drop  as  the  automobile  passes.  (The  “pitch”  of  a musical  tone — the 
“highness”  or  “lowness”  you  perceive — is  determined  by  the  frequency  of 
the  sound  wave  striking  your  ear.  The  higher  the  frequency,  the  higher 


538  Mechanical  Traveling  Waves 


the  pitch.)  The  downward  shift  in  the  frequency  that  you  detect  when  sta- 
tionary with  respect  to  the  air  through  which  the  sound  waves  travel  arises 
because  when  their  source  is  approaching  you,  the  detected  frequency  is 
higher  than  the  frequency  of  the  source,  and  when  the  source  is  receding 
from  you,  the  detected  frequency  is  lower  than  that  of  the  source.  T his 
property  of  mechanical  traveling  waves,  and  related  properties,  are  called 
the  Doppler  effect  after  the  physicist  Christian  Doppler  ( 1803-  1853),  who 
hrst  worked  out  the  theory  in  detail  for  the  case  of  sound  and  pointed  out 
its  implications  for  the  case  of  light. 

Let  us  begin  our  study  of  the  Doppler  effect  by  explaining  qualitatively 
what  happens  when  the  automobile  mentioned  above  approaches  a sta- 
tionary observer  through  still  air.  At  a certain  moment  the  oscillating  dia- 
phragm of  the  horn  produces  a region  of  maximum  air  pressure.  This  re- 
gion immediately  begins  to  propagate  as  a wave  outward  in  all  directions. 
Since  the  medium  through  which  it  propagates  is  uniform  and  stationary 
with  respect  to  the  observer  fixed  in  an  essentially  inertial  frame,  the  speed 
of  the  wave  with  respect  to  the  observer  is  the  same  in  all  directions.  From 
the  observer’s  viewpoint,  the  wave  must  therefore  propagate  as  a spherical 
wave.  That  is,  the  wave  front  passing  through  all  points  where  the  air  pres- 
sure maximizes  is  an  expanding  sphere.  This  sphere  is  centered  on  where 
the  horn  was  when  it  emitted  the  wave  front.  The  position  of  this  wave 
front  at  some  subsequent  time  is  indicated  by  the  curve  labeled  1 in  Fig. 
12-20a.  It  is  a circle  centered  on  the  position  labeled  1 and  is  supposed  to 
represent  a sphere  centered  on  that  position. 

Later  the  horn  emits  the  next  region  of  maximum  air  pressure,  which 
forms  wave  front  2.  But  from  the  viewpoint  of  the  observer,  the  horn  has 
moved  from  position  1 to  position  2 by  the  time  this  happens.  Thus  from  the 
observer’s  viewpoint  wave  front  2 propagates  as  a spherical  wave  whose 
center  is  at  position  2.  The  figure  shows  wave  front  2 at  the  same  instant  as 
it  shows  wave  front  1.  The  process  continues,  and  the  result  is  the  pattern 
of  spherical  waves  indicated  in  Fig.  12-20a.  The  motion  of  the  car  has  led  to 
a spacing  of  the  wave  fronts  which  is  different  in  different  directions.  This 
is  shown  for  circular  water  waves  emitted  by  a moving  source  in  the  ripple 
tank  photo  of  Fig.  12-206. 

If  the  observer  is  at  the  location  labeled  A in  Fig.  12-20a,  to  which  the 
car  is  approaching,  then  consecutive  wave  fronts  sweep  by  the  observer’s 
ear  at  a rate  greater  than  that  at  which  they  were  emitted  because  they  are 
“bunched  up.”  The  observer  consequently  hears  sound  of  frequency 
higher  than  the  frequency  of  oscillation  of  the  diaphragm  of  the  horn.  But 
if  the  observer  is  at  location  R,  from  which  the  car  is  receding,  then  consec- 
utive wave  fronts  sweep  by  at  a rate  less  than  that  at  which  they  were 
emitted  because  they  are  “spread  out,”  and  the  observer  hears  sound  of 
frequency  lower  than  the  frequency  of  oscillation  of  the  diaphragm. 

In  general,  the  observed  frequency  of  the  sound  depends  on  not  only 
the  velocity  of  the  source  relative  to  a reference  frame  fixed  to  the  ground, 
but  also  on  the  velocities  with  respect  to  that  reference  frame  of  the  propaga- 
tion medium  and  of  the  observer.  Let  us  restrict  ourselves  to  one- 
dimensional motion — that  is,  to  the  case  where  the  source,  the  medium,  and 
the  observer  all  move  with  respect  to  a reference  frame  fixed  to  the  ground 
along  a common  line,  though  at  different  speeds  and/or  directions.  We  will 
derive  a general  expression  for  the  Doppler  effect  in  this  case.  The  deriva- 


12-7  The  Doppler  Effect  539 


Fig.  12-20  (a)  A constant-frequency  sound  source  is  moving  with  uni- 

form velocity  with  respect  to  the  air  through  which  the  sound  propagates. 

1 he  figure  represents  the  locations  of  the  wave  fronts  at  a particular 
instant  of  time.  Each  wave  front  is  a sphere  centered  on  the  location  of 
the  sound  source  at  the  instant  when  the  wave  was  emitted.  Since  the 
source  is  moving  to  the  right,  so  are  the  centers  of  the  successive  spherical 
wave  fronts,  (b)  A ripple-tank  photo  showing  circular  wave  fronts  pro- 
duced by  a constant-frequency  source  moving  uniformly  to  the  right  over 
the  surface  of  the  water  in  the  tank.  (Courtesy  Educational  Development 
Center. ) 


( o ) 


tion  makes  repeated  use  of  the  Galilean  transformations  developed  in 
Sec.  3-8. 

Figure  12-21  depicts  the  situation  as  seen  from  the  essentially  inertial 
reference  frame  of  the  ground.  The  wind  blows — that  is,  the  air  which  is 
the  propagation  medium  moves — at  a constant  velocity  with  respect  to  the 
ground  given  by  the  signed  scalar  vm.  Because  vm  is  constant,  the  propaga- 
tion medium  itself  is  at  rest  in  some  other  essentially  inertial  reference 
frame.  Thus  the  speed  of  sound  with  respect,  to  the  propagation  medium  has  the 
fixed  value  |u|.  The  source  of  the  sound  waves  moves  at  a constant  velocity 
vs,  and  the  observer  who  detects  the  waves  moves  at  a constant  velocity  v0. 
Both  vs  and  v0  are  signed  scalars,  and  both  are  measured  with  respect  to  the 
ground.  We  consider  only  the  parts  of  the  wave  fronts  that  are  near  the 
common  line  of  motion.  And  we  use  their  direction  of  motion  to  define  the 
positive  direction  of  the  x axis  extending  along  the  line  of  motion  and  fixed 


540  Mechanical  Traveling  Waves 


/ / 
/ / 
2 1 


Fig.  12-21  A view,  from  a reference 
frame  fixed  to  the  ground,  of  a moving 
medium,  moving  source,  and  moving 
observer  considered  in  deriving  the 
general  one-dimensional  Doppler-effect 
formula.  Only  the  solid  parts  of  wave 
fronts  1 and  2 are  taken  into  account. 
Their  direction  of  motion  through  this 
reference  frame  defines  the  positive  di- 
rection of  its  x axis. 


to  the  ground.  (Note  carefully  how  the  positive  direction  is  defined,  and  be 
sure  to  conform  to  it  when  assigning  numerical  values  to  vm,  vs,  and  v0  in 
applying  the  formula  we  will  obtain  to  a specific  calculation.)  For  tfie  sake 
of  convenience,  the  figure  shows  tfie  velocities  of  the  medium,  the  source, 
and  the  observer  to  be  all  in  the  positive  x direction. 

Let  the  frequency  of  the  source  be  v.  It  emits  wave  front  1,  and  at  a 
time  T — \/v  later  it  emits  wave  front  2.  In  that  time  the  displacement  of 
wave  front  1 with  respect  to  the  propagation  medium  is  \v\T.  But  the 
propagation  medium  is  itself  moving  with  respect  to  the  ground  at  velocity 
vm  and  so  has  been  displaced  with  respect  to  the  ground  by  an  amount  vmT 
in  the  same  time.  Thus  wave  front  1 has  been  displaced  with  respect  to  the 
ground  by  the  amount  (|u|  + vm)T.  In  the  same  time,  the  source  has  been 
displaced  with  respect  to  the  ground,  by  the  amount  vsT,  to  the  location 
where  it  emits  wave  front  2.  Thus  the  separation  between  wave  front  1 and 
wave  front  2 is 

(M  + vm)T  - vsT  = [|u|  - ( vs  - vm)]T 

But  this  separation  is  precisely  what  is  meant  by  the  wavelength  A of  the 
waves.  So 

A = [M  - ( vs  - vm)]T 

Since  T = \/v,  this  is 


v 


The  parts  of  the  wave  fronts  traveling  in  the  positive  x direction  along 
the  ground  are  measured  from  the  ground  to  move  with  a velocity  given  by 
|u|  + vm.  But  the  observer  is  moving  with  respect  to  the  ground  as  well,  at 
velocity  v0.  So  the  velocity  of  the  wave  fronts  with  respect  to  the  observer  is 
M + vm  — v0  = |u|  — ( v0  — vm).  We  must  assume  that  the  speed  |u|  of  the 
waves  with  respect  to  the  medium  is  greater  than  the  speed  |u0  — vm\  of  the 
observer  with  respect  to  the  medium.  Then  the  velocity  |u|  — ( v0  — vm)  of 
the  wave  fronts  with  respect  to  the  observer  will  have  a positive  value.  This 
means  that  the  wave  fronts  will  catch  up  with  the  observer,  moving  at  a 
speed  with  respect  to  him  whose  numerical  value  is  given  by  |u|  — ( v0  — vm). 
After  wave  front  1 has  caught  up  with  the  observer,  it  takes  a time  7'  for 
wave  front  2 to  catch  up.  This  time  is  just  t he  separation  A between  consec- 
utive wave  fronts,  whose  value  is  given  by  Eq.  (12-69),  divided  by  the  speed 
whose  value  is  given  by  |u|  — ( v0  — vm).  Thus 

j,  = 1 M - ( vs  - vm) 

V |u|  - (V0  - Vm) 


The  frequency  v'  = 1/7'  measured  by  the  observer  is  therefore  obtained 
by  inverting  numerator  and  denominator  of  this  equation,  to  yield 


M - (Vp  - Vm) 

M - {vs  - vm) 


(12-70) 


For  this  general  Doppler-effect  formula  to  be  valid,  we  must  assume  also 
that  the  speed  |y|  of  the  wave  with  respect  to  the  medium  is  greater  than  the 
speed  |us  — vm\  of  the  source  with  respect  to  the  medium.  Then  we  can  be 
sure  that  the  denominator  in  Eq.  (12-70)  will  never  be  zero,  so  that  v'  is 
never  infinite,  and  that  the  denominator  will  never  be  negative,  so  that  v'  is 
never  negative. 


12-7  The  Doppler  Effect  541 


If  only  the  source  is  moving  with  respect  to  the  ground,  then  v0  and 
vrn  are  zero,  and  the  formula  reduces  to  the  simpler  form 


v = v 


for  only  source  moving 


PI  - Vs 

If  only  the  observer  is  moving,  the  corresponding  expression  is 

M - Vn 


V — V 


\v\ 


for  only  observer  moving 


(12-71) 


(12-72) 


Even  though  the  propagation  medium  is  not  moving  with  respect  to  the 
ground  in  the  cases  to  which  Eqs.  (12-71)  and  (12-72)  pertain,  its  presence  is  still 
essential.  Not  only  is  the  medium  responsible  for  the  very  existence  of  the  waves, 
but  also  it  specifies  the  frame  of  reference  with  respect  to  which  |v|  is  measured.  It 
was  thinking  about  this  “privileged”  frame  of  reference  in  connection  with  the 
propagation  of  light  that  led  Einstein  to  the  theory  of  relativity,  as  you  will  see  in 
Chap.  14.  There  is  a Doppler  effect  for  light,  and  other  electromagnetic  waves, 
which  is  qualitatively  similar  to  the  Doppler  effect  for  mechanical  waves.  But  the 
Doppler-effect  formula  for  electromagnetic  waves  is  not  the  same  as  Eq.  (12-70), 
and  the  physical  explanation  of  the  effect  is  completely  different,  because  there  is 
no  privileged  reference  frame  for  electromagnetic  waves. 


A problem  in  which  none  of  the  terms  in  Eq.  (12-70)  is  zero  is  worked 
out  in  Example  12-10. 


EXAMPLE  12-10 

Excursion  boats  A and  B are  moving  toward  each  other,  both  traveling  at  speeds  of 
7 m/s  with  respect  to  the  reference  frame  fixed  to  the  earth.  The  wind  is  blowing  at 
a speed  of  5 m/s  parallel  to  the  direction  of  motion  of  boat  A,  as  seen  from  the  ref- 
erence frame  fixed  to  the  earth.  There  is  a band  on  each  boat.  The  oboist  on  boat  A 
plays  the  musical  note  A,  which  corresponds  to  the  frequency  440  Hz.  The  musi- 
cians on  boat  B tune  to  that  note  and  then  play  it  back.  What  frequency  does  the 
bandleader  on  boat  A hear?  Take  the  speed  of  sound  in  air  to  be  344  m/s. 

B You  must  be  very  careful  in  applying  Eq.  (12-70)  to  assign  the  proper  signs  to 
the  various  velocities.  To  help  in  doing  this,  you  first  draw  a sketch  like  that  in  Fig. 
12-22a.  It  represents  the  first  part  of  the  process,  in  which  the  source  is  on  boat  A 
and  the  observers  on  boat  B,  and  takes  the  positive  x direction  in  the  direction  of 
motion  of  the  parts  of  the  wave  fronts  that  carry  the  sound  received  on  boat  B.  You 
can  then  calculate  v'B,  the  frequency  heard  on  boat  B.  It  is 


M - ( v0  - vm) 

v'b  = ^ n — ; ? 

\v\  - (vs  - vm) 


= 440  Hz  x 


344  m/s  — [(—7  m/s)  — (+5  m/s)] 
344  m/s  — [(+7  m/s)  — (+5  m/s)] 


458  Hz 


The  Doppler  shift  A v = v'B  — v has  the  value  Ar  = 458  Hz  — 440  Hz  — +18  Hz. 

Next  you  sketch  the  situation  for  the  second  part  of  the  process,  in  which  the 
musicians  on  boat  B act  as  the  source  and  the  bandleader  on  boat  A is  the  observer. 
In  so  doing,  you  must  remember  that  in  the  analysis  leading  to  Eq.  (12-70),  the  posi- 
tive x direction  was  defined  to  be  in  the  direction  of  motion  of  the  parts  of  the  wave 
fronts  which  carry  the  sound  from  source  to  observer.  Hence  the  positive  x direc- 
tion must  now  be  reversed  from  that  in  the  first  part  of  the  analysis,  and  you  make 


542  Mechanical  Traveling  Waves 


Wind 


vm  = +5  m/s 


Positive 
x direction 

(a) 


Wind 


vm  = -5  m/s 


Positive 
x direction 

(6) 


Fig.  12-22  Illustration  for  Example  12-10.  (a)  The  musicians  on  boat  B hear  the  oboist  on 
boat  A.  The  positive  x direction  is  defined  as  shown.  ( b ) The  bandleader  on  boat  A hears 
the  musicians  on  boatB.  The  positive  x direction  is  redefined  as  shown. 


your  sketch  as  in  Fig.  12-226.  Then  you  use  Eq.  (12-70),  again,  this  time  setting  v = 
v'B  = 458  Hz  and  calculating  v'A,  the  frequency  heard  on  boat  A.  You  find 


M “ (V0  - Vm) 

V'A  = v i i 7 7 

■ M - (vs  - Vm) 


= 458  Hz  x 


344  m/s 
344  m/s 


[(-7  m/s)  - (-5  m/s)] 
[(+7  m/s)  - (-5  m/s)] 


The  second  Doppler  shift  is  Av  = 477  Hz  - 458  Hz  = +19  Hz.  The  bandleader 
thus  hears  a frequency  differing  from  that  of  his  oboist  by  477  Hz  — 440  Hz  = +37 
Hz.  In  musical  terminology,  the  bandleader  would  say  that  the  note  he  hears  from 
the  other  boat  is  more  than  half  a tone  sharp.  The  difference  is  very  easily  detectable, 
even  though  the  speeds  are  all  quite  modest. 


EXERCISES 

Group  A 

12-1.  Triangular  wave  pulse,  I.  A transverse  wave  pulse 
is  traveling  in  the  positive  x direction  along  a long 
stretched  string.  The  origin  is  taken  at  a point  on  the 
string  which  is  far  from  the  ends.  The  speed  of  the  wave 
is  |u|.  At  time  t = 0,  the  displacement  of  the  string  is 
described  by  the  function  y = fix,  0)  = A(1  — \x\/l)  for 
|x|  =£  lt  and  y = f(x,  0)  = 0 for  |x|  > /. 


a.  Construct  a graph  that  portrays  the  actual  shape  of 
the  string  at  t = 0 for  the  case  A = 1/2. 

b.  Sketch  a graph  of  the  wave  form  at  the  following- 
times: 

1.  t = l/\v\ 

2.  t = 2 l/\v\ 

3.  t = -3//2M 


Exercises  543 


12-2.  Triangular  wave  pulse,  II.  Carry  out  the  con- 
struction of  a wave  pulse  moving  in  t lie  negative  x direction 
but  otherwise  as  described  in  Exercise  12-1. 

12-3.  Bird  on  a clothesline.  A child  plucks  one  end  of 
a taut  clothesline,  sending  a wave  pulse  toward  a sparrow 
perched  6.0  m down  the  line.  The  speed  of  the  wave  is  8.0 
m/s.  How  much  time  does  the  sparrow  have  to  fly  away  in 
order  to  avoid  being  shaken  by  the  pulse? 

12-4.  Tsunami  warning!  A burst  of  seismic  waves  is  de- 
tected at  a seismic  station  located  at  point  5 in  Fig.  12E-4. 
Seismologists  analyze  the  record  and  are  able  to  deter- 
mine that  a submarine  earthquake  has  occurred  at  point 
Q,  900  km  from  the  station  and  200  km  offshore  from  a 
seaport  P. 


Fig.  12E-4 


a.  How  much  time  was  required  for  the  seismic  waves 
from  the  quake  to  reach  the  seismic  station?  Assume  that 
the  average  speed  of  seismic  waves  is  10  krn/s,  and  that 
the  actual  path  followed  by  the  waves  in  going  from  Q to  S 
is  900  km  in  length. 

b.  The  seismologists  realize  that  the  quake  must  have 
produced  a potentially  destructive  tsunami  (often  inac- 
curately called  a “tidal  wave”).  This  is  a water  wave  propa- 
gating along  the  ocean  surface  with  a speed  of  150  m/s. 
Once  the  seismic  waves  have  been  detected  at  point  S,  how 
much  time  is  available  for  evacuation  of  low-lying  areas  of 
the  port  P?  (Assume  that  the  seismic  record  is  interpreted 
with  a negligible  loss  of  time.) 


(f2-5J  A prodigious  explosion!  On  June  30,  1908,  a huge 
expOrriTm  occurred  over  Siberia.  The  sound  from  this  ex- 


plosion (which  was  apparently  due  to  a very  large  meteor 
P’ite)  was  heard  500  km  away!  How  much  time  elapsed 
between  the  actual  explosion  and  the  arrival  of  the  sound 


waves  at  such  distant  locations?  Make  the  assumption  that 
the  speed  of  sound  was  a uniform  340  m/s. 

12-6.  Mathematical  description  of  a wave.  A transverse 
sinusoidal  wave  is  traveling  in  the  negative  direction, 
having  amplitude  0.10  m,  wavelength  2.0  m,  and  period 
0.50  s,  write  its  wave  function  in  the  forms  of  both  Eq. 
(12-18)  and  Eq.  (12-21)  assuming  that  8 = 0. 

12-7.  What's  the  wavelength ? The  frequency  of  electro- 
magnetic waves  used  to  transmit  television  signals  is  about 
1 X 108  Hz.  What  is  the  wavelength  of  these  waves?  The 
speed  of  electromagnetic  waves  is  3.00  x JO8  m/s. 

12-8.  Reading  a wave  function.  The  equation  for  a sin- 
usoidal traveling  wave  is  given  by 

y = 0.00500  m x cos[(10.0  m~')7rx  — (40.0  s-1  )7r/] 

a.  What  are  the  direction  of  propagation,  amplitude, 
wavelength,  and  velocity  of  the  wave? 

b.  What  is  the  displacement  y at  x = 0.0500  m at  time 
t = 0s;  t = 0.0125  s;  and  t = 0.0375  s? 

c.  What  is  the  displacement  y at  t = 0.0125  s at  posi- 
tion x = 0 m;  x = 0.0500  m;  and  x = 0.150  m? 

Vl2-9)  A traveling  wave,  /.  A sinusoidal  wave  is  travel- 
ing alTTng  a stretched  string.  The  point  on  the  string  at  the 
origin  ( x = 0)  oscillates  in  a manner  described  by  the 
equationy  = A cos(wt),  where  A = 0. 10  m and  w = 20a- s_1. 
The  point  at  x = 0.050  m oscillates  in  a manner  described 
by  the  equation  y = A cos (wt  — tt/4). 

a.  What  is  the  frequency  of  the  wave? 

b.  Assume  that  the  wave  is  traveling  in  the  positive  x 
direction  and  that  its  wavelength  is  greater  than  0.050  m. 
Find  the  wavelength  and  the  wave  velocity.  Write  an  equa- 
tion y = f(x,  t)  that  describes  this  traveling  wave. 

12-10.  To  preserve  the  wave  speed.  A stretched  string  is 
replaced  by  another  of  the  same  material  but  with  twice 
the  diameter.  What  should  be  the  ratio  of  the  new  tension 
to  the  old  if  the  wave  speed  is  to  remain  the  same? 

12-11.  7 ransverse  motion  of  the  string  itself.  The  equa- 
tion of  a certain  one-dimensional  transverse  traveling; 
wave  along  a string  is  given  by  y = A cos(kx  — wt),  where 
A = 0.0050  m,  k = IO77  m-1,  and  w = 40-7T  s-1. 

a.  What  is  the  expression  for  the  tranverse  velocity 
dy/dt  of  any  part  of  the  string? 

b.  At  t = 0,  what  is  the  value  of  this  velocity  at  x = 
0.050  m?  What  is  the  value  of  the  displacement  at  the  same 
point  at  t = 0? 

12-12.  Waves  on  a nylon  cord.  A nylon  cord  whose 
diameter  is  0.50  cm  is  held  under  a tension  of  500  N.  The 
density  of  nylon  is  1.1  g/cm3.  Find  the  speed  at  which  a 
transverse  wave  will  propagate  along  the  cord. 

12-13.  Energy  flow  in  a sinusoidal  traveling  wave.  A 
transverse  sinusoidal  wave  is  traveling  at  a speed  of  20  nr/s 
along  a cable  whose  linear  density  is  /jl  = 0.20  kg/m.  The 
wave  amplitude  is  0.10  m and  the  frequency  is  5.0  Hz. 


a.  Determine  the  wavelength. 

b.  Determine  the  tension  in  the  cable. 

c.  Find  the  average  energy  density  in  the  wave. 

d.  Find  the  average  energy  11  ux  carried  by  the  wave. 

12-14.  Across  the  bay.  A wave  source  of  power  100  W, 
located  in  an  otherwise  calm  bay,  creates  spreading  cir- 
cular water  waves  of  frequency  1 .0  Hz. 

a.  What  average  energy  flux  (in  W/m)  is  carried  past 
an  observation  point  Px  located  30  m from  the  source? 

b.  What  is  the  average  energy  flux  passing  point  P2, 
which  is  located  60  m from  the  source? 

c.  Express  the  wave  amplitude  at  point  P 2 as  a frac- 
tion of  the  amplitude  at  point  Pv 

12-15.  Albatross.  A shipwrecked  sailor  adrift  in  a life- 
boat can  just  barely  hear  the  cries  of  an  albatross.  The  bird 
is  flying  high  above  the  ocean,  and  the  straight-line  dis- 
tance between  the  bird  and  the  sailor  is  3 km.  As  you  will 
learn  in  Chapter  13,  the  threshold  (minimum  energy  flux) 
for  human  hearing  is  about  1 x 10-12  W/m2. 

a.  Assuming  that  the  albatross  emits  sound  isotropi- 
cally (equally  in  all  directions),  estimate  the  power  it  emits 
during  its  cries. 

b.  Suppose  that  the  albatross  cries  1000  times  daily 
and  that  each  cry  lasts  1 s.  How  much  energy  does  it  emit 
in  sound  waves  each  day? 

12-16.  Beep  beep.  Two  cars  are  traveling  in  the  same 
direction  with  the  same  speed,  30  m/s.  The  driver  of  the 
rear  car  blows  her  horn,  which  has  a frequency  of  200  Hz. 
What  frequency  does  the  driver  of  the  front  car  hear? 


this,  choose  whatever  dimensionally  consistent  values  you 
wish  for  the  arbitrary  constants  A and  B,  and  for  the  wave 
velocity  v,  and  then  plotting  the  wave  function  versus  x for 
two  different  values  of/.  How  do  you  determine  the  wave 
velocity  from  your  plots? 


12-21.  A leftward-moving  sinusoidal  wave.  Derive  the 
wave  function 


r /x 

t\ 

2tt  - 

+ - + 8 

L u 

77 

/(x,  t)  — A cos 


for  a sinusoidal  wave  traveling  with  velocity  v = — vk  = 
— k/T.  Use  an  argument  like  the  one  leading  to  Eq.  (12-1  7). 

/ 12-22/Slope  and  velocities.  Suppose  that  a transverse 
shiusnicTal  wave  is  traveling  along  a string.  Prove  that  at 
any  time  t the  slope  <4y/dxof  the  sti  ing  at  any  point  x is  equal 
to  the  negative  of  the  instantaneous  transverse  velocity 
dy/dt  of  the  string  atx  divided  by  the  wave  velocity  v.  That  is, 

12-23. J !r ind  the  wave  function.  A sinusoidal  wave  trav- 
liiVgTnaEie  positive  direction  on  a stretched  string  has  am- 
plitude 2.0  cm,  wavelength  1.0  m,  and  wave  velocity  5.0 
m/s.  At  x = 0 and  t = 0,  you  haveod'=  0 and  dy/dt  < 0. 
Find  the  wave  function  y = f(x,  t)  whicn  describes  the  wave. 


dy  dy/dt 

show  that  — = 1 

dx  v 


12-24.  Confirming  a solution.  By  direct  substitution 
into  the  wave  equation  for  a stretched  string,  Eq.  (12-28), 
show  that  the  wave  function 


12-17.  Doppler  demonstration.  In  a lecture  demon- 
stration a whistle  is  mounted  at  the  end  of  a 0.500  m arm  of 
a rotator  which  is  turning  at  2.00  revolutions  per  second. 
The  whistle  emits  a sound  whose  frequency  is  300  Hz. 
What  are  the  highest  and  lowest  frequencies  that  the 
audience  hears?  Take  the  velocity  of  sound  to  be  340  m/s. 


Group  B 

12-18.  A leftward-moving  wave  pulse,  I.  Derive  directly 
the  wave  function /(x,  /)  = f(x  + |v|/)  for  a wave  pulse  trav- 
eling at  speed  |u|  in  the  negative  x direction.  Use  an  argu- 
ment similar  to  the  one  leading  to  Eq.  (12-6),  but  with  ob- 
server O'  moving  in  the  negative  x direction. 

12-19.  A leftward-moving  wave  pulse,  II.  Verify  directly 
that  the  wave  function /(x,  t)  =/(x  + |v|/)  describes  a wave 
pulse  traveling  at  speed  |v|  in  the  negative  x direction.  Use 
an  argument  like  the  one  in  small  print  following  Eq. 
(12-6). 

12-20.  Wave  function  and  wave  shape.  The  wave  func- 
tion 

/(x,  t)  = Ae~uu~vt^ 

describes  a single  wave  pulse,  traveling  at  velocity  v,  which 
has  a shape  quite  like  the  one  shown  in  Fig.  1 2-4.  To  verify 


f(x,  t)  = Ae~B(x~vt)2 

is  a solution  to  the  equation  if  v = ± VT//C,  where  F is  the 
tension  in  the  string  and  p.  is  its  linear  density. 

12-25.  Waves  on  a steel  cable.  A gondola  that  is  used  to 
carry  sightseers  out  over  a deep  gorge  is  suspended  from 
a number  of  steel  cables.  Each  cable  is  3.0  cm  in  diameter 
and  is  kept  under  a tension  of  1.0  x 1 04  N.  The  density  of 
steel  is  7.8  g/cm3.  With  what  speed  would  transverse  waves 
propagate  along  the  cables? 

12-26.  Railroad  crossing.  A train  sounds  its  whistle  as  it 
approaches  and  leaves  a railroad  crossing.  An  observer  at 
the  crossing  measures  a frequency  of  219  Hz  as  the  train 
approaches  and  a frequency  of  184  Hz  as  the  train  leaves. 
The  speed  of  sound  is  340  m/s.  F ind  the  speed  of  the  train 
and  the  frequency  of  its  whistle. 

12-27.  On  wings  of  song.  A hawk  is  flying  directly  away 
from  a birdwatcher  and  directly  toward  a distant  cliff  at  a 
speed  of  15  m/s.  The  hawk  produces  a shrill  cry  whose 
frequency  is  800  Hz. 

a.  W'hat  is  the  frequency  in  the  sound  that  the  bird- 
watcher hears  directly  from  the  bird? 

b.  What  is  the  frequency  that  the  birdwatcher  hears 
in  the  echo  that  is  reflected  from  the  clifl  ? 


Exercises  545 


Fig.  12E-32 


12-28.  On  the  track.  As  shown  in  Fig.  12E-28,  an  ob- 
server P is  standing  between  two  parallel  train  tracks  when 
two  trains  approach  from  opposite  directions.  Locomotive 
A has  a speed  1^  | = 15  m/s.  It  toots  its  whistle,  which  has  a 
frequency  v0  = 200  Hz.  Locomotive B has  a speed  |vB|  = 30 
m/s.  The  speed  of  sound  in  the  air  is  340  m/s,  and  no 
breeze  is  blowing. 

°a  Fig.  12E-28 

QzutnmTi  i i i 1 1 j m rmttttmtrtttt 

iP  B 

i.ii.i.iiuu.iiiiiiriiiiiiiimiiiiU£HI^ 


VB 

a.  Find  the  wavelength  Kx  and  frequency  r,  of  the 
sound  waves  observer  P receives  from  locomotive  A. 

b.  What  frequency  v2  is  heard  by  the  engineer  on  lo- 
comotive B ? 

Some  of  the  sound  waves  reaching  locomotive  B are 
reflected  back  toward  observer  P and  locomotive  A. 

c.  Find  the  wavelength  X3  and  the  frequency  v3  of  the 
reflected  sound  waves  that  observer  P hears. 

d.  What  frequency  v4  does  the  engineer  on  locomo- 
tive A hear  in  the  reflected  waves? 

12-29.  Songs  of  the  whales.  Several  whale  species  are 
apparently  able  to  communicate  over  distances  of  up  to  50 
km  or  more.  The  speed  of  sound  in  water  is  1400  m/s. 

a.  If  two  whales  located  50  km  apart  begin  to  com- 
municate, what  is  the  minimum  time  lag  experienced  by 
each  whale  between  the  emission  of  its  call  and  its  re- 
ception of  a response? 

b.  The  maximum  swimming  speed  for  a whale  is 
about  8 m/s.  Suppose  that  one  whale  emits  a sound  of  fre- 
quency precisely  100  Hz.  Find  the  maximum  and  mini- 
mum possible  frequencies  heard  by  another  whale. 

12-30.  Moving  medium.  A wave  source  of  frequency  v 
and  an  observer  are  located  a Axed  distance  apart.  Both 
the  source  and  the  observer  are  stationary.  However,  the 
propagation  medium  (through  which  the  waves  travel  at 
speed  v ) is  moving  at  a uniform  velocity  v„,  in  an  arbitrary 
direction.  Find  the  frequency  v'  received  by  the  observer. 
Explain  your  result  physically. 

12-31.  A traveling  wave,  II.  Consider  once  again  the 
sinusoidal  traveling  wave  described  in  Exercise  12-9. 

a.  Find  the  frequency  of  the  wave. 

b.  Suppose  that  no  information  is  given  about  the 
direction  of  propagation  or  the  wavelength.  Show  that  lor 
each  direction  of  propagation,  there  are  an  infinite 
number  of  possible  wavelengths.  Determine  those  wave- 
lengths and  the  corresponding  wave  speed.  [Hint:  cos  g>  = 
cos(tp  + 2nv),  where  n is  an  integer.] 

12-32.  Rope  trick.  A loop  of  rope  is  whirled  at  a high 
aneular  velocity,  w,  so  that  it  becomes  a taut  circle  of 
radius  R.  A kink  develops  in  the  whirling  rope;  see  Fig. 
12E-32. 


a.  Show  that  the  tension  in  the  rope  is  F = /jl(d2R 2, 
where  /u,  is  the  linear  density  of  the  rope. 

b.  Under  what  conditions  does  the  kink  remain  sta- 
tionary relative  to  an  observer  on  the  ground? 


Group  C 

12-33.  The  wave  equation  for  a moving  observer,  /.  The 
wave  equation  for  a stretched  string  can  be  written 

d2f(x,  t)  1 d2f(x,  t) 

dxr  v dt 

where  v = ± \/F//x  is  the  velocity  of  waves  measured  by 
an  observer  0 in  an  inertial  reference  frame  motionless 
with  respect  to  the  string.  An  observer  O ' is  stationed  in  an 
inertial  reference  frame  moving  at  a constant  velocity  V 
with  respect  to  the  frame  of  observer  O.  The  position  and 
time  variables  used  by  O'  are  x'  and  t' . Their  relations  to 
the  variables  x and  t used  by  O are  given  by  the  Galilean 
equations  for  transforming  position  and  time,  x'  = x — Vt 
and  t'  = t.  Use  these  relations  to  show  that  when  trans- 
formed into  the  variables  x'  and  t'  the  wave  equation  be- 
comes 

/ U2\  d2f(x',  t')  1 d2f(x',  t')  2V  d2f(x',  t') 

\ u2/  dx'2  V2  dt'2  v2  dx'dt' 

(Hint:  You  will  need  to  make  repeated  use  of  the  “chain 
rules” 

dg  dg  dx'  dg  dt' 

dx  dx'  dx  dt'  dx 

and 

dg  dg  dx'  dg  dt' 

dt  dx'  dt  dt’  dt 

where  g is  any  function  of  x and  t,  or  of  x'  and  t' .)  An  ex- 
ercise in  Chapter  14  calls  for  a calculation  which  is  similar 
to  the  one  done  here,  except  that  the  Lorentz  equations 
for  transforming  position  and  time  in  the  relativistic  do- 
main are  applied  to  the  wave  equation  for  light  propaga- 
tion. 

12-34.  The  wave  equation  for  a moving  observer,  II.  By 
direct  substitution  into  the  transformed  wave  equation 
found  in  Exercise  12-33,  prove  that  the  function 


546  Mechanical  Traveling  Waves 


fix' , t ')  = A cos{£[x'  — {v  — V)/']} 

is  a solution  of  the  wave  equation.  Explain  why  the  form 
of  this  wave  function  shows  immediately  that  it  describes  a 
wave  traveling  with  a velocity  that  observer  O'  measures  to 
be  v — V.  How  does  the  Galilean  velocity  transformation 
show  immediately  that  this  is  just  the  velocity  that  you 
would  expect  observer  O'  to  measure?  In  an  exercise  of 
Chap.  14  a result  is  obtained  that  is  in  striking  contrast  to 
the  one  obtained  here. 

12-35.  Spherical  waves:  the  implications  of  geometry. 
Modify  the  arguments  leading  to  Eqs.  (12-63),  (12-64), 
and  (12-65)  so  that  they  apply  to  uniform  spherical  waves. 
Show  thereby  that  for  such  waves  the  average  energy  flux, 
average  energy  density,  and  amplitude  obey  the  relations 
(S)  = (P)/4tt)  2,  (pE)  = (P)/4ttv>2,  and  A oc  \/r. 

12-36.  Energy  in  a triangular  wave  pulse.  A triangular 
transverse  wave  pulse  of  amplitude  A and  wavelength  A.  = 
21,  travels  in  the  positive  x direction  along  a string  with 
tension  F and  linear  density  /x.  It  is  described  by  the  wave 
function  y =f(x,  t),  where  fix,  t)  = A(  1 - \x  — vt\/l)  for 
|x  — vt\  < l and  fix,  t)  = 0 otherwise.  Here  v = (F/p)112. 

a.  Find  the  total  energy  density  pEix,  t)  in  this  wave 
pulse.  Express  your  result  in  terms  of  F.  p.  A,  and  /. 

b.  Integrate  the  expression  found  in  part  a over  all 
values  of  x to  show  that  the  total  energy  E carried  by  the 
traveling  wave  pulse  is  2 FA2 /I. 

c.  Use  the  result  of  a to  determine  the  energy  flux 
Six,  t)  = pElx,  t)  v. 

d.  For  any  fixed  value  of  x,  integrate  Six,  t)  from  t = 
— oo  to  t = +“  in  order  to  verify  that  the  wave  pulse 
carries  (past  each  point  x)  the  total  energy  E found  in 
part  b. 

12-37.  Wave  energy  in  a nylon  cord.  Use  the  results  of 
Exercise  12-36  to  evaluate  the  energy  density  pE , the  en- 
ergy flux  S,  and  the  total  energy  E carried  by  a triangular 
wave  with  amplitude  A = 1.0  cm  and  overall  length  21  = 
20  cm  traveling  along  the  nylon  cord  described  in  Exercise 
12-12.  Express  your  results  in  SI  units. 

12-38.  Wave  energy  in  a steel  cable.  Use  the  results  of 
Exercise  12-36  to  evaluate  the  energy  density  pE,  the  en- 
ergy flux  S,  and  the  total  energy  E carried  by  a triangular 
wave  with  amplitude  A = 0.10  cm  and  overall  length  21  = 
20  cm  traveling  along  the  steel  cable  described  in  Exercise 
12-25.  Express  your  results  in  SI  units. 

12-39.  A second  flight  for  the  bumblebee ? The  following 
quotation  is  due  to  British  physicist  Lord  Rayleigh  [The 
Theory  of  Sound  2d  ed.  (1896;  reprinted  by  Dover,  New 
York,  1945),  2:154]; 

“The  pitch  of  a sound  is  liable  to  modification  when 
the  source  and  the  recipient  are  in  relative  motion  ...  if 
v be  the  velocity  of  the  observer  and  a that  of  sound,  the 
frequency  is  altered  in  the  ratio  (a  ± v):a,  according  as 


the  motion  is  towards  or  from  the  source.  [If  the  motion  is 
from  the  source  and]  we  could  suppose  v to  be  greater 
than  a,  a sound  produced  after  the  motion  had  begun 
would  never  reach  the  observer,  but  sounds  previously  ex- 
cited would  be  gradually  overtaken  and  heard  in  the  re- 
verse of  the  natural  order.  If  v = 2a,  the  observer  would 
hear  a musical  piece  in  correct  time  and  tune,  but  back- 
wards. ” 


a.  Carefully  explain  why  Lord  Rayleigh’s  statement  is 
true  in  principle. 

b.  At  the  conclusion  of  an  outdoor  performance  of 
Rimsky-Korsakov’s  The  Flight  of  the  Bumblebee,  a fanatical 
music  lover  (having  read  Lord  Rayleigh's  statement) 
speeds  away  at  twice  the  speed  of  sound!  How  far  from 
the  bandstand  will  he  be  by  the  time  he  has  heard  the  com- 
plete backward  performance?  (A  typical  performance 
time  fcmthe  composition  is  80  s.) 


(12-4( X/Doppler  effect  in  two  dimensions:  moving  source. 
Figure  1 2E-40  shows  a wave  source  S of  intrinsic  frequency 
v passing  an  observer  O . Both  the  observer  and  the  propa- 
gation medium  are  at  rest.  The  source  is  moving  with  con- 


stant velocity  vs.  At  the  instant  shown,  the  source  is  lo- 
cated at  rs  = rsrs  with  respect  to  O.  The  wave  speed  in  the 
medium  is  v. 


Fig.  12E-40 


a.  Show  that  when  the  waves  emitted  by  S at  the  in- 
stant shown  are  later  received  at  O,  the  received  fre- 
quency v'  will  be  given  by 


v 4-vs*rs 


b.  Show  that  the  result  of  part  a reduces  to  Ecp 
(12-71)  in  the  particular  case  that  the  source  moves  along 
a line  through  the  observer. 

12-41.  Doppler  effect  in  two  dimensions:  moving  observer. 
Figure  12E-41  shows  a wave  sourceS  of  intrinsic  frequency 
v.  It  is  immersed  in  a medium  in  which  the  wave  speed  is  v. 
Both  the  source  and  the  medium  are  at  rest.  An  observer 
O moves  past  the  source  with  constant  velocity  v0.  At  the 
instant  shown,  the  observer  is  located  at  r„  = r0r0  with 
respect  to  the  source. 


S 


Fig.  12E-41 


Exercises  547 


a.  Show  that  l lit-  frequency  u'  of  l he  wave  being  re- 
ceived by  O al  (he  instant  shown  is  given  by 


- y„T 

v 


0 


\P 


l>.  Show  that  the  result  ol  part  a reduces  to  Kq. 
( 12-72)  when  the  observei  moves  along  a line  through  the 
source.  ^0^ 

12-42.  ff  'aves  I rom  a suj)e\sf)ujc_ 

a.  Modify  f ig.  12-44*/  so  that  it  pertains  to  a case  in 
whit  b the  speed  |t/„|  ol  the  source  through  the  propagation 
medium  is  greater  than  the  speed  |v|  ol  the  waves  with 
respect  to  the  medium,  say  twite-  as  large. 


b.  Use  the  modilied  figure  to  show  that  all  the  waves 
emanating  Ironi  the  source  are  tangent  to  a cone  of  apex 
angle  2 a,  whose  apex  moves  along  with  the  source  (as- 
sumed to  be  a point  source).  This  conical  envelope  is  the 
“shock  wave"  produced,  for  example,  by  a supersonic  air- 
plane. It  is  called  the  Mach  cone  alter  the  Austrian  philos- 
opher and  physicist  Krnsl  Mach  (1  838-1916),  who  was  the 
first  to  obtain  photographs  of  the  shock  waves  of  super- 
sonic projectiles.  The  half-angle  a is  called  the  Mach 
angle. 

c.  Find  an  equal  it  m_ivjiich  gives  the  Mach  angle  in 
terms  ol  the  ratio  M T M/l’Aid >1  the  two  speeds.  This  ratio 
is  the  famous  Mach  nuTnT 


548  Mechanical  Traveling  Waves 


13 

Superposition  of 
Mechanical  Waves 


13-1  SUPERPOSITION  When  a wave  pulse  or  a wave  train  is  passing  through  a certain  region  ol 
OF  WAVES  space,  it  does  not  exclude  other  waves  Irom  that  region.  1 hat  is,  as  many 
waves  as  you  like  can  pass  simultaneously  through  the  same  location  in  a 
medium.  This  property  ol  waves  is  diametrically  opposite  to  a property  we 
take  for  granted  in  studying  the  mechanics  of  rigid  bodies,  namely,  that  no 
more  than  one  body  can  be  present  at  a given  point  in  space  at  a given  mo- 
ment. 

This  chapter  is  devoted  to  a study  of  what  happens  when  two  or  more 
mechanical  waves  pass  through  the  same  region  simultaneously.  An  under- 
standing of  many  very  important  physical  phenomena  will  emerge  from 
this  study. 

Everyone  has  seen  the  complex  pattern  which  is  set  up  when  two  or 
more  sets  of  ripples,  generated  on  the  surface  of  water,  interpenetrate.  A 
naturally  occurring  example  and  a carefully  controlled  laboratory  demon- 
stration of  this  phenomenon  are  illustrated  in  Fig.  13-la  and  b,  respec- 
tively. The  primary  question  suggested  by  these  pictures  can  be  phrased  in 
physical  terms  by  returning  to  the  example  with  which  Chap.  12  began — a 
cork  bobbing  on  the  surface  of  a pond.  The  pond  is  covered  with  a circular 
pattern  of  ripples  radiating  from  the  spot  where  the  pebble  was  dropped. 
If  another  pebble  is  dropped  into  the  pond  at  a different  point,  a new  set  ol 
ripples  will  come  into  being,  centered  on  that  point.  How  will  the  cork  be- 
have under  the  combined  influence  of  the  two  sets  of  circular  ripples?  Or, 
more  generally,  how  will  any  cork,  located  anywhere  on  the  pond,  behave 
when  any  number  of  pebbles  are  dropped  into  the  pond  at  arbitrary  loca- 
tions and  at  arbitrary  times? 

This  physical  question  must  be  rephrased  in  mathematical  terms  so 
that  we  can  undertake  an  analysis.  In  these  terms,  the  question  becomes: 


549 


Fig.  13-1  (a)  Raindrops  falling  on  a pond.  ( Courtesy  Elinor  S.  Beckwith.)  (b)  A ripple-tank 

photo  in  which  two  oscillating  rods  are  producing  two  interpenetrating  circular  wave  trains. 
(From  PSSC  Physics,  2d  ed.,  Boston,  D.  C.  Heath,  1 965.  Courtesy  Education  Development  Corporation. ) 


What  is  the  form  of  the  complete  wave  function  produced  by  the  superposi- 
tion of  two  or  more  wave  functions? 

The  answer  to  this  question  is  the  simplest  conceivable  one,  provided 
the  amplitude  of  the  waves  is  not  too  great.  (We  will  see  what  this  restriction 
means  later  in  this  section.)  If  this  condition  is  met,  the  resultant  wave  func- 
tion is  simply  the  algebraic  sum  of  the  individual  wave  functions.  That  is,  at  any 
particular  location  and  at  any  particular  moment,  the  displacement  pro- 
duced by  the  resultant  wave  is  the  algebraic  sum  of  the  displacements 
which  would  be  produced  by  the  individual  waves  if  each  were  acting  sepa- 
rately. 

This  statement  is  true  of  waves  propagating  in  one  dimension  or  in 
three  dimensions,  as  well  as  in  the  two-dimensional  situation  exemplified 
by  the  ripples.  In  the  most  general  case,  n different  wave  trains  propagate 
through  three-dimensional  space.  An  observer  fixed  in  an  inertial  frame 
can  specify  the  location  of  any  point  by  means  of  a set  of  coordinates 
(x,y,  z).  The jth  wave  train  is  described  by  the  wave  function/,- (x,;y,  z,  t),  where 
t denotes  the  time  elapsed  since  a moment  arbitrarily  designated  t — 0.  In 
terms  of  these  quantities,  the  rule  for  determining  the  resultant  wave  func- 
tion, stated  in  the  previous  paragraph,  can  be  expressed  mathematically  in 
the  form 

fix,  y,  z,  t)  = f(x,  y,  z,  t)  + f2{x,  y,  z,  t)  + f3(x,  y,  z,  t)  + ■ ■ ■ + f(x,  y,  z,  t) 

+ • • • + fn(x,  y,  z,  t) 

= ^fix,  y,  z,  t)  (13-1) 

j=i 

This  relation  is  called  the  principle  of  superposition.  Note  that  it  is  based 
on  two  related  but  distinct  assumptions: 

1.  Each  individual  wave,  described  by  the  wave  function f(x,  y,  z,  t),  is 
unaltered  by  the  simultaneous  presence  of  the  other  waves. 

2.  l he  resultant  wave,  described  by  the  wave  function f(x,  y,  z,  t).  is  the 
algebraic  sum  of  the  functions  f(x,  y,  z,  t),  as  already  stated. 


550  Superposition  of  Mechanical  Waves 


The  ultimate  justification  of  the  principle  of  superposition  must  be  in 
experimental  observation. 

One  practical  consequence  of  this  principle  is  quite  familiar.  In  lis- 
tening to  an  orchestra,  your  ears  are  immersed  in  the  resultant  wave  train 
produced  by  the  superposition  of  the  wave  trains  whose  sources  are  many 
instruments.  The  ear  and  the  mind  are  responding  to  a very  complex  wave 
function.  But  it  is  not  at  all  difficult,  with  a little  experience,  to  pick  out  the 
sounds  of  the  individual  instruments.  In  doing  this,  the  ear  and  the  mind 
act  together  as  a “filter”  to  single  out  the  wave  function  typical  of  the  partic- 
ular instrument.  This  is  possible  because  wave  1 (say  that  produced  by  the 
flute)  is  not  altered  by  the  simultaneous  presence  of  wave  2 (that  produced 
by  the  violin)  or  wave  3 (that  produced  by  the  harp). 

In  the  case  of  waves  propagating  on  a string,  the  superposition  princi- 
ple simplifies  to  the  one-dimensional  statement 

f(x,  t)  = ffx,  t)  + f2(x,  t)  + f3(x,  t)  + • • • + fj(x,  t)  + ■ ■ • + fn(x,  t ) 

= '£fj(x,t)  (13-2) 

i= l 

Figure  13-2  illustrates  the  superposition  of  two  one-dimensional  wave 
pulses  which  arrive  at  the  same  point  while  moving  in  opposite  directions. 
In  Fig.  13-2 a,  both  pulses  displace  the  long  spring  from  its  equilibrium  po- 
sition in  the  same  direction;  that  is,y  has  the  same  sign  for  both  pulses.  The 
two  pulses,  moving  independently,  come  together,  momentarily  merge 
into  a single  large  pulse,  and  then  separate.  Each  one  continues  on  its  way 
as  though  the  other  had  not  been  there. 

In  Fig.  13-2 b,  the  pulses  are  of  equal  magnitude  but  opposite  sign.  The 
addition  process  is  nonetheless  identical  to  the  previous  one,  if  due  atten- 
tion is  paid  to  signs  in  adding  amplitudes.  In  this  case,  the  result  is  a mo- 
mentary “disappearance”  of  the  pulses  as  they  merge. 

A natural  question  is:  At  the  moment  when  the  spring  is  straight,  how  does  it 
“know”  that  it  must  not  remain  at  rest,  but  must  again  distort  as  the  pulses  move 
apart?  The  answer  is  that  while  the  displacement  f(x,  t ) is  zero  for  all  parts  of  the 
spring  at  this  instant,  the  transverse  velocity  df(x,  t)/dt  is  not  zero. 

While  Fig.  13-2a  and  b illustrates  two  cases  of  the  same  phenomenon, 
their  appearance  is  so  different  that  they  are  often  given  the  separate 
names  constructive  interference  and  destructive  interference,  respec- 
tively. In  the  more  general  case  of  repetitive  wave  trains  superposing,  how- 
ever, both  kinds  of  interference  take  place  simultaneously.  Therefore  we 
will  usually  use  the  more  general  term  superposition  in  discussing  the  sub- 
ject, so  as  to  avoid  the  confusion  of  unnecessary  terminology. 

Wave  trains  as  well  as  pulses  can  be  made  to  superpose  so  as  to  demon- 
strate constructive  and  destructive  interference.  Figure  13-3  illustrates  the 
apparatus  called  the  acoustical  interferometer.  It  is  a simple  analogue  of 
the  Michelson  interferometer,  a very  important  instrument  described  in  Sec. 
14-2.  A loudspeaker,  which  emits  a pure  sinusoidal  sound  wave  of  wave- 
length A.,  is  fitted  to  one  end  of  tfie  two-branched  “trombone.”  The  length 
of  the  trombone,  measured  via  the  upper  arm,  is  fixed.  In  particular,  the 
length  of  the  path  PRQ_  is  fixed.  The  length  of  the  lower  arm  PSQ_  can  be 
adjusted  by  moving  the  close-fitting  slide,  as  suggested  by  tfie  double- 
headed arrow. 


13-1  Superposition  of  Waves  551 


Fig.  13-2  (a)  Two  equal  wave  pulses  of  the  same  sign  produce  a large  net  displacement  of  the 

spring  at  the  instant  that  they  move  through  each  other,  (b)  Two  equal  pulses  of  the  opposite 
sign  produce  essentially  no  net  displacement  at  the  instant  that  they  move  through  each  other. 
(Courtesy  Education  Development  Corporation.) 


As  the  slide  is  gradually  pulled  out  so  as  to  lengthen  the  lower  arm,  the 
sound  heard  at  the  horn  fades  and  disappears,  then  increases  to  a max- 
imum, fades  again,  and  so  on,  as  far  as  the  length  of  the  slide  permits  the 
experiment  to  proceed.  These  observations  can  be  explained  in  terms  of 
the  superposition  principle. 

The  incoming  sinusoidal  sound  wave  may  be  described  by  the  wave 
function  of  Eq.  (12-21).  With  a slight  change  in  notation,  this  can  be  written 

f(x,  t)  — A cos{kx  — cot  + 8) 

If  we  choose  the  origin  at  P , the  sound  wave  arriving  there  is  described  by 
the  space-independent  wave  function 

f(t)  = A cos  ( — cot  + 6) 


552  Superposition  of  Mechanical  Waves 


R_ 

L 

— 


Earphone 


5 


L + d 


Fig.  13-3  T he  acoustical  interferome- 
ter. A pure  sinusoidal  sound  wave  pro- 
duced by  the  loudspeaker  passes  down 
a tube  and  splits  into  two  parts  at  P.  The 
two  parts  recombine  at  Q according  to 
the  superposition  principle.  The  loud- 
ness of  the  sound  heard  at  the  horn 
depends  on  the  relative  phase  of  the 
two  partial  waves  at  Q,  which  in  turn  is 
determined  by  the  difference  in  path 
length  along  the  two  arms  PRQ,  of 
length  L,  and  PSQ.  of  length  L + d. 


The  sound  wave  divides  at  P into  two  parts  whose  amplitudes  are  Aj  and  A2. 
The  two  separate  wave  trains  pass  through  the  arms  PRQ  and  PSQ  and  su- 
perpose at  Q.  In  order  to  describe  the  total  wave  function  resulting  from 
this  superposition,  we  must  describe  the  separate  wave  trains  by  the  proper 
wave  functions.  The  two  sinusoidal  waves  may  also  be  represented  by  wave 
functions  of  the  form  of  Eq.  (12-21).  What  is  important  here,  however,  is 
not  the  horizontal  distance  x but  the  distances  Si  and  52  measured  along  the 
lengths  of  the  fixed  and  movable  arms,  respectively.  We  choose  the  origin 
for  both  and  s2  at  the  branch  point  P.  The  wave  functions  can  thus  be 
written 

/iCh,  t)  — Ai  cos (ksx  — cut  + 8) 

and 

f2(s 2,  t)  — A2  cos (ks2  — a>t  + S) 

(We  assume  for  the  sake  of  generality  that  the  amplitudes  At  and  A2  of  the 
two  wave  trains  are  not  equal.) 

Since  the  two  waves  were  obtained  by  “splitting”  the  original  wave  into 
two  parts,  the  space-independent  wave  functions  describing  the  two  waves 
at  point  P must  add  to  yield  the  space-independent  wave  function  describ- 
ing the  original  wave  at  that  point.  We  thus  have 

fi(t)  + f2(t)  =f(t) 

This  mves 

o 

Ax  cos(— c ot  + 8)  + A2  cos(  — u>t  + 8)  = A cos(—  cot  + 8) 


or 


Ax  + A2  — A 

That  is,  the  amplitude  of  the  original  space-independent  wave  function/ at 
point  P is  the  sum  of  the  amplitudes  of  the  space-independent  wave  func- 
tions fx  and  f2  at  the  same  point.  (The  equation  Ax  + A2  = A can  have 
meaning  only  at  that  point  and  at  the  symmetrical  point  Q,  since  these  are 
the  only  locations  at  which  all  three  wave  functions  are  defined.) 

Consider  what  happens  when  both  arms  of  the  apparatus  have  the 
same  length  L.  The  path  length  from  P to  Q measured  along  the  fixed  arm 
PRQ  is  Si  = L.  The  path  length  from  P to  Q measured  along  the  movable 
arm  PSQ  is  s2  = L.  Because  of  the  symmetrical  geometry  in  the  vicinities  of 
points  P and  Q , the  two  waves  combine  at  Q under  the  same  circumstances 
as  those  under  which  they  split  at  P.  Thus  at  Q,  the  two  waves  superpose  to 
yield  the  total  space-independent  wave  function  g(t)  given  by  the  sum 

g(t)  = fx(L,  t)  +/2(L,  t)  = Ai  cos (kL  — a>t  + 8)  + A2  cos (kL  — cut  + 8) 


or 

g(t)  ~ (Ax  + A2)  cos  {kL  — a>t  + S)  = A cos  (kL  — wt  + 8) 

Rearranging  terms  to  put  together  all  the  constants  in  the  argument  of  the 
cosine  gives 

g(t)  = A cos[—  cot  + (8  + kL)] 

That  is,  the  total  space-independent  wave  function  g(t)  at  Q is  identical  to 
the  original  wave  function  at  P,  f(t ) = A cos(-w/  + S),  except  for  an  addi- 
tional phase  constant  kL.  The  amplitude  of  g(t).  like  that  of/(£),  is  equal  to 


13-1  Superposition  of  Waves  553 


(a) 


fU) 


Fig.  13-4  (a)  Two  sinusoidal  waves  in  phase.  One  is 

represented  by  the  wave  function  L(t)  = A x cos  ( — a>t  + 6p 
and  the  other  by  the  wave  function/2(t)  = A2  cos  ( — out  + 52). 
The  amplitudes  At  and  A2  may  be  different,  but  the  angular 
frequency  to  is  the  same  for  both  functions  and  the  phase 
constants  satisfy  the  equality  8j  = S2.  The  two  waves  super- 
pose constructively,  as  can  be  seen  from  the  dark  curve 
which  represents  the  function  g(t)  =/i(t)  + f2(t),  whose  am- 
plitude is  Ax  + A2-  (b)  Two  sinusoidal  waves  out  of  phase. 
The  only  difference  between  the  wave  functions  repre- 
sented here  and  those  in  part  a is  that  the  phase  constant  82 
is  given  by  82  = 8!  + 7 r.  The  amplitudes  Ax  and  A2,  the  angu- 
lar frequency  w.  and  the  phase  constant  8X  are  the  same  as 
those  in  part  a.  The  two  waves  superpose  destructively,  as 
can  be  seen  from  the  dark  curve  which  represents  the  func- 
tion g'(t)  =/i(t)  + /2(t),  whose  amplitude  is  |AX  - A2|. 


the  sum  of  the  amplitudes  of fx  and  f2.  Figure  13-4«  depicts  the  superposi- 
tion of  the  space-independent  wave  functions  at  point  Q.  The  two  functions 
f-y  and  /2  are  said  to  be  in  phase. 

Now  consider  what  happens  when  the  slide  is  pulled  out  so  that  the 
length  L + d of  the  movable  arm  is  greater  than  the  length  L of  the  fixed  arm 
by  one-half  the  wavelength  of  the  sound  wave.  That  is,  we  have  d = A./2. 
Then  the  path  lengths  in  the  two  arms  are  s1  — L and  s2  = L + A./2,  respec- 
tively. At  Q,  where  the  two  waves  superpose,  die  total  space-independent 
wave  function  is  given  by  the  sum 

g'(t)  = ML,  t ) +ML  + v 2,  t ) 
or 

g'(t)  = Ai  cos (kL  - cot  + 8)  + A2  cos [k(L  + A./2)  — cot  + 8] 

But  the  wave  number  k is  defined  to  be  k = 2tt/K , so  that  k\  — 2 7 r and  we 
have  k{L  + A./2)  = kL  + tt.  The  second  term  on  the  right  side  of  the  last 
displayed  equation  thus  becomes  A2  cos  (kL  — cot  + S + tt).  But  for  any 
argument  0,  we  have  cos (6  + 77)  = —cos  0,  so  g'(t)  can  be  written 

g'(t)  - Ax  cos  {kL  - c ot  + 8)  - A2  cos(k  L - cot  + 8) 
or 

g'(t)  = (Ai  — A2)  cos[  — a>t  + (6  + kL)] 

That  is,  the  total  space-independent  wave  function  g'(t)  has  an  amplitude 
ecpial  to  the  magnitude  of  the  difference  of  the  amplitudes  of fx  and /2.  Fig- 
ure 1 3-40  depicts  the  superposition  of  the  space-inclependent  wave  func- 
tions at  point  Q.  The  two  functions /x  and/2  are  said  to  be  out  of  phase. 

Further  increase  in  length  of  the  movable  arm,  until  d is  equal  to  the 
wavelength  A.  of  the  sound  wave,  leads  again  to  constructive  superposition 


554  Superposition  of  Mechanical  Waves 


of  waves  which  are  in  phase.  Can  yon  explain  why?  As  the  length  of  the 
movable  arm  is  increased  steadily,  the  two  waves  which  superpose  at  Q shift 
gradually  out  of,  into,  and  out  of  phase.  The  amplitude  of  the  total  outgo- 
ing wave  therefore  varies  between  Ax  + A2  and  |Aj  — A2|.  If  it  happens  that 
Ai  — A2,  the  sound  will  disappear  completely  when  the  waves  are  out  of 
phase  at  Q. 

The  superposition  principle  can  be  justified  mathematically,  as  well  as  on  the 
physical  grounds  discussed  earlier  in  this  section.  Consider  two  wave  functions 
/i(x,  t)  andf2(x,  t),  each  of  which  describes  a possible  wave  propagating  down  a 
stretched  string  of  linear  density  /x  in  which  there  is  a tension  F.  If  the  function 
f i(x,  t)  represents  a possible  wave,  then  it  must  be  a solution  to  Eq.  (12-26),  the 
wave  equation  for  a stretched  string.  So  we  have 

a2fdx,  t)  = ix  82fdx,  t)  (l 3 3 ) 

dx2  F dt2 

Similarly,  f2(x,  t)  must  also  be  a solution  to  the  same  wave  equation,  so  that  we 
have 


d%(x,t)  fxd2f2(x,t) 

dx2  F dt2  [ ’ 

What  must  be  proved  is  that  the  form  for/(x,  t)  given  by  Eq.  (l3-2)  is  also  a solu- 
tion to  the  wave  equation  for  the  stretched  string.  That  is,  we  must  prove  that  the 
equation 

d2f(x,  t)  _ /JL  d2f(x,  t) 

dx2  ~ F dt2  (13'5) 

is  valid  when/(x,  t)  is  given  by 

f(x,  t ) =fi(x,  t)  + /2(x,  t)  (13-6) 

To  do  so,  we  substitute  this  expression  for/(x,  t)  into  the  equation  it  is  supposed 
to  solve,  Eq.  (13-5).  We  obtain 

"JpT  [/ 1 (x , t ) +/2(x,t)]  =y^[/dx,t)  +/2(x,t)] 

Evaluating  the  derivatives  of  the  sums  on  both  sides  of  the  last  equation  gives  us 

d2fi(x.  f)  d2f2jx,  t)  = /X  d 2f ! (x , t)  /X  d2f2(x,  t ) 

dx2  dx2  F dt2  F dt2 

If  this  equation  is  satisfied,  then  Eq.  (13-6)  is  a solution  to  Eq.  (13-5).  Applying  Eqs. 
(13-3)  and  (13-4)  shows  immediately  that  it  is  satisfied,  and  so  we  have  justified 
Eq.  (13-6),  the  superposition  principle. 

The  key  feature  used  in  our  proof  is  the  linearity  of  a derivative.  That  is,  the 
derivative  of  the  sum  of  two  functions  is  equal  to  the  sum  of  the  derivatives  of  the 
functions.  In  different  words,  a derivative  is  linear  because  it  is  proportional  to  the 
first  power  of  the  dependent  variable,  which  is  the  function  being  differentiated. 
Because  each  term  in  the  wave  equation  is  linear,  it  is  said  to  be  a linear  partial  dif- 
ferential equation.  The  superposition  principle  is  a necessary  and  sufficient  conse- 
quence of  the  linearity  of  the  wave  equation. 

When  does  an  actual  physical  system  obey  the  superposition  principle?  To  do 
so,  it  must  be  correctly  described  by  a linear  wave  equation.  This  in  turn  will  be 
true  only  if  the  system  is  itself  linear — that  is,  if  it  obeys  Hooke’s  law  in  the  gener- 
alized sense  discussed  in  criterion  1 toward  the  beginning  of  Sec.  12-6. 

Hooke’s  law  must  ultimately  fail  to  describe  any  real  medium  if  the  imposed 
distortions  become  too  large.  Waves  will  still  propagate  through  the  medium,  but 


13-1  Superposition  of  Waves  555 


the  superposition  principle  will  not  be  obeyed.  In  this  case,  the  behavior  of  the 
waves  is  much  more  complicated,  since  each  wave  train  interacts  with  and  is  af- 
fected by  the  other  wave  trains  which  simultaneously  pass  through  the  same 
region  of  space.  Such  waves  are  called  nonlinear.  We  discuss  nonlinearity  only 
qualitatively  and  briefly  in  connection  with  the  behavior  of  the  ear  in  Sec.  13-7. 

Example  13-1  applies  the  superposition  principle  to  the  relatively 
simple  case  of  two  one-dimensional  sinusoidal  waves. 


EXAMPLE  13-1 

Show  that  the  superposition  principle  holds  true  for  the  special  case  of  two  sinu- 
soidal waves  having  the  same  wave  number  k and  angular  frequency  w and  traveling 
along  a stretched  string  in  opposite  directions.  One  of  them  has  amplitude  A and  is 
moving  in  the  positive  x direction.  It  is  described  by  the  wave  function 

/x(x,  t)  = A cos(kx  — out) 

This  is  Eq.  (12-21)  for  the  case  of  motion  in  the  positive  x direction,  with  the  phase 
constant  5 set  equal  to  zero  for  simplicity.  The  other  wave  has  amplitude  B and  is 
moving  in  the  negative  x direction.  It  is  described  by  the  wave  function 

/2(x,  t)  = B cos{kx  + cot) 

This  is  Eq.  (12-21)  for  the  case  of  motion  in  the  negative  x direction,  with  the  phase 
constant  8 again  set  equal  to  zero. 

■ You  have  already  seen  in  Example  12-4  that  the  hrst  of  these  two  functions 
(there  written  in  terms  of  the  wavelength  \ and  period  T instead  of  k and  co)  satisfies 
the  wave  equation.  You  use  much  the  same  procedure  as  in  Example  12-4  to  show 
that  the  sum 

f(x,  t)  = A cos{kx  — cot)  + B cos(kx  + tot)  (13-7) 

also  satisfies  the  wave  equation.  First  you  evaluate  the  necessary  partial  derivatives, 
obtaining 


a/(x,  t) 
dx 


—kA  sin(/fx  — tot)  — kB  sin  (/ex  + tot) 


d2f(x,  t) 
dx2 


— k2A  cos(kx  — tot)  — k2B  cos(kx  + cot) 


and 


df(x,  t) 
dt 


— (—to)A  sin  (/ex  — tot)  — a >B  sin  (for  + tot) 


d2f(x , t) 
dt2 


— to 2 A cos  (/ex  — tot)  — co  2B  cos(kx  + tot) 


Inserting  the  values  of  the  second  partial  derivatives  into  Eq.  (13-5), 

d2f(x,  I)  _ ^ d2/(x,  t) 
dx2  ~ F dt2 


you  have 


— k2[A  cos(kx  — tot)  + B cos(kx  + cot)]  = 


2[A  cos(kx  — cot)  + B cos(/ex  4-  &>0] 


Ehis  equation  is  satisfied  if  k2  = /xa 2/F,  or 


556  Superposition  of  Mechanical  Waves 


But  according  to  the  definitions  of  co  and  k given  by  Eqs.  (12-19)  and  (12-20),  you 
have 


a>  2 TTV 

where  v is  the  frequency  of  the  wave.  You  know  from  Eq.  (12-1 16)  that  vK  = |w|,  the 
speed  of  the  wave.  Furthermore,  Eq.  (12-356)  shows  that  |u|  = x/F/Jl.  Conse- 
quently, cu/k  is  indeed  equal  to  x/F/fx,  and  you  have  proved  that  the  wave  function 
f(x,  t)  given  by  Eq.  (13-7)  does  satisfy  the  wave  equation,  Eq.  (13-5).  There- 
fore the  superposition  principle  is  obeyed  by  the  particular  functions  j\ (x,  t)  = 
A cos(kx  — cot)  and  f2(x,  t)  = B cos(foc  + cot). 


13-2  REFLECTION  OF  When  a pebble  is  dropped  into  a small  pond,  the  wave  pattern  appears 
WAVES  quite  simple  at  first,  as  an  array  consisting  of  a number  of  concentric  circu- 
lar ripples  moves  outward  from  the  original  disturbance.  As  soon  as  the 
first  ripples  strike  the  bank,  the  pattern  becomes  more  complicated  since 
ripples  are  reflected.  After  a while  the  pattern  appears  quite  confused  as  a 
result  of  multiple  reflections.  Figure  13-5  shows  this  effect  in  its  simplest 
form  in  a ripple  tank.  A short-term  disturbance  has  produced  a sequence 
of  a few  circular  ripples,  which  are  being  reflected  by  the  plane  surface  at 
the  right.  Many  and  varied  physical  phenomena  have  their  bases  in  the  re- 
flection of  waves.  A few  examples  are  the  vibration  of  a violin  string,  the 
possibility  of  radio  communication  between  distant  stations  on  the  earth, 
and  the  operation  of  many  optical  instruments.  We  will  now  discuss  briefly 
the  interaction  of  waves  with  boundaries. 

First  consider  the  analogous  situation  in  particle  motion.  When  a par- 
ticle strikes  a barrier,  what  happens  depends  on  the  details  of  the  collision. 
If  the  barrier  and  the  particle  are  perfectly  elastic,  the  particle  bounces  off 
without  loss  of  mechanical  energy. 

Waves  are  reflected  without  loss  of  energy  when  the  medium  through 
which  they  propagate  terminates  abruptly  at  a barrier  which  cannot  move. 
A three-dimensional  example  is  a gas  confined  in  a closed  container,  the 
walls  of  which  will  reflect  sound  waves  produced  inside.  A two-dimensional 


Fig.  13-5  Ripple-tank  circular  wave  re- 
flected by  plane  surface.  (Courtesy  Educa- 
tion Development  Corporation.) 


13-2  Reflection  of  Waves  557 


F 


(a) 

Fig.  13-6  (a)  An  idealized  experiment  in 

which  a pulse  propagates  through  a long 
string  subjected  to  a tension  F.  The  left 
end  of  the  string  is  not  shown.  The  right 
end  is  rigidly  fixed.  ( b ) A physical  realiza- 
tion of  the  idealized  experiment  of  part  a, 
using  a spring  instead  of  a string.  Note  the 
inversion  of  the  pulse  upon  reflection  from 
the  fixed  end  of  the  spring.  (From  PSSC 
Physics,  2d  eel.,  Boston,  D.  C.  Heath,  1965. 
Courtesy  Education  Development  Corporation.) 


example  (shown  in  Fig.  13-5)  is  the  surface  of  a pool  of  water  in  a container 
with  vertical  sides.  The  one-dimensional  example  on  which  we  will  concen- 
trate is  a stretched  string. 

o 


We  begin  with  the  case  shown  in  ideal  fashion  in  Fig.  13-6«,  where 
the  right  end  of  the  string  is  tied  down  so  that  it  cannot  move.  (This  is  not 
the  same  as,  but  is  more  stringent  than,  the  condition  that  the  barrier 
cannot  move.  In  the  case  of  water  surface  waves,  for  example,  the  sides  of 
the  container  are  rigid,  but  the  water  in  contact  with  them  can  oscillate 
freely  up  and  down.  We  will  consider  the  analogous  case  for  a stretched 
string  later.) 

Figure  13-66  is  a series  of  photos  of  a pulse  traveling  through  a 
stretched  spring.  (This  spring  behaves  in  essentially  the  same  way  as  a 
stretched  string.  But  it  is  easier  to  photograph  because  its  flexibility  makes 


558  Superposition  of  Mechanical  Waves 


practical  a smaller  tension;  hence  a large  pulse  amplitude  is  attainable  and 
also  the  pulse  travels  relatively  slowly.)  The  right  end  of  the  spring  is  rigidly 
fixed.  A positive  pulse  — a pulse  involving  displacements  only  in  the  posi- 
tive y direction — moves  into  the  picture  from  the  left.  It  reaches  the  fixed 
end  and  is  reflected  as  a negative  pulse — a pulse  involving  displacements 
only  in  the  negative  y direction  — moving  toward  the  left. 

To  see  why  this  happens,  compare  the  force  exerted  on  an  arbitrary 
element  of  the  string  by  the  element  immediately  to  its  right  with  the  force 
exerted  on  the  element  at  the  extreme  right  end  of  the  string  by  the  object 
to  which  it  is  rigidly  fixed.  The  arbitrary  element  experiences  at  its  right 
end  a force  having  a downward  component  during  the  first  half  of  the 
pulse.  (It  nevertheless  moves  upward  because  it  experiences  a force  at  its 
left  end  having  an  upward  component  of  greater  magnitude.)  When  the 
peak  of  the  pulse  reaches  the  arbitrary  element,  the  force  on  its  right  end  is 
horizontal  and  has  no  vertical  component.  Subsequently,  the  force  on  the 
right  end  has  an  upward  component,  whose  magnitude  hrst  increases  and 
then  decreases  to  zero  as  the  pulse  passes.  The  time  average  of  the  force 
over  the  entire  pulse  is  zero,  and  the  element  is  finally  at  rest  in  its  undis- 
turbed position. 

But  the  element  at  the  extreme  right  end  oi  the  spring  experiences  a 
downward  force  throughout  the  entire  time  during  which  the  arriving 
pulse  passes  it.  The  result  is  a net  downward  acceleration  which  the  ele- 
ment transmits  to  its  neighbor  to  the  left,  leading  to  a negative  pulse.  The 
sixth  photo  of  Fig.  13-6/t  shows  the  spring  at  the  instant  when  the  peak  of 
the  pulse  has  arrived  at  the  right  end  of  the  spring.  The  first  hall  of  the 
nearly  symmetric  pulse  has  already  been  reflected  to  the  left  as  a negative 
pulse.  It  has  superposed  with  the  still-arriving  second  half  of  the  original 
pulse,  leading  to  essentially  zero  displacement  of  the  spring.  But  the  spring 
is  moving  downward,  as  can  be  seen  in  subsequent  photos. 


It  is  a help  to  understanding  what  happens  on  reflection  to  mimic  the  process 
in  the  following  way.  Imagine  that  the  spring,  instead  of  terminating  at  the  point 
where  it  is  fixed,  extends  indefinitely  to  the  right.  And  imagine  that  a negative 
“ghost  pulse,”  or  virtual  pulse,  identical  to  the  actual  pulse  but  inverted  and 
moving  to  the  left  along  the  imaginary  extension  of  the  spring,  arrives  at  the  point 
where  the  spring  actually  ends  at  the  same  time  as  the  actual  pulse.  The  two 
pulses  wifi  then  superpose  on  the  actual  spring,  just  as  is  shown  in  the  photos  of 
Fig.  13-6b.  Just  as  in  the  actual  situation,  the  endpoint  (which  is  not  fixed  in  the 
situation  we  are  imagining)  will  never  move.  The  actual  pulse  continues  to  the 
right,  becoming  a virtual  pulse.  And  the  negative  virtual  pulse,  moving  to  the  left, 
becomes  the  actual  reflected  pulse. 


Let  us  now  consider  the  opposite  extreme  case,  in  which  the  right  end 
of  a stretched  string,  instead  of  being  fixed,  is  completely  free  to  move  in 
the  y direction.  In  order  for  there  to  be  tension  in  the  string,  the  end  of  the 
string  must  still  be  constrained  so  that  it  cannot  move  to  the  left  or  to  the 
right.  The  situation  is  shown  in  idealized  form  in  Fig.  13-7a,  where  the 
string  terminates  in  a massless  ring  which  slides  freely  on  a frictionless  post. 
(This  is  the  one-dimensional  analogue  of  the  surface  of  water  confined  in  a 
container  with  vertical  sides.) 

It  is  not  practical  to  attach  a string  to  a ring  sliding  on  a frictionless 
post.  However,  the  situation  can  be  reasonably  well  approximated  by  at- 
taching the  end  of  a stretched  spring  to  a very  light  thread,  as  in  Fig.  13-7 b. 


13-2  Reflection  of  Waves  559 


Frictionless 

post 

< > 

Ring  ^ 


(a) 

Fig.  13-7  (a)  An  idealized  experi- 

ment in  which  a pulse  propagates 
through  a long  string  subjected  to  a 
tension  F.  The  left  end  of  the  string 
is  not  shown.  The  right  end,  attached 
to  a massless  ring  looped  over  a fric- 
tionless post,  is  free  to  move  in  the 
y direction.  ( b ) A physical  realization 
of  the  idealized  experiment  of  part  a 
using  a spring  whose  end  is  attached 
to  a very  light  thread.  Note  that  the 
pulse  is  reflected  from  the  end  of  the 
spring  without  change  in  sign,  and  the 
maximum  displacement  of  the  end  of 
the  spring  is  greater  than  the  maxi- 
mum displacement  of  any  other  part  of 
the  spring.  ( From  PSSC  Physics,  2d  ed.,  D. 
C.  Heath,  Boston,  1965.  Courtesy  Education 
Development  Corporation. ) 


( b ) 

The  thread  is  strong  enough  to  maintain  the  rather  small  tension  in  the 
spring.  But  its  mass  is  negligible  compared  to  that  of  the  spring,  and  it 
offers  negligible  inertial  resistance  to  transverse  displacement. 

Just  as  in  Fig.  13-66,  a positive  pulse  arrives  at  (he  end  of  the  sprin 
from  the  left.  In  this  case,  however,  the  reflected  pulse  moving  back  towar 
the  left  is  not  inverted,  but  is  identical  to  the  original  pulse  except  for  its 
direction  of  motion.  Note  also  that  the  maximum  displacement  of  the  end 
of  the  spring  (where  it  is  attached  to  the  thread)  is  greater  than  the  max- 
imum displacement  of  other  parts  of  the  spring  as  the  pulse  passes  through 
them. 

These  observations  can  again  be  understood  by  comparing  the  end 
element  of  the  string  to  an  arbitrarily  located  element.  As  noted  in  Sec. 


560  Superposition  of  Mechanical  Waves 


O-  oq 


12-3,  the  net  force  exerted  in  the  y direction  on  an  arbitrary  element  by  its 
neighboring  elements  depends  on  its  curvature.  It  is  therefore  accelerated 
upward  during  the  arrival  of  the  hrst  quarter  of  the  symmetrical  positive 
pulse,  downward  during  the  second  and  third  quarters,  and  upward 
during  the  fourth  quarter.  But  the  force  exerted  on  the  end  element  by 
the  frictionless  post  cannot  have  a y component.  It  is  therefore  accelerated 
upward  as  long  as  the  arriving  pulse  has  a negative  slope,  that  is,  during  the 
hrst  half  of  the  pulse.  When  the  peak  of  the  pulse  arrives,  the  end  element 
has  a displacement  equal  to  the  pulse  amplitude.  Subsequently,  the  end  ele- 
ment is  accelerated  in  the  negative  y direction.  But  it  has  a positive  velocity 
and  therefore  overshoots.  The  symmetry  of  the  situation  suggests  that  the 
end  element  comes  to  rest  when  its  displacement  is  equal  to  twice  the  am- 
plitude of  the  pulse.  The  slope  of  the  string  is  now  positive,  and  the  end 
element  is  accelerated  back  toward  its  undisturbed  position.  The  upward 
reaction  force  it  exerts  on  its  neighboring  element  to  the  left  leads  to  the 
propagation  of  a positive  pulse  to  the  left.  This  is  the  reflected  pulse. 

In  the  preceding  small-print  section,  the  reflection  of  a pulse  at  the  fixed  end 
of  a stretched  string  was  interpreted  in  terms  of  an  inverted  virtual  pulse  traveling 
in  the  direction  opposite  to  that  of  the  actual  pulse  on  an  imaginary  extension  of 
the  string.  The  same  approach  can  be  used  in  considering  the  reflection  of  a pulse 
from  a string  whose  end  is  free  to  move  in  they  direction  (the  case  represented  in 
Fig.  13-7b).  In  this  case,  however,  the  virtual  pulse  is  not  inverted,  but  has  the 
same  sign  as  the  actual  pulse.  What  will  be  the  maximum  displacement  of  the 
point  on  the  imaginary  string  of  indefinite  length,  at  which  the  two  pulses  come 
together? 


13-3  STANDING 
WAVES 


Fig.  13-8  A length  of  stretched  rub- 
ber tubing  is  disturbed  by  a vibrator. 
A standing  wave  is  set  up  in  the  funda- 
mental mode,  for  which  the  wavelength 
is  twice  the  length  of  the  tubing.  There 
must  be  nodes  at  the  two  ends,  since 
they  are  fixed.  There  is  a single  anti- 
node in  the  center. 


The  behavior  of  musical  instruments,  lasers,  electrons  in  atoms,  and  many 
other  diverse  systems  depends  on  die  phenomenon  of  standing  waves.  All 
the  general  characteristics  of  this  special  class  of  waves  can  be  observed  in  a 
stretched  string,  the  relatively  simple  system  already  studied. 

Like  all  periodic  mechanical  waves,  a standing  wave  involves  a periodic 
oscillation  of  the  elements  comprising  the  medium.  The  elements  of  a 
stretched  string,  for  example,  oscillate  in  a direction  transverse  to  the 
length  of  the  string.  But  the  most  evident  property  of  a standing  wave  is 
that  there  are  certain  fixed  locations,  called  nodes,  where  there  is  never  any 
motion.  This  is  in  contrast  with  the  traveling-wave  case,  where  every  part 
of  the  medium  sooner  or  later  participates  in  the  wave  motion.  But 
standing  waves  have  an  intimate  connection  with  traveling  waves.  In  the 
one-dimensional  case  typified  by  the  string,  for  example,  standing  waves 
can  be  produced  when  traveling  waves  of  equal  amplitude,  frequency,  and 
wavelength  move  in  opposite  directions  along  the  string. 

In  a popular  lecture  demonstration,  a length  of  rubber  tubing  is 
stretched  between  two  supports,  as  shown  in  Fig.  13-8.  As  in  Fig.  12-1  1,  the 
tension  in  the  tubing  is  determined  by  the  weight  hanging  at  the  left  end.  A 
vibrator,  which  acts  as  the  source,  is  touched  to  the  tubing  at  a point  near 
the  left  end,  and  a wave  immediately  begins  to  travel  to  the  right  along  the 
tubing.  When  the  wave  reaches  the  rigidly  anchored  right  end,  it  is  re- 
flected. As  we  noted  in  Sec.  13-2,  this  reflection  involves  a sign  inversion  of 
the  transverse  displacement  and  a reversal  of  the  direction  of  travel,  but 
does  not  change  the  amplitude,  frequency,  or  wavelength  of  the  wave. 


13-3  Standing  Waves  561 


I he  reflected  wave  travels  back  to  the  left  end  of  the  tubing  and  is  re- 
flected a second  time.  There  is  a second  sign  inversion,  so  that  the  net 
change  in  sign  of  the  transverse  displacement  after  two  reflections  is  zero. 
The  wave  then  passes  the  vibrator  again.  If  it  happens  that  the  length  of 
the  tubing  is  exactly  half  the  wavelength  of  the  wave,  the  twice-reflected 
wave  will  have  traveled  a total  distance  of  one  wavelength  when  it  reaches 
the  vibrator.  It  will  therefore  produce  a transverse  motion  of  the  tubing  at 
the  vibrator  which  is  exactly  in  synchronism  with  the  motion  which  the  vi- 
brator is  itself  inducing  in  the  tubing.  The  wave  excited  by  the  vibrator  has 
the  same  frequency,  wavelength,  and  propagation  speed  as  the  wave 
already  moving  down  the  tubing.  Thus  the  displacement  maxima,  minima, 
and  zeros  of  the  two  waves  coincide  everywhere,  and  the  waves  are  in 
phase.  This  is  because  the  wave  functions  describing  the  two  waves  have  in 
their  arguments  the  same  phase  constant  8.  Figure  1 3-4 a illustrates  two  si- 
nusoidal waves  which  are  in  phase. 

Since  the  wave  already  traveling  down  the  tubing  and  the  wave  pro- 
duced by  the  vibrator  superpose  constructively,  there  comes  to  be  a wave  of 
larger  amplitude  moving  to  the  right.  This  process  continues  with  each 
round  trip,  and  a total  wave  is  built  up  composed  of  two  oppositely  moving 
traveling  waves  of  equal  frequency  and  wavelength.  A steady  state  is  soon 
reached  in  which  the  two  traveling  waves  have  the  same  amplitude,  whose 
value  is  such  that  the  rate  at  which  energy  is  lost  to  friction  in  the  rubber 
tubing  and  to  air  resistance  equals  the  rate  at  which  energy  is  supplied  by 
the  vibrator. 


Fig.  13-9  The  standing  wave  set  up 
in  the  tubing  has  a wavelength  equal 
to  the  length  of  the  tubing,  or  half 
the  wavelength  of  the  fundamental 
mode.  There  is  a third  node  at  the 
center  of  the  tubing,  with  two  antinodes 
spaced  evenly  between  the  nodes. 


The  two  oppositely  moving  traveling  waves  constitute  a total  wave 
which  is  a standing  wave.  The  standing  wave  has,  as  it  must,  nodes  at  the 
two  immovable  ends  of  the  tubing  (see  Fig.  13-8).  1 he  segment  of  the 
tubing  located  at  the  center  executes  transverse  oscillations  of  the  greatest 
amplitude.  This  amplitude  is  the  sum  of  the  amplitudes  of  the  two  travel- 
ing waves,  that  is,  twice  the  amplitude  of  either.  The  point  of  greatest  am- 
plitude is  called  an  antinode. 

You  can  see  from  inspection  of  Fig.  13-8  that  the  condition  for  pro- 
ducing the  standing  wave  illustrated  is  that  half  a wavelength  of  the  travel- 
ing wave  must  be  equal  to  the  length  of  the  tubing.  This  can  be  satisfied 
most  conveniently  by  adjusting  the  frequency  of  the  vibrator,  which  is  also 
the  frequency  of  the  wave  the  vibrator  induces.  [For  a given  linear  density 
and  tension  of  the  tubing,  the  speed  |u|  of  the  traveling  wave  will  be  fixed 
according  to  Eq.  (12-356),  |t]  = \/ F / fx.  Thus  Eq.  (12-116),  vX  = |u|,  spe- 
cifies that  the  product  of  the  frequency  v and  the  wavelength  A.  will  be  con- 
stant. Therefore,  adjusting  the  frequency  adjusts  the  wavelength.]  The 
critical  frequency  for  producing  the  standing  wave  of  Fig.  13-8  is  called  the 
fundamental  frequency.  Since  the  standing  wave  is  produced  by  superpo- 
sition of  two  traveling  waves  moving  in  opposite  directions,  its  frequency 
and  wavelength  are  the  same  as  those  of  the  traveling  waves.  (This  asser- 
tion is  justified  quantitatively  later  in  this  section.) 

If  the  frequency  of  the  vibrator  is  doubled,  the  wavelength  of  the  wave 
it  produces  will  be  halved,  and  a standing  wave  will  be  set  up  with  two 
half-wavelengths  contained  in  the  length  of  the  tubing.  Such  a standing 
wave  has  a node  at  the  center  of  the  tubing,  in  addition  to  the  two  manda- 
tory nodes  at  its  rigidly  supported  ends.  See  Fig.  13-9.  For  a frequency 
which  is  three  times  the  fundamental,  a standing  wave  will  be  formed  as  in 
Fig.  13-10,  with  two  nodes  in  addition  to  those  at  the  ends  of  the  tubing.  Thus 


562  Superposition  of  Mechanical  Waves 


Fig.  13-10  Here  the  standing  wave 
has  a wavelength  equal  to  two-thirds 
the  length  of  the  tubing,  or  one-third 
the  wavelength  of  the  fundamental 
mode.  There  are  four  nodes  spaced 
evenly  along  the  tubing  and  three  anti- 
nodes between  them. 


it  is  divided  into  three  equal  parts,  each  half  a wavelength  long.  In  general, 
the  condition  for  setting  up  a standing  wave  is  that  the  frequency  be  an  inte- 
gral multiple  of  the  fundamental  frequency. 

For  frequencies  which  are  not  integral  multiples  of  the  fundamental 
frequency  of  the  stretched  tubing,  it  is  not  possible  to  form  a standing 
wave.  You  can  look  at  this  from  two  different  points  of  view.  One  is  that  at 
all  frequencies  except  those  for  which  an  integral  number  of  half- 
wavelengths of  the  standing  wave  fit  exactly  into  the  length  of  the  tubing  it 
is  impossible  to  have  nodes  separated  by  half  a wavelength  and  also  to  have 
nodes  at  the  two  ends.  The  other  way  to  look  at  the  situation  is  that  the 
twice-reflected  traveling  wave  passing  the  vibrator  will  not  produce  a trans- 
verse motion  in  the  tubing  which  is  in  phase  with  that  induced  by  the 
vibrator  unless  the  frequency  is  such  that  an  integral  number  of  half 
wavelengths  are  contained  in  the  length  of  the  tubing.  At  all  other  fre- 
quencies, the  superposition  of  the  traveling  wave  being  produced  by  the  vi- 
brator and  the  multiply  reflected  traveling  waves  already  in  the  tubing 
sometimes  increases  the  transverse  displacements  of  the  total  wave  and 
sometimes  decreases  them.  The  net  result  is  confused  and  complicated  but 
generally  small  transverse  motions  in  the  tubing. 


Let  us  develop  an  equation  that  relates  the  particular  frequencies  v.  at 
which  there  can  be  a standing  wave,  to  the  linear  density  fi.  the  tension  F. 
and  the  length  L of  the  stretched  rubber  tubing.  This  will  relate  the 
standing-wave  frequencies  directly  to  the  physical  properties  of  the  system 
under  consideration.  The  discussion  in  the  preceding  paragraph,  and  Figs. 
13-8  and  13-10,  show  that  the  condition  for  a standing  wave  is 


(13-8a) 


where  k is  the  wavelength  of  a wave  traveling  on  the  tubing  and  n is  any  of 
the  integers 

n = 1,  2,  3,  . . . ( 1 3-86) 


That  is,  the  condition  for  formation  of  a standing  wave  is  that  an  integral 
number  of  half-wavelengths  of  the  standing  wave  fit  exactly  into  the  length 
of  die  tubing. 

The  frequency  of  the  standing  wave  is  equal  to  v.  the  frequency  of  the 
two  traveling  waves  which  superpose  to  produce  it.  Equation  (12-1 16)  con- 
nects v to  the  speed  of  the  traveling  waves  |w|  and  their  wavelength  k (which 
is  also  the  wavelength  of  the  standing  wave)  by  the  relation  v — \v\/k.  In- 
serting  into  this  relation  the  value  of  |u|  given  by  Eq.  (12-356),  |u|  = \/F/ /x, 
yields 

1 


From  Eqs.  (13-8a)  and  (13-86),  we  have 


(13-8c) 


So  the  equation  determining  the  frequencies  for  standing  waves  in  a 
stretched  tubing,  or  string,  is 


13-3  Standing  Waves  563 


where 


(13-9a) 


n = 1,  2,  3,  . . . 


(13-%) 


Example  13-2  applies  this  frequency  condition  to  a particular  system 
already  considered  in  two  examples  in  Chap.  12. 


EXAMPLE  13-2 

Predict  whether  a standing  wave  will  be  set  up  in  the  apparatus  considered  in  Ex- 
amples 12-5  and  1 2-8  for  the  vibrator  frequency  v = 5.00  Hz.  If  not,  determine  the 
nearest  higher  and  lower  frequencies  for  which  there  will  be  a standing  wave,  and 
describe  the  wave  in  each  case. 

■ Using  the  numerical  values  from  Example  12-8  to  evaluate  the  coefficient  of  n 
in  Eq.  (13-9«),  you  obtain 

1 [F  24.6  m/s 

— \ — = = 1 54  Hz 

2L  V fj.  2 x 8.00  m 

The  frequencies  for  which  standing  waves  will  be  produced  in  the  tubing,  in  this 

particular  case,  are  therefore  given  by  v = 1.54  » Hz,  with  n = 1,  2,  3 You 

thus  have 

v = 1.54,  3.08,  4.61,  6.15,  ...  Hz 

I he  vibrator  frequency  5.00  Hz  will  not  lead  to  a standing  wave.  But  if  the  fre- 
quency is  raised  to  6.15  Hz,  the  standing  wave  corresponding  to  n = 4 will  develop 
on  the  tubing.  4 his  wave  has  3 nodes  located  at  points  on  the  tubing  spaced  equally 
between  die  2 nodes  at  its  ends.  The  5 nodes  divide  the  tubing  into  4 parts,  each 
a half-wavelength  long.  Alternatively,  the  vibrator  frequency  could  be  reduced  to 
4.61  Hz  to  produce  a standing  wave  with  one  less  node,  corresponding  to  n = 3. 


We  are  now  ready  to  express  the  general  physical  ideas  underlying 
standing  waves  in  a precise  mathematical  way.  We  do  so  first  for  the  simple 
and  very  important  case  of  sinusoidal  waves.  In  Sec.  13-4,  we  develop  a 
general  treatment  which  is  valid  for  any  wave  function  at  all. 

Suppose  that  a stretched  string  of  length  L is  tied  to  rigid  supports  at 
both  ends,  and  a sinusoidal  wave  train  is  excited  in  it  by  a vibrator  placed 
near  its  left  end,  as  in  Figs.  13-8,  13-9,  and  13-10.  The  wave  train  travels  to 
the  right,  which  we  take  to  be  the  direction  of  positive  x,  the  quantity  x 
being  measured  from  the  extreme  left  end  of  the  string.  We  have  repre- 
sented such  a sinusoidal  wave  by  means  of  Eq.  (12-21)  which,  with  a slight 
change  in  notation,  can  be  written 

/(x,  t)  = A cos(kx  — ojt  + 8) 

fhe  minus  sign  specifies  that  the  wave  is  traveling  in  the  positive  x direc- 
tion, as  desired.  The  value  of  the  phase  constant  8 is  determined  by  the  dis- 
placement of  the  extreme  left  end  of  the  string  at  the  particular  moment 
chosen  as  t = 0.  It  will  turn  out  to  be  convenient  to  have  this  displacement 
be  zero,  that  is,/(0,  0)  = 0.  (The  advantage  of  making  this  choice  will  be- 
come evident  shortly.)  We  thus  wish  to  have 

/( 0,  0)  = A cos  6 = 0 


564  Superposition  of  Mechanical  Waves 


One  choice  of  8 which  satisfies  this  equation  is  8 = — 77-/2.  So  we  can  write 
f(x,  t ) for  arbitrary  values  of  x and  t in  the  form 

f(x,  t)  = A cos(8x  — cot  — tt/2) 

To  save  the  trouble  of  carrying  along  the  phase  constant  —tt/2  in  the 
calculations  which  follow,  we  note  that  for  any  argument  6 we  can  write 
cos(0  — tt/2)  = sin  0.  So  the  particular  sinusoidal  wave  function  we  wish  to 
use  to  represent  the  wave  traveling  along  the  string  to  the  right  can  be 
written /(x,  t)  = A sin(kx  — cot).  Finally,  using  the  shorthand  notation  f+  = 
f(x,  t)  to  emphasize  the  positive  direction  of  motion,  we  have 

f+  = A sin(kx  — cot)  (13-  10a) 

You  can  satisfy  yourself  that  this  function  is  indeed  a solution  to  the  wave 
equation  by  substituting  it  into  the  general  form  given  by  Eq.  (12-26),  just 
as  was  done  in  Example  12-4  for  a sinusoidal  wave  function  written  in 
terms  of  the  cosine  function. 

When  the  wave  train  comes  to  the  rigid  support  at  the  right  end  of  the 
string,  it  is  reflected  as  it  arrives.  The  reflected  wave  is  a sinusoidal  wave 
train  of  equal  amplitude  traveling  to  the  left,  that  is,  in  the  negative  x direc- 
tion. Just  as  in  Eq.  (12-21),  the  reversal  in  direction  can  be  represented  in 
the  wave  function  by  changing  the  sign  of  the  term  cot  in  the  argument. 
Thus  the  wave  function  for  the  reflected  wave  is 

f-  = A sin(8x  + cot  + 8)  (13-108) 

where  the  minus  sign  in  the  subscript  refers  to  the  direction  of  motion. 

Why  cannot  the  phase  constant  8 be  set  immediately  to  zero  in  Eq.  (13-10b)  as 
it  was  in  Eq.  (13-10a)?  We  wrote  Eq.  (13-10a)  in  such  a way  as  to  eliminate  the 
need  for  a phase  constant  by  choosing  the  location x = 0 and  the  instant  t = 0 in  a 
certain  way.  There  are  no  more  such  free  choices  available,  and/_  bears  a fixed 
phase  relationship — not  yet  known — to/+.  This  phase  relationship  is  specified  by 
the  phase  constant  8. 

The  complete  wave  function / = /(x,  t)  is  obtained  by  superposition.  It 
is 

/ = /++/_  = A sin(8x  — cot)  + A sin(8x  + cot  + 8)  (13-1  la) 

What  is  the  value  of  the  phase  constant  8?  To  find  out,  we  apply  the  bound- 
ary condition  at  the  left  end  of  the  string  (x  = 0),  which  is  that  the  string  at 
that  point  must  always  be  motionless.  When  this  condition  is  applied  to  Eq. 
(13-1  la),  the  space-independent  equation  at  x = 0 becomes 

/( 0,  t)  = A sin(0  — cot)  + A sin(0  + cot  + 8)  = 0 
Since  sin(—  cot)  = — sin(oit),  we  can  use  the  right-hand  equality  to  obtain 

sin  (cad  = sin(o>t  + 8) 

This  equation  is  satisfied  if 

8 = 0 (13-118) 

(It  is  also  satisfied  if  8 = ± 277,  ± 4-77,  ± 671,  ....  But  these  values  of  8 yield 
nothing  new.  Why?) 


13-3  Standing  Waves  565 


The  boundary  condition  used  immediately  above  is  the  simplest  example  of  a 
large  class  of  such  conditions  encountered  in  the  study  of  waves  in  physical 
systems.  In  general,  a boundary  condition  is  a mathematical  expression  of  some 
physical  restriction  placed  on  the  motion  of  a system.  Such  restrictions  are  most 
commonly  imposed  by  the  abrupt  change  in  the  nature  of  the  physical  conditions 
at  the  place  where  the  system  “ends.”  But  other,  more  subtle,  restrictions  do  occur 
in  more  complex  systems,  and  they  are  not  always  imposed  at  an  obvious  physical 
boundary.  They  are  nevertheless  called  boundary  conditions  because  of  their 
mathematical  resemblance  to  the  simple  boundary  condition  leading  to  Eq. 
(13-1  lb).  We  encounter  a variety  of  boundary  conditions  in  Sec.  13-5  and  later  in 
this  book. 


The  restriction  on  the  value  of  S given  by  Eq.  (13-116)  means  that  the 
wave  function  given  by  Eq.  (13-1  la)  is  too  general  to  describe  the  wave  on 
the  stretched  string  of  length  L unless  8 = 0.  We  apply  this  restriction  so 
that  Eq.  (13-1  la)  becomes 

f — A sin(Ax  — cot)  + A sin(foc  + cot)  (13-12) 

In  order  to  bring  out  the  physical  significance  of  Eq.  (13-12),  we  per- 
form some  trigonometric  manipulation.  We  use  the  identity 

sin  F + sin  G = 2 sin[i(/r  + G)]  cos[i(F  — G)]  (13-13) 

Setting  F = kx  — cot  and  G = kx  + cot,  we  obtain 

f = 2A  sin(kx)  cos  ( — cot) 
or,  since  cos  ( — cot)  = cos(cot), 

f = 2A  si n(kx)  cos(a >t)  (13-14) 

Look  at  this  equation  with  care.  We  started  with  two  wave  trains  travel- 
ing in  opposite  directions.  Each  wave  was  described  by  a wave  function 
which  depended  on  both  space  (that  is,  position)  and  time  through  the 
argument  kx  + cot,  as  would  be  expected  of  traveling  waves.  The  total  wave 
function  of  Eq.  (13-14)  also  depends  on  both  space  and  time.  But  the 
dependences  are  separated,  so  that  the  sine  function  is  time-independent  and 
the  cosine  function  is  space-independent.  This  so-called  separation  of  variables 
is  a great  convenience  in  understanding  the  behavior  of  the  wave,  as  well  as 
in  mathematical  manipulations. 


We  now  apply  the  boundary  condition  at  the  right  end  of  the  string 
(x  = L),  which  is  that  the  string  at  that  point  must  always  be  motionless,  just 
as  is  the  case  at  the  left  end  of  the  string.  It  is  particularly  easy  to  apply  this 
condition  to  the  wave  function  in  the  form  of  Eq.  (13-14).  Whatever  the 
value  of  cos(ajf)  may  be  in  that  equation,  the  value  of/ must  be  zero  at  the 
point  x = L;  that  is ,f(L,  t)  = 0.  This  can  be  true  only  if 

sin(AL)  = 0 

To  satisfy  this  equation,  we  must  have 

where  n = 1,  2,  3,  . . . 


where  n = 1,  2,  3,  . . . (13- 15a) 


kL  = mr 


or 


n 7? 

k=T 


566  Superposition  of  Mechanical  Waves 


Negative  integral  values  of  n,  and  the  value  n = 0,  also  satisfy  the  equation 
sin(kL)  = 0 mathematically.  However,  they  are  excluded  on  physical  grounds. 
The  wave  number  k = 2-rr/k  has  as  its  physical  meaning  2tt  times  the  number  of 
waves  per  unit  length  of  the  string  at  a given  instant,  which  cannot  be  negative. 
And  L is  the  length  of  the  string,  which  likewise  cannot  be  negative.  Hence  the 
product  k L cannot  be  negative.  As  for  the  value  n = 0,  this  implies  that  k = 0.  But 
k = 0 corresponds  to  an  infinite  wavelength — that  is,  no  wave  at  all. 

Equation  (13- 15a)  can  also  be  written  in  terms  of  the  wavelength  A. 
Again  using  the  definition  k = 2tt/K  given  by  Eq.  (12-20),  we  have 
(277/ A )L  = mr,  or 

2 L 

A.  = — where  n = 1,  2,  3,  . . . (13-156) 

n 

This  is  identical  to  Eq.  (13-8c),  which  was  derived  by  means  of  a more  direct 
but  less  general  and  less  quantitative  argument.  Its  physical  meaning  bears 
reiteration:  Only  those  values  of  k or  A given  by  Eqs.  (13 -15 a)  or  (13-1 5b)  lead 
to  standing  waves.  These  are  the  values  for  which  an  integral  number  n of  half- 
wavelengths fit  exactly  into  the  length  of  the  string.  Whether  they  are  propagated 
on  a string  or  in  a more  complicated  system,  standing  waves  are  invariably 
characterized  by  a restriction  analogous  to  this  one. 

Consider  a particular  standing  wave  on  a string.  As  is  always  true,  the 
standing  wave  is  described  by  the  wave  function  of  Eq.  (13-14),  / = 
2A  sin(Ax)  cos  (cot).  But  the  particular  wave  is  specified  by  a particular  value  of 
the  integer  n in  Eq.  (13- 15a),  so  that  the  wave  function  is 

f(x,  t ) = 2A  sin^p-xj  cos  (cot)  where  n is  a specific  integer 

We  now  define  an  integer  / such  that  j ranges  over  the  values  0,  1,  2, 
3,  . . . , n.  The  locations  along  the  string  specified  by 

x = — L where  j = 0,  1,  2,  3,  . . . , n 
n 

lie  at  the  left  end  of  the  string  and  at  fractions  of  the  total  length  of  the 
string  equal  to  l/n,  2/n,  3/n,  . . . , 1.  Since  the  entire  string  contains  n 
half-wavelengths,  these  points  are  thus  spaced  a half-wavelength  apart.  At 
these  special  points,  the  wave  function  has  the  value 

/ = 2A  sin  cos  (cot)  — 2 A sin(j7r)  cos  (cot) 

where/  = 0,  1,  2,  3 , ...  ,n 

But  for  these  values  of  the  argument  of  the  sine  function,  the  function  it- 
self always  has  the  value  zero.  Thus  at  these  points  we  always  have 

/=  0 

regardless  of  the  time  t.  These  points  are  the  nodes.  Note  that  the  boundary 
conditions  require  that  the  ends  of  the  string,  x = 0 and  x = L,  be  nodes. 
They  permit  the  existence  of  other  nodes.  For  any  particular  value  of  n, 
there  are  n + 1 nodes  on  the  string,  including  the  two  at  the  ends.  Each 
value  of  n thus  leads  to  a wave  function  which  describes  a particular  way  in 
which  the  string  can  vibrate.  These  particular  ways  are  called  standing- 
wave  modes,  or  sometimes  simply  modes  for  short.  The  wave  function  for 


13-3  Standing  Waves  567 


a particular  mode  is  given  by  the  equations 

f = 2A  sinf??77  cos  {cot)  where  n is  a positive  integer 

or  by  using  Eq.  (13-156),  A.  = 2 L/n, 

( x\  2 L 

f = 2A  sin  (277-  — ) cos (a>t)  where  A = — and  n is  a positive  integer 

(13-166) 

Systems  more  complicated  than  the  vibrating  string  also  possess 
standing-wave  modes,  although  they  are  not  usually  specified  by  conditions 
as  simple  as  those  given  by  Eqs.  (13-15a),  (13-156),  (13-16o),  and  (13-166). 
But  in  cases  where  the  modes  are  described  by  those  equations,  they  are 
often  given  the  special  name  harmonic  modes,  or  (more  commonly)  har- 
monics for  short.  It  was  supposedly  Pythagoras  (in  the  sixth  century  b.c.) 
who  first  found  the  simple  numerical  relationship  among  the  modes  of  a vi- 
brating string  and  connected  them  with  the  pleasing  sounds  of  musical  har- 
mony. 

What  happens  between  the  nodes  when  a string  is  vibrating  in  a partic- 
ular standing-wave  mode?  If  we  start  at  x = 0 and  move  along  the  string, 
the  term  sin(6x)  increases  gradually  until  it  reaches  the  value  +1,  then  de- 
creases through  0 to  —1,  next  increases  through  0 to  +1  again,  and  so  on. 
As  time  passes,  the  term  cos (wt)  swings  through  all  possible  values  between 
+ 1 and  —1,  but  never  passes  beyond  these  limits.  As  a result,  the  point  on 
the  string  located  at  some  arbitrary  value  x oscillates  between  the  extreme 
displacements  2 A sin(kx)  and  — 2A  sin(foc).  Since  t is  the  same  for  all  points 
on  the  string  at  any  particular  moment,  the  value  of  cos(a >t)  is  also  the  same 
for  all  points  on  the  string,  and  the  entire  string  oscillates  together.  All 
points  on  the  string  reach  their  extreme  displacements  at  the  same  mo- 
ment, and  all  pass  through  zero  at  the  same  moment.  Figure  13-11  is  a 
series  of  “snapshots”  of  the  string,  covering  a half-cycle;  that  is,  the  time 
elapsed  from  the  first  sketch  to  the  last  is  T/ 2,  where  the  period  T is  defined 
by  T = 2i t/oj. 

The  points  on  the  string  which  undergo  the  largest  excursions  from 
y = 0 are  those  for  which  sin(6x)  = ± 1.  They  lie  midway  between  the  nodes, 
and  are  the  antinodes.  All  points  on  the  string  other  than  the  nodes  oscil- 
late between  limits  determined  by  the  value  of  sin(6x)  at  their  locations.  The 
nature  of  the  motion  is  suggested  by  Fig.  13-12.  It  should  be  clear  why  such 
waves  are  called  standing  waves. 

Figure  13-13  illustrates  the  first  three  modes  of  oscillation  of  a 
stretched  string.  These  are  the  oscillation  modes  shown  separately  in  Figs. 
13-8,  13-9,  and  13-10. 

So  far  we  have  considered  the  restrictions  on  the  wave  number  k in  the 
term  sin(foc)  of  Eq.  (13-14)  if  there  are  to  be  standing  waves  in  a stretched 
string.  As  already  noted  at  the  beginning  of  this  section,  these  restrictions 
lead  to  restrictions  on  the  permissible  values  of  the  angular  frequency  co  (or 
the  frequency  v = io/2tt).  This  is  because  there  is  a fixed  relation  between 
k and  a>  (or  between  A.  and  v)  given  by  Eq.  (12-1 16)  and  by  the  discussion  in 
Example  13-1.  That  relation  is 

to 

— - m or  v\  = by 
k 


568  Superposition  of  Mechanical  Waves 


—2A 
2 A 


-2A 


x 


Fig.  13-11  A sequence  of  “snapshots” 
of  a stretched  string  oscillating  in  a 
standing-wave  mode.  The  wave  does 
not  move;  that  is  why  it  is  called  “stand- 
ing.” The  time  elapsed  from  the  first 
to  the  last  of  the  snapshots  is  a half- 
period. The  shape  of  the  string  is  always 
sinusoidal,  but  the  height  of  the  sinu- 
soidal varies  from  2 A to  —2 A,  where  A 
is  the  amplitude  of  each  of  the  two 
sinusoidal  traveling  waves,  moving  in 
opposite  directions,  which  may  be  con- 
sidered to  make  up  the  standing  wave. 


Fig.  13-12  Another  way  of  looking  at  the  motion 
of  a stretched  string  oscillating  in  a standing-wave 
mode.  Each  point  on  the  string  oscillates  up  and 
down  with  harmonic  motion,  as  indicated  by  the 
arrows.  The  amplitude  of  the  oscillation  of  any 
point  depends  on  its  location  along  the  string. 


Fig.  13-13  The  first  three  standing-wave  modes  (n  = 1,  2,  3)  for  a stretched  string.  The 
modes  are  shown  as  having  the  same  amplitude  A.  The  vertical  scale  is  grossly  exaggerated 
for  clarity.  If  a real  string  could  actually  oscillate  with  the  amplitude  shown,  without  break- 
ing. it  would  almost  certainly  not  constitute  a linear  system.  That  is,  the  restoring  force 
exerted  on  an  element  of  the  string  would  not  be  proportional  to  its  displacement,  and  the 
superposition  principle  would  not  be  valid. 


y 


13-3 


Standing  Waves 


569 


where  |v|  is  the  speed  of  propagation  of  a traveling  wave  in  the  same  string 
(or  other  medium).  The  quantity  |w|,  in  turn,  is  determined  by  the  mechan- 
ical properties  of  the  medium.  Thus,  if  the  wave  number  or  wavelength  of 
a standing  wave  is  given,  the  corresponding  harmonic  frequency  is  deter- 
mined. This  point  is  explored  further  in  Example  13-3. 


EXAMPLE  13-3 

A string  is  fixed  at  both  ends  under  tension.  One  of  its  harmonic  f requencies  is  360 
If/,  and  the  next  higher  harmonic  frequency  is  420  Hz.  What  is  the  fundamental 
frequency,  that  is,  the  frequency  corresponding  to  n — 1? 

■ You  can  write  Eqs.  (13-9)  in  the  form 


//  - 


where  n = 1,  2,  3,  . . . 


(13-17) 


fhe  string  tension  F,  the  linear  density  /u,  and  the  length  L are  all  fixed  for  the  par- 
ticular  siring.  Thus  the  quantity  ( 1 /2L)V  F/fx  ecu  ihe  right  side  of  the  equation  is  a 
constant.  Since  n is  a dimensionless  number,  the  dimensions  of  that  quantity  must 
he  tlie  same  as  those  of  the  quantity  e on  the  left  side  of  the  equation,  that  is,  hertz. 
(Can  you  verify  this,  beginning  with  the  proper  units  for  F,  /jl,  and  L?)  Indeed,  the 
quantity  (1/2 L)\Zf/il  is  just  tq,  the  fundamental  frequency.  Consequently,  Eq. 
(13-17)  can  be  written  in  the  form 


e = mq  where  n = 1,  2,  3,  . . . 

From  1 1 1 is,  you  can  see  that  the  difference  in  frequency  between  two  consecutive 
harmonics  is  equal  to  the  fundamental  frequency.  So  you  have 

tq  = 420  Hz  - 360  Hz  = 60  Hz 

What  would  be  wrong  if  the  example  stated  that  the  consecutive  harmonics 
were  at  360  Hz  and  440  Ilz? 


13-4  STANDING-WAVE 
SOLUTIONS  TO  THE 
WAVE  EQUATION 


Standing  waves  are  encountered  in  physical  situations  of  many  different 
kinds.  It  is  therefore  important  to  study  them  in  further  depth.  For  the 
simplest  case  — that  of  one-dimensional  sinusoidal  standing  waves  in  a 
stretched  sit  ing  — we  have  discussed  the  salient  features  in  Sec.  13-3.  \ hese 
features  include  the  form  of  the  wave  function,  the  permissible  wave- 
lengths or  wave  numbers,  and  the  permissible  frequencies.  We  obtained 
these  results  by  combining  several  properties  of  traveling  waves:  (1)  their 
superposition  properties;  (2)  their  reflection  properties;  and  (3)  the  rela- 
tions among  the  frequency  and  wavelength  v and  A,  the  traveling-wave 
speed  |u|,  the  string  tension  F,  and  the  mass  per  unit  length  /x  of  the  string. 

It  is  possible,  however,  to  begin  with  the  wave  equation  for  a vibrating 
string  in  its  most  general  form  and  to  obtain  directly  the  form  of  the  wave 
function  for  standing  waves,  together  with  the  restrictions  on  permissible 
wavelengths  and  the  frequency  condition  v = ( n/2L)\ / F//jl.  This  will  be 
done  in  this  section.  The  general  derivation  will  serve  as  a check  on  the  re- 
sults already  obtained.  More  importantly,  it  will  introduce  techniques 
which  will  be  used  in  Sec.  13-5  to  study  the  vibration  of  a circular  drum- 
head. It  is  not  possible  to  obtain  results  for  this  more  complicated  system  by 
simply  applying  features  1,  2,  and  3 listed  in  the  previous  paragraph.  The 
techniques  developed  in  this  section  will  also  find  application  in  the  study 
of  electromagnetic  waves  and  in  the  quantum-mechanical  systems  consid- 
ered in  Chap.  3 I . 


570  Superposition  of  Mechanical  Waves 


y 


Fig.  13-14  At  a particular  time  t a 
wave  function  f(x,  I ) satisfies  the  defini- 
tion of  a standing  wave. 


In  Fig.  13-14,  the  x axis  lies  along  the  undisturbed  string,  with  the  posi- 
tive x direction  chosen  to  the  right.  As  before,  we  choose  the  origin  x = 0 at 
the  left  end  of  the  string,  and  the  right  end  is  at  x = L.  The  transverse  dis- 
placement of  a segment  of  the  string  at  a certain  location  and  time  is  given 
by  y — f(x,  t).  In  general,  the  function  f(x,  t)  is  a solution  to  the  wave  equa- 
tion for  a stretched  string.  That  is,  it  is  capable  of  representing  any  wave  in 
any  part  of  any  string.  If  it  has  the  particular  form/(x,  t)  — fix  + |u|f),  it 
represents  a traveling  wave.  That  is,  the  function  represents  a wave  whose 
characteristic  features,  such  as  the  locations  where /(x,  t)  is  equal  to  zero, 
travel  along  the  string  in  the  positive  or  the  negative  direction  at  speed  |i»|. 
A different  dependence  on  the  variables  x and  t is  required  if/(x,  /)  is  to 
represent  a standing  wave — in  other  words,  a wave  in  which  the  values 
/(x,  t)  = 0,  and  other  characteristic  features,  occur  at  fixed  locations. 

Equation  ( 1 3- 14),  /(x,  /)  = 2 A sin(kx)  cos(o >t),  was  obtained  by  consider- 
ing die  kinematics  of  standing  waves.  It  suggests  that  for  such  waves /(x,  t) 
is  a product  of  a function  of  x alone  and  a function  of  t alone.  In  general, 
f(x,  t)  is  not  necessarily  a sinusoidal  standing  wave  like  that  described  by  Eq. 
(13-14).  Nevertheless,  there  is  a strong  suggestion  that  a standing  wave  can 
be  represented  symbolically  in  the  form 


fix,  t)  = g(x)h(t) 


(13-18) 


where  g(x)  is  some  function  of  x and  h(t)  is  some  function  of  t.  At  all  loca- 
tions where  g(x)  happens  to  have  the  value  zero,  the  displacement/(x,  t)  of 
the  string  will  be  zero  for  all  values  of  t.  Thus  the  wave  will  have  nodes  at 
fixed  locations.  For  instance,  the  form  given  in  Eq.  (13-18)  makes  it  possible 
for/(x,  t)  to  have  the  property  f(x,  t ) = 0 at  both  rigidly  supported  ends  of 
the  string,  no  matter  what  the  value  of  t.  All  that  is  required  is  to  have 
g(x)  = 0 at  both  ends.  These  considerations  would  make  it  possible  to  guess 
that  a standing  wave  must  have  the  form  of  Eq.  (13- 1 8),  even  if  Eq.  ( 13- 14) 
were  not  available  to  provide  a hint. 


Let  us  substitute  Eq.  (13-18)  into  the  wave  equation  for  a stretched 
string  and  see  what  happens.  To  do  this,  we  must  evaluate  the  second  par- 
tial derivatives  of  f(x,  t)  with  respect  to  space  and  time,  both  of  which  ap- 
pear in  the  wave  equation.  We  have 


d2/(x,  t) 
dx2 


[g (x)h(t)] 


Since  / is  held  constant  in  the  partial  differentiation,  the  function  h(t)  is  also 
held  constant.  Therefore  we  obtain 


d2/(x,  t)  d2gjx) 

fix2  W c)x2 

Since  the  partial  derivative  of  a function  of  a single  variable,  such  as  g(x),  is 
not  different  from  an  ordinary  derivative,  there  is  no  need  to  use 
partial-derivative  notation  on  the  right  side  of  this  equation.  Thus  we  have 
d2g(x)/dx2  = d2g(x)/dx2,  and 


I) 

dx2 


= h{t) 


fl2g(x) 

dx2 


( 1 3- 1 9) 


13-4  Standing-Wave  Solutions  to  the  Wave  Equation  571 


Similarly,  we  find 


d2f(x,  t) 
dt2 


[g(*)MO] 


= g(x) 


d2h(t ) 
dt2 


or 


d2f(x,  t) 
dt2 


g(x) 


d2h(t) 

dt2 


(13-20) 


Now  we  substitute  Eqs.  (13-19)  and  (13-20)  into  Eq.  (12-26),  which  is 
the  general  equation  for  waves  on  a stretched  string.  That  equation  is 


d2/(x , t)  _ d2f{x,  t) 
dx 2 F dt2 


(13-21) 


The  substitution  yields 


h(t) 


d2g(x) 

dx2 


ix  d2h(t) 

F gix)  ~dT 


Dividing  through  by  g(x)h(t)  isolates  all  functions  of  x on  one  side  of  the 
equation  and  all  functions  of  t on  the  other: 

1 d2g(x)  fx  1 d2h(t) 

g(x)  dx 2 F h(t)  dt2  K ’ 

Equation  (13-22)  requires  very  careful  analysis.  First,  we  must  realize  that 
the  symbol  “=”  really  means  that  the  quantities  on  the  two  sides  of  the 
symbol  have  a common  value.  It  will  be  useful  to  show  this  explicitly  by  desig- 
nating the  common  value  as  C and  rewriting  the  equation  in  the  form 

1 d2g{x)  = fx  1 d2h{t ) 
g(x)  dx2  F h(t)  dt 2 


Next  look  at  the  pair  of  equations 

1 d2g{x) 
g(x)  dx2 


and 


(13-23«) 


fx  1 d2h(t) 
F h(t)  dt2 


(13-23  b) 


Since  the  left  side  of  Eq.  (13-23«)  does  not  depend  on  the  independent 
variable  t,  it  is  apparent  that  C cannot  depend  on  t.  And  since  the  left  side 
of  Eq.  (13-236)  does  not  depend  on  the  independent  variable  x,  it  is  also 
apparent  that  C cannot  depend  on  x.  Thus  C depends  on  neither  of  the  two 
independent  variables,  and  so  it  must  be  a constant.  Think  about  it! 


In  so  doing,  you  may  well  ask  yourself  the  following  question:  If  C is  a con- 
stant, it  follows  that  the  left  side  of  Eq.  (13-23a)  is  not  a function  ofx  either;  how 
can  this  be?  A similar  question  can  be  asked  about  Eq.  ( 1 3-2 3b ) with  respect  to  the 
variable  t.  We  will  see  very  shortly  that  the  answers  are  simple  enough.  The  func- 
tion g (x)  on  the  left  side  of  Eq.  (13-23a)  will  turn  out  to  have  the  property  that 
d2g(x)/dx2  is  proportional  tog(x).  Consequently,  the  x dependence  of  l/g(x)  will 
cancel  the  x dependence  of  d2g(x)/dx2.  A similar  thing  will  happen  in  Eq. 
(13-23b). 


572  Superposition  of  Mechanical  Waves 


(13-24a) 


Transposing  terms  in  Eqs.  (13-23a)  and  (13-236)  yields 


and 


(13-246) 


We  started  with  a second-order  partial  differential  equation  involving  the 
two  independent  variables  x and  t — the  wave  equation.  We  have  separated 
it  into  two  second-order  ordinary  differential  equations,  a time-inde- 
pendent one  involving  x and  a space-independent  one  involving  t.  This 
was  done  by  writing  the  solution  to  the  partial  differential  equation  as  the 
product  of  the  solutions  to  the  ordinary  differential  equations.  This  is  a sig- 
nificant accomplishment  because  we  know  how  to  find  solutions  to  the  ordi- 
nary differential  equations,  and  we  will  therefore  be  able  to  solve  the 
partial  differential  equation.  The  procedure  we  have  followed  is  called  sep- 
aration of  variables,  and  the  constant  C that  arises  in  carrying  out  the  pro- 
cedure is  called  the  separation  constant.  Separation  of  variables  is  one  of 
the  principal  methods  used  for  finding  analytical  solutions  to  partial  dif- 
ferential equations  of  many  kinds. 

We  will  have  no  trouble  in  solving  Eqs.  (13-24a)  and  (13-246).  They 
both  have  the  same  basic  structure  as  Eq.  (6-16),  the  familiar  ordinary  dif- 
ferential equation  for  the  harmonic  oscillator,  which  is  d2x/dt2  — —ax.  Dif- 
ferent mathematical  symbols  (and  physical  meanings)  are  attached  to  the 
independent  and  dependent  variables  and  the  constant  factors  in  each  of 
these  equations.  But  in  each  case,  the  second  derivative  of  the  dependent 
variable  is  proportional  to  the  dependent  variable  itself.  Equation  (6-16) 
has  a solution  of  the  form  given  by  Eq.  (6-17),  x — A cos(ait  + 8).  In  like 
manner,  every  such  differential  equation  has  a solution  in  which  the 
dependent  variable  is  equal  to  the  product  of  some  constant  with  a sinus- 
oidal function.  And  the  argument  of  the  sinusoidal  function  is  the  inde- 
pendent variable  multiplied  by  some  other  constant.  The  function  can  be  a 
sine,  or  a cosine,  or  any  expression  of  a sinusoidal.  The  only  distinction 
among  these  three  possibilities  is  the  choice  of  the  phase  constant  § whose 
value  specifies  the  zero  point  of  the  independent  variable. 

Applying  these  observations  to  the  time-independent  Eq.  (13-24a),  we 
write  its  solution  in  the  form 


(13-25a) 


g(x)  = A s'm(kx) 


where  k is  a constant  whose  value  has  yet  to  be  determined.  (It  will  turn  out 
to  be  the  wave  number.)  The  solution  is  taken  to  be  a sine  function  rather 
than  a cosine  by  making  the  proper  choice  of  the  phase  constant  8.  This  is 
done  because  of  the  boundary  condition 


(13-256) 


g(x)  = 0 at  x = 0 


This  condition  states  that  there  can  be  no  displacement  of  the  rigidly  sup- 
ported string  at  its  left  end.  By  choosing  the  sine  function  for  the  required 
sinusoidal,  we  satisfy  this  condition  automatically  with  the  convenient  value 
8 = 0 for  the  phase  constant. 

The  constant  k multiplying  x in  Eq.  (13-25a)  must  have  the  value  k = 
27t/\,  where  A is  the  wavelength  of  the  standing  wave.  The  justification  is 


13-4  Standing-Wave  Solutions  to  the  Wave  Equation  573 


that  the  time-independent  function  g(x)  controls  the  wavelength  of  the 
standing  wave.  I hus  when  x increases  by  one  wavelength — by  the  amount 
A. — the  function  g(x)  must  go  through  one  oscillation.  This  requires  that  the 
argument  of  the  sine  function  in  Eq.  (13-25a)  increase  by  2-7t  when  x in- 
creases by  A.  That  is  precisely  what  happens  if  the  argument  is 

, x 

kx  = 27 T — 

A. 

The  function  g(x)  thus  assumes  the  specific  form 

g(x)  = A sin(-^j  (13-26) 

The  value  of  the  wavelength  A is  not  yet  known.  It  will  be  determined  later 
by  applying  the  proper  boundary  condition  at  the  other  end  of  the  string, 
where  x = L. 

There  is  one  more  constant.  A,  in  Eq.  (13-26)  whose  value  has  not  yet 
been  determined.  The  quantity  A is  the  amplitude  of  the  wave,  and  its 
value  is  the  magnitude  of  the  maximum  displacement  of  the  string  from  its 
undisturbed  position. 


Let  us  verify  that  Eq.  (13-26)  actually  does  satisfy  Eq.  (13-24a).  Dif- 
ferentiating both  sides  of  Eq.  (13-26),  we  find 


dg(x)  2 7 
dx  A 

Differentiating  a second  time  gives 

d2g(x) 


A cos 


2 77 X 


dx2 


2 77  x 2 


A sin 


Substitution  of  the  values  of  g(x)  and  its  second  derivative  into  Eq.  (13-24a) 
produces 


'2tt\2 
— — j A sin 


CA  sin 


This  is  valid,  and  therefore  Eq.  (13-26)  is  a solution  to  Eq.  (13-24a), provided 
that 


C = 


(13-27) 


Thus  we  have  identified  the  separation  constant  C in  Eqs.  (13-24«)  and 
(13-246).  It  is  a negative  quantity  whose  value  depends  on  the  wave- 
length A. 

To  determine  the  value  of  the  wavelength,  we  apply  the  boundary  condi- 
tion at  the  right  end  of  the  string: 


g(x)  =0  at  x = L 


(13-28) 


The  condition  states  that  there  can  be  no  displacement  of  the  string  at  that 
end  either.  Using  Eq.  (13-26),  we  have 


A sin 


0 


574  Superposition  of  Mechanical  Waves 


Since  the  amplitude  A is  not  zero  if  there  is  a standing  wave  in  the  string,  we 
must  satisfy  this  equation  by  having 


sin 


= 0 


This  can  be  achieved  only  if  2ttjL/\  has  one  of  the  values 

2 ttL 

— - — = 77,  277,  377,  . . . (13-29) 

A 

because  sin  77  = sin(277)  = sin(377)  = ■ ■ • = 0.  We  do  not  list  2ttL/\  = 0, 
even  though  sin  0 = 0,  because  it  corresponds  to  the  statement  A.  = and 
this  means  there  is  no  standing  wave. 

A compact  way  to  write  Eq.  (13-29)  is 

2ttL 

— - — = tt77  where  n = 1,  2,  3,  . . . (13-30a) 

A 

or 

= L where  n = 1,  2,  3,  . . . (13-30/?) 


Compare  this  condition  with  Eqs.  (13-8a)  and  (13-8/?).  The  functions 

g(x)  = A sin 

are  plotted  in  Fig.  13-15  for  the  first  three  values  of  the  integer  n.  Each 
integer  n produces  a particular  function  g(x).  But  all  the  functions  g(x)  sat- 
isfy the  boundary  conditions  g(x)  = 0 at  x = 0 and  x = L,  because  for  each 
function  an  integral  number  of  half-wavelengths  just  fit  into  the  length  of 
the  string.  You  can  see  this  directly  from  Eq.  (13-30/?). 


Now  we  attack  the  space-independent  differential  equation,  whose  so- 
lutions are  the  functions  h(t)  which  satisfy  Eq.  (13-24/?): 

at~  ijl 


g(x) 


n = 


Fig.  13-15  Plots  of  the  time-inde- 
pendent wave  function  g(x)  = 
A sin(27 t x/\)  with  A = 2 L/n.  The 
wave  functions  shown  correspond  to 
the  values  n = 1,  2,  3.  Compare  with 
Fig.  13-13. 


13-4  Standing-Wave  Solutions  to  the  Wave  Equation  575 


(13-31) 


From  Eqs.  (13-27)  and  (13-30a)  we  have 


Using  this  value  of  C gives  us 

d2h(t)  (mr\2  F , 

~ir  = - (t)  h m 

Again,  we  have  a differential  equation  which  we  know  to  have  sinusoidal 
solutions.  We  choose  to  write  them  in  the  form 

h{t)  = cos(27 Tvt)  (13-32) 

Since  cos  0=1,  this  is  tantamount  to  choosing  the  zero  of  time  to  be  the  in- 
stant when  the  displacements /(x,  t)  = g(x)h(t)  of  all  the  segments  of  the 
string  have  their  extreme  values.  The  constant  multiplying  t has  been 
written  in  terms  of  the  frequency  v.  Using  the  discussion  preceding  Eq. 
(13-26)  as  a guide,  can  you  justify  in  detail  the  choice  of  the  constant  2 ttv  in 
Eq.  (13-32)?  What  about  the  phase  constant  S? 

We  do  not  multiply  the  cosine  in  Ecp  (13-32)  by  an  amplitude  factor  A. 
This  is  because  the  constant  A which  is  part  of  the  time-independent  func- 
tion g(x)  provides  all  the  adjustability  needed  in  the  complete  wave  function 
f(x,  t)  = g(x)h(t). 

To  verify  Eq.  (13-32),  and  also  to  determine  the  frequency  v,  we  dif- 
ferentiate once  to  obtain 


dh(t) 

dt 


— 2 ttv  sin(27rU) 


Differentiating  a second  time  gives 

^ ^ = -(27TV)2  COS(27 TVt) 

Substitution  of  the  values  of  h(t)  and  its  second  derivative  into  Eq.  (13-31) 
produces 

— (27 tv)2  COS(2 TTVt)  = — — COS(27 TPt) 


This  equality  is  valid,  and  therefore  Eq.  (13-32)  is  a solution  to  Eq.  (13-246), 
provided  that 


( 2 ttv)2 


F_ 

V- 


Thus  i he  frequency  must  have  one  of  the  values 


v 


where  n = 1,  2,  3,  . . . 


(13-33) 


We  now  have  a complete  solution  to  Eq.  (13-18), /(x,  t)  = g(x)h(t).  The 
form  of  g(x)  is  given  by  Eq.  ( 13-26),  and  the  permissible  values  of  the  wave- 
length K which  appears  in  that  equation  by  Eq.  (13-306).  The  form  of  h(t)  is 
given  by  Eq.  (13-32),  and  the  permissible  values  of  the  frequency  v which 
appears  in  that  equation  by  Eq.  (13-33).  Combining  all  this  information,  we 
have  the  mathematical  expression  for  standing-wave  solutions  to  the  wave 


576  Superposition  of  Mechanical  Waves 


equation  for  a string  with  linear  density  /jl,  tension  F,  and  length  L: 


fn(x,  t ) 


= A„  sin 


COS(2'7n'„0 


where  n = 1,  2,  3,  . . . 


(13-34a) 


with 

1 n 

k„  2 L 


and 


= n_  [F=±  ff 

2L  V kn  V /r 


(13-346) 


(13-34r) 


We  have  used  the  subscript  n as  a label  on  th efn(x,  t),  and  on  amplitude  An, 
the  wavelength  kn,  and  the  frequency  vn  of  each  of  these  functions,  to  em- 
phasize that  for  every  value  of  the  integer  n there  is  a different  standing 
wave.  Every  one  represents,  individually,  a possible  mode  of  the  string.  Note 
that  in  choosing  a particular  value  of  n to  specify  a solution  of  Eq.  (13-34a), 
you  must  use  the  same  value  of  n in  evaluating  1/A.,,  from  Eq.  (13-346)  and 
vn  from  Eq.  (13-34c). 

In  the  first  mode,  n — 1,  the  solution  to  the  wave  equation  is 


/,(x,()  =A,  sin(-|f)  cos(fj  ^ I ) 

This  describes  the  string  oscillating  at  its  fundamental  frequency  v1  = 
( 1 /2 L)\Zf//jl  in  a standing  wave  with  wavelength  Ax  = 2 L and  nodes  at  each 
end.  At  the  start  of  every  cycle  of  oscillation,  it  has  the  shape  shown  in  Fig. 
13-15  by  the  curve  labeled  n = 1 and  an  amplitude  determined  by  what- 
ever value  Ai  happens  to  have.  In  the  second  mode,  the  solution  is 

Hx,t)=A,  sin(^)cos(^  yT) 

Here  the  shape  of  the  string  at  the  beginning  of  every  cycle  is  given  by  the 
n = 2 curve,  for  the  appropriate  value  of  the  amplitude  A2.  The  wave- 
length is  A2  = L,  and  the  frequency  is  v2  = (l/L)vT//t. 

By  using  mathematical  arguments  to  investigate  standing  waves  in  a 
string,  we  have  obtained  a description  of  these  waves  that  agrees  with  the 
description  we  obtained  earlier  from  simpler  and  more  physical  argu- 
ments. Compare  Eqs.  (13-9)  and  (13-34r).  One  advantage  of  the  mathemat- 
ical arguments  is  that  they  do  not  involve  detailed  assumptions  about  how 
the  standing  waves  are  excited,  and  thus  they  make  it  clear  that  the 
standing  waves  possible  in  a system  are  characteristic  of  the  system,  instead 
of  the  excitation  process.  Another  advantage  is  that  the  mathematical  argu- 
ments can  be  extended  much  more  easily  than  the  physical  ones  to  the 
treatment  of  standing  waves  in  systems  of  two  or  three  dimensions.  We 
make  such  an  extension  in  Sec.  13-5. 


Equation  (13-34r)  relates  the  frequency  and  wavelength  of  a standing 
wave  on  a string  by  means  of  the  equation  vnkn  = \/F/ix.  According  to  Eq. 
(12-356),  |u|  = \ZF/fx,  the  quantity  \ZF//jl  occurring  in  Eq.  (13-34c)  is  the 
speed  |y|  of  a traveling  wave  in  the  same  string.  We  thus  have  the  relation 

v„kn  = M 

between  the  standing-wave  frequency  and  wavelength  and  the  traveling-wave 


13-4  Standing-Wave  Solutions  to  the  Wave  Equation  577 


speed.  This  equation  is  identical  in  appearance  to  Eq.  (12-116),  which  relates 
the  frequency  and  wavelength  of  a traveling  wave  to  its  speed;  however,  its 
meaning  is  different  since  a standing  wave  has  zero  speed. 

While  the  equation  vn\n  = |u|  has  been  derived  for  the  special  case  of 
standing  waves  on  a string,  it  is  valid  for  all  standing  waves.  Since  the  fre- 
quency and  wavelength  of  standing  waves  can  very  often  be  measured  with 
great  accuracy,  the  relation  is  the  basis  for  one  of  the  most  accurate 
methods  for  measuring  the  propagation  speed  of  traveling  waves.  This 
method  is  exploited  in  Example  13-4. 


EXAMPLE  13-4 

A cellist  tunes  the  “A  string”  of  a cello  by  adjusting  its  tension  until  the  fundamental 
frequency  of  the  standing  wave  which  he  sets  up  in  the  string  by  bowing  it  is  220 
Hz.  The  distance  between  the  supports  at  the  two  ends  of  the  string  is  68.4  cm,  and 
the  mass  of  the  string  extending  between  the  supports  is  1.31  g. 

a.  What  is  the  tension  in  the  string? 

■ Setting  n = 1 in  Eq.  (13-34r)  gives  the  fundamental  frequency 

Solving  for  the  tension  F,  you  find 

F = 4L2fj.v\ 

When  you  write  the  linear  density  /jl  in  terms  of  the  string’s  mass  m and  length  L,  the 
expression  becomes 

m 

F = 4 L2  — v\  = ALmv\ 

Inserting  the  numerical  values,  you  obtain  a tension 

F = 4 x 0.684  m x 1.31  x 10~3  kg  x (220  s-1)2 
= 173  N 

which  pulls  the  supports  together.  To  withstand  this  tension  and  that  of  the  other 
three  strings,  the  neck  of  the  cello  must  be  made  reasonably  strong.  The  total  force 
exerted  on  the  frame  of  a piano  by  the  more  than  200  strings  stretched  on  it  is  very 
large;  the  frame  must  be  made  of  heavy  steel. 

b.  The  cellist  now  plucks  the  string  near  one  end.  What  is  the  speed  of  the  trav- 
eling wave  which  propagates  down  the  string? 

■ You  know  the  fundamental  frequency;  it  is  Vi  = 220  Hz.  If  you  use  Eq. 
(13-346)  to  find  the  corresponding  standing  wavelength  \j  from  the  known  value  of 
the  length  of  the  string  L = 68.4  cm,  you  can  use  the  equation  rAj  = |v|  to  deter- 
mine the  traveling-wave  speed.  Setting  n = 1,  you  have  from  Eq.  (13-346) 

A.x  = 2 L 

The  speed  is  thus  given  by 

M = i'iA.1  = 2 vxL 

Inserting  the  numerical  values,  you  have 

|v|  = 2 x 220  Hz  x 0.684  m = 301  m/s 
which  is  somewhat  greater  than  the  speed  of  a commercial  jet  plane. 


578  Superposition  of  Mechanical  Waves 


13-5  STANDING  For  a one-dimensional  system,  such  as  a string,  the  frequencies  of  possible 
WAVES  ON  A standing  waves  are  related  to  each  other  in  a very  simple  way.  For  a two- 
CIRCULAR  MEMBRANE  dimensional  system,  such  as  a drumhead,  the  relation  is  not  so  simple.  In  this 

section  we  extend  the  methods  of  Sec.  13-4  to  study  standing  waves  on  a 
drumhead.  In  spite  of  some  complications  arising  from  the  geometry  of  the 
two-dimensional  case,  the  fundamental  ideas  carry  over  with  little  change. 
The  results  obtained  for  standing  waves  in  both  circular  drumheads  and 
strings  are  used  in  subsequent  sections  to  understand  some  of  the  basic 
properties  of  musical  instruments. 

Just  as  for  a stretched  string,  the  properties  of  waves  on  a stretched 
membrane  comprising  a drumhead  are  determined  by  the  relationship 
that  Newton’s  second  law  imposes  between  the  force  exerted  on  each  small 
segment  of  the  membrane  and  its  mass  and  acceleration.  We  will  assume 
that  the  membrane  is  of  uniform  composition  and  is  uniformly  stretched 
across  a circular  rim,  as  a drumhead  is.  The  force  acting  on  any  segment  re- 
sults from  the  tension  in  the  membrane.  We  will  again  use  the  symbol  F for 
the  tension.  However,  here  we  have  a two-dimensional  membrane  instead  of 
a one-dimensional  string,  and  F is  now  defined  as  the  magnitude  of  the 
force  per  unit  length  acting  across  any  line  cutting  the  drumhead  in  any  direc- 
tion. This  definition  is  illustrated  in  Fig.  13-16.  The  tension  F is  the  same, 
no  matter  what  the  location  or  orientation  of  the  line  involved  in  defining 
its  magnitude,  because  the  membrane  is  stretched  uniformly.  The  mass  of 
each  segment  of  the  membrane  is  expressed  in  terms  of  its  mass  per  unit 
area,  called  the  areal  density  /x.  We  will  assume  that  the  membrane  is  of  uni- 
form composition,  so  that  fx  has  the  same  value  everywhere. 

To  recapitulate,  we  have  redefined  the  quantities  F and  /x  for  the 
two-dimensional  case  now  under  consideration.  You  may  feel  that  there  is 
some  awkwardness  involved  in  using  the  same  symbols  for  quantities 
slightly  different  from  those  used  in  discussion  of  the  one-dimensional 
stretched  string.  There  is  a compensating  advantage,  however,  in  the  simi- 
larity between  the  two-dimensional  wave  equation  which  we  are  about  to 
develop  and  the  one-dimensional  equation  which  describes  wave  motion  in 


Fig.  13-16  Illustrating  the  tension  F in  a uniformly  stretched 
membrane.  If  a line  of  unit  length  (l  = 1)  is  constructed  with  any 
location  and  orientation,  the  total  force  exerted  on  the  membrane 
at  one  side  of  the  line  by  the  membrane  at  the  other  side  has 
magnitude  F.  If  the  length  of  the  line  is  l ^ 1.  each  of  these  forces 
has  magnitude  FI.  If  the  membrane  were  actually  cut  along  a line 
of  length  /,  the  two  edges  of  the  cut  would  pull  apart  because  of 
the  tension.  To  prevent  this,  a total  force  of  magnitude  Ft  would 
have  to  be  applied  to  each  edge  to  balance  the  tension  force. 


13-5  Standing  Waves  on  a Circular  Membrane  579 


v=v\  Nodal  circle 
at  fixed  rim 


(a) 


Nodal  circle 


v-  2.30fi  Nodal  circle 
at  fixed  rim 


(b) 


Nodal  circles 


v=3.6U>i  Nodal  circle 
at  fixed  rim 


(c) 


Fig.  13-17  "Snapshots"  of  a circular  drumhead 
vibrating  in  the  first  three  standing-wave  modes 
possessing  circular  symmetry.  A grid  is  drawn  on 
the  drumhead  to  facilitate  visualization.  As  in  the 
case  of  the  vibrating  string,  each  point  on  the 
drumhead  oscillates  in  harmonic  motion  up  and 
down  with  frequency  v and  with  an  amplitude 
which  depends  on  its  position.  But  both  the  shape 
of  the  waveforms  and  the  frequency  ratios  of  the 
modes  are  more  complicated  than  those  for  the 
vibrating  string,  (a)  The  fundamental  mode,  with 
frequency  v = vx.  The  value  of  vl  is  determined 
by  the  radius,  density,  and  tension  of  the  drum- 
head, as  discussed  in  the  text.  All  points  on  the 
periphery  of  the  drumhead  are  always  nodes 
because  the  drumhead  is  fixed  to  the  rim.  In  the 
fundamental  mode  they  are  the  only  nodes.  Taken 
together,  they  comprise  a nodal  circle.  ( b ) The 
second  mode  with  circular  symmetry.  Its  fre- 
quency is  2.30  times  the  fundamental  frequency 
vl.  There  is  a second  nodal  circle,  located  ap- 
proximately 0.44  (less  than  half)  of  the  distance 
from  the  center  to  the  rim.  At  any  instant,  the 
displacement  and  direction  of  motion  of  points 
on  the  drumhead  within  this  nodal  circle  are 
opposite  to  those  of  points  outside  it.  The  ampli- 
tude of  the  central  antinode  is  greater  than  that 
of  the  second  antinode,  (c)  The  third  mode  with 
circular  symmetry.  Its  frequency  is  3.61  times  the 
fundamental  frequency  pt.  There  are  now  two 
nodal  circles  in  addition  to  the  rim,  located 
approximately  0.28  (less  than  one-third)  and  0.64 
(less  than  two-thirds)  of  the  distance  from  the 
center  of  the  rim,  and  the  drumhead  is  divided 
into  three  regions  which  at  any  instant  have  dis- 
placements of  alternate  sign  and  are  moving  in 
alternate  directions.  The  amplitudes  of  the  three 
antinodes  are  successively  smaller  from  the  center 
outward. 


a stretched  string.  Note  that  in  going  from  one  dimension  to  two  dimen- 
sions, we  give  each  of  the  quantities  F and  /jl  a corresponding  dimensional 
change,  dividing  in  each  case  by  a length. 

For  simplicity,  we  will  give  a mathematical  treatment  only  for  standing 
waves  which  are  symmetric  about  the  center  of  the  circular  membrane. 
These  are  the  ones  that  are  excited  in  a drumhead  if  it  is  struck  at  its  center. 
Three  such  standing  waves  are  illustrated  in  Fig.  13-17.  At  the  end  of  this 
section  we  will  describe  other  types  of  standing  waves  that  can  exist  on  a 
drumhead. 

As  we  did  in  studying  waves  on  a stretched  string,  we  study  waves  on  a 
stretched  drumhead  by  concentrating  on  a typical  small  segment  of  the 
drumhead.  Because  of  the  circular  symmetry  of  the  waves  that  we  are  con- 
sidering, the  appropriate  shape  of  the  segment  is  that  of  a thin  ring  concen- 
tric with  the  rim  of  the  drumhead,  as  in  Fig.  13-18.  The  ring  extends  from 
an  inner  radius  r to  an  inhnitesimally  greater  outer  radius  r + dr.  The  sym- 
metry of  the  waves  being  treated  implies  that  all  points  on  the  ring  lying  on 
a circle  of  a given  radius  have  the  same  displacement  z from  the  plane  of 
the  undisturbed  drumhead.  Note  the  strong  analogy  between  the  structure 
of  Fig.  13-18  along  any  radial  direction  and  the  structure  of  Fig.  1 2-8,  used 
in  deriving  the  differential  equation  for  any  type  of  waves  on  a string.  Flere 
we  derive  the  differential  equation  for  symmetrical  waves  on  a drumhead. 


580  Superposition  of  Mechanical  Waves 


Fig.  13-18  Sketch  for  deriving  the  wave  equation  for  the  vibrating  cucular  drum- 
head. Shown  is  a thin,  ring-shaped  element  of  the  drumhead,  at  an  instant  when 
its  outer  periphery  is  distorted  further  in  the  positive  vertical  z direction  than  its 
inner  periphery,  and  its  shape  is  concave  upward.  The  arrows  schematically  repre- 
sent force  vectors  exerted  on  local  regions  of  the  ring  by  neighboring  parts  of  the 
drumhead.  They  are  everywhere  tangent  to  the  drumhead  at  their  points  of  applica- 
tion, and  their  horizontal  components  (in  the  xy  plane)  lie  along  radii  of  the  undis- 
turbed drumhead.  The  tension  F,  or  force  per  unit  length,  is  everywhere  the  same 
in  the  drumhead.  Consequently,  more  force  vectors  are  shown  along  the  outer 
periphery  of  the  ling  than  along  the  inner  periphery,  because  the  outer  periphery 
is  longer. 


But  the  derivation  of  this  wave  equation  is  very  similar  to  the  earlier  deriva- 
tion, so  we  can  proceed  relatively  quickly  through  familiar  territory. 

At  any  point  on  the  inner  or  outer  periphery  of  the  ring-shaped  seg- 
ment, the  forces  exerted  on  it  by  adjacent  parts  of  the  drumhead  are  locally 
tangent  to  the  surface  of  the  segment  and  locally  perpendicular  to  its 
periphery.  (What  analogous  statement  can  you  make,  based  on  Fig.  12-8, 
for  the  forces  exerted  on  a segment  of  a stretched  string?)  Consider  the  in- 
ner periphery.  Acting  on  it  is  the  set  of  force  vectors  indicated  in  Fig.  13-18. 
Although  they  vary  in  direction,  their  z components  (which  are  the  ones  as- 
sociated with  the  z displacements  of  the  segment)  do  not  vary  in  magnitude. 
So  the  z component  of  the  total  force  acting  on  a unit  length  of  the  inner 
periphery  is  just  F times  the  slope,  measured  radially  outward,  of  the 
drumhead  at  that  periphery.  (We  are  assuming  here,  as  in  Sec.  12-3,  that 
the  slope  is  small  so  that  we  can  equate  sines  to  tangents.)  The  slope  is  given 
by  the  partial  derivative  of  z = fir,  t)  with  respect  to  r,  that  is  df(r,  t)/dr. 
Thus  the  z component  of  the  total  force  acting  on  a unit  length  of  the  inner 
periphery  is  F,  the  tension  force  per  unit  length,  times  the  slope  d/(r,  t)/ dr. 
Multiplying  this  z component  per  unit  length  by  2nr,  the  total  length  of  the 
inner  periphery,  we  find  that  the  magnitude  of  the  total  force  in  the  z direc- 
tion acting  on  the  inner  periphery  is 


2 tttF 


df(r,  t) 

dr 


The  same  expression,  evaluated  at  r + dr,  gives  the  magnitude  of  the 
total  force  in  the  z direction  acting  on  the  outer  periphery  of  the  segment, 
even  though  the  forces  in  the  z direction  have  opposite  signs  at  the  two 
peripheries.  These  two  forces  do  not  cancel  completely  because  their  mag- 
nitudes are  not  equal.  In  fact,  the  net  force  acting  on  the  segment  in  the  z 
direction  is  just  the  difference  between  the  value  of  2ttvF  d/(r,  t)/dr  at  the 
outer  periphery  and  its  value  at  the  inner  periphery.  This  difference  is 
given  by  the  rate  of  change  of  the  quantity  with  respect  to  r times  dr,  the 
change  in  r.  So  the  net  force  acting  on  the  segment  in  the  z direction  is 

_d_ 
dr 


2 m F 


dfjr,  t) 
dr 


dr 


Compare  this  with  the  equation  just  above  Eq.  (12-23),  which  gives  the  net 
force  on  a segment  of  a stretched  string. 

The  mass  of  the  ring-shaped  segment  is  its  area  277T  dr  times  the  mass 
per  unit  area  p of  the  drumhead.  So  the  mass  is 

277pr  dr 


13-5  Standing  Waves  on  a Circular  Membrane  581 


The  acceleration  in  the  z direction  of  the  segment  is 


c *1 2/(r,  t) 
dt2 


Equating  force  to  mass  times  acceleration  gives 


_d_ 

dr 


2tt>F 


df(r , t ) 

dr- 


dr  = 27 T/xr  dr 


d'2f(r,  t) 
dt 2 


Since  the  tension  F is  the  same  everywhere  in  the  uniformly  stretched 
drumhead  it  is  a constant,  just  as  is  277.  Thus  we  have 


2t tF 


dr 


d/(r,  t) 

dr- 


dr  = 2 TTfxr  dr 


d2/(r,  t) 
dt2 


Canceling  and  transposing,  we  obtain 

J_  _d_ 
r dr 

This  is  the  wave  equation  for  circularly  symmetrical  waves  on  the  drum- 
head. [For  the  general  case  of  waves  that  do  not  necessarily  have  this  sym- 
metry, the  displacement  of  the  drumhead  must  be  written  z = /(r,  6,  t),  and 
the  left  side  of  the  wave  equation  contains  partial  derivatives  with  respect 
to  6.] 


df(r,  t) 
dr 


J±  d-f{r,  t) 
F dt2 


13-35) 


Because  of  the  similarity  of  the  right  sides  of  the  wave  equations  for  a 
drumhead  and  a string — the  sides  containing  the  time  derivative — it 
seems  likely  that  for  both  the  standing-wave  solutions  will  have  similar  time 
dependences.  With  this  in  mind,  we  save  ourselves  some  effort  by  assuming 
from  the  beginning  that  the  standing-wave  solutions  to  Eq.  (13-35), 
/(r,  t)  — g(r)h(t),  can  be  written  as 


f(r,  t)  - g{r)  cos(27 rvt)  (13-36) 

Then  we  prepare  to  validate  this  assumption  by  calculating  the  partial 
derivatives.  We  obtain 


]_  _d_  df(r , t) 

r dr  dr 


= COS  (2  77 TVt) 


LA 

r dr 


dg(r) 

dr 


and 


d2f(r,  t) 
dt 2 


— g(r)(27TV)2  COS(27 TVt) 


Substituting  these  into  the 

o 1 d 

COS(27 TVt) — 

r dr 


partial  differential  equation  produces 


dg(r) 

dr 


= - g{r)  (2  TTV)2  L;  COS(2  TTVt) 

r 


I his  will  be  satisfied,  and  so  Eq.  (13-36)  will  be  verified,  if  the  following 
ordinary  differential  equation  is  satisfied: 


LA  r dgiyy 

r dr  dr 


477-V/Z 
F Sir) 


(13-37) 


1 his  is  the  time-independent  equation  corresponding  to  Eq.  (13-35).  The 
solutions  g(r)  to  Eq.  (13-37)  are  the  functions  specifying  the  shapes  of  pos- 
sible symmetrical  standing  waves  on  the  drumhead.  In  finding  these  solu- 
tions, we  will  also  determine  for  each  of  them  the  value  of  its  frequency  v. 


582  Superposition  of  Mechanical  Waves 


We  will  solve  Eq.  (13-37)  numerically.  (One  reason  for  doing  so  is  that 
it  will  introduce  procedures  of  which  we  will  make  important  use  in  Chap. 
31  to  solve  the  quantum-mechanical  wave  equation.)  The  first  step  is  to 
work  out  the  derivative  of  the  term  in  brackets,  so  that  the  equation  be- 
comes 


\ f d2g(r)  + dg(r)~ 
r dr 2 dr 


4TT2V2fA 

F 


g(r) 


or 


d2g(r)  4v2v2fjL  1 dg(r) 

dr  2 = F g(r)  ~ 7 dr 


(13-38) 


Now  the  differential  equation  is  in  a form  that  allows  us  to  apply  the  nu- 
merical method  developed  in  Chap.  6.  But  before  we  do  this,  it  is  worth- 
while to  introduce  the  dimensionless  variable  u,  defined  as 


u 


(13-39) 


where  a is  the  radius  of  the  drumhead.  The  quantity  u gives  the  distance 
from  the  center  of  the  drumhead  to  some  other  location,  expressed  as  a 
fraction  ranging  from  0 to  1 as  that  location  ranges  from  the  center  to  the 
rim.  This  will  lead  to  an  equation  which  can  be  applied  to  a drumhead  of 
any  radius.  Employing  the  chain  rule  to  convert  the  derivatives  in  Eq. 
(13-38)  to  the  new  variable,  we  have 

dg(r)  dg(u)  du  _ dg(u)  1 

dr  du  dr  du  a 


and 


d2g(r) 

d 

dg{r)~ 

du 

d 

dg(u)  r 

1 d2g{u)  1 

dr 2 

~ du 

dr 

dr 

du 

du  a 

a du 2 a 2 

Substituting  these  derivatives  into  the  differential  equation  and  then  multi- 
plying through  by  a 2,  we  obtain 

d2g(u)  _ 47 t2v2[xci2  , , 1 dg(u) 

du-  - i m-uHT 


It  is  convenient  to  introduce  the  parameter  a,  defined  as 

_ 47 t2v2/jlci2 


(13-40) 


If  you  insert  the  units  for  v,  /a,  a,  and  F,  you  will  see  that  a is  a dimen- 
sionless number.  In  terms  of  a,  the  differential  equation  becomes 


d2g{u)  _ 1 dg(u) 

du 2 ag  U u du 


(13-41) 


The  purpose  of  going  from  Eq.  (13-38)  to  Eq.  (13-41)  is  that  the  latter, 
since  u and  a are  dimensionless  quantities,  has  universal  applicability.  Once 
we  find  its  solutions,  they  can  be  applied  to  any  drumhead  by  simply  using 
the  appropriate  value  of  a in  evaluating  the  independent  variable  u and  of 
p,,  a,  and  F in  evaluating  the  parameter  a. 


13-5  Standing  Waves  on  a Circular  Membrane  583 


Writing  Eq.  (13-41)  as 


d2 3 4g{u) 
du 2 U 


where 


Q = ~ otg(u)  - 


1 dg(u) 
u du 


(13-42a) 


(13-426) 


we  see  that  Et|.  ( 13-42a)  is  mathematically  identical  to  Eq.  (6-14),  d2x/dt2  = Q. 
This  is  the  form  we  have  used  to  solve  second-order  differential  equations 
numerically.  The  numerical  method  for  solving  Eq.  (13-42a)  according  to 
Eqs.  (6-15)  can  be  applied  immediately.  Doing  this  involves  nothing  more 
than  adding  to  the  harmonic  oscillator  program  several  steps  which  will 
generate  the  value  of  Q specified  in  Eq.  (13-426)  and  then  running  the 
calculator  or  computer  with  this  new  program.  The  vibrating  drumhead 
program  is  listed  in  the  Numerical  Calculation  Supplement. 

As  is  always  the  case  in  numerical  calculations,  it  is  necessary  here  to 
specify  the  initial  values  of  the  quantity  g(u)  and  its  first  derivative  dg(u)/du. 
The  following  points  must  be  considered  in  determining  these  values: 


1.  In  earlier  numerical  calculations,  we  have  always  begun  by  setting 
the  initial  value  of  the  independent  variable  to  zero,  and  we  have  then  pro- 
ceeded in  small  positive  increments.  But  in  the  present  case  the  quantity  Q 
cannot  be  evaluated  at  u = 0 (the  center  of  the  drumhead)  because  at  that 
point  the  factor  \/u  in  its  second  term  goes  to  infinity.  Fortunately,  it  is  easy 
to  circumvent  this  difficulty.  The  numerical  calculation  is  started  at  u = 1 
(the  edge  of  the  drumhead).  Negative  increments  Au  are  then  taken  and  the 
calculation  cycle  is  carried  out  repeatedly,  each  cycle  yielding  a value  of  g(u) 
at  a location  closer  to  the  center  of  the  drumhead  than  its  predecessor.  This 
process  is  continued  until  g(u)  is  evaluated  very  near,  but  not  at,  u = 0. 

2.  Since  the  drumhead  is  fixed  at  its  outer  edge,  the  “initial”  condition 
on  g(u)  at  u = 1 is  [g(w)]x  = 0.  In  the  terminology  of  Sec.  13-3,  this  condi- 
tion is  the  boundary  condition  at  one  end  of  the  range  of  the  coordinate  u. 

3.  The  numerical  method  requires  that  we  choose  also  an  initial  value 
for  dg(u)/du  at  u = 1.  This  quantity  gives  the  slope  of  the  drumhead  imme- 
diately at  the  rim.  But  since  the  differential  equation  is  linear  in  g(u ),  the 
choice  will  affect  only  the  vertical  scale  of  a plot  of  the  displacement  g(u) 
versus  the  coordinate  u,  and  not  the  shape  of  g(u).  The  vertical  scale  is  of  no 
real  consequence,  since  it  has  to  do  only  with  the  maximum  displacement 
of  the  standing  wave.  Hence  we  can  take  any  value  we  wish  for  [fifg-fid/rfz/]!. 
To  put  it  another  way,  we  choose  the  units  used  to  measure  the  displace- 
ment g(u)  so  that  its  value  at  u — 1,  in  these  units,  is  either  —1  or  +1. 

4.  The  trickiest  point  is  that  the  calculation  of  Q from  Eq.  (13-426)  re- 
quires a knowledge  of  the  parameter  a,  which  we  do  not  know.  Further- 
more Eq.  (13-40)  shows  that  a depends  on  the  value  of  the  standing-wave 
frequency  v , and  v is  what  we  are  trying  to  determine.  To  deal  with  this  dif- 
ficulty, we  try  various  “guess”  values  of  a and  use  them  to  calculate  g(u)  in- 
ward from  u = 1,  as  outlined  in  point  1.  Through  this  trial-and-error 
method,  we  search  for  those  values  of  a which  make  it  possible  to  satisfy  the 


584  Superposition  of  Mechanical  Waves 


boundary  conditions  at  the  inner  end  of  the  range  of  the  coordinate  u,  that 
is,  at  the  center  of  the  drumhead,  where  u = 0.  Only  those  values  of  a can 
lead  to  standing  waves.  Therefore,  only  those  values  can  lead  to  valid  plots 
of  the  standing-wave  functions  g(u).  And  once  the  values  of  a are  known, 
Eq.  (13-40)  can  be  used,  together  with  a knowledge  of  the  constants  /a,  a, 
and  F,  to  find  the  corresponding  standing-wave  frequencies  v. 

What  are  the  boundary  conditions  at  u — 0?  There  is  no  restriction  on 
the  value  of  the  displacement  g(u ) at  this  point,  because  the  drumhead  is 
unconstrained  at  its  center.  But  we  must  have  dg(u)/du  = 0 at  u = 0.  If  this 
were  not  so,  the  value  of  Q given  by  Eq.  (13-426)  would  be  infinite.  The  cur- 
vature of  the  drumhead  given  by  Eq.  (13-42a)  would  then  be  d2g{u)/du 2 = 
± °°  at  u = 0,  and  the  drumhead  would  have  an  infinitely  sharp  peak  or  de- 
pression at  its  center,  as  shown  in  Fig.  13-19.  This  is  physically  impossible, 
and  the  central  boundary  condition  [dg(u)/da]0  — 0 is  thus  justified. 

In  Examples  13-5  through  13-7,  the  vibrating  drumhead  program  is 
used  to  find  the  values  of  the  parameter  a corresponding  to  the  first  three 
circularly  symmetrical  standing-wave  modes  of  the  drumhead. 


g(u) 


Fig.  13-19  Justification  for  the  boundary  condition 
[dg(u)/du]0  = 0.  The  graph  shows  three  possible 
behaviors  of  the  wave  function  g(u)  in  the  immediate 
vicinity  of  the  center  of  the  drumhead,  u = 0.  We 
are  considering  only  circularly  symmetric  wave  func- 
tions. Symmetry  therefore  dictates  that  g(u)  must 
approach  the  vertical  axis,  at  u = 0,  in  the  same 
way  from  all  directions.  The  curve  representing  g(u) 
must  approach  the  axis  either  with  nonzero  slope, 
as  in  the  curve  labeled  a and  b,  or  with  zero  slope, 
as  shown  by  the  curve  labeled  c.  But  curves  a and  b 
imply  the  correspondingly-labeled  drumhead  shapes 
shown  in  perspective.  In  each  of  these,  the  drum- 
head is  distorted  at  its  center  into  an  infinitely  sharp 
point,  which  is  impossible  for  a real  drumhead. 
Curve  c in  the  graph  implies  the  correspondingly 
labeled  drumhead  shape,  which  is  flat  and  parallel 
to  the  horizontal  plane  in  the  immediate  vicinity  of 
the  center.  This  is  the  only  physical  possibility  for 
circular  symmetry.  The  flatness,  or  zero  slope,  is 
specified  by  the  boundary  condition  [dg-(M)/dw]0  = 0. 


mmmM 


13-5  Standing  Waves  on  a Circular  Membrane  585 


EXAMPLE  13-5 


Run  the  vibrating  drumhead  program  with  the  following  set  of  initial  conditions 
and  parameters: 

[g(u)]i  = 0;  [dg{u)/du\  = — 1;  u = 1;  Aw  = -0.01;  a = 5.750,  5.800,  5.825, 
5.850,  5.900. 

■ 4 he  program  displays  the  value  of  g(w)  at  the  end  of  every  calculation  cycle. 
Your  concern  centers  on  the  slope  dg(u)/du  of  a plot  of  these  values  as  u ap- 
proaches zero,  since  the  condition  [dg{u)/du]0  = 0 must  be  satisfied.  So  you  begin  by 
making  a preliminary  search  using  values  of  a more  widely  separated  than  those 
given  above.  You  find  that  for  values  of  a around  5.8,  the  slope  dg(u)/du  is  fairly 
small  as  u approaches  zero.  Then  you  use  the  values  of  a listed  above  to  make 
a fine-grained  search.  In  Fig.  13-20  the  values  of  g(u)  are  plotted  versus  u from  the 
center  of  the  drumhead  to  its  periphery  for  each  value  of  a. 

The  plot  shows  that  when 

a - 5.825  (13-43) 

the  differential  equation  has  a solution  which  satisfies  the  boundary  condition 
dg(u)/du  — 0 at  u = 0,  as  well  as  the  boundary  condition  g(u)  = 0 at  u = 1.  The  re- 
sults for  this  value  of  a constitute  a plot  of  the  function  g(u)  that  solves  the  equation 
and  satisfies  the  boundary  conditions.  The  function  describes  the  radial  depen- 
dence of  the  standing-wave  mode  for  which  a = 5.825.  Physically,  the  plot  shows  a 
radial  profile  (or  cross  section)  of  the  shape  of  the  drumhead  at  a time  when  its  dis- 
placements have  a maximum  positive  value.  The  vertical  scale  in  Fig.  13-20  is 
greatly  exaggerated.  The  scale  depends  on  the  numerical  value  chosen  for 
[dg(u )/du]1.  A value  —1  was  chosen  simply  to  make  the  numerical  values  of  g(u) 
large  enough  to  be  plotted  easily,  and  to  lead  to  a positive  displacement  of  the 
drumhead  at  the  antinode  u = 0.  In  reality,  the  displacements  of  the  drumhead 
must  be  sufficiently  small  to  satisfy  the  assumption  that  the  slope  of  the  displaced 
drumhead  is  everywhere  small. 

Remember  that  Fig.  13-20  is  a “snapshot”  of  the  profile  of  the  drumhead  from 
center  to  periphery  at  an  instant  when  its  displacement  is  maximum.  It  shows  a 
scaled  plot  of f(r,  t ) = g(r)  cos(2-rrvt ) at  an  instant  when  cos(2jrvt)  = 1.  The  center  of 
the  drumhead  is  displaced  upward  at  this  time.  Shortly  thereafter  the  center  will  be 
displaced  downward,  but  with  the  same  profile.  Hence  any  point  on  the  rim  of  the 
drumhead  is  a node  (as  it  must  be  since  the  rim  is  fixed),  and  the  center  is  an  an- 
tinode. 


Example  13-6  explores  the  properties  of  the  second  standing-wave 
mode  of  the  vibrating  drumhead. 


EXAMPLE  13-6  ■■■ ----- - — 

Run  the  vibrating  drumhead  program  with  the  following  set  of  initial  conditions 
and  parameters: 

[g(u)]i  = 0;  [dg{u)/du\  = 1;  u = 1;  Am  = -0.01;  a = 30.85. 

■ A continued  search  for  values  of  a satisfying  the  central  boundary  condition 
[dg(u)/du] o = 0 — first  with  widely  spaced  values  of  a and  then  with  more  closely 
spaced  values — yields 

a = 30.85  (13-44) 

as  the  next  value  above  a — 5.825  which  does  so.  The  resulting  values  of  g(u)  are 
plotted  versus  u in  Fig.  13-21.  For  this  standing-wave  mode,  the  drumhead  has  at  a 
given  instant  a displacement  of  one  sign  at  its  center  and  a ring-shaped  region  near 
its  edge  with  a displacement  of  the  other  sign.  There  is  a node  approximately — but 
not  exactly — halfway  out  from  the  center  at  u — 0.44,  in  addition  to  the  node  at  the 
fixed  rim.  The  antinodes  lie  at  u = 0 and  u — 0.70.  (Can  you  see  why  the  value  of 
[dg(u)/du]i  was  chosen  to  be  +1  and  not  —1?) 


586  Superposition  of  Mechanical  Waves 


1.000 


Fig.  13-20  Illustration  for  Example  13-5,  showing  the  curve  g(u)  versus  u for  five  closely 
spaced  trial  values  of  the  parameter  a.  Compare  with  Fig.  13-17<z.  Curve  3,  representing 
the  trial  value  a = 5.825,  comes  close  to  approaching  the  g(u)  axis  perpendicularly.  It  thus 
comes  close  to  satisfying  the  boundary  condition  [ dg(u)/du]0  = 0.  Beginning  at  the  value 
u = 1,  the  curves  representing  the  calculations  for  the  five  specified  values  of  a nearly 
coincide  at  first,  and  have  not  been  plotted  separately.  But  as  the  calculation  proceeds  and 
the  value  of  u approaches  zero,  the  g(u)  curve  becomes  quite  sensitive  to  the  choice  of  a.  The 
calculated  curves  therefore  diverge  and  are  plotted  separately  in  the  region  of  small  u. 


13-5  Standing  Waves  on  a Circular  Membrane  587 


«(«) 


Fig.  13-21  Illustration  for  Example  13-6.  The  wave  function  g(u)  is  plotted  versus  u for 
a = 30.85,  the  second  value  of  a for  which  the  boundary  condition  [dg(u)/du\0  = 0 is  satis- 
fied. Like  Fig.  13-20,  this  graph  may  be  regarded  as  a radial  slice  through  the  drumhead 
from  center  to  periphery  at  an  instant  when  every  point  on  the  drumhead  achieves  its 
maximum  displacement  from  the  undisturbed  position  represented  by  the  u axis.  Compare 
with  Fig.  1 3-176. 


The  standing  wave  is  pictured  in  the  perspective  drawing  of  Fig.  13-176.  Note 
that  the  node  at  u — 0.44  is  actually  a nodal  circle — a circle  having  that  radius, 
along  which  the  drumhead  remains  at  rest  at  all  points. 


588  Superposition  of  Mechanical  Waves 


g(u) 


Example  13-7  explores  the  properties  of  the  third  standing-wave 
mode  of  the  vibrating  drumhead. 


EXAMPLE  13-7 

Run  the  vibrating  drumhead  program  with  the  following  set  of  initial  conditions 
and  parameters: 

[g(ir)]i  = 0;  [dg(u)/du\i  = — 1;  u — 1 ; Am  = —0.01:  a = 75.9. 

■ The  next  value  of  a which  satisfies  the  central  boundary  condition 
[dg(u)/du\ o = 0 is 

a — 75.9  (13-45) 

The  corresponding  values  of  g(u)  are  plotted  versus  u in  Fig.  1 3-22.  Here  there  are 
two  concentric  ring-shaped  regions  surrounding  the  center  region,  and  at  any  in- 
stant the  sign  of  the  displacement  of  the  drumhead  alternates  in  going  from  the 
center  to  the  first  ring  and  then  to  the  second.  The  appearance  of  the  standing  wave 
is  indicated  by  the  drawing  in  Fig.  13-1 7c.  There  are  now  two  nodal  circles  in  addi- 
tion to  the  fixed  rim,  located  at  u — 0.28  and  u — 0.64. 

The  maximum  displacement  is  smallest  for  the  outer  ring-shaped  region, 
larger  for  the  inner  ring-shaped  region,  and  largest  for  the  central  region.  By  con- 
sidering the  way  the  drumhead  is  supported,  you  should  be  able  to  give  a physical 
explanation  for  this  mathematical  result. 


Fig.  13-22  Illustration  for  Example  13-7.  The  wave  function  g(u)  is  plotted  versus  u for 
a = 75.9,  the  third  value  of  a for  which  the  boundary  condition  [dg(u) /du]„  = 0 is  satisfied. 
Compare  with  Fig.  13- 17c. 


0.500 


0.400 


0.300 


0.200 


0.100 


0 


-0.100 


-0.200 


13-5  Standing  Waves  on  a Circular  Membrane  589 


Line  of  nodes  Nodal  circle 


v=  1.59^1 
(a) 


Nodal  circle 


Line  of 

nodes  v ~ - H 


( b ) 

Fig.  13-23  “Snapshots”  of  a circular 
drumhead  vibrating  in  two  standing- 
wave  modes  which  do  not  possess  circu- 
lar symmetry.  The  frequency  v of  each 
mode  is  expressed  in  terms  of  iq,  the 
frequency  of  the  fundamental  mode 
shown  in  Fig.  13-17a.  (a)  Here  the 
drumhead  vibrates  symmetrically  about 
a motionless  line  of  nodes  which  is  a 
diameter  of  the  drumhead.  The  parts 
of  the  drumhead  to  the  left  and  the 
right  of  the  line  of  nodes  have  at  any 
instant  opposite  displacements  from  the 
undisturbed  position  and  oppositely 
directed  velocities.  ( b ) Here  there  are 
two  lines  of  nodes  which  are  per- 
pendicular to  each  other.  They  divide 
the  drumhead  into  a “four-leafed  sham- 
rock." At  any  instant,  the  “leaves”  have 
displacements  of  alternate  sign  and 
move  in  alternate  directions.  In  both 
of  the  modes  shown  in  this  figure,  the 
center  of  the  drumhead  is  a nodal  point 
because  it  lies  on  at  least  one  line  of 
nodes. 


In  Examples  13-5  through  13-7  we  have  determined  the  frequencies 
(or  at  least  the  ratios  of  the  frequencies)  of  the  first  three  circularly  sym- 
metrical standing  waves  in  a drumhead.  According  to  Eq.  (13-40),  the  fre- 
quency v is 


The  quantity  (\ /2na)\/F/ /x  is  determined  by  the  physical  properties  of 
the  drumhead:  its  radius  a,  tension  F,  and  density  fx.  So  there  is  a certain 
fixed  value  of  this  quantity  for  a particular  drumhead.  Hence  the  value  of  v 
for  each  standing  wave  on  a particular  drumhead  is  proportional  to  its 
value  of  \/a.  Taking  the  values  of  a from  Eqs.  (13-43)  through  (13-45),  we 
have 

v cc  V5.825,  V30.85,  V7T9,  . . . 

or 

v oc  2-41,  5.55,  8.71,  . . . 

The  lowest  is  the  fundamental  frequency  for  the  drumhead.  It  is  the  fre- 
quency for  the  standing  wave,  illustrated  in  Fig.  13-17 a,  that  has  the  sim- 
plest shape.  If  we  write  the  fundamental  frequency  as  iq,  our  results  can  be 
expressed  as 

5.55  8.71 

v ~ Vl>  2 41  Vl’  2.41  Vl’  ' ' ' 

or 

v = vx,  2.30  vx,  3.61  vu  . . . (13-47) 

In  contrast  to  the  situation  for  standing  waves  on  a string,  the  frequencies 
of  successively  more  complicated  drumhead  standing  waves  of  circular 
symmetry  are  not  equal  to  the  fundamental  frequency  multiplied  by  the 
successive  integers.  That  is,  the  standing-wave  modes  are  not  harmonics. 

This  is  also  true  for  the  standing  waves  we  have  been  ignoring — those 
in  which  the  displacement  of  the  drumhead  at  any  instant  does  not  have 
symmetry  about  its  center.  There  are  many  such  standing  waves,  each  with 
its  own  characteristic  frequency.  Their  mathematical  treatment  is  beyond 
the  level  of  this  book.  We  indicate  in  Fig.  13-23  only  the  general  character 
of  two  of  them  by  showing  the  shape  of  each  standing  wave  and  its  fre- 
quency in  units  of  iq.  These  two  are  the  lowest-frequency  standing  waves 
on  a drumhead  that  do  not  have  circular  symmetry.  In  Sec.  13-8,  you  will 
see  that  there  is  a direct  connection  between  the  nonintegral  relation 
among  the  frequencies  of  various  standing-wave  modes  of  a drumhead 
and  the  “nonharmonious”  sound  of  a drum. 

You  may  be  interested  to  know  that  Eq.  (13-35)  is  called  Bessel’s  equation  of 
order  zero.  It  is  a very  important  differential  equation  because  it  arises  in  many 
different  fields  of  mathematical  physics.  Solutions  to  it  can  be  obtained,  with 
some  difficulty,  by  analytical  methods.  The  solutions,  such  as  the  ones  plotted  in 
Fig.  13-17,  are  called  Bessel  functions  of  order  zero.  They  are  evidently  oscilla- 
tory, but  not  sinusoidal.  Nor  are  they  exponentially  diminishing  sinusoids  of  the 
sort  encountered  in  Chap.  6. 

So  far  we  have  dealt  with  the  vibrations  of  drumheads  only  in  a form 
reduced  to  dimensionless  parameters.  In  Example  13-8  these  results  are 


590  Superposition  of  Mechanical  Waves 


applied  to  a specific  drumhead  having  realistic  size,  tension,  and  mass  per 
unit  area. 


EXAMPLE  13-8 


Find  the  fundamental  frequency  for  a drumhead  of  radius  a = 20.0  cm,  areal  density 
/x  = 500  g/m2,  and  tension  F = 1000  N/m. 

■ Using  Eq.  (13-46),  with  a = 5.825,  you  have 


v 


1 F r 

- — a/-  Va 
l-rra  v fj, 

1 


2 77  x 0.200  m 
= 85.9  Hz 


/ 1 000  N/m  x 5.825 
V 0.500  kg/m2 


13-6  ACOUSTICS  Sound  waves  in  air  and  other  media  are  quite  typical  of  the  longitudinal 

waves  discussed  in  Sec.  12-6.  They  are  usually  small  enough  in  amplitude, 
and  the  resistance  to  deformation  of  the  propagation  medium  is  usually 
linear  enough — that  is,  obeys  Hooke’s  law  closely  enough — that  the  superpo- 
sition principle  applies  quite  accurately.  (There  are  notable  exceptions, 
such  as  the  crack  of  a whip  and  the  sonic  boom  of  high-speed  airplanes.) 
However,  the  word  “sound”  used  in  the  everyday  sense  embraces  much 
more  than  the  propagation  of  longitudinal  waves  through  the  air. 

First,  there  must  be  some  sort  of  vibrating  source,  and  these  are  of 
many  varieties.  Second,  there  must  be  some  way  of  coupling  the  vibrating 
source  to  the  propagation  medium.  Third,  there  must  be  propagation 
through  the  medium.  Except  in  unusual  cases,  the  propagation  process  is 
strongly  affected  by  reflection,  bending  of  path,  and  absorption  at  inter- 
vening or  nearby  surfaces,  including  the  ear  canal  itself.  Fourth,  the  oscil- 
lating air  sets  the  eardrum  into  vibration.  Fifth,  the  eardrum  sets  into  vibra- 
tion the  very  complex  solid  and  liquid  mechanical  system  comprising  the 
middle  and  inner  ear.  This  system  is  very  far  from  obeying  Hooke’s  law  for 
sounds  of  commonly  encountered  loudness  and  thus  introduces  compli- 
cated modifications  of  its  own.  Sixth,  the  mechanical  vibration  is  translated 
into  electric  nerve  impulses  by  an  array  of  an  enormous  number  of  so- 
called  hair  cells  whose  detailed  structure  varies  subtly  from  point  to  point. 
Seventh,  the  nerve  impulses  are  organized  and  interpreted  by  the  auditory 
nerve  and  the  brain. 

The  first  through  the  fourth  of  these  processes  lie  in  the  domain  of 
physical  acoustics,  to  which  we  devote  most  of  our  attention  in  this  section. 
The  fourth  through  the  seventh  are  the  domain  of  physiological  acoustics, 
and  the  seventh  also  falls  into  the  domain  of  psychological  acoustics.  The 
remainder  of  this  section  provides  an  introduction  to  the  physical  bases  of 
the  acoustical  impressions  which  we  call  loudness,  pitch,  and  quality  or  timbre. 

The  ear,  like  most  of  the  other  sense  organs,  must  respond  in  a useful 
manner  to  a tremendous  range  of  stimulus  intensity.  To  this  task  it  is  ad- 
mirably adapted.  T he  ratio  of  the  energy  flux  in  the  loudest  sound  the  ear 
can  handle  to  that  of  the  faintest  sound  it  can  detect  is  something  like  1013 
at  frequencies  of  approximately  1 kHz.  The  normal  ear  is  most  sensitive  at 
about  3.5  kHz,  where  the  minimum  audible  energy  flux  is  of  order  of  mag- 
nitude 10~12  W/m2.  This  flux  involves  a pressure  fluctuation  of  the  order  of 
10-10  times  atmospheric  pressure,  and  the  oscillation  amplitude  of  the  air  is 


13-6  Acoustics  591 


something  like  1CT12  m,  or  about  1 percent  of  the  diameter  of  a typical 
atom.  (Individual  atoms  in  the  air  move  about  randomly  very  much  more 
than  this,  as  you  will  see  in  Chap.  17.  But  this  is  the  average  displacement 
of  the  air  taken  as  a whole,  as  it  oscillates  against  the  eardrum.) 

In  order  for  the  enormous  range  of  stimulus  intensity  to  which  the  ear 
can  respond  to  be  compressed  into  a manageable  range  of  perceived 
loudness,  the  sense  of  hearing,  like  that  of  sight  and  of  touch,  is  highly 
compressed.  That  is,  it  takes  much  more  than  a doubling  of  the  sound  en- 
ergy flux  5 to  make  a listener  render  a subjective  judgment  that  a sound  has 
become  “twice  as  loud.”  And  a redoubling  of  the  subjective  loudness  re- 
quires a still  greater  increase  in  5.  In  fact  the  sensation  of  loudness  is  re- 
lated roughly  logarithmically  to  the  energy  flux  incident  on  the  ear.  We  can 
exploit  this  physiological-psychological  fact  to  define  a rough  but  conve- 
nient scale  of  acoustic  intensity. 

We  call  the  energy  flux  associated  with  the  faintest  audible  sound  S0 
the  threshold  of  hearing.  The  acoustic  intensity  a of  a sound  whose  en- 
ergy flux  is  S is  defined  to  be 

S 

a = 10  log—  (13-48a) 

■jo 

This  equation  says  that  a 10-fold  increase  in  energy  Hux  above  S0  leads  to  a 
10-fold  increase  in  a.  But  a 100-fold  increase  in  energy  flux  above  S0  leads 
only  to  a 20-fold  increase  in  a,  and  a 1000-fold  increase  leads  to  a 30-fold 
increase  in  a.  You  can  see  how  this  leads  to  the  desired  “compression”  of 
the  scale  of  a,  which  corresponds  to  what  is  perceived  by  the  listener.  The 
threshold  of  hearing  varies  substantially  from  individual  to  individual  and 
rises  significantly  with  increasing  age.  Nevertheless,  it  has  been  agreed  by 
international  convention  to  set 

So  = 1 X 10“12  W/m2  (13-486) 

l he  acoustic  intensity  a is  evidently  a dimensionless  number;  nevertheless, 
a is  expressed  in  a unit  called  the  decibel  (dB)  after  Alexander  Graham  Bell 
(1847-1922),  the  Scottish-American  teacher  of  the  deaf,  phonologist, 
acoustician,  inventor,  and  painter.  The  decibel  is  of  convenient  size,  since 
1 dB  represents,  in  the  middle  range  of  frequency  and  loudness,  approxi- 
mately the  minimum  change  in  loudness  of  a sound  which  is  perceptible. 

Table  13- 1 lists  the  acoustic  intensity  in  decibels  of  a number  of  famil- 
iar acoustic  environments. 


Table  13-1  

Typical  Acoustic  Intensities 
Acoustic  environment 

Threshold  of  hearing 

Intelligible  whisper  in  quiet  surroundings 
Quiet  street 

Interior  of  jet  plane  in  flight 
Conversation 
Busy  downtown  street 
Heavy  truck  from  side  of  road 
Jackhammer  at  5 m distance 
Rock  band  in  closed  room 
Threshold  of  pain 


Acoustic  intensity  a (in  dB) 

0 (by  definition) 

20 

40 

50 

60-65 

60-70 

90 

95-105 
90  — >120 
120 


592  Superposition  of  Mechanical  Waves 


EXAMPLE  13-9 


By  what  factor  must  the  acoustic  energy  flux  increase  if  the  change  is  to  be  barely 
perceptible?  Assume  that  the  initial  acoustic  intensity  is  40  dB  and  the  frequency 
range  lies  near  3 kHz,  where  the  typical  ear  has  normal  response  and  maximum 
sensitivity. 

■ Since  the  acoustic  intensity  and  the  frequency  are  both  roughly  in  midrange  for 
the  ear,  a change  of  1 dB  in  a will  be  perceptible.  Using  Eq.  ( 13-48a),  calling  the  initial 
and  final  intensities  a,  and  af,  and  calling  the  corresponding  energy  fluxes  St  and 
Sf,  you  can  therefore  write 

1 = 0Lf-  cn=  10  (log  - log  ~)  = 10  logy 


or 


1 

To 


By  the  definition  of  the  logarithm  function,  this  means  that 


or 


y = 101'10 
*^2 


That  is,  the  ear  can  just  discern  a 26  percent  increase  in  acoustical  energy  flux. 


Just  as  the  ear  responds  to  the  physical  stimulus  of  acoustic  energy  flux 
with  the  sensation  of  loudness,  it  responds  to  the  physical  stimulus  of  fre- 
quency with  the  sensation  of  pitch.  The  lowest  frequency  audible  as  a sound 
is  somewhere  between  16  Hz  and  25  Hz  for  most  normal  ears.  The  upper 
limit  depends  very  much  on  the  individual  and  particularly  on  age.  Many 
persons  younger  than  16  years  can  hear  well  above  18  kHz  (say,  to  20  kHz 
or  so)  while  few  persons  older  than  45  years  can  hear  much  above  12  kHz. 
(An  interesting  benchmark  is  the  annoying  whistle  produced  by  poorly  de- 
signed television  receivers.  Its  frequency  in  North  America  and  some  other 
parts  of  the  world  is  15,750  Hz.  Can  you  hear  it?)  The  high-frequency 
response  of  the  ear  is  particularly  susceptible  to  damage  from  sustained 
loud  sounds;  rock  musicians  and  persons  who  listen  frequently  to  highly 
amplified  music  are  often  found  to  be  substantially  deaf  at  frequencies 
above  10  kHz  or  so. 

The  sensation  of  pitch  is  also  logarithmic  with  respect  to  the  stimulus  of 
frequency.  That  is,  a repeated  additive  increase  in  the  sensation  of  pitch  by  a 
given  amount  requires  a repeated  increase  of  the  frequency  by  the  same 
multiplicative  factor.  To  give  the  simplest  and  most  familiar  example,  an 
increase  in  pitch  of  one  octave  requires  a doubling  of  the  frequency,  and  an 
increase  in  pitch  of  two  octaves  requires  a quadrupling  of  the  frequency, 
regardless  of  the  value  of  the  initial  frequency.  In  these  terms,  the  ear  has  a 
frequency  range  close  to  10  octaves.  This  corresponds  roughly  to  a 
1000-fold  range  of  frequency  ( — 20  Hz  to  —20  kHz). 

Besides  its  sensitivity  to  loudness  and  pitch,  the  ear  possesses  great  sen- 
sitivity to  quality,  or  timbre,  as  it  is  called  in  connection  with  musical  instru- 
ments. It  is  this  sensitivity  which  makes  it  possible  for  you  to  distinguish 
among  the  voices  of  your  friends,  even  when  they  have  been  substantially 
distorted  by  the  telephone.  With  a little  practice  you  can  distinguish  not 
only  among  the  various  instruments  in  an  orchestra,  but  also  between  two 


13-6  Acoustics  593 


Fig.  13-24  Space-independent  wave  functions  for  the  same  musical  note 
played  on  (a)  a flute,  ( b ) a clarinet,  (c)  an  oboe,  and  (d)  a saxophone.  (After  D. 
C.  Miller,  Sound  Waves,  Their  Shape  and  Speed,  Macmillan,  New  York,  1937.) 


i 


1(d) 


instruments  of  the  same  kind.  That  is  one  reason  why  musicians  will  go  to 
great  trouble  and  expense  to  obtain  a particular  violin,  or  guitar,  or  clari- 
net. 

What  is  the  physical  stimulus  corresponding  to  the  subjective  sensation 
of  quality  or  timbre?  The  answer  begins  with  Fig.  13-24,  which  depicts  the 
space-independent  wave  function  of  a particular  note  (which  happens  to  be 
middle  C)  played  on  several  different  musical  instruments.  A little  study 
will  convince  you  that  the  periods  of  the  waves  are  all  the  same,  but  their 
shapes  vary  considerably.  This  consideration  leads  us  directly  to  Fourier 
synthesis,  which  is  the  topic  of  Sec.  13-7. 


13-7  FOURIER  It  seems  reasonable  to  ask:  Can  a wave  of  any  shape  whatsover  be 
SYNTHESIS  synthesized — built  up — by  the  proper  superposition  of  simple  sinusoidal 
waves?  It  was  proved  in  1822  by  the  French  mathematician  Joseph  B. 
Fourier  (1768-1830)  that  the  answer  is  Yes.  Fourier’s  theorem  can  be 
stated  on  several  levels  of  generality,  two  of  which  concern  us  here: 


1.  Any  perfectly  periodic  function  f(t)  — that  is,  any  function  which  re- 
peats itself  identically  and  indefinitely — whose  first  derivative  is  well  de- 
fined everywhere — that  is,  the  function  has  no  sharp  corners — can  be 
represented  by  a finite  series  of  the  form 

f(t)  = Ai  sin[27r(rjt  + 8J  + A2  sin[27r(2fiF  + 82]  + A3  sin[27r(3r,)t  + 83] 

+ • • • + An  sin[27 r(nv)t  + 8„]  (13-49a) 


594  Superposition  of  Mechanical  Waves 


(13-496) 


Expressed  in  more  compact  notation,  this  is  identical  to 

fit)  = X Aj  sin[277 (jv)t  + §j] 
j= i 

Written  in  either  of  these  two  ways,  the  series  is  called  the  Fourier  expan- 
sion for  a space-independent  wave  function  f(t).  If  this  function  represents 
the  time  dependence  of  the  pressure  oscillations  in  a sound  wave  arriving 
at  the  fixed  position  of  the  ear  of  a stationary  listener,  then  it  describes  the 
situation  most  often  analyzed  in  acoustics.  However,  the  theorem  is  equally 
valid  no  matter  what  physical  phenomenon  is  described  by  the  mathemati- 
cal function  f{t). 

Each  term  in  the  summation  of  Eqs.  (13-49a)  and  (13-496)  is  called  a 
Fourier  component  of  the  function  f(t).  We  have  written  these  terms  in 
such  a way  that  they  depend  on  the  frequency  jv  of  the  sinusoidal  function 
giving  the  time  dependence  of  the  Fourier  component,  rather  than  on  its 
angular  frequency  or  its  period.  The  reason  is  that  it  is  customary  to  deal 
with  frequencies  in  acoustics.  In  Eqs.  (13-49a)  and  (13-496),  the  amplitudes 
Aj  of  each  of  the  components  must  be  chosen  properly  to  synthesize  the  de- 
sired function  f(t).  That  is,  proper  “amounts”  of  the  various  components 
must  be  used.  Furthermore,  the  components  must  have  the  proper  phases 
Sj.  But  what  is  perhaps  surprising  is  that  the  only  components  necessary 
have  frequencies  which  are  integral  multiples  of  the  fundamental  frequency  v. 

2.  Even  if  a perfectly  periodic  function  f(t)  does  not  have  a well- 
defined  first  derivative  everywhere  — that  is,  if  it  has  sharp  corners  like  the 
“sawtooth”  function  illustrated  in  Fig.  13-25a — it  can  still  be  represented  as 

69)  Fig.  13-25  (a)  An  idealized  “sawtooth”  function.  The 


13-7  Fourier  Synthesis  595 


a sum  of  the  form  given  in  Eq.  (13-49o),  but  with  an  infinite  number  of 
terms  of  frequency  v,  2v,  3v,  . . . . For  comparison,  Fig.  13-256  represents 
the  sum  of  20  terms  of  the  form 

fit)  = 1 sin[27r(r')t]  + i sin[27r(2r-)t]  + i sin[27r(3i/)t]  +•  • • 

20  i 

= X 7 sin[2ir(»f] 

j=i  J 

It  does  not  take  a great  stretch  of  the  imagination  to  believe — and  it  can  be 
rigorously  proved — that  Fig.  13-25a  represents  the  sum  of  the  infinite 
series 


f(i)  = 2 ^ sin[27 T{jv)t\  (13-50) 

}=  i J 

In  this  particular  case  the  amplitude  coefficients  Aj  are  given  by  Aj  = 1 /j. 
Consequently,  the  value  of  Aj  decreases  fairly  rapidly  as  j increases.  This  is 
generally  the  case  when  series  of  the  form  of  Eq.  (13-49 b)  represent  physi- 
cal situations.  Therefore,  it  is  not  usually  necessary  for  practical  purposes 
to  use  a very  large  number  of  terms  in  the  summation. 


The  practical  question  now  arises:  Given  a particular  form  f(t)  of  a 
space-independent  wave  function,  is  it  possible  to  determine  the  ampli- 
tudes Aj  and  the  phases  8j  and  thus  to  analyze  the  wave  into  its  simple  si- 
nusoidal components?  It  is  indeed  possible.  The  process  is  the  inverse  of 
Fourier  synthesis  and  is  called  Fourier  analysis.  Fourier  analysis  is  an  im- 
portant tool  both  mathematically  and  physically.  In  acoustics,  it  is  usually 
accomplished  by  the  use  of  special  instruments  called  Fourier  analyzers  or 
wave  analyzers.  A microphone  picks  up  the  sound  and  converts  it  into  an 
electric  waveform  nearly  identical  to  the  waveform  of  the  sound.  (By  wave- 
form is  meant  the  shape  of  an  actual  space-independent  or  time- 
independent  wave.  A distinction  is  thus  made  between  a waveform  and  a 
wave  function,  which  is  a mathematical  representation,  either  algebraic  or 
graphical,  of  a wave.)  The  electric  waveform  is  then  fed  into  an  array  of  de- 
vices called  filters,  or  their  equivalents,  which  separate  the  various  Fourier 
components  according  to  their  frequencies.  The  individual  amplitudes  of 
the  Fourier  components  are  then  measured  as  voltages  and  displayed  in 
one  of  several  convenient  fashions. 

The  most  useful  form  of  display  is  usually  the  Fourier  spectrum, 
which  is  simply  a graphical  plot  of  amplitude  as  a function  of  frequency. 
Figure  13-26  depicts  the  Fourier  spectra  of  typical  notes  played  on  a variety 
of  musical  instruments.  Such  a Fourier  spectrum,  in  which  only  specific, 
sharply  defined  frequencies  contribute  to  the  complete  wave,  is  called  a dis- 
crete spectrum,  or  line  spectrum. 

Of  course,  nothing  in  the  real  world  is  infinitely  sharp,  and  the  spectra 
of  Fig.  13-26  are  idealized.  Figure  13-27  depicts  Fourier  spectra  actually 
obtained  from  a flute  and  a bassoon,  playing  in  various  parts  of  their 
ranges,  by  a specialized  wave  analyzer  called  a sound  spectrograph.  In  each 
case,  the  corresponding  space-independent  waveform  also  is  shown. 


596  Superposition  of  Mechanical  Waves 


Fig.  13-26  Typical  Fourier  spectra  of 
some  common  musical  instruments. 
The  Fourier  spectrum  of  an  instrument 
varies  considerably  over  its  range  of 
pitch  and  depends  to  some  degree  on 
the  loudness  with  which  it  is  played  and 
other  factors.  Nevertheless,  the  charac- 
teristic quality  of  the  instrument  is  still 
clearly  discernible.  The  reason  for  this  is 
especially  evident  for  the  clarinet,  where 
the  odd  harmonics  (j  = 1,  3,  5,  . . .) 
are  considerably  stronger  than  the  even 
harmonics  ( j = 2,  4,  6.  . . .). 


c 

cd 

T3 


a 

6 


French  horn 
(second  harmonic  = 1) 


Piano,  played  medium  loud 


Li 


in 


123456789  10  11 


llu 


Violin,  open  g-string 
(second  harmonic  = 1 ) 


1 2 3 4 5 6 7 8 9 10  11  12  13  14  15  16  17  18  19 


Mode  number 


Fourier  analysis  can  be  applied  to  time-independent  wave  functions  as 
well.  Consider,  for  example,  the  vibrating  string  shown  in  the  photo  of  Fig. 
13-28.  The  shape  of  the  string  at  any  instant  is  quite  complex.  But  it  can  be 
represented  by  a Fourier  series  of  the  form 


f(x)  = 2)  Aj  sin 

j=i 


Sj 


This  form  is  very  similar  to  Eq.  ( 1 3-49 6).  However,  the  independent  vari- 
able is  x rather  than  t.  And  the  coefficient  of  x,  that  is  27 r(j/\),  is  the  wave 
number,  which  depends  on  the  fundamental  wavelength  k corresponding 
to  the  fundamental  frequency  v in  Eq.  (13-49a).  In  contrast,  the  quantity 
27 T(jv)  which  multiplies  t in  Eq.  (13-4%)  depends  on  the  frequency  v and  is 
the  angular  frequency. 


For  the  particular  case  of  the  string  in  Fig.  13-28,  the  phase  constants  Sj  must 
all  be  equal  to  zero  if  the  origin  is  chosen  at  the  left  end  of  the  string.  This  restric- 
tion is  imposed  by  the  boundary  conditions.  Can  you  explain  how? 

A physical  interpretation  of  the  Fourier  series  describing  the  shape  of 
the  vibrating  string  is  as  follows.  The  string  may  be  regarded  as  vibrating  in 
several  of  its  harmonic  modes  at  the  same  time.  Such  vibration  is  the  rule 
rather  than  the  exception  for  stringed  musical  instruments.  The  modes  are 
superposed.  You  may  imagine  the  higher-j  modes  as  “riding”  on  the 
lower-j  modes,  each  with  its  own  particular  amplitude  Aj.  It  is  the  superpo- 
sition of  varying  proportions  of  harmonic  modes,  or  harmonics,  which  de- 
termines in  large  measure  the  peculiar  tone  quality,  or  timbre,  of  each  mu- 
sical instrument.  This  point  is  considered  in  further  detail  in  Sec.  13-8. 


13-7  Fourier  Synthesis  597 


L 

0 

1( 

)0 

4( 

)0 

i 

K 2K  4 

K 

1< 

)K 

60  200  600 


1.0 

0.8 

0.6 

0.4 

0.2 

0 


6K  20K 


(a) 


Frequency  (in  Hz) 


(b) 


Frequency  (in  Hz) 


Frequency  (in  Hz) 
(c) 


B 

f 

1 

1 

1 

ft  | 

l i 

—A. 

lil 

,■  i Afllz:  llWWfc.  i 0 

40  | 100  | 400 { IK  2K  4K  | 10K 
60  200  600  6K  20K 


Frequency  (in  Hz) 
(d) 


598  Superposition  of  Mechanical  Waves 


Amplitude  Amplitude  Amplitude  Amplitude 


Fig.  13-27  (a)  Waveform  and  sound  spectrum  of  a pure  (sinusoidal)  tone  generated  electron- 

ically (500  Hz),  (b)  Waveform  and  sound  spectrum  for  a flute  playing  near  the  lower  end  of  its 
range  (A  = 440  Hz).  The  fundamental  is  strongest,  but  higher  frequencies  are  well  repre- 
sented, too,  especially  the  third  and  fourth  harmonics,  (e)  Flute  playing  in  the  middle  of  its 
range  (D  = 1175  Hz).  The  tone  has  “cleaned  up”  to  the  typically  pure  quality  we  associate 
with  this  instrument.  The  second  harmonic  has  only  one-fifth  the  amplitude  of  the  funda- 
mental, and  no  other  harmonics  can  be  seen.  The  eye  can  barely  distinguish  the  waveform 
from  a pure  sinusoidal,  (d)  Bassoon  playing  at  the  very  bottom  of  its  range  (B-flat  = 58  Hz). 
Note  the  complete  absence  of  the  fundamental  and  the  great  weakness  of  the  lower  harmonics. 
The  maximum  amplitude  is  in  the  twelfth  harmonic.  ( e ) Bassoon  playing  in  the  middle  of  its 
range  (G  = 196  Hz).  The  fundamental  is  now  visible,  but  is  weaker  than  five  of  the  harmonics, 
especially  the  second.  (/)  Bassoon  near  the  top  of  its  range  (A  = 440  Hz).  This  is  the  same 
note  being  played  by  the  flute  in  part  b.  The  fundamental  is  now  stronger,  but  still  does  not 
dominate  the  waveform.  This  sound  is  much  smoother  than  the  lower  notes,  but  still  quite 
characteristic  of  the  bassoon. 


Fig.  13-28  Time-exposure  photograph  of  a string  vibrating  in  a complex  fashion.  The 
complicated  shape  is  due  to  the  simultaneous  presence  of  several  sinusoidal  standing-wave 
modes,  each  with  its  own  amplitude.  The  time-independent  wave  function  representing  the 
shape  of  the  string  at  any  moment  can  be  represented  as  a Fourier  series,  as  discussed  in  the 
text.  (From  D.  C.  Miller,  The  Science  of  Musical  Sounds,  New  York,  Macmillan,  1916.) 


13-7  Fourier  Synthesis  599 


13-8  THE  PHYSICS  We  can  now  return  to  the  acoustical  question,  asked  in  Sec.  13-6,  which 
OF  MUSIC  launched  the  discussion  of  Fourier  analysis  and  synthesis  in  Sec.  13-7: 

What  is  the  physical  stimulus  corresponding  to  die  subjective  sensation 
called  quality  or  timbre?  it  is  very  largely  the  Fourier  spectrum  of  the 
emitted  sound  which  determines  the  timbre  of  a musical  instrument,  or  the 
quality  of  a sound  in  general.  The  ear  (or  rather  the  auditory  sensory  ap- 
paratus taken  as  a whole)  is  a quite  sensitive,  although  nonquantitative, 
Fourier  analyzer  as  far  as  amplitudes  are  concerned.  (However,  the  ear  is 
not  at  all  sensitive  to  phase.  Changing  the  values  of  the  phase  constants  8j  in 
the  Fourier  components  making  up  a sound  wave  causes  no  change  at  all  in 
the  perceived  quality  of  the  sound.) 

As  far  as  the  timbre  of  musical  instruments  is  concerned,  the  Fourier 
spectrum  of  a more  or  less  sustained  note  is  not  the  whole  story,  because 
musical  instruments  do  not  usually  play  sustained  notes.  The  attack  and 
decay  are  also  of  great  importance.  The  first  term  refers  to  the  brief  time 
during  which  the  musical  note  “starts  up,”  while  the  second  refers  to  the 
time  during  which  it  “dies  away.”  Not  only  does  the  amplitude  of  the  total 
wave  change  during  these  intervals,  but  also  the  ratios  of  the  Fourier  com- 
ponents change  in  a way  which  contributes  to  the  characteristic  timbre  of 
the  instrument.  In  a piano,  for  instance,  the  note  begins  with  a sudden 
hammer  blow  to  the  string  (the  attack)  and  dies  away  gradually  (the  decay). 
An  organ  pipe,  on  the  contrary,  takes  a short  but  noticeable  time  to  build 
up  to  a steady  state.  (Indeed,  a tape  recording  of  a piano  played  backward 
sounds  vaguely  like  an  organ.)  The  relatively  highly  damped  violin  string 
responds  very  sensitively  to  subtle  variations  in  bowing  technique.  The 
singer  imposes  a vibrato  (a  slight  oscillation  of  pitch  at  a frequency  of 
approximately  5 Hz)  on  the  sung  note,  either  consciously  or  unconsciously. 
The  violinist  does  the  same  by  rocking  the  fingers  on  the  fingerboard. 

Most  sources  of  sound  do  not  produce  a single  frequency  or  a group  of 
discrete  frequencies  in  the  way  that  most  musical  instruments  do.  Rather, 
there  is  sound  energy  at  all  frequencies  within  some  range,  although  the 
amplitude  may  vary  in  a very  complex  way  as  a function  of  frequency.  The 
human  speaking  voice  is  an  example  of  such  a source.  There  may  well  be  a 
sense  of  generally  “high"  or  “low”  pitch — everyone  with  normal  hearing 
can  distinguish  easily  between  normal  female  and  male  voices — but  the 
sense  of  pitch  is  not  specific.  In  such  cases  the  Fourier  spectrum  is  not  dis- 
crete but  continuous. 

Discrete  spectra  are  relatively  easy  to  synthesize,  and  artificial  musical 
instruments  that  do  this  have  existed  for  some  time.  The  electronic  organ  is 
the  best  known  of  these.  It  operates  by  reproducing  the  Fourier  spectra 
characteristic  of  real  organ  pipes,  using  other  sources  which  are  cheaper 
and  more  compact.  As  a practical  matter,  all  but  the  most  expensive  elec- 
tronic organs  produce  a recognizable  but  very  poor  imitation  of  real  pipe 
organs.  Even  the  best  electronic  organs  can  be  distinguished  from  the  gen- 
uine article  by  a not  very  practiced  ear.  In  part  this  has  to  do  with  the  diffi- 
cult y of  reproducing  the  very  large  number  of  harmonics  which  contribute 
to  the  quality  of  an  organ  pipe.  Sometimes  there  are  more  than  30  of  these, 
while  few  electronic  organs  use  more  than  7 or  8 Fourier  components  in 
synthesizing  their  notes. 

Usually,  however,  it  is  the  attack  and  decay  pattern  characteristic  of  the 
original  instrument  that  is  most  difficult  to  imitate  with  an  electronic  instru- 
ment. fhe  problem  can  be  solved  by  the  expedient  of  using  a mechan- 


600  Superposition  of  Mechanical  Waves 


ical  system  more  or  less  similar  to  that  in  the  parent  instrument  to  originate 
the  tone.  This  approach  is  obvious,  ancl  very  successful,  in  the  electric 
guitar. 

The  most  successful  electronic  instruments  are  not  intended  to  imitate 
the  parent  instrument  as  closely  as  possible.  The  best  musical  results  are  at- 
tained when  an  instrument  is  accepted  on  its  own  terms  ancl  a literature  is 
devised  for  it.  Such  a trend  is  clearly  discernible  in  the  case  of  the  electric 
guitar.  When  the  instrument  was  hrst  designed,  about  fifty  years  ago,  the 
idea  was  to  build  something  that  would  sound  as  much  like  a guitar  as  pos- 
sible, but  louder.  Nowadays  a vast  variety  of  sound  qualities  (that  is,  a vast 
variety  of  Fourier  spectra)  are  available.  Except  for  the  attack-decay  char- 
acteristics and  the  technique  of  playing,  many  of  these  sounds  bear  little 
resemblance  to  those  of  the  acoustic  guitar. 

A still  more  striking  example  of  the  trend  away  from  imitation  of  tradi- 
tional instruments  is  the  synthesizer,  which  has  no  parent  instrument  at  all. 
In  the  synthesizer,  a wide  variety  of  electronic  tone  generators  are  con- 
trolled by  a standard  keyboard.  The  musician  also  has  very  flexible  control 
over  the  way  in  which  the  individual  waveforms  are  superposed  to  produce 
the  output  waveform,  as  well  as  on  the  attack-decay  pattern.  In  effect,  the 
musician  makes  whatever  Fourier  synthesis  best  suits  the  music  to  be 
played.  The  nearest  parallel  among  traditional  instruments  is  the  large 
pipe  organ,  in  which  various  banks  of  pipes,  called  stops,  can  be  turned  on 
and  off.  However,  the  synthesizer  has  considerably  greater  flexibility.  Its 
waveforms  are  not  restricted  to  those  typical  of  pipes,  and  the  amplitudes 
of  the  individual  component  waveforms  can  be  adjusted  separately.  Unfor- 
tunately, there  is  as  yet  little  or  no  memorable  literature  written  for  the  syn- 
thesizer. 

What  is  it  that  underlies  the  “musicality”  of  musical  instruments? 
Stringed  instruments  like  the  violin,  for  example,  not  only  can  play  com- 
plex, sustained  melodies,  but  also  can  be  combined  harmoniously  to  pro- 
duce rich,  pleasing  musical  textures.  Percussion  instruments  such  as  the 
cymbals  ancl  most  drums,  on  the  other  hand,  are  restricted  to  a “tzing- 
boom”  which  serves  mainly  to  accentuate  the  music  played  by  other  instru- 
ments. And  why  is  it  that  certain  combinations  of  musical  notes  sound  har- 
monious while  others  sound  harsh? 

The  answer  to  these  questions  lies  within  the  scope  of  the  partially 
physical,  partially  physiological  and  psychological  theory  of  harmony.  The 
groundwork  for  this  theory,  of  which  we  will  consider  mainly  the  physical 
aspects,  was  laid  by  the  German  physicist  and  physiologist  Hermann  von 
Helmholtz  (1821-1894). 

All  harmony  is  founded  on  the  fact  that  two  or  more  musical  notes 
sounded  together  are  perceived  as  consonant  (that  is,  pleasant-sounding  or 
harmonious)  or  dissonant  (that  is,  harsh  or  disharmonious).  To  some  ex- 
tent, this  judgment  is  subjective  and  learned.  But  there  can  be  no  doubt 
that  it  has  an  objective  basis.  All  listeners  agree,  for  example,  that  two  notes 
whose  frequency  ratio  is  1 : 2 sound  consonant  when  played  together.  Be- 
cause of  the  logarithmic  response  of  the  ear  to  frequency,  combinations  of 
two  notes  sound  similar,  though  not  identical,  to  other  combinations  of  two 
notes  having  the  same  frequency  ratio.  Musicians  call  the  frequency  ratio  of 
two  notes  an  interval.  The  interval  with  the  1 : 2 ratio,  called  the  octave,  is 
the  simplest  and  most  important. 


13-8  The  Physics  of  Music  601 


An  example  of  an  octave  is  the  combination  of  the  note  called  middle  C, 
whose  frequency  is  normally  261.6  Hz,  with  the  next-higher  note  given  the  name 
C,  whose  frequency  is  2 x 261.6  Hz  = 523.2  Hz. 

On  the  other  hand,  all  listeners  agree  that  two  notes  whose  frequency 
ratio  is  1 : 1.06 — an  interval  which  musicians  call  a semitone  — sound  highly 
dissonant  when  played  together. 

An  example  of  a semitone  is  the  combination  of  middle  C with  the  adjacent 
note,  called  C-sharp,  whose  frequency  is  277.4  Hz. 

l ire  judgment  of  consonance  and  dissonance  is  ultimately  based  on 
the  way  the  human  ear  hears  beats,  which  will  now  be  described.  The  first 
curve  in  Fig.  13-29  depicts  a space-independent  wave  function  /frfr  = 
sin(27mt),  where  v is  an  arbitrary  frequency.  The  second  curve  is  the  func- 
lion/2(/)  = sin[27r(  1. lv)t\,  which  differs  from  the  hrst  only  in  that  its  fre- 
quency is  10  percent  higher. 

The  third  curve  is  the  sum  of  the  other  two.  It  thus  represents  the 
sound  wave  which  excites  the  ear  when  the  two  pure  sinusoidal  notes  are 
played  together.  The  salient  feature  of  the  superposition  is  the  slow  waning 
and  waxing  of  the  amplitude,  as  the  two  component  waves  repeatedly  drift 
into  and  out  of  phase  on  account  of  their  slightly  different  frequencies. 
The  variations  in  perceived  loudness  are  called  beats. 

Let  us  now  consider  the  phenomenon  of  beats  quantitatively.  If  two 
sinusoidal  sound  waves  of  frequency  v and  v + e are  heard  together,  the 


Fig.  13-29  The  production  of  beats  by  the  superposition  of  two  sinusoidals  of  equal  ampli- 
tude having  slightly  different  frequencies.  The  resultant  waveform  is  characterized  by  two 
frequencies.  The  higher  of  the  two,  v+,  is  the  average  of  the  frequencies  of  the  two  sinu- 
soidals and  is  perceived  as  the  pitch.  The  lower  of  the  two  frequencies,  is  one-half  the 
difference  of  the  frequencies  of  the  two  sinusoidals.  It  is  the  frequency  of  the  envelope 
of  the  resultant  wave  function.  The  quantity  2v~-  is  called  the  beat  frequency,  as  explained 
in  the  text. 


602  Superposition  of  Mechanical  Waves 


superposition  principle  gives  the  total  wave  function  as 

f(t)  = sin(27 rvt)  + sin[27r(p  + e)t]  (13-51) 

We  use  the  standard  trigonometric  identity  of  Eq.  (13- 1 3)  to  rewrite  this  ex- 
pression in  the  form 

fit)  = 2 sin{4[27rH  + 2tt(v  + e)f]}  cos{i[277H  — 2tt(v  + e)t]} 


or 


fit)  = 2 sin 


t 


(13-52) 


While  this  equation  is  correct  for  any  values  of  v and  e,  we  are  interested  at 
present  in  the  case  where  e is  very  much  smaller  than  v,  so  that  the  two  fre- 
quencies are  very  close  together.  When  this  is  so,  the  two  sinusoidal  compo- 
nents are  not  heard  as  separate  tones.  Rather,  the  frequency  heard  is  that 
of  the  sine  term,  v + e/2,  which  is  the  average  of  the  frequencies  of  the  two 
components.  If  e is  small,  this  frequency  leads  to  a perceived  pitch  indistin- 
guishable from  v or  v + e. 

This  tone  is  modulated — that  is,  its  amplitude  is  varied — by  the  cosine 
term,  whose  frequency  is  e/2.  But  the  amplitude  depends  on  the  magni- 
tude, not  the  sign,  of  the  cosine  function.  Thus,  each  time  the  cosine  passes 
through  a complete  cycle,  the  amplitude  goes  through  two  maxima  and  two 
zeros.  The  loudness  of  the  sound  thus  varies  with  a frequency  2 X e/2  = e 
times  per  second.  If  this  frequency  is  low  enough,  the  modulation  produces 
perceptible  pulsations  in  loudness. 

This  phenomenon  is  very  useful  in  tuning  two  strings  of  a musical  instrument 
(such  as  guitar  strings  or  piano  strings)  to  the  same  frequency.  As  the  two  strings 
are  sounded  together  with  frequencies  v and  v + e,  the  tension  in  one  of  them  is 
varied  to  make  the  beat  frequency  e smaller  and  smaller,  until  the  beats  disappear. 
The  strings  are  then  tuned  accurately  to  the  same  frequency.  They  are  said  to  be  in 
unison,  from  the  Latin  words  meaning  “one  sound.” 


Beat  frequencies  greater  than  about  7 Hz  can  no  longer  be  heard  as 
countable  pulsations  in  loudness,  but  are  detected  by  the  ear  as  a 
“roughness”  or  “harshness”  which  the  listener  interprets  as  a dissonance. 
The  beat  frequency  corresponding  to  the  most  disagreeable  dissonance  de- 
pends on  the  pitch  of  the  note  heard.  At  low  pitch  it  is  about  15  Hz,  in  the 
middle  range  it  is  about  40  Hz,  and  at  high  pitch  it  increases  to  about  250 
Hz.  As  a rough  rule  of  thumb,  any  frequency  ratio  less  than  about 
1:1.2 — that  is,  any  musical  interval  smaller  than  what  musicians  call  a 
minor  third  (for  example,  the  interval  between  the  notes  called  C and  E- 
flat) — is  perceived  as  more  or  less  dissonant. 


Not  all  dissonances  are  produced  by  two  notes  very  close  together. 
There  are  two  reasons  for  this.  First,  practically  all  musical  tones  carry  a 
rich  cargo  of  harmonics.  Thus  there  is  the  possibility  of  dissonant  beating 
between  the  fundamental  of  one  of  the  notes  and  one  of  the  harmonics  of 
the  other,  or  between  one  or  more  harmonics  of  one  of  the  notes  and  one 
or  more  of  the  other. 

Second,  there  is  the  intriguing  role  of  the  ear  itself.  The  ear  is  very  far 
from  obeying  Hooke’s  law.  This  is  true  even  for  the  displacements  of  the 
eardrum  and  other  parts  of  the  ear  produced  by  relatively  faint  sounds. 


13-8  The  Physics  of  Music  603 


Threshold  acoustic  intensity  of 
incident  fundamental  (db) 


100 


Fig.  13-30  The  production  of  aural  harmonics 
in  the  human  ear.  If  a pure  tone  (a  sinusoid) 
is  to  produce  perceptible  harmonics  in  the  ear,  it 
must  have  a minimum  acoustic  intensity  whose 
value  depends  on  its  frequency.  In  general,  the 
higher  the  frequency  of  the  incident  tone,  the 
louder  it  must  be  to  produce  aural  harmonics. 
For  instance,  the  graph  shows  that  a 1000-Hz 
incident  tone  must  have  an  acoustic  intensity  of 
40  dB  if  the  listener  is  to  peceive  the  second 
harmonic  at  2000  Hz.  But  an  incident  125-Hz 
tone  at  18  dB  will  induce  the  listener  to  perceive 
the  second  harmonic  at  250  Hz.  Greater  intensi- 
ties are  required  to  produce  higher  aural 
harmonics. 


Fifth  harmonic 


Fourth  harmonic 


40 


20 


80 


60 


Second  harmonic 


Third  harmonic 


0 


67.5 


125  250  500  1000  2000  4000  8000 


Frequency  of  incident  fundamental  (Hz) 


The  effect  becomes  rapidly  more  important  as  the  amplitude  of  the  inci- 
dent sound  increases.  Figure  13-30  shows  the  intensity  level  of  an  incident 
pure  tone  at  which  the  ear  begins  to  perceive  aural  harmonics,  that  is,  har- 
monics which  are  not  present  in  the  incident  sound  but  are  generated  in  the 
ear.  If  you  compare  this  figure  with  Table  13-1,  you  will  see  that  aural  har- 
monics make  a substantial  contribution  to  the  sound  you  hear  at  almost  all 
commonly  encountered  acoustic  intensity  levels. 

In  order  to  see  how  the  nonlinearity  of  the  ear  leads  to  the  production  of  aural 
harmonics,  let  us  make  the  simplest  possible  (and  crude)  assumption  as  to  the  way 
the  ear  deviates  from  Hooke’s  law  when  its  parts  vibrate  in  response  to  the  pres- 
sure oscillations  of  a sound  wave.  If  a certain  part  of  the  ear  obeyed  Hooke’s  law, 
there  would  be  a linear  relation  between  its  displacement  y and  the  force  F exerted 
on  it  by  the  sound  wave.  Such  a relation  can  be  written  y = aF,  where  a is  the 
reciprocal  of  the  force  constant  defined  in  Chap.  6.  But  suppose  that  the  ear  part 
actually  obeys  the  simplest  possible  nonlinear  rule 


y = aF  + bF 2 


(13-53) 


(Note  that  the  displacement  y would  conform  approximately  to  Hooke’s  law  if  the 
displacing  force  F were  small  enough  or  if  the  constant  b were  small  enough, 
which  is  not  the  case  here.) 

Now  suppose  that  a force  is  exerted  on  the  nonlinear  part  as  a result  of  the 
arrival  at  the  ear  of  a purely  sinusoidal  sound  wave,  so  that 


F = sin(277ct) 


(We  take  the  amplitude  A = 1 for  simplicity.)  The  displacement  of  the  nonlinear 
ear  part  is  then  found  by  substituting  this  expression  forF  into  Eq.  (13-53),  which 
leads  to  the  displacement 


(13-54) 


y(t)  = a sin(27 n't)  + b sin2(277H) 


As  far  as  the  first  term  on  the  right  side  of  this  equation  is  concerned,  the  ear  part 
will  oscillate  so  as  to  reproduce  faithfully  the  incident  frequency  v.  But  consider 
the  second  term.  By  using  the  trigonometric  identity  sin2  8 = f[l  - cos(20)],  it  can 
be  written 


604  Superposition  of  Mechanical  Waves 


(13-55) 


So  the  displacement  y(t)  is  given  by 

y(t)  = ^ + a s\n\2-rTVt\  - ^ cos[277-(2v)t] 

That  is,  the  function  describing  the  displacement  of  the  nonlinear  part  of  the  ear 
contains  two  terms  that  are  Fourier  components  with  frequencies  v and  2 v,  respec- 
tively. The  frequency  2v  is,  so  to  speak,  “manufactured”  by  the  ear.  But  it  is  heard 
just  as  though  such  a frequency  component  were  present  in  the  incident  sound 
wave. 

In  very  similar  manner,  the  nonlinearity  of  the  ear  (as  its  departure  from 
Hooke’s  law  is  called)  results  in  the  creation  of  additional  tones  when  the  incident 
sound  wave  has  two  or  more  frequency  components.  Suppose,  for  example,  that 
the  displacing  force  is  described  by 

F = sin(277iqt)  + sin(277v2t)  (13-56) 

(Here  we  have  again  set  the  amplitudes  at  the  values  A j = A2  = 1 for  simplicity.) 
If  the  ear  obeys  Eq.  (13-53),  as  above,  we  have 

y (t)  = a [sin(2 mV)  + sin(277v2t)]  + b [sin2(27rr,1t)  + sin2(27rv2t) 

+ 2 sin(277iqt)  sin(27rv2t)]  (13-57) 
As  before,  the  term  b [sin2(27rvxt)  + sin2(27rv2t)]  leads  to  the  “manufacture”  of 
Fourier  components  of  frequency  2^  and  2v2.  However,  there  is  now  a new  effect 
produced  by  the  mixed  term  (the  last  term  on  the  right  side  of  the  equation).  Using 
a trigonometric  identity  like  Eq.  (13-13),  we  can  write 

2b  sin(27ri'1t)  sin(27rv2t)  = b cos[27t(v1  - v2)t]  - b cos[277(iq  + v2)t] 

Thus  the  ear  adds  to  the  perceived  sound  new  components  with  frequencies  equal 
to  the  difference  and  the  sum  of  the  incident  frequencies.  These  are  the  difference 
and  sum  tones,  or  collectively  simply  the  combination  tones.  The  difference  tones 
are  far  more  important,  since  they  tend  to  have  much  greater  intensity. 


The  nonlinearity  of  the  ear  does  not  cease  at  the  quadratic  term  in  Eq.  (13-53). 
Higher-order  terms  are  present,  though  of  smaller  magnitude,  and  they  add  still 
further  complications  to  the  perceived  sound  in  a manner  similar  to  that  described 
above.  Not  only  do  these  tones  exist  in  the  combined  sound  perceived,  but  also 
they  can  be  discerned  separately  by  a listener  who  has  a little  practice. 

If  the  amplitude  is  sufficiently  great,  all  systems  must  ultimately  fail  to  obey 
Hooke’s  law.  In  audio  high-fidelity  systems,  the  weak  link  in  this  respect  is 
usually  the  loudspeakers.  If  the  volume  is  set  too  high,  they  will  manufacture  the 
combination  tones  of  Eq.  (13-57).  The  result  is  intermodulation  distortion,  or  IM 
distortion  for  short.  However  great  the  care  taken  in  making  a faithful  recording  of 
a musical  performance,  the  effort  will  be  wasted  if  the  volume  is  turned  so  high  as 
to  produce  significant  IM  distortion  in  the  speaker.  If  the  volume  is  set  either  too 
high  or  too  low,  it  results  in  a level  of  IM  distortion  in  the  ear  which  is  different 
from  that  experienced  by  the  “live”  or  direct  listener.  Ideally,  music  should  be 
heard  at  precisely  the  volume  level  at  which  it  was  recorded. 

It  is  the  presence  of  combination  tones  in  the  perceived  sound  which  unravels 
an  important  musical  paradox.  You  may  have  noted  in  Figs.  13-26  and  13-27  that 
the  fundamental  is  by  no  means  always  the  strongest  component  in  the  tone  pro- 
duced by  a musical  instrument.  Nevertheless,  we  always  perceive  the  pitch  of  the 
instrument  as  that  of  the  fundamental.  Indeed,  it  is  only  by  concentrating  that  we 
can  hear  the  harmonics,  which  may  be  much  stronger,  as  separate  entities.  But 
every  pair  of  adjacent  harmonics  in  a musical  tone  produces  the  fundamental  as  a 
difference  tone.  As  a result,  the  ear  supplies  a strong  fundamental  even  when  it  is 
entirely  or  nearly  missing  in  the  incident  waveform. 

The  satisfactory  performance  of  compact  and  economical  sound  systems  de- 
pends on  the  same  effect  in  an  essential  way.  It  is  not  possible  for  a small  loud- 


13-8  The  Physics  of  Music  605 


speaker  to  reproduce  low  frequencies  at  all.  Such  systems  simply  do  not  pass  on 
to  the  ear  the  low-frequency  part  of  the  Fourier  spectrum  commonly  present  in 
music  (and  even  in  speech).  Nevertheless,  the  ear  reconstructs  the  missing  fre- 
quencies from  difference  tones,  at  least  well  enough  to  make  the  music  or  speech 
recognizable.  (In  the  light  of  the  above  discussion,  how  is  it  that  small  earphones 
of  good  quality  can  give  truly  excellent  reproduction  of  low  frequencies?) 


We  are  now  ready  to  consider  the  conditions  for  consonance  and  disso- 
nance in  a general  way.  When  a musical  tone  is  played  on  some  instrument, 
what  strikes  the  ear  is  a space-independent  waveform  which  has  as  its 
Fourier  components  a fundamental  of  frequency  iq  and  higher  harmonics 
of  frequencies  2iq,  3rq,  and  so  on.  The  I'elative  amplitudes  of  these  Fourier 
components  depend  on  the  particular  instrument,  the  particular  tone 
being  played  on  it,  and  the  particular  way  in  which  that  tone  is  being 
played.  If  a second  musical  tone  of  fundamental  frequency  v2  is  simulta- 
neously played  on  some  other  instrument,  what  strikes  the  ear  is  a superpo- 
sition of  the  waveforms  of  the  two  tones.  Like  the  first  tone,  the  second  also 
comprises  a set  of  Fourier  components  with  characteristic  amplitudes.  To 
this  complex  superposed  waveform,  the  ear  adds  further  complexities.  For 
each  pair  of  Fourier  components  whose  amplitudes  are  great  enough,  the 
ear  manufactures  aural  harmonics. 

There  are  consequently  all  sorts  of  possibilities  for  combination  tones. 
If  the  fundamental  frequencies  of  the  two  notes  have  a simple  ratio,  say 
iq/iq  = 2,  the  interval  musicians  call  the  octave,  then  the  combination  tones 
will  all  have  frequencies  which  coincide  with  harmonics  already  present. 
But  suppose  the  ratio  is  more  complicated,  for  example,  v2/vx  = f,  the  in- 
terval musicians  call  a second.  Then  some  of  the  combination  tones  will  fall 
into  the  range  of  beat  frequencies — somewhere  between  15  Hz  and  250 
Hz,  depending  on  pitch — which  the  ear  interprets  as  harsh.  The  more 
such  combination  tones  that  are  present  and  the  greater  their  amplitudes, 
the  more  dissonance  will  be  perceived  in  the  combination  of  the  two  mu- 
sical notes. 

In  order  to  explore  the  possibilities  for  dissonance,  the  frequencies  of 
the  harmonics  of  two  musical  tones  can  be  plotted  along  a horizontal  axis 
whose  scale  is  proportional  to  frequency.  This  is  done  in  Fig.  13-31  for  the 
fundamental  frequency  ratio  iq/iq  = f,  the  interval  which  musicians  call  a 
fifth.  In  this  case,  many  of  the  harmonics  of  the  higher-pitch  tone  coincide 
with  those  of  the  lower-pitch  tone.  Of  the  others,  only  the  harmonic  of  fre- 
quency 3iq  lies  close  enough  to  any  other  harmonic  that  their  ratio  is  less 
than  1:1.2,  which  is  the  crude  threshold  criterion  for  dissonance.  Taking 
the  ratio  of  3v2  to  the  frequency  4rq  of  one  of  the  neighboring  harmonics 
of  the  lower-pitch  tone,  we  have 


3^2 

4tq 


fzq 

— 1 = 1.13 

4vi 


2vi  3 vx 

v2  = 2v2  = 

\vl  3 vi 


Relative  frequency  v 


4q  5v1 


3v2  = 


I"i 


6iq  Fig.  13-31  Plot  of  the  relative  frequencies  of  a musical 

tone  of  fundamental  frequency  iq  and  its  harmonics,  to- 
= gether  with  the  relative  frequencies  of  a musical  tone  of 
" fundamental  frequency  v2  = 3iq/2  and  its  harmonics. 

This  is  the  musical  interval  called  the  fifth. 


606  Superposition  of  Mechanical  Waves 


"1 

2i>!  3 1^1 

5 pi 

6pj 

v2  = 

2v2  = 

3v2  = 

4^2  = 

Sv2  = 

h 

§"l 

5 Pi 

¥*1 

»-  Relative  frequency  v 

Fig.  13-32  Relative  frequency  plot  for  Example  13-10,  with  v2  = 5^i/4.  This  is  the  musical 
interval  called  the  major  third. 


The  ratio  of  the  frequency  5iq  of  the  other  harmonic  of  the  lower  pitch 
tone  to  2>v2  is 


5iq  _ 5vi 
3v2  fiq 


1.11 


The  interval  called  the  fifth  is  perceived  as  relatively  consonant,  although 
less  so  than  the  octave.  This  is  suggested  by  the  fact  that  the  frequency  ratio 
for  the  fifth,  3:2,  is  “more  complicated”  than  the  frequency  ratio  for  the 
octave,  2:1. 

Examples  13-10  and  13-11  consider  progressively  more  dissonant  mu- 
sical intervals. 


EXAMPLE  13-10 

The  musical  interval  having  the  frequency  ratio  v2/v  1 = I is  called  by  musicians  a 
major  third.  Explore  the  possibilities  for  dissonance  in  the  major  third.  Is  it  more  or 
less  dissonant  than  the  fifth,  discussed  immediately  above? 

■ Using  a relative  frequency  plot  like  that  of  Fig.  13-31,  you  plot  the  frequencies 
v1  and  v2  — ivi  together  with  their  associated  higher  harmonics,  as  in  Fig.  13-32. 
Inspection  suggests  that  the  frequency  pairs  denoted  by  arrows  may  lead  to  beat 
frequencies  small  enough  to  produce  dissonance.  You  find  the  frequency  ratios  of 
the  pairs  to  be 

4r2  4iq 
— — = -p—  = 1.07 
ov2  ~rvi 

and 


6vi  6^1 

You  compare  this  with  the  rough  criterion  that  any  frequency  ratio  less  than  about 
1.2  is  likely  to  sound  dissonant,  and  you  conclude  that  the  major  third  is  somewhat 
dissonant.  Comparison  suggests  that  it  is  more  dissonant  than  the  fifth,  both  be- 
cause there  are  more  possibilities  for  beats  and  because  the  frequency  ratios  are 
smaller. 


How  dissonant  the  major  third  sounds  depends  on  the  intensity  of  the 
dissonant  harmonics  in  the  particular  instruments  used  and  the  pitch  of  the 
fundamentals.  If  the  notes  are  played  loudly  enough,  there  is  also  the  pos- 
sibility of  combination  tones  which  are  dissonant  with  some  of  the  har- 
monics. 


13-8  The  Physics  of  Music  607 


"l 


2vx 


3f, 


4^i 


S^i 


6iq 


v2 


3»2 

. 24 

■ t*i 


4^2 

■¥* 


Relative  frequency  v 


Fig.  13-33  Relative  frequency  plot  for  Example  13-11,  with  v2  = 8v1/5.  This  is  the  musical 
interval  called  a minor  sixth. 


EXAMPLE  13-11 

The  musical  interval  having  the  frequency  ratio  v2/vx  = I is  called  by  musicians  a 
minor  sixth.  Explore  the  possibilities  for  dissonance  in  the  minor  sixth.  Is  it  more  or 
less  dissonant  than  the  major  third,  discussed  in  Example  13-10? 

■ Using  a relative  frequency  plot  like  that  of  Fig.  13-31,  you  plot  the  frequencies 
vx  and  v2  = $vx  together  with  their  associated  higher  harmonics,  as  in  Fig.  13-33. 
Inspection  suggests  that  the  frequency  pairs  denoted  by  arrows  may  lead  to  beat 
frequencies  small  enough  to  produce  dissonance.  You  find  the  frequency  ratios  of 
the  pairs  to  be 


2v2_'fvx 
3iq  3 m 


1.07 


5vi  5zq 
3^2  _ ¥m 


1.04 


4^2 

fir, 


32 

Trr, 


fir. 


1.07 


Here  there  are  three  dissonant  intervals.  There  are,  moreover,  relatively  few  conso- 
nant intervals,  and  there  are  abundant  possibilities  for  combination  tones.  So  the 
minor  sixth  is  relatively  dissonant,  compared  to  the  major  third  and  the  fifth. 

TUtwimin iffiT—  T"--  — 


In  Examples  13-10  and  13-11,  why  is  it  unnecessary  to  consider  the 
possibility  of  dissonances  between  harmonics  higher  than  the  sixth? 

In  actual  music,  it  is  quite  common  to  have  harmonies  involving  three  or 
more  musical  notes.  The  analysis  of  such  situations  from  a physical  point  of  view 
becomes  rapidly  more  complicated,  but  the  principles  involved  are  the  same.  It 
should  be  remembered  that  dissonance  is  not  universally  undesirable  in  music.  A 
piece  consisting  entirely  of  consonances  would  be  unbearably  tedious,  and  the 
resolution  of  dissonances  into  consonances  is  a very  important  aspect  of  Western 
music. 

We  conclude  this  section  with  a brief  discussion  of  actual  musical  in- 
struments. Every  instrument  must  perform  two  functions.  It  must  generate 
any  of  a number  of  fundamental  frequencies  (perhaps  along  with  a series 
of  harmonics)  at  the  will  of  the  musician.  Then  it  must  transmit  the  vibra- 
tional energy  to  the  air  in  a reasonably  efficient  way. 

In  the  stringed  instruments,  the  frequency  is  set  by  the  linear  mass 
density,  tension,  and  length  of  the  string.  The  string  is  set  into  oscillation  by 
striking  it  with  a hammer  (as  in  the  piano),  by  plucking  it  (as  in  the  guitar, 
the  lute,  the  harpsichord,  and  the  violin  played  pizzicato),  or  by  bowing  it 
(as  in  the  viols  and  violins).  The  bowing  mechanism  is  the  only  one  whose 
function  is  not  obvious.  The  string  adheres  to  the  bow  by  friction  until  the 
maximum  static  frictional  force  is  exceeded.  The  string  then  slips  back 
until  it  is  again  caught  by  the  bow  at  the  other  end  of  its  oscillation.  (Such 
action  is  encountered  in  a wide  variety  of  mechanical  systems.  It  is  called 
“stick-slip"  action.) 


608  Superposition  of  Mechanical  Waves 


The  details  of  the  excitation  mechanism  are  very  important  in  deter- 
mining the  quality  of  the  instrumental  sound,  since  they  influence  the  am- 
plitudes of  the  various  harmonics.  In  the  piano,  for  instance,  the  hammer 
strikes  the  string  in  the  region  between  about  one-eighth  and  one-sixth  of 
the  distance  along  its  length.  This  strongly  suppresses  all  the  harmonics 
which  possess  nodes  in  that  region.  (Why?) 

A string  by  itself  is  a very  inefficient  transmitter  of  energy  to  the  air, 
because  it  is  so  thin.  In  stringed  instruments,  therefore,  one  or  both  ends  of 
the  string  are  stretched  across  a (usually)  relatively  light  wooden  bridge, 
which  rests  on  a wooden  plate  called  the  sounding  board.  In  most  stringed 
instruments,  the  sounding  board  is  part  of  a box,  usually  of  complex 
shape.  The  sounding  board  (and  the  air  it  contains,  if  any)  is  set  into  vibra- 
tion by  the  oscillations  transmitted  from  the  string  via  the  bridge.  Since 
the  sounding  board  is  large,  it  sets  the  surrounding  air  into  motion  rather 
efficiently,  as  required. 

If  the  instrument  is  to  have  relatively  uniform  tone  quality  over  its 
range  of  pitch,  the  entire  system  must  be  fairly  highly  damped  (see  Sec. 
6-6)  so  that  it  will  not  “ring”  at  certain  frequencies  and  sound  “dead”  at 
others.  The  complex  shape  also  aids  in  “smoothing  out”  the  response  of  the 
sounding  board  to  string  vibrations  of  different  frequencies,  for  reasons 
whose  discussion  would  require  too  much  digression  here. 

In  brass  and  woodwind  instruments,  and  in  the  organ,  it  is  the  vi- 
brating source  which  has  no  well-defined  oscillation  frequency.  Contrary  to 
the  situation  in  the  stringed  instruments,  it  is  the  equivalent  of  the  sound- 
ing board  — in  this  case  the  air  in  the  pipe  which  is  the  main  body  of  the 
instrument — which  vibrates  in  standing-wave  modes  of  sharply  defined 
frequency  and  thus  controls  the  frequency  of  oscillation  of  the  source. 

In  most  of  the  woodwinds  (and  some  organ  pipes)  the  source  is  a reed. 
In  the  clarinet  and  the  saxophone,  for  example,  the  reed  is  a thin,  flexible 
plate  of  cane  which  fits  over  a hole  in  the  mouthpiece.  As  it  vibrates,  at  a 
frequency  controlled  by  the  body  of  the  instrument,  the  reed  opens  and 
closes  the  hole,  allowing  puffs  of  air  to  enter  the  instrument.  In  the  brasses, 
the  same  effect  is  achieved  by  the  musician’s  lips,  which  are  coupled  to  the 
instrument  by  a cup-  or  funnel-shaped  mouthpiece.  The  flute  and  most 
organ  pipes  are  driven  by  the  oscillating  turbulences  produced  (in  a very 
complicated  way)  when  air  from  a narrow  slit  (in  the  case  of  the  flute,  the 
player’s  lips)  is  directed  against  a sharp  edge.  In  the  recorder,  the  ocarina, 
and  some  organ  pipes  the  same  effect  is  achieved  by  a built-in  whistle. 

The  resonant  frequency  of  the  air  in  the  body  of  the  instrument  is  de- 
termined by  the  length  of  the  column  and  by  whether  the  pipe  is  open  at 
both  ends  or  closed  at  one  end  (and  to  a lesser  extent  by  its  shape.)  As 
usual,  the  physical  situation  sets  the  boundary  conditions.  A standing  wave 
in  a pipe  open  at  both  ends,  in  which  the  air  is  excited  near  one  end,  must 
have  pressure  nodes  at  both  ends  (or  very  near  them)  since  nothing  that 
happens  inside  the  pipe  can  possibly  change  the  pressure  of  the  outside  air. 
Figure  13-34(7,  is  a schematic  picture  of  a standing  pressure  wave  set  up  in 
such  an  open  pipe.  The  fundamental  frequency  corresponds  to  the  longest 
wavelength  satisfying  this  condition,  which  has  just  one  antinode  between 
the  nodes  at  the  ends.  The  length  L of  the  pipe  is  thus  one-half  the  wave- 
length A,  of  the  fundamental  mode;  that  is,  Ai  = 2 L.  The  fundamental  fre- 
quency vx  is  related  to  kx  according  to  the  rule  = |<u|/ Ax,  where  is  the 
speed  of  sound  in  air.  Hence  the  frequency  is 


13-8  The  Physics  of  Music  609 


(a) 


X,  = 2 L 


P max 

— P ~ PzXm 
P min 


( C ) 


»P=P*xn 


Fig.  13-34  Standing  acoustic  pressure  waves  produced  in  open  pipes  and  in  pipes 
closed  at  one  end  (called  “closed  pipes”).  An  open  end  must  be  a pressure  node 
because  the  activity  in  the  pipe  cannot  significantly  influence  the  pressure  of  the 
open  atmosphere.  A closed  end  must  be  a displacement  node  because  the  air  cannot 
move  longitudinally  past  it.  As  discussed  in  the  text,  a displacement  node  must  be  a 
pressure  antinode,  (a)  The  fundamental  standing  pressure  wave  in  an  open  pipe 
has  nodes  only  at  the  two  ends,  and  its  wavelength  is  twice  the  length  of  the  pipe. 
Compare  with  Fig.  13-8.  ( b ) The  second  harmonic  has  an  additional  pressure  node 
in  the  center  of  the  pipe,  and  its  wavelength  is  equal  to  the  length  of  the  pipe. 
Compare  with  Fig.  13-9.  ( c ) The  fundamental  standing  pressure  wave  in  a closed 
pipe  has  a node  at  the  open  end  and  an  antinode  at  the  closed  end.  Its  wavelength 
is  four  times  the  length  of  the  pipe.  ( d ) The  next  possible  standing  wave  is  the 
third  harmonic,  with  an  additional  pressure  node  one-third  of  the  distance  from 
the  closed  end  to  the  open  end  of  the  pipe.  Only  odd  harmonics  are  possible  in  a 
closed  pipe. 


id) 


X2  = 


4 L 
3 


P = Patm 


\V\ 


(13-58) 


The  next  possible  mode  is  the  one  in  which  there  is  a node  between  the 
two  nodes  at  the  ends.  See  Fig.  13-346.  The  wavelength  k2  is  thus  one-half 
that  of  the  fundamental.  A.1?  and  the  frequency  v2  is  twice  that  of  the  funda- 
mental, vv  In  like  manner,  harmonics  can  exist  whose  frequencies  are  3,  4, 
5,  . . . times  that  of  the  fundamental,  just  as  in  a string. 

If  one  end  of  the  pipe  is  closed,  there  must  be  a displacement  node  at 
that  point.  Sound  waves  involve  longitudinal  displacement  of  the  air,  and 
this  is  prevented  at  the  closed  end.  But  we  saw  in  Sec.  12-6  that  a zero  dis- 
placement corresponds  to  a pressure  extreme  (see  Fig.  12-15).  Thus  there 
must  be  a pressure  antinode  at  the  closed  end.  The  situation  is  that  shown  in 
Fig.  13-34c.  Here  the  fundamental  has  a wavelength  4 times  the  length  of 
the  pipe,  and  its  frequency  is 

v\  |u 
Vl=T1=4L 

Hence  a closed  pipe  has  a fundamental  frequency  one-half  that  of  an  open 
pipe  of  the  same  length.  But  from  the  musical  point  of  view  there  is  a 
matter  of  greater  interest.  Note  from  Fig.  13-33d  that  the  geometry  of  the 
situation  requires  that  the  next  possible  mode  have  a wavelength  not 
one-half,  but  one-third  that  of  the  fundamental.  Similarly,  the  next  wave- 
length will  be  one-hfth  that  of  the  fundamental.  In  general,  only  the 
odd-numbered  modes  are  possible,  with  frequencies  3,  5,  7,  . . . times  that 
of  the  fundamental.  As  you  can  see  from  the  sound  spectrum  of  Fig.  13-26, 
the  clarinet  approximates  this  situation,  to  which  it  owes  its  distinctive 
quality. 

Example  13-12  applies  the  principles  just  developed  to  an  organ  pipe. 


610  Superposition  of  Mechanical  Waves 


EXAMPLE  13-12 


How  long  must  an  organ  pipe  be  to  produce  the  lowest  audible  musical  note  called 
C,  whose  frequency  is  16.35  Hz,  if  it  is 

a.  Open  at  both  ends? 

b.  Closed  at  one  end? 

Take  the  speed  of  sound  to  be  344.0  m/s. 

■ a.  For  the  open  pipe  you  have  from  Eq.  (13-58) 


M 344.0  m/s 
2^  ~ 2 x 16.35  s-1 


10.52  m 


b.  For  the  closed  pipe,  Eq.  (13-59)  gives 

M 344.0  m/s 
L ~ 4^  “ TXM6.35  s"1 


5.260  m 


Ranks  of  very  large  “organ  pipes”  are  sometimes  used  as  decorations  in  large 
churches.  You  can  be  sure  that  they  are  either  nonfunctional  or  nonmusical  if  they 
are  much  more  than  10  m long! 


In  the  brass  instruments,  the  pitch  is  altered  by  using  a set  of  valves  or 
a slide  to  vary  the  length  of  the  tube  and  thus  the  resonant  frequency  of  the 
air  within  it.  In  the  woodwinds  the  same  effect  is  achieved  in  a much  more 
complicated  fashion  by  opening  and  closing  holes  distributed  along  the 
tube. 

The  bell  at  the  end  of  the  instrument  serves  two  related  purposes. 
First  it  serves  as  an  efficient  way  to  couple  the  vibrating  air  inside  the  instru- 
ment to  the  outside  air  and  thus  to  emit  the  sound.  Second,  it  is  not  equally 
efficient  in  doing  this  at  all  frequencies.  Thus  the  hell  tends  to  suppress 
some  harmonics  relative  to  others  and  contributes  to  the  timbre  of  the  in- 
strument as  a whole. 

Percussion  instruments  are  a special  class.  In  general,  they  are  two- 
dimensional  oscillators  (like  the  drum  and  the  cymbals)  or  even  three- 
dimensional  oscillators  (like  the  bell  and  the  wood  block).  The  frequencies 
of  their  standing-wave  modes  do  not  constitute  a simple  arithmetic  series,  as 
we  have  seen  for  the  drumhead  in  Sec.  13-5.  The  modes  are  thus  not  har- 
monic. Taken  together,  the  nonharmonic  modes  and  the  combination  tones 
constitute  an  irregular  series  in  frequency,  and  it  is  not  possible  for  the  ear 
to  identify  or  construct  a well-defined  fundamental  frequency  to  which  a 
pitch  can  be  assigned.  In  some  cases,  such  as  the  snare  drum,  the  triangle, 
and  the  cymbals,  this  is  what  is  desired  since  the  instrument  is  essentially  a 
“noisemaker.”  In  other  cases,  the  musical  tone  is  produced  by  using  one  or 
more  ingenious  devices.  In  the  xylophone,  a musical  tone  is  achieved  by 
hollowing  out  the  centers  of  the  bars  in  such  a way  as  to  “tune”  the  second 
mode  to  an  approximate  integral  multiple  of  the  fundamental.  A similar 
device  is  used  in  the  bell.  In  the  chimes,  which  consist  of  a set  of  tubular 
bars,  the  bars  are  suspended  at  the  nodes  of  the  desired  modes,  thus  sup- 
pressing other,  dissonant  ones.  In  the  marimba,  tuned  hollow  containers 
called  resonators,  in  which  standing-wave  oscillations  can  be  set  up  very 
much  as  in  the  air  column  of  a trumpet,  are  used  to  enhance  the  funda- 
mental. Thus  the  dissonant  higher  modes  are  relatively  weak. 

The  timpani  (or  kettledrums)  is  an  interesting  special  case.  The  funda- 
mental (see  Fig.  13-1  la),  in  which  the  entire  drumhead  moves  together, 
transfers  energy  to  the  air  so  efficiently  that  it  damps  out  very  quickly  and 


13-8  The  Physics  of  Music  611 


is  not  heard  except  as  part  of  the  initial  “boom.”  The  pitch  heard  is  that  of 
the  mode  having  the  next  higher  frequency,  shown  in  Fig.  13-23 a,  while 
the  higher  modes  contribute  a somewhat  unmusical  quality  to  the  instru- 
ment. Thus  the  pitch,  while  it  can  be  discerned,  is  somewhat  “fuzzy.” 

All  percussion  instruments  have  the  property  that  elaborate  chords 
cannot  be  played  on  them,  even  if  the  individual  notes  do  have  discernible 
pitches.  It  is  not  possible  to  obtain  consonances  with  bells,  for  example, 
when  more  than  two  are  played  together.  There  are  simply  too  many  possi- 
bilities for  dissonances  among  the  various  nonharmonic  modes.  As  a conse- 
quence, there  is  a special  musical  literature  written  for  the  carillon,  as  a 
choir  of  bells  is  called.  Rather  than  calling  for  the  playing  of  many  bells 
together,  advantage  is  taken  of  the  fact  that  while  the  sound  of  bells  persists 
for  some  time,  there  is  considerable  variation  among  the  decay  rates  of  the 
modes.  So  ingenious  and  unexpected  harmonies  can  be  created  by 
sounding  one  bell  when  only  the  desired  modes  of  a previously  sounded 
bell  or  bells  still  persist.  Carillons  are  particularly  common  in  Belgium  and 
Holland.  If  you  ever  have  the  chance  to  listen  to  one,  listen  with  the  ear  of  a 
physicist  as  well  as  the  ear  of  a music  lover! 


EXERCISES 

Group  A 

13-1.  Superposition.  Two  wave  trains  are  traveling 
along  a very  long  stretched  string  at  the  same  speed  |zj  but 
in  opposite  directions.  The  wave  train  traveling  in  the  pos- 
itive x direction  is  described  by  the  wave  function  ffx,  t ) = 
Aj  coslkpx  - |v|f)]  and  the  wave  train  traveling  in  the  nega- 
tive x direction  by  /2(x,  t)  = A2  cos[k2(x  + |v|f)  + 82]. 

a.  Plot  fi(x,  t ) over  the  range  —(4 -rr/kP)  < x < (4ir/ki) 
for  each  of  the  following  instants:  (1)  t = 0;  (2)  t = 
Tr/2k1\v\. 

b.  On  the  same  graphs,  plot  /2(x,  t)  for  the  positions 
and  instants  listed  in  part  a.  Assume  that  A2  = Alt  k2  = 
2ku  and  82  = 7t/6. 

c.  On  the  same  graphs,  plot  the  wave  function  f(x,  t) 
that  results  from  the  superposition  of  the  two  traveling 
waves  described  in  a and  b. 

13-2.  An  experimental  determination  of  the  speed  of  sound 
in  air.  The  brass  rod  AS  in  Fig.  13E-2  is  0.50  m long  and  is 
clamped  at  its  center  E.  When  the  rod  is  rubbed  length- 
wise with  a rosined  cloth,  it  is  set  into  longitudinal  vibra- 
tion with  E necessarily  a node  and  A and  B antinodes.  At 
end  B,  a cardboard  disk  is  fastened.  The  disk  fits  loosely 
into  an  open  glass  tube  CD  on  the  bottom  of  which  a fine 
powder  (such  as  talc)  has  been  sprinkled.  When  AS  is  set 
into  vibration,  the  powder  gathers  into  little  heaps  at  very 
regular  intervals.  Adjacent  heaps  of  powder  are  separated 
by  4.8  cm.  The  speed  of  sound  in  brass  has  been  mea- 
sured; it  is  3.5  x 103  m/s. 


a.  What  is  the  fundamental  frequency  of  longitudinal 
vibrations  of  the  clamped  rod  AS? 

b.  Assuming  that  the  vibrations  described  are  vibra- 
tions at  the  fundamental  frequency,  what  is  the  speed  of 
sound  in  air? 


13-3.  Piano  wire.  The  wire  that  is  used  to  produce  the 
tone  “concert  A"  (v  = 440  Hz)  on  a particular  piano  is 
made  of  a material  with  density  7.8  g/cm3.  The  wire 
is  0.80  mm  in  diameter  and  60.0  cm  long.  What  tension 
exists  in  the  wire  when  its  fundamental  frequency  is  prop- 
erly tuned  to  concert  A? 


( ) 


^13-4/  Standing  wave  1.  The  equation  for  a standing 
wave  in  a string  is  given  by  y = A sin(foc)  cos(oj t).  where  A = 
0.04  m,  k = 4tt  m-1,  and  oo  = 8OO77  s~x. 

a.  What  is  the  distance  between  nodes? 

b.  What  is  the  wavelength  of  the  traveling  waves  that 
superpose  to  produce  the  standing  wave? 

c.  What  is  the  frequency  of  the  vibration? 

d.  With  what  speed*.  do  the  traveling  waves  pro- 
pagate on  the  string?  ^ [jjx 

e.  What  is  the  amplitude  A'  of  each  of  the  two  travel- 
ing waves  (of  equal  amplitude)  that  produce  the  standing 
wave? 


13-5.  Standing  wave,  II.  Two  waves  each  of  wave- 
length 20  cm  and  of  equal  amplitude  are  traveling  in  op- 
posite directions  on  a taut  string  fixed  at  both  ends.  If  the 
string  is  1.0  m long,  how  many  nodes  are  there,  counting 
the  fixed  ends,  in  the  standing  wave  produced? 

13-6.  Wave  on  a string.  A string  1.0  m long  has  a mass 
of  5.0  x 10-3  kg  and  is  under  a tension  of  50  N. 


612  Superposition  of  Mechanical  Waves 


a.  What  is  the  speed  of  a wave  traveling  on  the  string? 

b.  If  the  string  is  vibrating  in  four  segments  (with  five 
nodes),  what  is  the  frequency  of  the  sound  it  is  producing? 

c.  Suppose  that  the  standing  wave  has  amplitude  A. 
Write  the  wave  function  for  the  standing  wave,  taking  the 
origin  at  one  end  of  the  string. 

13-7.  The  light  location.  The  first  string  on  the  violin  is 
called  the  E string  because  it  produces  the  tone  E when  its 
full  length  vibrates.  The  second  string  is  called  the  A 
string  for  similar  reasons.  The  ratio  of  the  frequency  of  E 
to  that  of  A is  3 : 2 (the  major  fifth).  Where  must  a violinist 
place  a finger  on  the  A string  so  that  it  will  produce  the 
same  tone  as  the  E string? 

i^3-8.  The  fundamentals.  A number  of  strings  of  the 
same  length  but  different  diameters  are  made  of  the  same 
material  and  are  at  the  same  tension.  Prove  that  their  fun- 
damental frequencies  are  inversely  proportional  to  their 
diameters. 

13-9.  How  much  motion ? Points  A,  B , and  C lie  along  a 
uniform  stretched  string  of  length  L.  The  distances  of  A, 
B , and  C from  the  left  end  of  the  string  are  exactly  0.1  L, 
0.4  L,  and  0.7  L,  respectively  . Find  the  amplitudes  of  the 
transverse  motion  of  the  string  at  points  A.  B.  and  C,  if  the 
string  is  vibrating  with  maximum  amplitude  h in  its  nth 
mode,  for  the  following  values  of  n: 

a.  n = 1 (fundamental  mode);  b.  n = 3;  c.  n = 10 

13-10.  Drums. 

a.  Find  the  frequencies  of  the  first  three  modes  of  os- 
cillation of  a drumhead  of  mass  50  g,  radius  20  cm,  and 
tension  600  N/m. 

b.  Three  identical  drumheads  with  the  characteristics 
described  in  part  a are  mounted  on  three  open  cylinders. 
Find  the  lengths  for  the  cylinders  if  each  has  one  of  the 
frequencies  found  in  part  a as  its  fundamental.  (Notice 
that  each  assembled  drum  will  constitute  a pipe  open  only 
at  the  bottom  end.  Take  the  speed  of  sound  to  be  340 
m/s.) 

13-11.  Find  the  frequencies.  A glass  tube  30  cm  long  is 
closed  at  one  end.  Fake  the  speed  of  sound  to  be  340  m/s, 

a.  What  is  the  frequency  of  the  fundamental 
standing  wave  mode? 

b.  What  is  the  frequency  of  the  second  standing  wave 
mode? 

13-12.  Low  notes.  A pipe  that  is  closed  at  one  end  has  a 
fundamental  frequency  of  30  Hz.  Assume  a sound  speed 
of  340  m/s. 

a.  What  is  the  length  of  the  pipe? 

b.  Suppose  the  cover  is  removed  from  the  closed  end. 
What  is  the  fundamental  frequency  now? 

13-13.  Listen  to  the  beat.  A 256-Hz  tuning  fork  pro- 
duces four  beats  per  second  when  sounded  with  another 
fork  of  unknown  frequency.  What  are  two  possible  values 
for  the  unknown  frequency? 


13-14.  Different  pipes,  same  overtone.  For  a particular  J 
paif-ef-TTrgan  pipes,  the  first  overtone  (the  {gcorTo^har- 
monic  mode)  of  the  closed  pipe  has  the  same  frequency  as 
the  first  overtone  (the  second  harmonic  mode)  of  the  open 
pipe.  What  is  the  ratio  of  the  pipe  lengths? 


Group  B 

/l3  -15.  Constructive  interference.  In  Fig.  13E- 15,  source 
S produces  a pure  sound  of  a single  frequency  that  can  be 
varied.  Observer  O is  situated  so  that  OS  is  perpendicular 
to  APB.  a reflective  wall.  PS  is  equal  to  2.0  m.  What  are  the 
two  lowest  frequencies  for  which  the  sound  heard  at  O has 
maximum  loudness?  The  speed  of  sound  in  air  is  340  m/s. 

Fig.  13E-15 


2.0  m 


^3-lb.  Tune-up.  Some  of  the  low  keys  of  the  piano 
have  two  strings.  On  a particular  key  one  of  the  strings  is 
tuned  correctly  to  100  Hz.  When  the  two  strings  are 
sounded  together,  one  beat  per  second  is  heard.  By  what 
percent  must  a piano  tuner  change  the  tension  of  the  un- 
tuned string  to  make  it  match  perfectly?  The  beating  is 
due  to  superposition  of  the  fundamental  tones,  which  are 
by  far  the  strongest. 

^l3-\ 7.  Energy  in  standing  waves  on  a uniform  string.  A 
uniform  string  (length  L,  linear  density  /z,  and  tension  F) 
is  vibrating  with  amplitude  An  in  its  nth  mode.  Show  that 
its  total  energy  of  oscillation  is  given  by  E = tt2v\A\^L. 

/13-18.  Acceleration  in  standing  wave  modes.  A 10-m 
length  of  nylon  rope  has  a linear  density  of  0.10  kg/m 
and  is  under  a tension  of  1000  N. 

a.  Find  the  frequencies  of  the  standing  wave  modes 
of  this  rope. 

b.  If  the  rope  is  vibrating  in  the  nth  mode,  what  is  the 
required  amplitude  of  vibration  (at  the  antinodes)  in 
order  for  the  maximum  acceleration  to  exceed  the  accel- 
eration of  gravity  g? 

/l3  -19.  Can  you  explain  it?  Inspection  of  Figs.  13-20 
and  13-21  reveals  that  the  node  in  the  second  mode  of  the 
circular  membrane  occurs  at  u = 0.44,  while  the  nodes  in 
the  third  mode  occur  at  u = 0.28  and  at  u = 0.64.  Equa- 
tion 13-47  can  be  used  to  show  that  vjv2  = 0.44.  v1/v3  = 
0.28,  and  v2/v3  = 0.64.  Explain  this  agreement  of  node  lo- 
cations with  frequency  ratios. 


Exercises  613 


*''13-20.  Working  together.  Si  and  S2  are  identical  sources 
of  sound  of  a single  frequency.  They  are  equidistant  from 
O.  When  alone  is  sounding,  the  amplitude  reaching  O 
is  A. 

a.  If  Sx  and  S2  are  both  turned  on  and  are  operated 
"in  phase,”  what  is  now  the  amplitude  of  the  sound  at  O? 

b.  How  does  the  energy  flow  at  O with  both  sources 
sounding  compare  with  the  flow  when  only  Si  is 
sounding? 

c.  What  is  the  increase  in  acoustic  intensity  in  decibels 
when  S2  is  turned  on? 

v^3-21.  Fourier  analysis.  Fig.  13E-21  represents  a taut 
string  fastened  at  A and  C.  It  is  initially  raised  a short  dis- 
tance in  the  center  from  B to  B'  and  released.  Show  that 
the  even-numbered  harmonic  modes  are  missing  from  the 
sound  produced. 


B'  Fig.  13E-21 


113  -22.  Whistles  and  beats.  A boy  is  walking  away  from  a 
wall  at  a speed  of  1.0  m/s  in  a direction  at  right  angles  to  the 
wall.  As  he  walks,  he  blows  a whistle  steadily.  An  observer 
toward  whom  the  boy  is  walking  hears  4.0  beats  per  sec- 
ond. If  the  speed  of  sound  is  340  m/s,  what  is  the  fre- 
quency of  the  whistle? 

T3-23.  Using  a resonance  to  determine  the  speed  of  sound. 
A sounding  tuning  fork  whose  frequency  is  256  Hz  is  held 
over  an  empty  measuring  cylinder.  See  Fig.  13E-23.  The 
sound  is  faint,  but  if  just  the  right  amount  of  water  is 
poured  into  the  cylinder,  it  becomes  loud.  When  this 
occurs,  the  sound  consists  of  the  vibrations  from  the  fork 
plus  identical  vibrations  from  the  air  column.  If  the  length 


Fig.  13E-23 


of  the  air  column  that  produces  the  loudest  sound  is 
0.31  m,  what  is  the  speed  of  sound  in  air  to  a first  approxi- 
mation? (For  higher  precision  a correction  must  be  made 
to  allow  for  the  fact  that  the  pressure  node  of  the  air  col- 
umn occurs  slightly  outside  the  end  of  the  column.) 


'1 3-24.  What  do  you  hear ? Suppose  that  the  displace- 
ment y(t)  of  a part  of  the  human  ear  obeys  the  nonlinear 
equation,  Eq.  (13-53),  and  that  the  wave  function  describ- 
ing the  incident  sound  is  given  by  Eq.  (13-56),  with  vx  = 
200  Hz  and  v2  = 260  Hz.  If  y(t)  is  written  as  a sum  of  sinu- 
soidal functions,  what  frequencies  are  involved? 


(13-25.) Superposing  three  waves.  Three  wave  sources, 
$i,  S^-a«a  S3  are  located  in  a large  body  of  water  and  are 
equidistant  from  an  observation  point  O.  The  wave 
sources  have  identical  frequencies  v and  power  outputs. 
When  only  the  jth  source  is  operating,  the  surface  dis- 
placement at  O is  A cos(2m>t  + 8j),  where  A > 0 and  0 =£ 
8j  < 277. 

a.  Find  the  surface  displacement  z{t)  at  O when  all 
three  sources  are  operating. 

b.  Combine  the  contributions  from  and  S2  to  show 

that  _ / 

z (?)  = 2A  cos[2<8x  ^ S2)]  cos[277H  + 2(81  + S2)] 

+ A COS(277l4  + S3) 


c.  Suppose  that  A.  8 x,  and  S2  are  given,  and  that 
1 81  — S2|  =£  77,  but  that  the  value  of  S3  can  be  chosen  at  will. 
Find  the  value  of  S3  for  which  the  overall  surface  displace- 
ment z(t)  has  a maximum  amplitude.  What  is  the  value  of 
that  maximum  amplitude? 

d.  Evaluate  the  optimum  value  of  S3  and  the  corre- 
sponding maximum  amplitude  for  the  following  cases: 

(1)  Sx  = 82  = 0°;  (2)  Si  = 60°,  S2  = 0°:  (3)  Si  = 90°, 
S2  = 0°;  (4)  Si  = 120°,  S2  = 0°;  (5)  81  = 180°,  S2  = 0°. 

e.  For  case  (5)  in  part  d,  show  that  the  amplitude  does 
not  depend  on  the  particular  choice  made  for  S3.  Explain 
why  not. 

f.  Extend  the  results  of  part  c to  cover  all  possible  val- 
ues of  Si  — S2. 


/L3  -26.  Steel  guitar.  A steel  guitar  can  readily  be  used 
to  produce  a glissando  (Italian  for  “sliding”)  in  which  the 
pitch  of  a vibrating  string  is  continuously  varied  while  the 
string  is  sounding.  This  is  accomplished  by  sliding  a “stop” 
along  the  string  to  increase  or  decrease  the  effective 
length  of  the  string.  In  a certain  steel  guitar,  one  particu- 
lar string  vibrates  at  a frequency  of  440  Hz  at  its  full 
length  of  50  cm. 

a.  How  far  must  the  slide  be  moved  to  raise  the  pitch 
by  one  octave? 

b.  If  the  slide  is  moved  at  a constant  speed  of  20  cm/s 
to  raise  the  pitch  by  an  octave,  find  the  duration  of  the 
glissando. 

c.  How  many  times  does  the  string  vibrate  during  the 
glissando  in  part  S? 


614  Superposition  of  Mechanical  Waves 


§ 


Grot, 

13-27 J Rocked  in  the  cradle  of  the  deep.  Two  sources  of 
wareTwaves,  Sx  and  S2 , are  separated  by  a distance  d,  as 
shown  in  Fig.  13E-27.  The  sources  produce  outgoing  cir- 
cular waves  at  the  same  angular  frequency  &>;  the  corre- 
sponding wave  number  k is  given  by  k = oj/v,  where  v is 
the  wave  speed.  When  only  is  operating,  the  instanta- 
neous vertical  displacement  of  the  water  at  point  0 is 
given  by 

fi(t)  = -^=  cos  (krx  — kvt  + 8X), 


O 


A 


where  rx  is  the  distance  from  5X  to  O.  When  only  S2  is 
operating,  the  instantaneous  displacement  at  O is  given  by 

C2 

f2{t)  = — — cos (kr2  - kvt  + 82) 


where  r2  is  the  distance  f rom  S2  to  O.  C, 


c 


static 

n OJPL  POSi'b'V*/ 

2 0 Fig.  13E-27 

A 


A 

\ 

\ 

-Ac 


a.  Find  the  instantaneous  vertical  displacement /(f)  at 
0 when  both  sources  are  operating. 

b.  Using  trigonometric  identities,  show  that  the  result 
of  part  a can  be  rewritten  as 


f{t)  — A+  cos  9+  cos  6-  — A_  sin  6+  sin  9- 


where 


A+  = 


Ci 


c2 


kin  + r2)  8X  + 82 

9+  = ^ kvt  H 


n 


and 


0.  = *(ri  r rz)  + 


8l  ~ §2 


2 ' 2 

c.  Find  two  restrictions  relating  rx  and  r2  which,  if 
both  are  satisfied,  lead  to  the  result  f(t)  = 0;  that  is,  the 
point  O is  a node. 

d.  Suppose  that  C2  = Cx  and  S2  = 8X  - 77.  That  is, 
suppose  that  Sx  and  S2  are  equal  in  strength  but  are  oscil- 
lating 180°  out  of  phase.  Show  that  the  water  surface  will 
remain  motionless  along  the  perpendicular  bisector  of  the 
line  joining  Sx  and  S2. 

0 3-28.)  Reflection  at  a junction.  Two  wires,  made  of  dif- 
feremjijaferials  and/or  having  different  thicknesses,  have 
linear  densities  /jlx  and  p,2  respectively.  The  two  wires  are 
joined  end  to  end.  The  combined  wire  is  put  under  ten- 
sion, which  must  be  the  same  in  both  parts  of  the  wire.  A 
sinusoidal  wave  in  the  left  part  is  traveling  in  the  positive  x 
direction  and  can  therefore  be  represented  by  the  wave 
function  y\  = A;  cos(kxx  — coxt).  When  this  incident  wave 
reaches  the  junction  at  x = 0,  it  is  partially  reflected  and  par- 
tially transmitted.  The  reflected  wave  is  still  in  the  left  part 
of  the  wire,  and  can  be  represented  by  the  leftward- 


traveling  wave  function  yr  = Ar  cos(kxx  + a>xt),  where  the  ^ ^ 
amplitude  Ar  is  yet  to  be  determined.  (Note:  The  omission 
of  a phase  constant  from  this  wave  function  is  a tricky/ 
which  leads  to  a simple  method  of  determining  the  phase  A A 
relationship  between  the  incident  wave  and  the  reflected  A 
wave.  As  you  will  soon  see,  the  omission  leads  to  a value  1 ^ 
for  the  amplitude  Ar  of  the  reflected  wave  which  may 
either  positive  or  negative,  whereas,  strictly  speaking,  a 
wave  amplitude  must  be  positive.)  The  transmitted  wave  is 
represented  by  the  wave  function  yt  = At  cosf^x  — a>2t),<S*fis‘'? 
with  the  value  of  At  likewise  to  be  determined.  One  i 
boundary  condition  is  that  at  the  junction  between  the  two  T/'T 
wires,  their  instantaneous  displacements  yx  = yt  + yr  and'®  ^ 
y2  = yt  must  be  equal;  that  is,  yx  = y2  at  x = 0. 

a.  Show  that  this  boundary  condition  leads  to  the  re-  d 
suits  cox  = co2  and  A{  + Ar  = At.  What  is  the  physical 
meaning  of  these  results? 

b.  The  second  boundary  condition  at  the  junction, 
x = 0,  is  that  dyx/dx  = dy2/dx\  that  is,  there  is  no  kink  at' 
the  junction. (This  is  necessary  since  tne  torce  exerted  by\  J 

\l  the  wires  on  each  other  at  the  junction  must  be  equal  andy 
n opposite.  I\s  in  Chap.  12,  assume  that  the  displacements  y 
are  small.  Use  the  boundary  condition  to  show  that 
kxAj  — kxAr  = kiAt- 

c.  Using  the  results  of  parts  a and  b,  show  that 

Ar  _ kx  - k2  _ \v2\  - |ux| 

At  kx  + k2  |u2|  + |ux| 

where  |ux|  and  |u2|  are  the  wave  speeds  on  the  two  parts  of 
the  wire. 

d.  If  p,i  < /jl2,  what  does  the  result  of  part  d imply  for 
the  sign  of  Ar?  (This  result  includes  the  case  where  p.2  is 
infinite,  which  means  physically  that  the  left  part  of  the 
wire  is  fixed  rigidly  at  x = 0.)  How  can  you  rewrite  the 
wave  function  yr  so  that  Ar  will  have  a positive  value?  What 
is  the  resulting  phase  relationship  between  yr  and  yj? 

e.  If  /xx  > fjL2,  what  is  the  sign  of  Ar?  (This  includes 
the  case  where  /x2  = 0,  which  means  physically  that  the 
left  part  of  the  wire  is  completely  free  at  x = 0.)  What  is 
the  resulting  phase  relationship  between  yr  andyi?  Express 
your  result  in  terms  of  a phase  constant  8r  in  the  argument 
of  the  function  yr. 

f.  Show  that  the  phase  difference  between  yt  and  yt  is 
always  zero;  that  is,  8t  = 0.  po"this  by  showing  tfiat  at  x 

Ji  + yr  ~ Jt  regardless  of  the  value  of  t. 

g.  Show  that 


* 


A = 2*i  = 

A;  kx  + k2 


be  ap 


\v2\  + |l'i| 


nergy  flow  at  a junction.  Equation  (12-59)  can*  S 
to  an  element  dx  of  a sinusoidal  traveling  wave  - S’ 


by  replacing  m by  fvdx  and  E by  dE,  giving  dE  = iw2A2/xdx,  A 
where  a>  = 2nv. 

a.  What  is  the  expression  foi<^>/^the  energy  per  unit 
length? 

b.  Equations  (12-55)  and  (12-56)  relate  energy  flux 
and  energy  density.  Use  these  equations  and  the  result  of 
part  a to  show  that  the  energy  arriving  at  a junction  of  two 


Exercises  615 


wires  of  different  linear  density  joined  together  is  equal  to 
the  energy  carried  away  from  the  junction.  It  will  be  nec- 
essary to  use  the  results  of  Exercise  13-28. 

i/ 1 3-30.  The  information  is  all  there!  Consider  carefully 
the  relationship  among  the  mode  frequencies  and  node 
locations  of  successive  symmetrical  modes  of  a circular 
membrane.  One  aspect  of  this  relationship  is  implicit  in 
Exercise  13-19.  Provide  an  explanation  for  the  following 
assertion  (which  is  correct):  The  frequencies  and  wave 
functions  for  modes  1 through  n — 1 of  the  circular 
membrane  can  be  determined  directly  by  inspection  of  the 
results  for  the  nth  mode  by  itself,  without  any  further  nu- 
merical calculation. 

Numerical 

13-31.  Checking  the  boundary  conditions,  I.  Show  that 
for  the  second  standing-wave  mode  of  the  symmetrical  vi- 
brating drumhead  a = 30.85  gives  a better  fit  to  the 
boundary  condition  at  u — 0 than  that  given  by  a = 30.80 
or  a = 30.90.  Do  this  by  running  the  vibrating  drumhead 
program  with  the  same  initial  conditions  and  parameters 
as  in  Example  13-7,  except  with  a set  equal  to  the  lower 
value  in  one  run  and  equal  to  the  higher  value  in  another 
run.  Plot  your  data  only  in  the  region  where  u < 0.10. 
Add  to  your  plot  points  taken  from  Fig.  13-22  in  this 
region. 

13-32.  Checking  the  boundary  conditions,  II.  Show  that 
for  the  third  standing-wave  mode  of  the  symmetrical  vi- 
brating drumhead  a = 75.9  gives  a better  fit  to  the  bound- 
ary condition  at  u = 0 than  that  given  by  a = 75.8  or  a = 
76.0.  Do  this  by  running  the  vibrating  drumhead  program 
with  the  same  initial  conditions  and  parameters  as  in  Ex- 
ample 13-8,  except  with  a set  equal  to  the  lower  value  in 
one  run  and  equal  to  the  higher  value  in  another  run.  Plot 
your  data  only  in  the  region  where  u < 0.10.  Add  to  your 
plot  points  taken  from  Fig.  13-23  in  this  region. 

13-33.  The  fourth  standing  wave  mode.  Run  the  vi- 
brating drumhead  program  to  find  the  value  of  the 
parameter  a for  the  fourth  standing-wave  mode  of  the 
symmetrical  vibrating  drumhead.  Determine  appropriate 
values  for  the  initial  conditions  and  the  independent  vari- 
able increment  by  considering  the  values  employed  in  Ex- 
amples 13-6,  13-7,  and  13-8.  Make  a guess  at  a trial  value 
to  use  for  a by  inspecting  the  values  found  for  the  first, 
second,  and  third  modes  in  the  three  examples.  Next  look 
at  Figs.  13-21,  13-22,  and  13-23  in  order  to  determine 
how  many  times  g(u)  should  change  sign  as  u decreases 
from  1 to  0.  Then  carry  out  a run  with  your  trial  value  of 
a.  Do  not  plot,  but  just  note  the  number  of  sign  changes  in 
g(u),  and  its  slope  at  u = 0.  Search  for  a value  of  a which 
produces  a g(u)  having  the  proper  number  of  sign 
changes,  and  an  acceptably  small  slope  at  u = 0.  When 
you  have  found  it,  make  a final  run  in  which  you  plot  g(u) 
versus  u.  Compare  your  plot  with  the  three  figures.  Final- 
ly, use  your  value  of  a to  add  one  more  frequency  v to  the 
set  displayed  in  Eq.  (13-47). 


13-34.  A loaded  drumhead.  Describe  how  the  proce- 
dure for  determining  the  standing  wave  modes  of  the 
symmetrical  vibrating  drumhead  would  need  to  be  modi- 
fied if  a point  particle  of  mass  m were  attached  to  the 
center  of  the  drumhead. 

13-35.  Standing  waves  on  a loaded  drumhead.  Following 
the  procedure  you  described  in  Exercise  13-34,  perform 
numerical  calculations  to  determine  the  frequencies  and 
wave  functions  of  the  first  three  modes  of  a loaded  drum- 
head for  rn  = fjLna2. 

13-36.  Fourier  synthesis  with  A}  = \/f.  Consider  the 
Fourier  series 

oo 

//)  = V Aj  sin(27 rjvt  + 8/ 

j=i 

where  A,  = l/f  and  8j  = 0. 

a.  Let  fn(t)  represent  the  sum  of  the  first  n terms  in 
the  series;  that  is, 

» | 

fn(t)  = y.~2  sin(27 rjvt) 
j=17 

With  the  help  of  a programmable  calculating  device, 
plot  f„(t)  versus  vt  for  0 vt  =£  1,  for  each  of  the  following 
values  of  n:  (1)  n = 4;  (2)  n = 7;  (3)  n = 10. 

b.  Sketch  the  form  of  the  function  f(t)  that  would  be 
obtained  by  summing  the  entire  infinite  series.  Contrast 
the  function  /(f)  obtained  here  with  the  sawtooth  function 
obtained  in  Sect.  13-7  by  synthesis  with  Aj  = l/j. 

13-37.  Fourier  synthesis  of  a square  wave.  Consider  a 
space-independent  wave  function  f(t),  given  by  f(t)  = A 
for  iT  < t < (i  + k)T  and  f(t)  = —A  for  / + k)T  < t < 
iT  for  all  integers  i = 0,  ±1,  ±2,  ±3,  ....  The  wave 
described  by /(f)  is  called  a square  wave  of  amplitude  A and 
period  T. 

a.  Construct  a graph  showing//)  for  — T =£  f =£  T. 

b.  The  square  wave /(f)  can  be  written  as  a Fourier 
series  in  the  form  of  Eq.  (13-496): 

oo 

/(f)  =2  Aj  sin(2 it jVt  + 8j) 

j=i 

where  v = 1 /T,  provided  that  the  amplitudes  Aj  and  phases 
8j  are  properly  chosen.  The  correct  choices  are  Aj  = 44/ ttj 
for  odd  / Aj  = 0 for  even  / 8}  = 0 for  all  j.  With  the  help 
of  a programmable  calculating  device,  plot  the  partial  sums 


j=i 


for  each  of  the  following  values  of  n:  (1)  n = 3;  (2)  n = 5; 
(3)  n = 7. 

13-38.  Numerical  Fourier  analysis  of  a plucked  string. 
A stretched  string  of  length  L is  plucked  so  that  its 
midpoint  is  given  a transverse  displacement  L/10.  Define 
an  x axis  extending  along  the  undisturbed  string,  and  with 
an  origin  at  one  end.  Then  the  shape  of  the  string  just  be- 
fore being  released  can  be  written  as  /exact(x)  = x/5,  for 


616  Superposition  of  Mechanical  Waves 


x =£  L/2;  /exactW  = (L  — x)/5,  for  x > L/2.  Prepare  a pro- 
gram for  your  calculating  device  which  allows  you  to  find 
approximate  values  of  the  first  three  nonzero  amplitudes, 
A1;  A3,  and  As,  in  the  Fourier  series  for/exact(x).  This  is: 


where  A.  = 2 L.  The  program  should  cause  x to  increase  in 
20  uniform  steps  from  0 to  L.  At  each  step  the  quantity 
/exactM  ~ /FourierW  should  be  evaluated,  and  then  entered 
in  a routine  (accessible  in  a programmable  calculator 
through  a single  key)  which  calculates  the  standard  devia- 
tion, or  variance,  of  a set  of  values.  In  running  the  pro- 
gram, first  set  A3  = 0 and  As  = 0 and  then  find  by  trial 
and  error  a value  of  which  minimizes  the  standard  de- 
viation, or  variance.  Next,  using  the  value  previously 


found  for  Ax  and  keeping  A5  = 0,  find  the  value  of  A3 
which  minimizes  the  standard  deviation,  or  variance. 
Finally,  using  the  values  previously  found  for  At  and  A3, 
find  the  value  of  A5  which  minimizes  the  standard  devia- 
tion, or  variance.  Now  run  the  program  in  such  a way  that 
you  can  make  plots  of  /exa ctW  and  /Fourier M-  How  is  it 
known  from  the  beginning  that  A2  = 0 and  A4  = 0?  See 
Exercise  13-21. 

13-39.  Numerical  Fourier  analysis  revisisted.  Modify  the 
procedure  described  in  Exercise  13-38  so  that  you  can  use 
it  to  find  approximate  values  of  the  first  three  nonzero 
amplitudes  A1;  A2,  and  A3  in  the  Fourier  expansion  for  a 
string  plucked  a third  of  the  way  from  its  end,  at  the  point 
x = L/2>.  Why  is  A2  / 0 in  this  case?  Can  you  explain  why 
A3  - 0? 


Exercises  617 


Relativistic 

Kinematics 


14-1  THE  In  the  preceding  chapters  we  have  developed  and  used  the  mechanics  of 
RELATIVISTIC  Isaac  Newton  to  study  the  behavior  of  objects  moving  at  speeds  small  com- 
DOMAIN  pared  to  the  speed  of  light.  Now  we  turn  our  attention  to  objects  moving  at 
speeds  comparable  to  the  speed  of  light.  In  so  doing  we  enter  what  is  called 
the  relativistic  domain,  temporarily  leaving  behind  the  newtonian  domain. 
In  the  relativistic  domain,  motion  must  be  treated  in  terms  of  Albert  Ein- 
stein’s theory  of  relativity. 

The  basic  concepts  of  relativity  occur  in  two  forms:  the  special  theory 
of  relativity  and  the  general  theory  of  relativity.  The  special  theory  is  used 
to  treat  the  rapid  motion  of  objects,  of  any  size,  as  seen  from  inertial  refer- 
ence frames.  It  is  the  original  form  of  the  theory,  developed  by  Einstein  in 
1905,  and  by  far  the  simplest.  The  general  theory  has  to  do  with  noniner- 
tial  (that  is,  accelerating)  reference  frames  and  their  relation  to  gravity. 
This  form  of  the  theory,  first  set  forth  by  Einstein  in  1916,  is  used  for  the 
most  part  to  treat  the  behavior  of  very  massive  systems.  These  are  generally 
ones  at  the  large  end  of  the  natural  scale  of  sizes,  that  is,  systems  of  astro- 
nomical size.  Because  of  its  mathematical  complexity,  we  only  mention  the 
existence  of  the  general  theory.  In  this  chapter  and  the  next  we  concern 
ourselves  with  developing  the  special  theory  of  relativity.  And  in  subse- 
quent chapters  we  make  use  of  the  special  theory  on  a number  of  occasions. 

The  mechanics  we  will  develop  to  analyze  motion  on  the  basis  of  the 
special  theory  of  relativity  is  known  as  relativistic  mechanics.  It  is  used  to 
study  the  high-speed  motion  of  objects  over  the  entire  range  of  sizes.  Rela- 
tivistic mechanics  applies  over  the  entire  range  of  speeds,  its  results 
merging  smoothly  into  those  of  newtonian  mechanics  when  the  speed  is 
low  compared  to  the  speed  of  light.  It  is  the  more  broadly  applicable  me- 
chanics, valid  from  zero  speed  to  the  speed  of  light.  (We  will  find  that  the 


618 


range  of  possible  speeds  for  any  object  ends  at  the  speed  of  light;  higher 
speeds  cannot  be  attained.)  To  put  the  matter  another  way,  the  rules  of 
newtonian  mechanics  emerge  as  the  low-speed  special  case  of  the  more 
broadly  applicable  relativistic  mechanics. 

This  close  relation  between  newtonian  and  relativistic  mechanics  will 
be  very  helpful  in  our  development  of  relativistic  mechanics.  It  allows  us  to 
employ  procedures  analogous  to  those  that  worked  so  well  in  our  develop- 
ment of  newtonian  mechanics.  First,  we  concentrate  our  attention  on  ac- 
curately describing  high-speed  motion.  This  is  the  task  of  this  chapter, 
which  treats  relativistic  kinematics.  In  Chap.  15  we  consider  the  question 
of  what  causes  the  motion,  using  arguments  that  involve  the  conservation 
of  momentum  to  obtain  relativistic  mechanics. 

We  will  take  advantage  of  the  close  relation  between  newtonian  and  re- 
lativistic mechanics  in  another  way  to  help  us  in  our  development  of  the 
latter.  Since  relativistic  mechanics  is  supposed  to  be  valid  at  all  possible 
speeds,  any  relativistic  result  must  agree  with  the  familiar  newtonian  result 
if  the  speed  is  low.  Thus  we  can  check  each  new  result  of  relativistic  kine- 
matics, or  mechanics,  against  the  newtonian  result  by  considering  the 
low-speed  case  of  the  relativistic  result.  If  the  check  is  satisfactory,  we  will 
gain  confidence  in  the  correctness  of  our  arguments. 

You  may  find  the  special  theory  of  relativity  to  be  one  of  the  most 
interesting,  challenging,  and  rewarding  topics  in  this  book.  The  arguments 
involve  simple  mathematics  and  simple  physical  systems.  But  you  must  pay 
very  careful  attention  to  the  arguments.  They  often  lead  to  conclusions  that 
seem  to  be  in  serious  conflict  with  your  intuition.  If  you  object  to  a conclu- 
sion on  intuitive  grounds,  you  should  then  feel  obliged  to  find  a fault  in  the 
argument  or  a fault  in  your  intuition.  After  all,  you  should  not  expect 
“common  sense’’  to  take  you  very  far  in  guessing  about  how  systems  should 
behave  at  speeds  with  which  you  have  had  no  everyday  experience  at  all.  Be 
warned  by  Einstein’s  saying:  “Common  sense  is  that  layer  of  prejudices  ac- 
quired before  the  age  of  eighteen.’’ 


14-2  THE  SPEED  The  speed  of  light  plays  a fundamental  role  in  the  theory  of  relativity.  This 
OF  LIGHT  speed  enters  into  relativistic  kinematics,  and  hence  into  relativistic  me- 
chanics, as  the  maximum  speed  at  which  information  can  be  transmitted 
from  one  location  to  another.  Before  we  can  explain  the  connection 
between  relativistic  kinematics  and  the  transmission  of  information,  we 
must  describe  briefly  how  light  travels.  Then  we  must  consider  experiments 
which  measure  how  fast  it  travels  in  various  circumstances. 

Light  travels  according  to  the  laws  of  wave  motion.  Except  for  a very 
important  dif  ference,  it  travels  in  just  the  same  way  that  mechanical  waves 
do.  Thus  there  is  a strong  analogy  between  light  waves  and  transverse 
waves  propagating  along  the  connected  particles  of  a stretched  string,  or 
pressure  waves  propagating  through  the  interacting  particles  of  the  air. 
The  very  important  difference  is  that  there  is  no  mechanical  medium  involved 
in  the  propagation  of  a light  wave.  This  is  made  apparent  by  the  fact  that  light 
travels  from  the  sun  to  the  earth  through  the  essentially  perfect  vacuum  of 
space. 

In  Chap.  27  we  investigate  what  it  is  that  is  “waving”  in  a light  wave, 
and  we  find  that  the  answer  is  electric  and  magnetic  fields.  We  also  find  that 
traveling  light  waves  transport  energy,  just  as  traveling  mechanical  waves 

14-2  The  Speed  of  Light  619 


Fig.  14-1  I he  reinforcement  in  the  re- 
flection of  light  by  a thin  oil  him  floating 
on  water. 


do.  This  is  the  property  of  interest  here.  But  you  already  know  it  and  are 
reminded  of  it  every  time  you  stand  under  the  hot  sun  on  a clear  day.  Be- 
cause light  carries  energy,  it  can  be  used  to  transmit  signals  that  can  actuate 
detectors  at  distant  locations.  In  fact,  light  waves  are  commonly  used  for 
this  purpose.  Of  course,  this  is  just  what  happens  in  the  phenomenon  of  vi- 
sion. 


In  Chap.  28  we  study  a variety  of  experimental  demonstrations 
proving  that  light  travels  as  a wave.  Here  we  describe  two  such  demon- 
strations. Our  first  experimental  proof  of  the  wave  nat  ure  of  light  is  found 
in  the  multiple  colors  produced  when  white  light  is  reflected  from  a very 
thin  film  of  oil  that  has  spread  over  a surface  of  water.  You  have  surely  seen 
this  phenomenon  at  some  time  or  other. 

A very  thin  film  of  oil  is  itself  quite  transparent  and  colorless.  It  cer- 
tainly does  not  have  a multiplicity  of  intrinsic  colors.  The  observed  colors 
are  formed  because  part  of  the  illuminating  light  is  reflected  at  the  inter- 
face between  the  oil  and  the  air  above  it,  and  the  rest  passes  into  the  film 
where  some  of  it  is  reflected  at  the  interface  between  the  oil  and  the  water 
beneath  it.  An  observer  sees  light  reflected  from  both  surfaces,  the  two 
beams  of  reflected  light  having  traveled  paths  of  different  length  to  reach 
the  eye.  See  Fig.  14-1.  If  the  illuminating  light  is  of  a single  wavelength,  the 
two  beams  of  reflected  light  will  either  reinforce  or  cancel  each  other  when 
superposed  in  the  observer’s  eye,  depending  on  the  way  in  which  the  wave- 
length of  the  light  fitted  into  the  difference  in  the  two  path  lengths.  If  the 
additional  length  of  path  traveled  by  the  light  reflected  from  the  lower  sur- 
face is  equal  to  an  integral  number  of  wavelengths  (that  is,  X,  or  2X,  or  3X, 
and  so  forth),  then  the  two  reflected  light  beams  will  be  in  phase  when  they 
combine,  and  they  will  reinforce  each  other.  (For  the  case  illustrated  in  the 
figure,  the  combining  light  beams  reinforce  each  other  because  the  length 
of  the  additional  path  is  X.)  If  the  additional  path  length  is  equal  to  a 
half-integral  number  of  wavelengths  (that  is,  X/2,  or  3X/2,  or  5X/2,  and  so 
forth),  the  beams  will  be  out  of  phase  when  they  combine  and  so  will  cancel 
each  other. 

White  light  is  actually  a mixture  of  wavelengths,  the  various  wave- 
lengths corresponding  to  what  our  eyes  and  brain  perceive  as  the  various 
colors.  Also,  an  oil  film  is  usually  not  of  uniform  thickness.  So  when  white 
light  is  incident  on  such  a film,  regions  of  a particular  thickness  produce 
strong  reflection  of  the  particular  wavelength  which  has  the  proper  rela- 
tion to  the  thickness,  and  we  perceive  the  corresponding  color.  Regions  of 
different  thicknesses  produce  strong  reflections  of  different  wavelengths, 
perceived  as  different  colors.  The  most  satisfactory  way  to  explain  this 
familiar  phenomenon  (and  the  only  way  to  explain  many  others  which  we 
discuss  in  the  next  paragraphs  and  in  Chap.  28)  is  by  saying  that  light  trav- 
els as  waves  that  superpose — that  is,  obey  the  superposition  principle  — 
just  like  mechanical  waves  involving  small  displacements. 


A particularly  striking  demonstration  and  application  of  the  superpo- 
sition of  light  waves  are  provided  by  an  instrument  invented  and  refined  in 
the  1870s  by  the  U.S.  physicist  Albert  A.  Michelson  (1852-1931).  The  in- 
strument shown  schematically  in  Fig.  14-2  is  called  a Michelson  interfero- 
meter. Light  of  a single  wavelength  expanding  from  a point  source  S is  ren- 
dered into  a nondivergent  beam  by  lens  L.  The  beam  falls  on  a glass  plate  P 
inclined  at  an  angle  of  45°  to  the  beam.  The  plate  is  lightly  silvered  so  that  it 


620  Relativistic  Kinematics 


Fig.  14-2  Schematic  diagram  of  a 
Michelson  interferometer.  Considera- 
tion of  the  paths  followed  by  light  at  the 
two  edges  of  the  beam  incident  on  plate 
P will  show  that  the  path  difference  is 
the  same  across  the  entire  width  of  the 
recombining  beams,  providing  that  the 
surface  of  plate  P is  at  precisely  45°  to 
the  direction  of  the  beam  incident  on  it 
and  that  the  surfaces  of  mirrors  Ml  and 
M2  are  at  precisely  90°  to  the  beams 
incident  on  them. 


Representation 
of  fringe  pattern 


forms  a partially  reflecting  mirror.  Approximately  half  of  the  incident 
beam  is  reflected  along  path  1 to  the  fully  silvered  mirror  M1;  and  the  re- 
mainder is  transmitted  through  P to  follow  path  2 to  the  fully  silvered 
mirror  M2.  Mirrors  and  M2  reflect  the  beams  back  on  themselves  toward 
P.  On  striking  P,  about  half  of  the  beam  returning  along  path  2 is  reflected 
into  the  observer’s  eye  E,  and  about  half  of  the  beam  returning  along  path 
1 is  transmitted  through  P into  E.  Thus  the  interferometer  splits  the  beam 
of  light  from  the  source  into  two  beams,  which  travel  different  paths  until 
they  are  brought  together  to  recombine  in  the  eye  of  the  observer. 

The  recombining  beams  will  be  in  phase,  and  so  will  reinforce  each 
other,  if  the  round-trip  distance  along  path  1 differs  from  the  round-trip 
distance  along  path  2 by  an  integral  number  of  wavelengths  \ of  the  light. 
If  these  distances  differ  by  a half-integral  number  of  wavelengths,  the  re- 
combining beams  will  be  out  of  phase  and  will  cancel  each  other.  (Note  the 
close  analogy  with  the  acoustical  interferometer  of  Sec.  13-1.)  In  practice. 
Mi  and  M2  are  not  precisely  perpendicular,  so  the  path  difference  varies  by 
a few  wavelengths  across  the  width  of  the  recombining  beams.  The  result  is 
that  the  observer  sees  a pattern  of  a few  alternating  bright  and  dark  bands, 
called  fringes,  as  indicated  in  the  insert  below  the  eye  in  Fig.  14-2. 

The  mirror  Mi  can  be  moved  along  a carriage  by  turning  the  handle  H 
connected  to  a screw  of  fine  pitch.  When  Mr  is  moved  through  a distance 
A./4,  one-quarter  of  the  wavelength  of  the  light  emitted  by  S,  the  round-trip 
distance  traveled  by  the  light  beam  in  that  arm  of  the  interferometer 
changes  by  A./2,  one-half  a wavelength.  Thus  all  the  path  differences  for 
the  two  beams  change  by  half  a wavelength.  As  a consequence,  every  region 
where  the  recombining  beams  had  been  in  phase  now  becomes  a region 
where  they  are  out  of  phase,  and  every  region  where  they  had  been  out  of 
phase  now  becomes  a region  where  they  are  in  phase.  This  makes  all  the 
regions  that  had  contained  bright  fringes  now  shift  to  regions  that  contain 
dark  fringes,  and  vice  versa.  The  effect  is  striking.  It  can  be  understood 
only  in  terms  of  the  superposition  of  waves. 


14-2  The  Speed  of  Light  621 


A Michelson  interferometer  is  the  instrument  used  to  establish  the  relation 
between  the  meter  and  the  wavelength  of  light  emitted  by  atoms  of  krypton-86.  As 
discussed  in  Sec.  2-2,  the  standard  meter  is  now  defined  to  be  1,650,763.73  wave- 
lengths of  this  light.  The  number  was  obtained  by  using  a krypton-86  light  source, 
stepping  off  a total  distance  with  the  movable  mirror  defined  by  the  separation 
between  the  scratches  at  the  ends  of  the  meter  bar  that  was  the  previous  standard, 
and  counting  the  total  number  of  fringe  shifts  that  are  observed. 

Interferometers  using  a light  source  of  known  wavelength  are  employed  in 
science,  engineering,  and  technology  to  measure  lengths.  The  limit  of  accuracy  is 
a fraction  of  the  wavelength  of  the  light  used,  that  is,  smaller  than  about  5 x 10~7 
m.  It  is  desirable  for  the  light  source  to  be  a laser.  The  reason  is  that  lasers  emit  an 
almost  continuous  train  of  light  waves  that  will  produce  fringes  in  an  interfero- 
meter even  when  the  difference  in  the  lengths  of  the  paths  followed  by  the  two 
beams  is  very  large.  All  other  light  sources  emit  light  in  a series  of  short-duration 
bursts,  with  the  wave  in  each  burst  unrelated  in  phase  to  the  wave  in  the  others. 
Because  the  bursts  of  light  from  these  sources  are  of  short  duration,  they  extend 
along  the  beam  only  a short  distance — typically  something  like  a few  centimeters. 
When  the  difference  in  the  total  length  of  the  two  paths  followed  by  the  split  beam 
through  the  interferometer  exceeds  the  length  of  an  individual  burst  of  light — the 
coherence  length — fringes  are  no  longer  observed.  This  is  because  a burst  com- 
bines in  the  observer's  eye  no  longer  with  itself,  but  instead  with  a preceding  or 
following  burst  that  is  unrelated  in  phase.  The  accuracy  of  the  krypton-86  inter- 
ferometer definition  of  the  meter  suffers  as  a result  of  the  necessity  of  measuring 
die  length  of  a meter  bar  in  a sequence  of  short  steps,  instead  of  in  one  single  mea- 
surement. So  it  is  likely  that  the  standard  meter  will  be  redefined  once  more,  using 
a laser  interferometer.  [Krypton  does  not  appear  to  be  usable  in  a laser,  so  a dif- 
ferent material,  producing  light  of  a different  wavelength,  probably  will  have  to  be 
employed  in  such  a redefinition  of  the  meter.) 

We  turn  now  to  a question  of  most  basic  importance  to  the  theory  of 
relativity:  What  is  the  speed  of  propagation  of  a light  wave?  The  first  rea- 
sonably accurate  measurement  of  the  speed  of  light  was  performed  in  1849 
by  the  French  physicist  A.  H.  L.  Fizeau  (1819-1896),  using  the  apparatus 
sketched  schematically  in  Fig.  14-3.  Light  from  an  intense  source  5 was  re- 
flected by  a mirror  Mx  to  a lens  L1  that  focused  it  on  a gap  in  a wheel  W 
having  720  evenly  spaced  teeth.  The  light  passed  through  the  gap  to  a lens 
L2,  which  converted  it  into  a parallel  beam  that  traveled  a distance  of  8.63 
km  to  a mirror  M2.  At  the  mirror  the  beam  was  reflected  back  on  itself, 
finally  passing  through  the  gap  in  the  wheel.  Just  missing  Mx,  the  beam  en- 
tered the  observer’s  eye  E.  The  wheel  was  then  set  into  rotation,  at  gradu- 
ally increasing  speed.  The  brightness  of  the  light  seen  by  the  observer  de- 
creased, reaching  zero  when  the  speed  was  12.6  rotations  per  second,  and 
then  increased  to  a maximum  when  the  rotation  speed  was  25.2  rotations 
per  second.  This  is  because  the  rotating  toothed  wheel  chopped  the  light 


8.63  km 


W 


Fig.  14-3  Schematic  diagram  of  Fizeau’s  apparatus 
for  measuring  the  speed  of  light. 


Relativistic  Kinematics 


beam  traveling  toward  the  mirror  into  a series  of  pulses.  At  12.6  rotations 
per  second,  the  time  required  for  a pulse  to  travel  to  the  mirror  and  back 
equaled  the  time  required  for  the  wheel  to  rotate  through  such  an  angle 
that  the  tooth  adjacent  to  the  gap  moved  into  the  path  of  the  beam,  com- 
pletely blocking  the  returning  pulse.  But  when  the  rotation  speed  was  25.2 
rotations  per  second,  the  next  gap  of  the  wheel  had  rotated  into  the  light 
path,  allowing  all  the  returning  pulse  to  pass  into  the  observer’s  eye.  These 
data  are  used  in  Example  14-1  to  evaluate  the  speed  of  light. 


EXAMPLE  14-1 

Use  Fizeau's  data  to  determine  the  speed  of  light. 

■ Since  there  were  720  teeth,  and  gaps,  in  the  wheel,  the  time  t required  for  one 
gap  to  rotate  into  the  position  of  an  adjacent  gap  is  1 /720  of  the  time  required  for  the 
wheel  to  make  a complete  rotation.  At  25.2  rotations  per  second,  the  time  for  one 
complete  rotation  is  1 s/25.2.  So 


1 1 s 

720  X 2A2 


= 5.51  x 10~5  s 


In  the  time  t the  light  pulse  travels  a distance  d from  the  wheel  to  the  mirror  and 
back.  This  distance  is 


d = 2 x 8.63  x 103  m = 1.73  x 104  m 


Evaluating  the  speed  v = d/t  of  the  light  pulse,  you  have 

_ d _ 1.73  x IQ4  m 
V ~ t ~ 5.51  x 1(T5  s 
= 3.14  x 108  m/s 

This  result  is  about  5 percent  large,  compared  to  more  precise  modern  experi- 
ments, owing  mainly  to  the  uncertainty  inherent  in  Fizeau’s  method  of  measuring 
the  rotation  rate  of  his  toothed  wheel. 


In  the  intervening  years  many  investigators  have  carried  out  progres- 
sively more  accurate  measurements  of  the  speed  of  light.  A central  motiva- 
tion for  this  difficult  task  is  the  very  important  role  played  by  the  speed  of 
light  in  physics.  These  measurements  have  been  made  using  not  only  visi- 
ble light,  but  also  other  members  of  the  family  of  electromagnetic  radiation 
to  which  light  belongs.  All  electromagnetic  radiation  travels  at  the  same 
speed  in  vacuum.  For  technical  reasons,  it  is  possible  to  make  especially 
precise  measurements  of  the  speed  of  the  type  of  electromagnetic  radiation 
called  microwaves  (very  short-wavelength  radio  waves,  with  A.  — 1 cm).  The 
speed  of  electromagnetic  radiation,  loosely  called  the  speed  of  light,  is  so 
fundamental  that  it  is  given  its  own  symbol  c.  The  currently  accepted  value, 
obtained  principally  from  microwave  measurements  in  vacuum,  is 

c = 2.99793  x 108  m/s  (14-1) 

with  an  accuracy  of  1 in  the  last  quoted  decimal  place.  For  most  practical 
calculations  the  value  used  is  c — 3.00  x 108  m/s.  Although  measurements 
of  the  speed  of  light  are  sometimes  made  with  the  light  traveling  through 
air,  the  value  given  in  Eq.  (14-1)  has  been  corrected  so  that  it  pertains  to  the 
speed  of  light  in  vacuum.  The  correction  is  very  small;  the  speed  in  vacuum 
is  higher  than  the  speed  in  air  by  0.00067  x I08  m/s. 


14-2  The  Speed  of  Light  623 


Consider  a mechanical  wave  propagating  through  a medium  that  is  at 
rest  in  an  inertial  reference  frame,  such  as  a sound  wave  propagating 
through  still  air.  The  wave  travels  at  a speed  that  can  be  calculated  from  the 
wave  equation  (or  looked  up  in  a table  reporting  experimental  measure- 
ments) in  terms  of  the  mechanical  properties  of  the  propagation  medium. 
When  we  speak  of  the  speed  of  this  wave,  it  is  perfectly  unambiguous  what 
we  mean.  We  do  not  mean  the  speed  measured  with  respect  to  the  source 
of  the  wave.  This  would  be  inappropriate  since  the  speed  of  the  wave  (in 
contrast  to  its  frequency  or  wavelength)  is  not  affected  by  any  motion  the 
source  may  have.  Nor  do  we  mean  the  speed  of  the  wave  measured  with 
respect  to  the  observer  of  the  wave.  It  would  not  be  appropriate  to  mean 
this  because  the  speed  measured  with  respect  to  the  observer  (in  other 
words,  the  value  that  would  be  obtained  by  the  observer  in  a measurement 
of  the  speed  of  the  wave  as  seen  by  the  observer)  is  affected  by  the  motion 
of  the  observer.  So  there  would  be  no  unique  value  for  the  speed  of  a wave 
propagating  through  a particular  mechanical  medium  if  the  speed  were 
measured  with  respect  to  the  observer.  What  we  do  mean  when  we  speak  of 
the  speed  of  the  wave  is  the  speed  measured  with  respect  to  the  mechanical 
medium  through  which  it  propagates.  For  a particular  medium  this  speed  has  a 
particular  value,  and  it  is  the  fundamental  value  characterizing  the  motion 
of  the  wave. 

But  a light  wave  propagates  through  vacuum.  So  there  is  no  medium  in- 
volved in  its  propagation.  Yet  speed  must  always  be  measured  with  respect  to 
something.  What  then  can  we  mean  when  we  talk  about  the  speed  of  light  in 
vacuum?  With  no  propagation  medium  with  respect  to  which  to  measure  the 
speed,  it  would  seem  that  we  must  choose  between  measuring  the  speed 
with  respect  to  the  source  of  a light  wave  or  measuring  the  speed  with 
respect  to  the  observer  of  the  light  wave.  (What  else  is  there?)  We  have  no 
reason  to  expect  that  the  motion  of  the  source  of  a light  wave  is  connected 
in  any  way  to  its  speed — it  certainly  is  not  in  the  case  of  a mechanical  wave. 
If  we  rule  out  the  source,  what  remains  is  the  observer.  Then  thinking  care- 
fully about  what  actually  happens  in  an  experimental  procedure  used  to 
determine  the  speed  of  light,  we  soon  come  to  the  conclusion  that  if  we  say 
the  speed  of  a light  wave  is  to  be  measured  with  respect  to  the  observer,  we 
are  completely  consistent  with  the  experimental  procedure.  In  an  experi- 
ment used  to  determine  the  speed  of  light,  the  light  wave  moves  through 
some  apparatus  fixed  in  a reference  frame  that  can  be  considered  to  be  an 
inertial  frame.  Take  as  an  example  an  experiment  with  the  apparatus  fixed 
to  the  surface  of  the  earth.  An  observer  stationed  at  the  apparatus  manipu- 
lates it  and  records  the  data.  Thus  the  value  obtained  for  the  speed  of  light 
is  a value  measured  with  respect  to  the  apparatus  or,  what  amounts  to  the 
same  thing,  a value  measured  with  respect  to  the  observer.  These  consider- 
ations tell  us  what  we  mean  by  the  speed  of  light  by  specifying  the  opera- 
tional definition  of  the  quantity — that  is,  the  definition  in  terms  of  the  exper- 
imental operations  carried  out  in  order  to  measure  its  value.  What  we  mean 
is  that  we  measure  the  speed  of  light  with  respect  to  the  observer. 


Say  we  are  measuring  the  speed  of  a beam  of  light  passing  by  us  and 
that  we  are  in  the  essentially  inertial  reference  frame  of  the  earth’s  surface. 
Another  observer  is  in  an  essentially  inertial  reference  frame  of  a rocket 
ship,  moving  uniformly  with  respect  to  us  in  the  direction  of  the  light  beam 


at  a very  high  speed.  This  observer  uses  an  identical  apparatus  to  measure 
the  speed  of  the  same  light  beam.  Will  we  obtain  the  value  quoted  in  Eq. 
(14-1),  c = 3.00  x 108  m/s,  while  the  observer  obtains  some  other  value?  If 
we  answer  Yes,  we  put  ourselves  in  the  following  logical  dilemma.  We  are 
on  the  earth,  which  moves  at  an  appreciable  speed  with  respect  to  the  sun, 
while  the  sun  moves  with  respect  to  the  center  of  mass  of  our  galaxy  and 
our  galaxy  moves  with  respect  to  the  center  of  mass  of  the  universe.  So  it  is 
not  only  the  observer  in  the  rocket  ship  who  is  moving  with  respect  to  the 
universe- — we  are,  too.  Who,  then,  will  be  measuring  “the”  speed  of  the 
light  beam?  If  it  is  not  the  observer  in  the  rocket  ship,  why  should  it  be  us? 
In  the  absence  of  a medium  through  which  light  waves  propagate,  there  is 
no  observer  in  the  universe  who  has  a privileged  status  with  regard  to  a 
measurement  of  the  speed  of  light. 

For  mechanical  waves,  there  is  a privileged  observer  — the  one  who  is 
at  rest  with  respect  to  the  inertial  frame  of  the  propagation  medium.  This 
observer  will  obtain  the  fundamental  value  when  measuring  the  speed  of 
the  mechanical  waves.  Other  observers,  in  inertial  frames  moving  with 
respect  to  that  of  the  propagation  medium,  will  find  numerically  different 
values  when  they  measure  the  speed  of  the  waves  from  their  point  of  view. 
If  an  observer  is  moving  through  the  medium  in  the  same  direction  as  that 
of  the  mechanical  wave,  the  value  he  measures  for  the  speed  will  be  de- 
creased by  the  value  of  his  speed  through  the  medium.  If  another  observer 
is  moving  relative  to  the  medium  in  opposition  to  the  direction  of  the  pro- 
pagating mechanical  wave,  she  will  measure  a value  for  its  speed  which  is 
increased  by  the  value  of  her  speech  There  is  an  abundance  of  accurate 
experimental  proof  that  this  is  so.  An  example  is  found  in  any  measure- 
ment of  the  Doppler  effect  for  sound  waves  with  a moving  observer,  dis- 
cussed in  Sec.  12-7.  But  it  cannot  work  this  way  for  light  waves  since  light 
waves  have  no  propagation  medium. 

Based,  at  least  in  part,  on  ideas  such  as  these,  in  his  landmark  1905 
paper  on  relativity  theory  Einstein  made  this  assertion:  The  speed  of  light  is 
measured  by  observers  in  all  inertial  reference  frames  to  have  the  same  value,  despite 
the  fact  that  such  observers  may  be  moving  with  respect  to  one  another. 


In  the  nineteenth  century,  most  physicists  held  strongly  to  quite  a dif- 
ferent idea.  They  were  so  impressed  with  the  successful  theory  of  the  prop- 
agation of  mechanical  waves  that  they  tried,  as  much  as  possible,  to  use  it  as 
a model  for  the  theory  of  the  propagation  of  light  waves.  Thus  the  gener- 
ally held  view  at  the  time  was  that  there  is  a privileged  observer  for  light 
waves — the  observer  in  a special  inertial  reference  frame.  It  was  believed 
that  only  that  observer  would  obtain  the  fundamental  value  when  mea- 
suring the  speed  of  light  waves.  Other  observers,  in  inertial  frames  moving 
with  respect  to  the  special  one,  would  obtain  different  values  for  the  speed. 
This  usually  was  done  by  saying  that  the  special  frame  is  the  inertial  frame 
fixed  with  respect  to  the  universe  as  a whole.  Thus  the  idea  was  that  an 
observer  in  a reference  frame  stationary  with  respect  to  the  center  of  mass 
of  the  universe  would  measure  the  “real”  value  of  the  speed  of  any  light 
beam.  And  an  observer  is  an  inertial  frame  moving  with  respect  to  the  center 
of  mass  of  the  universe  would  measure  an  “apparent”  value  for  the  speed 
of  the  light  beam  which  differed  from  the  “real”  value  in  a way  that  depends 
on  the  motion  of  the  observer’s  inertial  frame. 


14-2  The  Speed  of  Light  625 


In  1881  Michelson  began  a series  of  experiments  whose  results,  when 
finally  they  were  understood,  showed  that  this  idea  is  incorrect.  The  exper- 
iments used  a Michelson  interferometer.  In  particular,  a very  careful  series 
of  measurements  on  an  improved  apparatus  was  performed  by  Michelson 
in  collaboration  with  the  chemist  Edward  W.  Mot  ley  (1838-1923)  in  1887. 

The  Michelson-Morley  experiment,  as  it  is  called,  takes  advantage  of 
the  fact  that  any  reference  frame  fixed  to  the  surface  of  the  earth  is  an  es- 
sentially inertial  frame  that  is  moving  at  a high  speed  with  respect  to  the  in- 
ertial frame  fixed  to  the  center  of  mass  of  the  universe.  At  the  time  no  in- 
formation existed  concerning  the  motion  of  either  our  galaxy  with  respect 
to  the  universe  or  the  sun  with  respect  to  our  galaxy.  But  motion  of  the 
earth  with  respect  to  the  sun  was,  of  course,  well  known.  Almost  entirely 
because  of  its  annual  motion,  the  earth  moves  with  respect  to  the  sun  at  a 
speed  of  about  3 x 104  m/s. 

In  the  experiment  a large  interferometer  was  mounted  on  a granite 
block.  The  block  was  floated  in  a pool  of  mercury,  both  to  minimize 
bending  of  the  block  from  any  nonuniformity  of  support  and  to  facilitate 
smooth  rotation  of  the  entire  apparatus  in  the  horizontal  plane.  One  arm 
of  the  interferometer  was  aligned  parallel  to  the  direction  of  motion  of  the 
earth  with  respect  to  the  sun,  and  the  other  arm  was  aligned  perpendicular 
to  this  direction.  In  the  parallel  arm,  the  light  first  moved  away  from  the 
observer,  in  the  same  direction  as  that  of  the  observer’s  motion  with  respect 
to  the  sun.  After  reflection  from  the  mirror  at  the  end  of  the  arm,  the  light 
moved  in  a direction  opposite  to  that  of  the  observer’s  motion  with  respect 
to  the  sun.  In  the  perpendicular  arm,  the  direction  of  motion  of  the  light 
was  at  all  times  perpendicular  to  that  of  the  observer.  If  the  motion  with 
respect  to  the  sun  of  the  observer  stationed  at  the  interferometer  affected 
the  speed  of  the  light  that  he  observed — because  a motion  with  respect 
to  the  sun  means  a motion  with  respect  to  the  universe  as  a whole  — there 
would  be  very  small  but  measurable  differences  between  the  speeds  the 
observer  would  find  for  light  traveling  in  the  various  directions.  If  these 
differences  were  present,  they  would  show  up  as  a change  in  the  phase 
of  one  recombining  light  beam  with  respect  to  the  other.  This  is  because 
the  total  travel  time  for  one  beam  would  be  affected  differently  from  that 
of  the  other.  In  the  experiment,  the  fringe  pattern  was  observed  as  the 
interferometer  was  rotated  through  90°.  The  90°  rotation  interchanged  the 
two  arms  of  the  interferometer,  which  would  have  reversed  the  sign  of  any 
phase  change  present  and  thereby  led  to  a fringe  shift. 

Michelson  and  Morley  made  a calculation  in  which  they  assumed  that 
the  observed  speed  of  light  waves  is  affected  by  the  motion  of  the  observer’s 
inertial  frame  with  respect  to  one  fixed  to  the  center  of  mass  of  the  universe 
in  the  same  way  that  the  observed  speed  of  mechanical  waves  is  affected  by 
the  motion  of  the  observer’s  inertial  frame  with  respect  to  one  fixed  to  the 
mechanical  propagation  medium.  The  calculation  predicted  that  they 
should  have  found  a shift  of  about  0.4  fringe  when  they  rotated  the  appa- 
ratus by  90°.  (This  does  not  take  into  account  the  motion  of  the  sun  with 
respect  to  the  universe.  When  it  is  taken  into  account,  the  predicted  fringe 
shift  is  considerably  larger.)  They  knew  that  they  could  reliably  detect  a 
fringe  shift  as  small  as  0.01  fringe.  But  to  this  degree  of  accuracy  they 
found  no  evidence  for  a fringe  shift. 

Thus  the  Michelson-Morley  experiment  is  inconsistent  with  the  idea 
that  there  is  something  special  about  an  inertial  frame  fixed  to  the  center  of 
mass  of  the  universe,  as  far  as  the  speed  of  light  is  concerned.  This  is  so  be- 


626  Relativistic  Kinematics 


O O' 

► 


Fig.  14-4  Observer  0 is  at  rest  in  one 
inertial  frame,  and  observer  O'  is  at  rest 
in  another  inertial  frame.  Observer  O' 
is  moving  to  the  right  relative  to  O,  so 
the  frames  are  in  relative  motion.  But 
because  all  inertial  frames  are  equiva- 
lent, the  speed  of  light  is  the  same  in 
both  frames.  Thus  O and  O'  will  find 
that  the  speed  of  the  light  beam  is 
the  same. 


14-3  THE 
EQUIVALENCE 
OF  INERTIAL 
FRAMES 


cause  the  experiment  shows  that  rapid  motion  of  an  observer’s  inertial 
frame  with  respect  to  the  “special”  inertial  frame  does  not  change  the  speed 
of  light  measured  by  the  observer.  Michelson  received  the  Nobel  prize  in 
physics  in  1907  for  the  invention  and  utilization  of  the  interferometer.  He 
was  the  first  U.S.  citizen  to  win  the  prize. 

In  contrast,  the  Michelson-Morley  experiment  is  consistent  with  Ein- 
stein’s assertion  that  the  speed  of  light  is  measured  by  observers  in  all  iner- 
tial frames  to  have  the  same  value. 


What  Einstein  said  is  certainly  not  what  Newton  would  have  said.  Ein- 
stein’s assertion  is  completely  contradictory  to  newtonian  intuition.  For  in- 
stance, imagine  the  front  of  a light  beam  that  is  moving  away  from  a flash- 
light just  after  it  has  been  turned  on,  as  in  Fig.  14-4.  Let  there  be  two 
observers  measuring  the  speed  of  the  front  of  the  beam.  Observer  O is  sta- 
tioned in  the  inertial  frame  containing  the  flashlight,  and  observer  O',  who 
is  on  a rocket  ship,  is  stationed  in  an  inertial  frame  moving  with  respect  to 
the  first  one  in  the  same  direction  as  the  light  beam  and  at  half  the  speed  of 
light.  If  the  speed  of  light  is  the  same  for  both  observers,  then  both  O and  O' 
will  obtain  the  same  numerical  value,  c = 3.00  x 108  m/s,  for  their  mea- 
surements of  the  speed  at  which  the  front  of  the  beam  moves  away  from 
them. 

You  cannot  say  that  O is  right  while  O'  is  wrong,  just  because  O 
happens  to  be  in  the  same  inertial  frame  as  the  flashlight.  The  speed  of 
light  waves  is  not  afected  by  the  motion  of  the  source  any  more  than  the 
speed  of  sound  waves  is  affected  by  the  motion  of  the  source  of  sound. 
(Many  experiments  show  this  statement  to  be  true.  One  of  the  earliest  was  a 
Michelson-Morley  type  of  experiment  in  which  the  source  of  light  was  a 
star  moving  rapidly  away  from  the  observer  making  the  measurements.)  If 
both  O and  O'  are  in  inertial  frames  relative  to  which  the  flashlight  is  in 
rapid  motion,  and  the  two  observers  are  also  moving  rapidly  with  respect  to 
each  other,  Einstein  would  still  say  that  both  obtain  the  same  value  c = 
3.00  x 108  m/s  in  their  measurements  of  the  speed  of  the  front  of  the 
beam. 

Before  you  use  the  argument  of  Fig.  14-4  to  reject  Einstein’s  assertion 
about  the  speed  of  light  as  an  impossibility,  remember  that  you  most  likely 
have  no  experience  at  all  in  the  relativistic  domain.  Since  you  live  in  a world 
where  obvious  phenomena  have  to  do  with  speeds  that  are  extremely  small 
compared  to  the  speed  of  light,  your  intuition  is  based  on  evidence  from 
the  newtonian  domain.  It  cannot  legitimately  be  used  to  deny,  or  to  con- 
firm, a statement  concerning  the  relativistic  domain. 


The  most  important  feature  of  Einstein’s  first  paper  on  relativity  theory  is 
his  postulate  of  the  equivalence  of  inertial  reference  frames.  Before  stating 
the  postulate,  we  will  summarize  very  briefly  the  point  of  view  concerning 
inertial  reference  frames  that  most  physicists  held  around  the  turn  of  the 
twentieth  century. 

Different  inertial  reference  frames  move  with  respect  to  one  another. 
But  they  do  so  with  constant  velocity  and  thus  do  not  accelerate  with 
respect  to  one  another.  For  this  reason,  all  inertial  frames  are  completely 
equivalent  with  regard  to  all  mechanical  phenomena.  A game  of  billiards 
played  on  a table  in  a railway  car  moving  smoothly  and  uniformly  along  a 


14-3  The  Equivalence  of  Inertial  Frames  627 


Fig.  14-5  Observer  O is  at  rest  in  one  inertial  frame,  and  observer 
O'  is  at  rest  in  another  inertial  frame.  Observer  O'  is  moving  to 
the  right  relative  to  O,  so  the  frames  are  in  relative  motion.  But 
because  all  inertial  frames  are  equivalent,  the  mechanical  laws 
governing  the  motion  of  the  billiard  balls  is  the  same  in  both 
frames.  Thus  O and  O'  will  find  that  the  mechanical  laws  governing 
the  behavior  of  the  billiard  balls  are  the  same. 


straight  track,  as  in  Fig.  14-5,  is  governed  by  exactly  the  same  equations  of 
mechanics  as  when  the  game  is  played  with  the  car  standing  in  a station.  We 
have  seen  other  examples  of  this  property  on  several  occasions  in  earlier 
chapters,  and  it  was  certainly  well  known  by  physicists  working  before 
1905.  But,  as  we  have  said,  physicists  of  that  era  thought  that  for  optical 
phenomena  (phenomena  involving  light)  all  inertial  frames  were  not  equiv- 
alent. They  felt  that  there  was  one  special  inertial  frame  that  was  singled 
out  by  the  properties  of  light  propagation.  In  that  frame  — and  in  that 
frame  alone — they  believed  the  speed  of  light  would  be  measured  to  have 
the  numerical  value  c = 3.00  x 108  m/s. 

Einstein  asserted  that  nature  is  simpler  than  others  had  believed  to  be 
the  case  because  the  same  situation  that  applies  for  mechanical  phenomena 
also  applies  for  optical  phenomena.  All  inertial  reference  frames  are  equiv- 
alent for  optical  phenomena,  as  well  as  for  mechanical  phenomena,  since  the 
speed  of  light  will  have  the  same  measured  value  c in  all  such  frames.  In  his 
1905  paper  Einstein  generalized  to  make  the  statement  include  all  physical 
phenomena.  He  did  this  in  his  postulate  of  the  equivalence  of  inertial 
frames:  All  inertial  reference  frames  are  completely  equivalent  for  all  physical  phe- 
nomena. This  statement  is  often  called  the  principle  of  relativity.  A tre- 
mendous body  of  experimental  data  has  verified  Einstein’s  postulate.  The 
experimental  work  started  with  Michelson  and  Morley  and  continues  to 
this  day.  The  experiments  verifying  the  postulate  of  equivalence  of  inertial 
frames  cover  a wide  variety  of  phenomena.  No  valid  experiments  have  con- 
tradicted it.  Relativistic  kinematics  is  founded  on  this  single  postulate. 

There  has  been  controversy  about  the  extent  to  which  Einstein’s  work  was  in- 
fluenced by  the  earlier  experiments  of  Michelson.  The  logic  of  his  1905  paper  is 
based  on  the  explicitly  electromagnetic  properties  of  light,  and  he  does  not  even 
mention  the  experiment  in  the  paper.  In  fact,  Einstein  was  silent  on  this  point  in 
all  his  writings.  R.  S.  Shankland  questioned  Einstein  about  it  in  interviews  given 
on  several  occasions  not  long  before  his  death  in  1955.  According  to  Prof.  Shank- 
land,*  Einstein  said,  “I  am  not  sure  when  I first  heard  of  the  Michelson  experiment. 
I was  not  conscious  that  it  had  influenced  me  directly  during  the  seven  years  that 
relativity  had  been  my  life.  I guess  I just  took  it  for  granted  that  it  was  true.” 
Shankland  concludes  by  saying,  “It  should  be  noted  that  in  1905  it  was  not  the 
practice  to  give  specific  references  in  published  papers  as  it  is  today.  Many  impor- 
tant papers  gave  no  references  whatever,  so  the  fact  that  Einstein’s  1905  paper 
[which  has  no  references]  makes  no  explicit  mention  of  the  Michelson-Morley 
experiment  is  not  in  the  least  unusual.  To  what  degree  and  at  what  stage  of  his 
activities  he  was  directly  influenced  by  the  work  of  others,  it  is  now  impossible  to 
determine.  But  I became  convinced  that  his  interest  in  the  Michelson-Morley 
experiment  . . . had  existed  before  1905.” 

* American  Journal  of  Physics,  vol.  41,  1973,  p.  895;  quoted  by  permission 


Relativistic  Kinematics 


14-4  SIMULTANEITY  It  is  an  experimental  fact  that  the  measured  speed  of  a beam  of  light  is  not 

affected  by  the  measurer’s  own  motion.  But  on  first  exposure  the  fact  may 
seem  hard  to  believe.  It  must  have  seriously  bothered  Einstein,  too,  at  first. 
It  certainly  made  him  think  more  deeply  than  anyone  had  before  about 
what  speed  really  is,  for  it  to  have  such  perplexing  properties. 

Since  speed  involves  the  ratio  of  a space  interval  to  a time  interval,  his 
thoughts  led  him  in  due  course  to  consider  separately  the  question  of  what 
is  really  meant  by  space  and  what  is  really  meant  by  time.  He  told  R.  S. 
Shankland:  “At  last  it  came  to  me  that  time  was  suspect.”  So  Einstein  set  out 
to  think  through,  from  scratch,  how  time  should  be  defined  operationally  in 
physics — that  is,  what  procedures  should  be  used  to  measure  time.  To 
develop  relativistic  kinematics,  we  must  do  the  same.  In  this  we  will  follow 
closely  many  of  Einstein’s  arguments. 

Einstein  argued  that  a measurement  of  time  always  involves  a determi- 
nation of  simultaneity.  He  wrote,  “If  I say  ‘That  train  arrives  here  at  7 
o’clock,’  I mean  something  like  this:  ‘The  pointing  of  the  small  hand  of  my 
watch  to  7 and  the  arrival  of  the  train  are  simultaneous  events.’  ” An  event 
is  anything  that  occurs  at  a particular  time  and  place.  Simultaneous  events 
are  events  which  occur  at  the  same  time. 

If  two  events  occur  at  the  same  place,  as  in  Einstein’s  example,  it  is  easy 
to  determine  whether  they  also  occur  at  the  same  time  and  so  determine 
whether  they  are  simultaneous.  Consider  Fig.  14-6,  which  depicts  the 
simultaneous  events  described  by  Einstein.  An  observer  located  where  two 
events  occur  at  the  same  place  receives  from  each  event,  with  essentially  no 
time  delay,  the  information  he  needs  to  judge  their  simultaneity.  He  sees 
both  events  at  the  instants  when  they  occur  because  he  is  at  the  location 
where  they  both  occur. 

But  simultaneous  events  do  not  necessarily  occur  at  the  same  place.  It 
is  possible  to  determine  whether  two  events  are  simultaneous  in  a case 
where  the  events  occur  at  places  separated  by  an  appreciable  distance. 
However,  it  is  not  as  easy  to  do  this  as  it  is  in  a case  where  the  events  occur 
at  the  same  place.  The  reason  is  that  there  are  time  delays  between  the  in- 
stants at  which  the  separated  events  occur  and  the  instants  at  which  the  ob- 
server who  is  judging  their  simultaneity  sees  them  occur.  The  information 
that  they  have  happened  is  carried  to  the  observer  watching  the  events  by 
light  traveling  at  a speed  which  is  not  infinite.  Hence  an  accurate  determi- 


Fig.  14-6  A measurement  of  the  arrival  time  of  a train.  The  arrival 
of  the  train  and  the  pointing  of  the  hour  hand  of  the  watch  to  7 
are  simultaneous  events. 


14-4  Simultaneity  629 


Fig.  14-7  A measurement  of  the  simultaneity  of  two  spatially 
separated  events.  (The  measurement  requires  the  observer  to 
look  in  opposite  directions  at  the  same  time.  One  way  to  ac- 
complish this  is  by  using  two  mirrors,  each  mounted  at  45°  to  the 
light  beam  incident  on  it.) 


nation  of  the  simultaneity  of  separated  events  requires  that  the  time  for 
light  to  travel  to  an  observer  not  be  neglected.  (This  crucial  fact  is  ignored 
in  newtonian  kinematics.)  Einstein  said  that  the  simplest  way  to  take  this 
time  into  consideration  is  to  station  the  observer  at  the  point  midway 
between  the  separated  events,  as  in  Fig.  14-7.  Then  the  time  delays  will  cer- 
tainly be  equal  because  in  all  circumstances  light  moves  from  both  events  to 
the  observer  at  the  same  speed.  Thus  the  simultaneity  of  separated  events 
can  be  determined  by  applying  the  following  test.  If  separated  events  are 
simultaneous,  then  light  emitted  from,  each  event  must  reach  an  observer  midway 
between  them  at  the  same  time. 

Information  that  the  events  have  occurred  could  be  transmitted  to  the  ob- 
server by  means  other  than  light  signals.  If  the  region  between  the  locations  of  the 
events  contained  air,  sound  signals  could  be  used.  But  it  would  be  essential  for  the 
observer  to  measure  precisely  the  relative  motion  of  the  air  at  all  points  along  the 
paths  of  the  signals,  since  any  air  motion  would  affect  the  apparent  speed  of  sound 
and  therefore  the  observed  time  delays.  Einstein  said  light  signals  should  be  used, 
instead  of  sound  signals,  because  light  has  the  great  simplifying  property  that  its 
speed  relative  to  the  observer  has  the  same  value,  regardless  of  the  circumstances. 

Given  a way  to  determine  the  simultaneity  of  separated  events,  an  ob- 
server in  any  reference  frame  can  measure  time  at  any  location  in  that 
frame.  First  the  observer  obtains  a number  of  clocks  running  at  the  same 
rate  and  places  one  of  them  near  each  point  in  the  reference  frame  where 
time  must  be  measured.  The  observer  then  moves  to  the  point  midway 
between  a pair  of  clocks  and  directs  a helper  at  one  of  these  clocks  to  adjust 
the  hands  until  the  observer  sees  both  clocks  simultaneously  reading  the 
same  time.  Next  another  clock  is  paired  with  one  of  the  first  pair  and  is  sim- 
ilarly set.  The  process  is  continued  until  all  the  clocks  fixed  in  the  observer’s 
reference  frame  are  set.  The  clocks  are  then  said  to  be  synchronized.  The 
observer  can  use  the  synchronized  clock  at  any  location  in  the  reference 
frame  to  measure  time  at  that  location  and  can  make  valid  intercom- 
parisons of  these  time  measurements. 

However,  these  clocks  would  not  be  considered  synchronized  by  an  ob- 
server who  is  moving  relative  to  the  reference  frame  containing  the  clocks. 
The  reason  is  that  two  events  which  are  simultaneous  from  the  point  of 
view  of  one  reference  frame  are  not  simultaneous  from  the  point  of  view  of 
a second  reference  frame  moving  relative  to  the  first,  as  we  will  prove 
immediately  below.  Thus  the  events  used  in  synchronizing  the  clocks, 
although  simultaneous  as  judged  from  the  reference  frame  containing  the 
clocks,  are  not  simultaneous  as  judged  from  a reference  frame  moving  with 
respect  to  the  clocks.  And  so  the  observer  moving  with  respect  to  the  clocks 
does  not  judge  them  to  be  properly  synchronized.  Consequently,  this  ob- 


630  Relativistic  Kinematics 


server  will  not  consider  comparisons  of  time  measurements  made  with 
these  clocks  to  be  valid.  (This  crucial  fact  is  also  ignored  in  newtonian  kine- 
matics. Indeed,  the  assumption  that  there  is  a universal  time  scale,  which  is 
the  same  in  all  reference  frames  no  matter  what  their  state  of  motion,  and 
the  justification  of  this  assumption,  are  considered  in  newtonian  kinematics 
to  be  so  self-evident  that  they  are  unvoiced.  This  incorrect  assumption 
plays  a vital  role  in  deriving  the  Galilean  transformations.  See  Sec.  3-8.) 

We  will  prove  that  observers  in  relative  motion  disagree  about  the  si- 
multaneity of  separated  events,  by  thinking  through  what  would  happen  if 
a certain  experiment  were  carried  out.  Such  a procedure  is  called  analyzing 
a thought  experiment.  The  experiment  is  illustrated  in  Fig.  14-8.  It  shows 
two  successive  views  of  a railroad  train  moving  at  a constant  high  velocity  V 
along  a straight  track,  from  the  viewpoint  of  an  observer  O fixed  on  the 
ground.  Prior  to  the  passage  of  the  train,  0 has  carefully  measured  off  a 
distance  in  one  direction  down  the  track  to  a location  Bx  and  an  equal  dis- 
tance in  the  opposite  direction  to  a location  B2 . At  B1  and  B2  she  has  placed 
two  blasting  caps  (small  explosive  charges  that  are  detonated  electrically) 
near  where  the  side  of  the  train  will  pass.  A modification  of  Einstein’s  test 
for  determining  the  simultaneity  of  separated  events  implies  that  she  can 
make  them  explode  simultaneously  in  her  reference  frame  by  sending  light 
signals  from  her  midpoint  location,  which  actuate  detonators  on  their 
arrival  at  Bx  and  B2 . She  does  this  so  that  the  explosions  occur  when,  from 
her  point  of  view,  an  observer  O',  stationed  on  the  train  at  its  center,  is  just 
passing  her. 

When  the  blasting  caps  explode,  they  leave  marks  B[  andfl^  cm  the  side 
of  the  train  near  its  two  ends.  After  the  expeximent  is  over,  the  observer  O' 
on  the  train  can  measure  the  distances  0'B[  and  0'B'2.  As  you  would  expect, 
he  certainly  finds  the  distances  to  be  equal.  (If  this  were  not  the  case,  space 
would  have  properties  in  one  dix  ection  which  differed  from  its  properties 
in  the  opposite  direction.) 

The  explosions  also  produce  light  flashes.  The  observer  O on  the 
ground  finds  that  the  flashes  coming  from  the  front  and  rear  ends  of  the 
train  reach  her  at  the  same  time.  This  confirms  that  the  blasting  caps  actu- 
ally were  detonated  simultaneously  in  her  frame  of  reference.  She  also 
finds  that  O'  receives  the  flash  coming  from  the  front  end  before  he  receives 
the  flash  coming  from  the  rear  end.  As  the  figuxe  shows,  O sees  this  to  be  a 
result  of  both  the  motion  of  O'  toward  the  flash  approaching  him  from  the 
front  and  his  motion  away  from  the  flash  appxoaching  him  from  the  rear. 
So  O explains  the  obsei  vations  of  O'  by  saying  that  the  motion  of  O'  causes 
him  to  receive  the  flash  from  the  front  of  the  train  before  the  flash  from  the 
rear. 


o' 

JL— * 


Fig.  14-8  A thought  experiment  concerning  the 
simultaneity  of  separated  events,  which  are  measured 
in  reference  frames  moving  relative  to  each  other. 
The  two  successive  illustrations  of  the  experiment 
both  show  it  from  the  point  of  view  of  the  ground- 
based  observer  O.  The  top  one  views  the  situation  at 
the  time  of  detonation,  and  the  lower  one  at  a slightly 
later  time.  The  short  arrows  represent  light  flashes 
coming  from  the  detonations. 


T 

0 


14-4  Simultaneity  631 


Observer  O'  agrees  that  the  flash  from  the  front  arrives  at  his  location 
before  the  flash  from  the  rear.  (The  observers  cannot  disagree  about  this 
since  light  detectors  fixed  at  the  center  of  the  train  can  be  used  to  make 
permanent  records  of  when  the  flashes  arrive,  and  anyone  can  inspect 
these  records  when  the  experiment  is  over.)  But  O'  disagrees  with  O about 
the  explanation.  He  perceives  himself  to  be  stationary  in  the  frame  of  ref- 
erence of  the  train  and  contends  that  the  ground  on  which  O stands  is 
moving  past  him,  going  in  the  direction  toward  the  rear  of  the  train.  At 
equal  distances  from  his  location  are  two  points,  B[  at  the  rear  of  the  train 
and  Bo  at  the  front.  Explosions  occur  at  each  of  these  points  and  produce 
light  flashes.  Since  he  does  not  receive  these  light  signals  from  the  equidis- 
tant events  at  the  same  time,  when  O'  applies  the  test  for  determining  the 
simultaneity  of  separated  events,  he  concludes  that  the  two  explosions  were 
not  simultaneous. 

Soon  we  will  obtain  a relation  that  allows  us  to  make  a quantitative 
statement  about  the  departure  from  simultaneity  of  the  two  events,  as 
judged  by  O'.  But  here  the  important  result  is  the  qualitative  conclusion 
that  according  to  O'  the  explosion  at  the  front  of  the  train  occurred  some 
time  before  the  explosion  at  the  rear,  even  though  the  explosions  were 
simultaneous  according  to  O.  The  reason  for  this  disagreement  is,  again, 
just  the  fact  that  the  observer  on  the  train  moves  relative  to  the  observer  on 
the  ground  in  the  finite  time  it  takes  for  the  light  signals  to  converge.  The 
observers  disagree  about  the  simultaneity  of  the  events  because  of  their  rel- 
ative motion. 

Who  is  “really”  right,  the  observer  on  the  ground  or  the  observer  on  the  train? 
It  would  not  be  surprising  if  you  answer  by  saying  that  the  ground-based  observer 
is  right  and  the  train-based  observer  is  suffering  from  some  sort  of  optical  illusion. 
But  this  is  not  so.  Both  observers  have  made  valid  measurements  and  have  given 
them  valid  interpretations.  The  observer  on  the  train  is  completely  correct  in 
saying  that  since  he  measures  the  events  not  to  be  simultaneous,  they  are  not 
simultaneous.  And  the  observer  on  the  ground  is  just  as  correct  in  saying  that  they 
are  simultaneous  because  she  measures  them  to  be  so.  Measurement  gives  the  true 
reality  in  physics. 

If  you  nevertheless  feel  that  measurements  made  in  the  reference  frame  fixed 
to  the  ground  have  more  significance  than  those  made  in  the  frame  fixed  to  the 
train  moving  at  constant  velocity  relative  to  the  ground,  you  should  remember 
that  both  are  equally  good  approximations  to  inertial  reference  frames.  Then  re- 
member that  experiment  shows  that  “All  inertial  reference  frames  are  completely 
equivalent.  . . The  fact  that  you  live  on  the  ground  gives  you  a natural  bias 
toward  the  frame  fixed  to  the  ground.  If  you  have  ever  taken  a long  train  trip,  you 
will  realize  that  someone  living  permanently  on  a train  moving  at  constant  veloc- 
ity over  the  ground  would  have  the  same  bias  toward  the  frame  fixed  to  the  train. 

Another  reason  why  it  may  be  difficult  to  look  at  the  experiment  discussed 
above  with  an  unbiased  attitude  is  that  the  blasting  caps,  and  the  observer  who 
caused  them  to  be  detonated,  were  on  the  ground.  As  a result,  the  explosions  were 
simultaneous  as  seen  from  the  frame  of  reference  of  the  ground.  You  will  find  it  in- 
structive to  repeat  the  analysis  for  a variation  of  the  experiment  in  which  the 
blasting  caps  were  carried  on  the  train  and  their  detonation  was  initiated  by  the 
observer  in  the  train,  so  that  the  explosions  occurred  simultaneously  in  the  frame 
of  reference  of  the  train.  It  is  also  worthwhile  contemplating  an  experiment  in 
which  two  identical  spaceships  pass  each  other  going  in  opposite  directions. 
Identical-twin  observers,  one  on  each  ship,  both  initiate  simultaneity  experi- 
ments. Each  will  conclude  that  the  events  initiated  by  the  other  were  not  simulta- 
neous. Here  there  can  be  no  bias  in  your  reaction  to  the  results  of  the  measure- 
ments because  the  situation  is  completely  symmetrical  with  regard  to  the  two  ref- 
erence frames. 


632  Relativistic  Kinematics 


14-5  TIME  DILATION 
AND  LENGTH 
CONTRACTION 


Mirror 


/ 


Fig.  14-9  A thought  experiment  con- 
cerning time  dilation  and  length  con- 
traction, from  the  point  of  view  of 
observer  0. 


Mirror 


Fig.  14-10  The  thought  experiment  of 
Fig.  14-9,  from  the  point  of  view  of 
observer  O' . 


We  continue  our  study  of  time  and  space  by  analyzing  additional  thought 
experiments.  The  experiments  are  designed  to  produce  three  quantitative 
results.  One  gives  the  relation  between  time  intervals  measured  by  ob- 
servers moving  relative  to  each  other.  It  will  be  shown  that  there  is  a dis- 
agreement between  the  observer  stationed  in  the  reference  frame  in  which 
clocks  are  distributed  and  the  observer  moving  relative  to  that  reference 
frame  concerning  the  rate  at  which  the  clocks  run  (as  well  as  their  syn- 
chronization). The  other  two  experiments  give  the  relations  between  space 
intervals  measured  by  observers  in  relative  motion.  They  will  show  that  the 
observers  also  disagree  about  the  separation  between  the  clocks  in  the 
direction  of  the  observers  relative  motion.  These  results,  along  with  the  re- 
sults of  the  simultaneity  thought  experiment  considered  in  Sec.  14-4,  are 
used  in  Sec.  14-6  as  a starting  point  in  an  argument  that  leads  to  the  trans- 
formation equations  which  in  the  relativistic  domain  replace  the  Galilean 
transformation  equations. 

Figure  14-9  shows  an  observer  O in  an  inertial  frame  who  is  comparing 
a time  interval  measured  on  her  clock  C with  a measurement  made  by  an 
observer  O'  of  the  same  time  interval.  Observer  O'  is  in  another  inertial 
frame  moving  to  the  left  at  constant  velocity  relative  to  O,  and  so  are  his  two 
clocks  C[  and  C2.  We  take  the  positive  direction  to  the  right  and  thus  write 
the  velocity  of  O'  relative  to  O as  V = — |V|,  where  |Vj  is  the  speed  of  O'  with 
respect  to  O.  When  the  observers  were  not  in  motion,  prior  to  the  experi- 
ment, they  verified  that  all  their  clocks  were  synchronized  and  ran  at  the 
same  rate  when  at  rest  with  respect  to  each  other.  When  the  observers  are 
in  relative  motion,  there  is  no  complication  at  all  in  comparing  the  reading 
of  the  clock  belonging  to  O with  either  of  the  clocks  belonging  to  O'  at  the 
instant  that  the  O and  O'  clocks  pass  by  each  other.  If  two  clocks  are  at  es- 
sentially the  same  location,  any  observer  at  that  location  can  make  a valid 
comparison  of  their  readings,  despite  the  relative  motion,  since  there  will 
be  no  appreciable  time  delay  in  receiving  information  from  either  clock.  In 
the  experiment,  O sends  a flash  of  light  from  C along  a path  which  is  per- 
pendicular to  a distant  mirror.  The  light  signal  is  reflected  back  on  that 
path  and  subsequently  returns  to  C.  The  beginning  of  the  time  interval  to 
be  compared  is  defined  by  the  emission  of  the  light  signal  at  C,  and  the  end 
of  the  time  interval  is  defined  by  its  reception  at  C.  Observer  O measures 
the  duration  of  the  time  interval  by  the  difference  between  the  two 
readings  of  C. 

The  clock  C[  belonging  to  O'  is  adjacent  to  the  clock  C when  the  light 
signal  is  emitted.  This  situation  is  illustrated  from  the  point  of  view  of  O in 
Fig.  14-9.  Because  O'  is  moving  to  the  left  with  respect  to  O,  the  clocks  C[ 
and  C'2  move  to  the  left  while  the  light  signal  is  traveling  to  the  mirror  and 
back.  And  at  the  instant  when  the  light  signal  returns  to  C,  the  clock  C2  has 
moved  to  the  location  adjacent  to  C.  Thus  observer  O'  can  measure  the 
beginning  of  the  time  interval  with  clock  C[  and  the  end  of  the  time  interval 
with  clock  C2,  since  both  clocks  are  in  the  right  place  at  the  right  time. 

Figure  14-10  illustrates  the  experiment,  from  the  point  of  view  of  O', 
when  the  light  signal  returns  to  the  permanent  location  of  his  clock  C2  and 
the  instantaneous  location  of  the  clock  C belonging  to  O.  Since  O'  is  moving 
to  the  left  relative  to  O with  speed  |Vj,  observer  O must  be  moving  to  the 
right  relative  to  O'  with  the  same  speed.  If  this  were  not  the  case,  there 
would  be  an  asymmetry  between  the  reference  frames  of  the  two  observers 
that  is  not  allowed  by  Einstein’s  postulate  of  the  equivalence  of  inertial 
frames.  Thus  the  velocity  of  O relative  to  O'  is  V = |Vj.  Observer  O'  would 


describe  the  experiment  by  saying  that  O emitted  a flash  of  light  in  the  gen- 
eral direction  of  the  mirror  when  her  moving  clock  C was  next  to  C[  and 
that  the  light  returned  to  C when  it  was  next  toC^.  Note  the  fundamentally 
important  fact  that  according  to  O'  the  light  does  not  follow  the  shortest 
path  to  the  mirror  and  back,  since  it  is  not  moving  perpendicular  to  the 
mirror. 


According  to  O,  the  duration  of  the  time  interval  is  T — 2A t,  where  A t 
is  the  time  required  for  the  light  signal  to  travel  to  the  mirror,  or  to  travel 
back.  Figure  14-9  shows  that  if  / is  the  perpendicular  distance  to  the  mirror 
as  measured  by  O and  c is  the  speed  of  light,  then 


At  = 


l_ 

c 


(14-2) 


According  to  O',  the  time  interval  has  a duration  T’  = 2 At',  where  At' 
is  the  time  for  light  to  make  the  trip  to  the  mirror,  or  the  trip  back.  In  Fig. 
14-10,  the  quantity  l'  is  the  perpendicular  distance  to  the  mirror  as  mea- 
sured by  O'.  The  quantity  |Vj  At'  is  the  distance  traveled  by  clock  C moving 
at  speed  |V|  in  time  At'.  And  the  quantity  c At'  is  the  distance  traveled  by  the 
light  signal  moving  at  speed  c in  that  time.  From  the  figure  and  the  py- 
thagorean  theorem,  it  is  evident  that 

(r  At')2  = (|V|  At')2  + l’2 


To  solve  this  equation  for  At',  we  first  write  it  as 

c2(  At')2  = |T|2(At')2  + /'2 

Now  \V\2  = V2  because  V2  has  a positive  value  no  matter  whether  the  value 
of  the  velocity  V is  positive  or  negative.  Thus  we  have 

c2(At')2  = T2(At')2  + V2 

Then  we  gather  the  coefficients  of  (At')2  to  obtain 
(c2  - V2)(At')2  = l’2 
or 

i’2  r2  i 

(A  t1)2  = — - — = — 

1 ) c2  _ V2  f2j_  V2/c2 


Taking  square  roots  produces  the  result 


At'  = 


V 1 

c VI  - V2/c2 


(14-3) 


Two  observers  in  relative  motion  cannot  be  in  disagreement  about  dis- 
tances measured  perpendicular  to  the  direction  of  motion.  It  is  easy  to  prove 
this  by  the  argument  illustrated  in  Fig.  14-11  and  explained  in  its  caption. 
Thus  the  perpendicular  distances  to  the  mirror  must  be  measured  by  both 
observers  to  have  the  same  value  / = V . So  Eq.  (14-3)  can  be  written 


c VI  - V2/c2 


Substituting  from  Eq.  (14-2),  we  obtain 


A t’  = At 


1 

Vl  - V2/c2 


634  Relativistic  Kinematics 


(14-4) 


B- r 


T B' 


A -L 


-L  A' 


Fig.  14-11  A thought  experiment 
proving  that  observers  in  different  in- 
ertial frames  are  in  agreement  about 
lengths  measured  perpendicular  to  their 
direction  of  relative  motion.  Two  rods 
AB  and  A'B'  are  first  measured  to  be  of 
the  same  length  when  at  rest  with  re- 
spect to  each  other.  Then  they  are  made 
to  move  by  each  other,  in  the  manner 
illustrated.  When  they  pass,  observers  at 
A and  B mark  on  their  rod  the  locations 
of  the  ends  of  rod  A'B'  and  also  send 
light  signals  toward  the  center  of  the 
rods.  Observers  on  the  other  rod  per- 
form similar  operations.  Since  both 
observers  at  0 and  O'  receive  the  signals 
at  the  same  time,  they  both  agree  that 
the  marks  were  made  simultaneously, 
and  so  both  accept  the  length  compari- 
sons. Now  the  procedure  carried  out  in 
each  reference  frame  is  symmetrical  to 
that  carried  out  in  the  other.  And  the 
postulate  of  equivalence  of  inertial 
frames  says  that  the  properties  of  the 
reference  frames  themselves  are  sym- 
metrical. Thus  there  is  complete  sym- 
metry in  the  situation.  Consequently 
the  length  comparisons  must  show  that 
the  rods  have  the  same  lengths  when 
moving  perpendicular  to  their  orienta- 
tion. T his  is  the  only  result  that  could 
be  agreed  on  by  both  observers  and  also 
be  consistent  with  the  complete  sym- 
metry. 


Then  multiplying  through  by  the  factor  2 produces  the  relation 


V = 


V 1 - V2/c2 


Since  the  factor  1/Vl  - V2/c2  has  a value  greater  than  1 if  V2/c2  is 
greater  than  0,  this  relation  shows  that  O'  measures  the  value  T'  for  the 
time  interval  to  be  longer  than  the  value  T measured  by  O.  The  phenome- 
non is  called  time  dilation  (the  word  “dilation”  means  stretching).  Recall 
that  the  time  interval  is  defined  by  two  events  (the  emission  and  reception 
of  a light  signal)  which  occur  at  the  same  place  (C)  from  the  point  of  view  of 
O.  She  measures  the  shortest  possible  value  T for  the  time  interval,  which  is 
called  the  proper  time.  From  the  point  of  view  of  O',  the  events  initiating 
and  terminating  the  time  interval  occur  at  different  places  ( C[  and  C2).  He 
necessarily  measures  a longer  value  T'  for  the  time  interval,  called  the  di- 
lated time. 

The  reason  for  the  time  dilation  found  in  the  thought  experiment  is 
simply  that  the  light  connecting  the  mirror  with  the  two  events  travels  a 
longer  path  when  the  events  are  viewed  from  a frame  of  reference  in  which 
they  occur  at  different  locations.  Since  light  travels  at  the  same  speed  with 
respect  to  observers  in  both  reference  frames,  it  takes  the  light  signal  more 
time  to  follow  its  longer  path  in  the  frame  of  observer  O'.  So  observer  O'  ob- 
tains a dilated  value  for  his  measurement  of  the  time  interval. 


Now  we  develop  a relation  involving  length  measurements  made  by 
the  two  observers  parallel  to  their  direction  of  relative  motion.  We  do  this  by 
reconsidering  the  same  apparatus  pictured  in  Figs.  14-9  and  14-10,  except 
that  a rigid  rod  extending  from  clock  C[  to  clock  C'2  has  been  fixed  in  the 
reference  frame  of  O'.  We  use  the  symbol  L'  to  represent  the  length  of  the 
rod  as  measured  by  O'  and  the  symbol  L to  represent  its  length  as  measured 
by  O.  (Note  that  the  conclusion  obtained  from  the  thought  experiment  in 
Fig.  14-1 1 does  not  apply  here  because  O and  O'  are  not  comparing  lengths 
measured  perpendicular  to  their  direction  of  relative  motion.) 

According  to  O,  the  quantity  T is  the  time  interval  from  when  she  ob- 
serves the  front  end  of  the  rod  pass  C to  when  she  observes  the  rear  end 
pass  that  location.  See  Fig.  14-9.  Thus  in  time  T the  rod  moves  past  her 
through  its  own  length  L.  Since  the  rod  is  moving  in  her  reference  frame 
with  speed  |F|,  its  length  L must  be 

L = \V\T  (14-5) 

According  to  O',  the  quantity  T is  the  time  interval  during  which  the 
clock  C moves  the  length  L'  of  the  rod.  See  Fig.  14-10.  Since  this  clock  is 
moving  through  his  reference  frame  with  a speed  that  also  has  the  value 
|T|,  the  length  L'  is  given  by 

L'  = \V\T'  (14-6) 

Dividing  this  into  Eq.  (14-5)  gives 

L_  _ T_ 

L'  ~ T 

Then  using  the  relation  obtained  from  the  time  dilation  argument,  Eq. 
(14-4),  to  evaluate  T/T' , we  have 

k = vi  - v2/c2 


14-5  Time  Dilation  and  Length  Contraction  635 


or 


L = Vl  ~ V2/c 2 L'  (14-7) 

This  result  shows  that  observer  O will  measure  the  length  L of  the  rod 
to  be  shorter  than  the  length  L'  measured  by  observer  O' , since  the  factor 
V l — V^/c2  has  a value  less  than  1 if  V2/c2  is  greater  than  0.  The  name  given 
to  this  phenomenon  is  length  contraction.  Note  that  from  the  viewpoint 
of  O'  the  length  is  measured  over  a space  interval  between  fixed  locations 
(along  the  fixed  rod).  He  measures  the  largest  possible  value  L'  for  the  space 
interval,  called  its  proper  length.  From  the  viewpoint  of  O the  space  interval 
is  measured  between  moving  locations  (along  the  moving  rod).  She  measures 
a shorter  length,  or  contracted  length  L. 

It  is  apparent  from  our  argument  that  length  contraction  and  time  di- 
lation are  intimately  related  physical  phenomena.  The  necessity  for  length 
contraction  follows  immediately  from  the  necessity  for  time  dilation,  given 
the  requirement  that  the  speed  of  O'  with  respect  to  O equals  the  speed  of 
O with  respect  to  O',  since  all  inertial  frames  are  equivalent. 

The  phenomena  of  time  dilation  and  length  contraction  predicted 
from  the  thought  experiment  that  we  have  considered  are  not  specific 
properties  of  this  particular  thought  experiment.  Rather,  they  are  general 
properties  of  time  and  space.  This  is  indicated  by  the  fact  that  a variety  of  quite 
different  thought  experiments  make  the  same  predictions  concerning  time 
dilation  and  length  contraction.  We  consider  one  of  these  in  Sec.  14-6. 
Furthermore,  there  are  many  different  real  experiments  which  confirm 
these  predictions.  Several  of  these  are  discussed  later  in  this  section. 

It  should  be  pointed  out  that  the  two  events  occurring  in  the  thought  experi- 
ment considered  here  (the  emission  and  reception  of  the  light  signal)  are  sepa- 
rated by  a proper  time  in  the  O frame  and  separated  by  a proper  length  in  the  O' 
frame.  So  evaluating  the  dilated  time  defined  by  the  events  is  a matter  of  calcu- 
lating T'  from  T according  to  Eq.  (14-4),  and  evaluating  the  contracted  length  in- 
volves calculating  L fromL'  according  to  Eq.  (14-7).  One  calculation  goes  from  un- 
primed to  primed  quantities,  and  the  other  is  reversed  in  that  it  goes  from  primed 
to  unprimed  quantities.  The  physical  situation  causes  the  equations  to  be  used  in 
reverse  ways.  But  mathematically  Eqs.  (14-4)  and  (14-7)  are  completely  analogous 
since  the  identical  factor  l/Vl  — V2/c2  relates  T'  to  T and  L'  to  L. 

The  result  given  by  Eq.  (14-7)  has  the  same  mathematical  form  as  an  assump- 
tion made  by  G.  F.  Fitzgerald  and  later  by  H.  A.  Lorentz,  two  physicists  who  were 
attempting  to  explain  the  results  of  the  Michelson-Morley  experiment.  Equation 
(14-7)  is  therefore  often  said  to  describe  the  Fitzgerald-Lorentz  contraction,  or  the 
Lorentz  contraction.  The  logical  framework  used  by  Fitzgerald  and  by  Lorentz 
turned  out  to  be  inconsistent,  even  though  some  of  their  results  are  mathemati- 
cally identical  to  those  of  Einstein.  In  contrast,  the  role  of  Eq.  (14-7)  in  Einstein’s 
theory  is  certainly  not  that  of  an  assumption.  It  is  an  inevitable  consequence  of  the 
postulate  of  the  equivalence  of  inertial  frames. 


EXAMPLE  14-2 

When  a railroad  train  is  stationary  on  the  ground,  it  is  measured  to  have  a proper 
length  of  precisely  1 km.  Predict  its  contracted  length,  measured  from  the  ground, 
when  its  speed  relative  to  the  ground  is  100  km/h. 

■ You  can  make  the  prediction  by  evaluating  Eq.  (14-7), 

L = Vl  - V2/c2L' 


636  Relativistic  Kinematics 


The  speed  of  the  train  is 


M 


100  X 103  m 
3.6  x 103  s 


= 27.8  m/s 


The  speed  of  light  is 


c = 3.00  x 108  m/s 


So 


and 


M 


27.8  m/s 
3.00  m/s 


= 9.27  x 10~8 


V2 

— = 8.59  x 10“15 
c 


Since  V2/c2  « 1,  the  square  root  can  be  evaluated  very  accurately  by  using  the 
binomial  expansion  approximation,  Eq.  (3-36),  to  give 


Vi  - V2/c2  = 1 - w2/c 2 

Thus  you  have 

Vl  - V2/c2  = 1 - 4.30  X 10-15 
Since  the  proper  length  of  the  train  is 

L'  = 103  m 

its  contracted  length  is 

L = (1  - 4.30  x 10-15)  103  m 
or 


L = 103  m - 4.30  x 1(T12  m 

The  relativistic  contraction  in  the  length  of  the  train  is  predicted  to  be  4.30  x 
10-12  m.  This  is  about  1 percent  of  the  diameter  of  an  atom.  By  the  standards  of  the 
newtonian  domain,  the  train  is  traveling  at  an  appreciable  speed  and  has  an  appre- 
ciable length.  But  the  predicted  total  contraction  is  so  small  as  to  be  completely  un- 
measurable. 


Example  14-2  provides  a typical  demonstration  of  the  fact  that  relati- 
vistic kinematics  makes  no  predictions  in  disagreement  with  experiments 
carried  out  in  the  newtonian  domain.  In  that  domain  objects  move  at 
speeds  which  are  very  small  compared  to  the  speed  of  light.  Hence  V2 /c2  is 
extremely  small  compared  to  1,  so  that  Vl  — V2/c2  and  its  reciprocal  have 
values  extremely  close  to  1.  As  a consequence,  the  predictions  of  relativistic 
kinematics  merge  smoothly  into  those  of  newtonian  kinematics  at  speeds 
characteristic  of  the  newtonian  domain.  But  for  objects  moving  at  speeds 
comparable  to  the  speed  of  light,  relativistic  kinematics  predicts  effects 
which  are  significant  numerically  and  so  should  be  measurable. 

One  of  the  hrst  experimental  confirmations  of  time  dilation  and  length 
contraction  was  obtained  in  the  1940s  from  the  study  of  cosmic  radiation. 
A stream  of  very  high-energy  protons  from  cosmic  sources  is  known  to 
bombard  the  earth  constantly.  When  these  particles  strike  atomic  nuclei  in 
air  at  the  top  of  the  atmosphere,  typically  at  an  altitude  of  about  10,000  m, 


14-5  Time  Dilation  and  Length  Contraction  637 


they  produce  particles  called  pions.  The  pions  are  unstable,  decaying  very 
quickly  into  particles  called  muons.  Muons  are  also  unstable.  They  soon 
decay  into  electrons,  but  this  second  decay  is  not  as  rapid  as  the  first.  Exper- 
iments have  shown  that  a muon  will  live  for  an  average  time  of  2.2  x 10-6  s, 
as  seen  in  a reference  frame  in  which  the  muon  is  at  rest. 

From  the  earth  reference  frame,  the  muons  are  moving  downward  at  a 
speed  about  99.9  percent  of  the  speed  of  light.  This  is  a result  of  the  large 
downward  momentum  of  the  initially  incident  protons  and  is  known  from 
direct  measurements  of  the  muon  momentum.  If  the  “lifetime”  of  muons 
from  birth  to  decay  were  2.2  x 10-6  s,  the  distance  the  muons  could  travel 
when  going  at  a speed  of  3.0  x 108  m/s  would  be  3.0  x 108  m/s  x 2.2  x 
10~6  s = 660  m.  The  distance  is  short  compared  to  the  10,000  m the  muons 
must  cover  in  order  to  reach  ground  level.  It  would  seem  that  muons 
should  not  make  it  to  ground  level,  and  so  none  should  be  detected  there. 
But  muons  are  detected  in  abundance  at  ground  level. 

An  explanation  is  found  in  the  phenomenon  of  time  dilation.  The  time 
2.2  x 10~6  s is  the  average  time  a muon  lives,  measured  from  a frame  of 
reference  moving  with  the  particle.  (In  other  words,  this  lifetime  is  mea- 
sured in  a frame  relative  to  which  the  muons  are  at  rest.)  The  time  is  a 
proper  time  since  it  is  a time  interval  measured  between  two  events  (the 
birth  and  death  of  the  muon)  which  occur  at  the  same  place.  To  determine 
whether  a typical  muon  can  travel  a distance  of  10,000  m from  the  top  of 
the  atmosphere  to  the  surface  of  the  earth,  the  calculation  should  employ 
the  dilated  time  evaluated  from  the  frame  of  reference  of  the  earth,  since 
this  is  the  frame  from  which  the  process  is  viewed.  The  calculation  in 
Example  14-3  does  so. 


EXAMPLE  14-3 

Evaluate  the  dilated  time  a muon  lives  before  decaying  when  taveling  at  speed  equal 
to  0.999c  if  the  proper  time  it  lives  is  2.2  x 10~6  s.  Then  compare  it  to  the  time  re- 
quired to  travel  10,000  m moving  at  that  speed. 

■ If  the  proper  time  is  T,  then  Eq.  (14-4)  says  the  dilated  time  will  be 


1 

7”  = , T 

Vl  - V2/c2 

Here 

|V|  = 0.999c 


or 


-1^-  = 0.999 
c 


Squaring  both  sides,  you  obtain 

V2 

— = 0.998 
c 


So 

Vl  - V2/c2  = V0. 002  = 0.045 

and 


1 

Vl  - V1 2/c2 


1 

0.045 


= 22 


638  Relativistic  Kinematics 


Thus  the  dilated  time  is 


T = 22  x 2.2  x l<r6  s = 49  x KT6  s 


The  time  t'  required  to  travel  10,000  m going  at  3.0  X 108  m/s  is 


1.0  x 104  m 
3.0  x 108  m/s 


= 33  x 10“6 s 


Since  this  is  less  than  the  available  time  T'  = 49  X 10-6  s,  there  is  no  difficulty  in  ex- 
plaining the  fact  that  cosmic  ray  muons  typically  reach  ground  level  before  de- 
caying. 


Muons  can  be  thought  of  as  clocks  that  give  one  tick  in  a proper  time 
equal  to  their  average  lifetime.  Because  they  are  microscopic,  it  is  possible 
to  build  machines,  like  high-energy  particle  accelerators,  to  make  them 
move  at  speeds  high  enough  for  relativistic  effects  to  become  large.  And 
nature  has  provided  such  an  accelerator  in  the  form  of  the  cosmic  radia- 
tion. So  muons  are  ideally  suited  to  test  time  dilation.  The  test  is  completely 
successful.  l ime  dilation  provides  a qualitative  explanation  of  the  seemingly 
impossible  fact  that  any  cosmic  ray  muons  at  all  reach  the  ground.  Further- 
more, careful  measurements  of  the  exact  fraction  reaching  the  ground  are 
in  quantitative  agreement  with  the  expected  dilation  factor.  As  predicted  by 
relativistic  kinematics,  the  proper  time  interval  defined  by  the  rapidly 
moving  muon  clocks  is  observed  in  the  reference  frame  of  the  earth  to  be  a 
dilated  time  interval.  Another  way  of  stating  this  is  to  say  that  the  moving 
clocks  run  slow. 

In  1971  the  prediction  that  moving  clocks  run  slow  was  tested  in  an 
experiment  using  macroscopic  clocks.  Four  extremely  accurate  cesium- 133 
clocks  (see  Sec.  2-3)  were  taken  by  commercial  airlines  on  trips  around 
the  world.  Two  clocks  traveled  eastward  in  the  same  direction  as  the  earth’s 
rotation,  and  two  clocks  traveled  westward  in  opposition  to  its  rotation. 
Before  and  after  their  circumnavigations,  the  readings  of  the  clocks  were 
compared  with  the  readings  of  reference  clocks  at  the  U.S.  Naval  Observa- 
tory. Prior  to  analyzing  the  results  for  the  effect  of  time  dilation,  corrections 
had  to  be  made  for  the  fact  that  neither  the  flying  clocks  nor  the  ground- 
based  clocks  were  in  completely  acceleration-free  inertial  reference  frames. 
This  was  done  by  using  Einstein’s  general  theory  of  relativity,  which  treats 
relative  accelerations.  The  general  theory  was  also  used  to  correct  for  the 
effects  on  the  clocks,  closely  related  to  those  of  acceleration,  produced  by 
the  earth’s  gravity.  The  results  of  the  test  were  quite  satisfactory.  Time 
dilation  was  expected  to  cause  a total  time  loss  of  about  3 x 10~7  s for  each 
clock  moving  eastward.  The  corrected  measured  value  agreed  with  expecta- 
tion to  within  the  experimental  accuracy  of  about  0.2  x 10-7  s. 

Length  contraction  provides  a complementary  explanation  of  the  ob- 
servations concerning  cosmic  ray  muons.  In  the  reference  frame  where  the 
muons  are  at  rest,  the  average  time  available  before  decay  is  the  proper 
time  2.2  x 10-6  s.  This  frame  is  moving  downward  through  the  atmos- 
phere relative  to  the  earth  at  speed  |E|  = 0.999c  along  with  the  muons. 
Thus  from  the  point  of  view  of  an  observer  in  this  frame,  the  earth  and 
atmosphere  are  moving  upward  at  the  same  speed.  Consequently,  in  the 
muon  reference  frame  the  length  measured  from  the  point  at  the  top  of 


14-5  Time  Dilation  and  Length  Contraction  639 


EXAMPLE  14-4 


14-6  THE  LORENTZ 
POSITION-TIME 
TRANSFORMATION 


the  atmosphere,  which  passes  them  when  they  are  born,  to  the  point  at  the 
bottom  of  the  atmosphere,  which  passes  when  they  strike  the  ground,  will 
be  very  much  contracted.  In  Example  14-4  the  numbers  are  worked  out. 


Evaluate  the  contracted  length  of  the  atmosphere,  of  proper  length  10,000  m, 
moving  by  the  muons  at  speed  |V|  = 0.999c.  Then  show  that  it  takes  less  than  the 
average  muon  lifetime,  2.2  x 10-6  s,  for  this  length  to  move  once  along  itself. 

■ The  contracted  length  L is  related  to  the  proper  length  L'  by  Eq.  (14-7): 

L = Vl  - T2/c2  V 

You  already  have  evaluated  the  contraction  factor  for  |V|  — 0.999c  in  Example  14-3. 
It  is 

Vl  - V2/c2  = 0.045 
So  the  contracted  length  has  the  value 

L = 0.045  x 10,000  m = 450  m 


The  time  t required  for  the  atmosphere  to  move  through  its  contracted  length 
past  the  muon,  going  at  essentially  the  speed  of  light,  is 


4.5  x 102  m 
3.0  x 108  m/s 


1.5  x 10-6  s 


Since  this  is  less  than  the  available  time  T = 2.2  X 10  6 s,  there  is  no  difficulty  in 
understanding  why  the  muon  lives  long  enough  for  the  atmosphere  to  move  past  it. 


l lte  two  explanations  of  the  muon  measurements  are  summarized  in 
Fig.  14-12fl  and  b. 


By  analyzing  appropriately  chosen  thought  experiments,  we  have  obtained 
four  essential  features  of  relativistic  kinematics:  (1)  Observers  in  relative 
motion  disagree  about  measurement  of  simultaneity.  (2)  They  disagree 
about  measurement  of  time  intervals.  (3)  They  disagree  about  measure- 
ment of  length  intervals  parallel  to  the  direction  of  motion.  (4)  They  agree 
about  measurement  of  length  intervals  perpendicular  to  that  direction.  We 
have  quantitative  relations  involving  the  last  three  features,  but  only  a qual- 
itative description  of  the  first  one.  In  this  section  we  analyze  a thought 
experiment  that  will  provide  a quantitative  relation  concerning  the  simul- 
taneity disagreement.  It  will  also  yield  independent  derivations  of  the  sec- 
ond and  third  feature.  All  four  features  are  contained  in  a set  of  four  equa- 
tions called  the  Lorentz  position-time  transformation. 

As  a preliminary  to  developing  the  Lorentz  transformation  of  relati- 
vistic kinematics,  it  is  worthwhile  reviewing  the  Galilean  transformation  of 
newtonian  kinematics.  Figure  14-13  shows  two  observers  O and  O' , with  the 
primed  observer  moving  relative  to  the  unprimed  observer  at  a constant 
velocity.  The  two  observers  have  constructed  mutually  parallel  sets  of  axes 
x,  y,  z and  x',  y' , z'  with  origins  at  their  own  locations.  For  simplicity,  they 
have  aligned  the  x and  x'  axes  parallel  to  the  direction  of  their  relative 
velocity.  We  continue  to  use  the  symbol  V as  a signed  scalar  to  represent  the 
velocity  of  O'  with  respect  to  O.  Its  numerical  value  is  positive  if  O'  moves 


640  Relativistic  Kinematics 


0 

i 


Top  of  atmosphere 


Top  of  atmosphere 


i 

0.9 99  c 


Fig.  14-12  (a)  Measurement  of  the  decay  of 
cosmic  ray  muons  from  the  point  of  view  of  an 
observer  on  the  ground.  ( b ) Measurement  of 
the  decay  of  cosmic  ray  muons  from  the  point 
of  view  of  an  observer  on  a muon. 


10,000  m 


10,000  m 


Ground  level 


33  X 1 0-6 

/ 

\ 

0.999c 

(a) 


Ground  level 


0.999c 


Top  of  atmosphere 
Ground  level 


1.5  X 10 

i 


0.999  c 


6 s ' 

450  m: 

Top  of  atmosphere 
Ground  level 


(b) 


with  respect  to  O in  the  positive  direction  of  the  x and  x'  axes,  as  in  the  case 
illustrated  in  Fig.  14-13.  If  O'  moves  with  respect  to  O in  the  negative  direc- 
tion of  the  x and  x'  axes,  the  numerical  value  of  V is  negative. 

Observer  O specifies  that  an  event  occurs  at  a point  P and  at  a certain 
time  in  terms  of  the  values  she  measures  for  its  coordinates  x,  y,  z and  the 
time  t.  Similarly,  O'  specifies  the  same  event  by  stating  the  values  x',  y' , z' 
which  he  measures  for  the  coordinates  and  the  value  t'  which  he  measures 


Fig.  14-13  Two  frames  of  reference  in  uni- 
form relative  motion.  The  constant  velocity  of 
the  primed  frame  with  respect  to  the  un- 
primed frame  is  V. 


14-6  The  Lorentz  Position-Time  Transformation  641 


for  the  time.  The  relation  between  the  set  of  numbers  (x,  y,  z,  t)  and  ( x ',  y' , 
z',  t')  is  given  by  the  Galilean  position-time  transformation 

x'  = x — Vt 

y'  = y 

(14-8) 

z = z 

t'  = t 

where  0 and  O'  measure  t and  t'  from  the  instant  they  pass  each  other. 

The  first  three  of  Eqs.  (14-8)  are  exactly  what  would  be  obtained  by 
taking  components  of  both  sides  of  Eq.  (3-56),  with  the  velocity  of  ob- 
server O'  with  respect  to  observer  O directed  along  the  x and  x'  axes.  The 
fourth  of  Eqs.  (14-8)  is  an  implied,  but  unwritten,  condition  on  Eq.  (3-56). 
According  to  newtonian  kinematics,  the  equation  t'  — t is  deemed  to  be 
so  self-evident  as  not  to  warrant  explicit  mention.  But  here  we  have  added 
it  to  the  other  three  to  emphasize  that  newtonian  kinematics  is  based  on 
the  assumption  that  there  is  a universal  time  scale  which  is  the  same  at  all 
locations  in  all  reference  frames,  independent  of  their  state  of  motion.  Be- 
cause we  now  believe  this  assumption  to  be  incorrect,  we  must  investigate 
the  correctness  of  the  transformation  equations  to  which  it  leads. 

Ehis  will  be  done  by  arguing  through  another  thought  experiment. 
The  argument  will  make  tentative  use  of  some  ideas  arising  from  earlier 
thought  experiments.  But  you  will  see  that  its  results  provide  completejus- 
tification  of  these  ideas.  Actually,  the  argument  assumes  only  the  validity  of 
Einstein’s  postulate  of  the  equivalence  of  inertial  frames.  This  is  because  it 
is  based  directly  and  solely  on  one  experimental  fact  which  agrees  with  the 
postulate.  This  is  the  fact  that  the  speed  of  light  is  independent  of  the  mo- 
tion of  the  observer  and  the  motion  of  the  source. 

Two  observers  in  inertial  frames  move  past  each  other,  with  the  con- 
stant velocity  V of  O'  relative  to  O directed  along  the  positive  x and  x'  direc- 
tions of  the  mutually  parallel  axes  of  their  coordinate  systems.  Each  ob- 
server is  located  at  the  origin  of  the  observer’s  system.  When  they  pass  each 
other,  they  celebrate  the  event  by  igniting  a flashbulb  at  the  temporarily 
coincident  origins  of  the  coordinate  systems.  They  also  both  set  their  clocks 
to  zero,  so  that  the  two  observers  will  measure  time  from  the  instant  of  igni- 
tion of  the  flashbulb. 

The  light  produced  by  the  flashbulb  expands  outward  from  its  point  of 
emission  in  all  directions  at  speed  c.  According  to  O',  at  time  t'  the  light  is 
on  a spherical  shell  of  radius  r'  — ct' , centered  on  the  origin  of  his  coordi- 
nate system.  Thus  he  finds  the  coordinates  of  a typical  point  on  the  shell  to 
be  related  by  the  equation  of  a sphere  of  radius  r'  = ct' . The  equation  is 

x'2  + y'2  + z'2  — r'2 
or 

x'2  + y'2  + z'2  = c2t'2  (14-9) 

And  according  to  0 the  light  is  on  a spherical  shell  of  radius  r — ct  at  time  t, 
centered  on  the  origin  of  her  coordinate  system.  So  she  finds  the  coordi- 
nates of  a point  on  the  shell  at  that  time  to  satisfy  the  equation  of  the 
sphere 

x2  + y2  + z2  = r2 

642  Relativistic  Kinematics 


or 

y?  + y2  + z2  = c2t2  (14-10) 

Thus  both  O and  O'  find  themselves  to  be  at  the  center  of  an  expanding 
spherical  shell  of  light,  as  Fig.  14-14  illustrates.  Is  this  possible?  Yes.  pro- 
viding the  relation  between  the  sets  of  numbers  (x\  /,  z\  t ')  and  (x,  y,  z,  t) 
is  such  that  both  Eqs.  (14-9)  and  (14-10)  are  satisfied.  We  will  find  a rela- 
tion which  satisfies  them  both  by  finding  a set  of  equations  which  transforms 
Eq.  (14-9)  into  Eq.  (14-10).  This  set  of  equations  will  constitute  the  Lorentz 
position-time  transformation. 

To  be  specific,  what  we  will  do  is  let  the  qualitative  ideas  we  have  come 
across  in  studying  the  earlier  thought  experiments  suggest  how  we  might 
obtain  the  Lorentz  position-time  transformation  by  modifying  the  Galilean 
position-time  transformation.  Because  of  considerations  explained  below, 
it  seems  reasonable  to  try  the  following  modified  form  for  the  transforma- 
tion equations: 

x'  = y(x  — Vt) 

y'  = y 

(14-11) 

z'  — z 

t'  = y (t  + 8) 

Here  y is  a dimensionless  quantity,  and  8 is  a quantity  that  must  have  the  di- 
mensions of  time.  If  we  are  successful  in  making  Eqs.  (14-11)  satisfy  the  re- 
quirement that  they  transform  Eq.  (14-9)  into  Eq.  (14-10),  it  will  be  done  by 
finding  the  necessary  expressions  for  y and  8.  We  will  soon  do  just  this.  But 
even  now  we  can  say  that  y and  8 must  involve  the  velocity  V of  relative  mo- 
tion of  the  two  reference  frames,  and  also  the  speed  c of  light,  in  such  a way 
that  y should  approach  1 and  8 should  approach  0 when  |Vj  becomes  very 
small  compared  to  c.  This  must  be  so  in  order  that  Eqs.  (14-11)  reduce  to 
the  Galilean  transformation  of  Eqs.  (14-8)  in  those  circumstances.  After  all, 
we  know  from  experiment  that  the  Galilean  transformation  leads  to  correct 
predictions  when  the  relative  speed  of  the  two  reference  frames  is  neg- 
ligible compared  to  the  speed  of  the  light. 

We  also  know  that  when  the  speed  of  relative  motion  is  not  small  com- 
pared to  the  speed  of  light,  there  is  a disagreement  between  the  two  refer- 
ence frames  concerning  the  synchronization  of  clocks,  which  arises  from 
the  disagreement  about  simultaneity.  We  have  tried  to  allow  for  it  by  in- 
serting the  additive  term  8 in  the  fourth  of  Eqs.  (14-1  1).  This  term  is  sup- 
posed to  take  into  account  the  contention  of  O'  that  the  dock  used  by  O at 


Fig.  14-14  (a)  From  the  point  of  view  of  O,  the  light  pro- 
duced by  a flashbulb  is  on  an  expanding  spherical  shell 
centered  at  the  O origin  of  coordinates,  and  O'  is  moving 
to  the  right  at  speed  | V|.  ( b ) From  the  point  of  view  of  O', 
the  light  is  on  an  expanding  spherical  shell  centered  at  the 
O'  origin  of  coordinates,  and  O is  moving  to  the  left  at  speed 

in 


( b ) 


14-6  The  Lorentz  Position-Time  Transformation  643 


the  location  of  an  event  to  time  it  is  not  synchronized  to  give  the  same 
reading  as  the  clock  located  at  the  origin  of  the  coordinate  system  of  0. 
Having  considered  the  synchronization  disagreement,  we  have  then  at- 
tempted to  take  into  account  the  disagreement  between  the  two  observers 
concerning  the  rates  at  which  clocks  located  at  their  respective  origins  mea- 
sure time  intervals.  This  is  done  by  inserting  the  multiplicative  factor  y in 
the  fourth  equation.  As  discussed  immediately  before  Example  14-2,  the 
identical  multiplicative  factor  should  appear  in  the  equation  relating  the 
unprimed  and  primed  values  of  the  coordinate  that  is  measured  along  the 
direction  of  relative  motion.  So  we  have  inserted  the  y in  the  first  transfor- 
mation equation.  Because  the  observers  do  not  disagree  about  lengths  mea- 
sured perpendicular  to  the  direction  of  motion,  we  have  made  no  modifica- 
tion of  the  two  equations  relating  the  coordinates  extending  in  the  perpen- 
dicular directions. 

Now  let  us  confirm  or  deny  the  guesses  we  have  made  about  the  form 
of  the  Lorentz  position-time  transformation.  We  do  this  by  seeing  whether 
Eqs.  (14-11)  actually  can  transform  Eq.  (14-9)  into  Eq.  (14-10),  providing 
that  8 and  y have  the  necessary  forms.  If  we  find  the  transformation  works, 
then  we  will  also  find  what  these  forms  are.  First,  we  use  Eqs.  (14-11)  to 
write  each  of  the  primed  variables  of  Eq.  (14-9)  in  terms  of  the  unprimed 
variables.  If  we  make  the  substitutions,  Eq.  (14-9),  x'2  + y'2  + z'2  = c2t'2, 
becomes 

y2(x2  — 2 Vxt  + V2t2)  + y2  + z2  = c2y2(t2  + 26?  + S2)  (14-12) 

Our  task  is  to  obtain  Eq.  (14-10),  x2  + y2  + z2  — c2f,  from  Eq.  (14-12).  Since 
the  former  has  no  terms  in  it  that  involve  the  product  xt,  there  must  be  a 
cancellation  of  the  second  term  in  parentheses  on  the  left  side  of  Eq.  (14-12) 
by  some  term,  or  terms,  appearing  on  its  right  side.  And  since  the  term  on 
the  left  side  is  proportional  to  the  first  power  of  t,  the  canceling  terms  on 
the  right  side  must  be  also.  Otherwise,  the  cancellation  could  not  hold  for 
all  values  of  t.  The  only  term  on  the  right  side  having  this  property  is  the 
second  term  in  parentheses.  Therefore  we  must  have 

— y22  Vxt  = c2y228t 


or 


8 = -—  (14-13) 

c 

Next  we  substitute  this  result  into  Eq.  (14-12)  and  work  out  the  now 
simplified  expressions  in  parentheses,  producing  directly 

y2x2  + y2V2t2  + y2  + z2  = c2y2t2  + --  X 

c 

If  we  group  the  factors  of  x2  on  the  left  side  and  of  t2  on  the  right  side,  this 
becomes 

y2(l  — V2/c2)x2  + y2  + z2  = c2y2(  1 — V2/c2)t2  (14-14) 

Success!  We  see  that  Eq.  (14-10),  x2  + y2  + z2  = c2t2,  will  be  obtained  im- 
mediately from  Eq.  (14-14),  provided  that 

r2(  l - v2/c2)  = i 


644  Relativistic  Kinematics 


or 


1 


(14-15) 


7 Vl  - V2/c 2 


Using  the  expressions  given  in  Eqs.  (14-13)  and  (14-15)  for  8 and  y in 
the  form  given  by  Eq.  (14-1 1),  we  obtain  the  Lorentz  position-time  trans- 
formation: 


x 


VI 


(x  - Vt) 


(14-16) 


Vl  - V2/c2 


(t  - Vx/c2) 


This  transformation  is  given  its  name  because  H.  A.  Lorentz  had  obtained 
equations  of  the  same  mathematical  form  in  1895,  while  making  an  unsuc- 
cessful attempt  to  introduce  consistency  into  the  theory  of  moving  electric 
charges. 


Note  that  8 = — Vx/c 2 does  have  the  dimensions  of  time  and  does  ap- 
proach 0 when  V/c  approaches  0,  as  predicted  initially.  You  should  ex- 
plain to  yourself,  in  terms  of  the  simultaneity  thought  experiment  of  Sec. 
14-4,  the  physical  reason  why  the  quantity  8 is  proportional  to  the  hrst 
powers  of  both  V and  x.  The  quantity  V in  the  Lorentz  transformation  has 
the  same  significance  as  the  V in  the  Galilean  transformation  of  Eqs.  ( 14-8). 
Its  magnitude  is  the  speed  of  one  inertial  reference  frame  with  respect  to  the 
other.  Its  sign  is  positive  if,  as  assumed  in  Figs.  14-13  and  14-14,  O'  moves 
in  the  positive  x and  x'  directions  relative  to  O.  In  such  a case,  the  sign  of  the 
quantity  8 = — Vx/c2  is  opposite  to  the  sign  of  the  coordinate  x.  If  the  direc- 
tion of  relative  motion  is  reversed,  so  that  O'  moves  in  the  negative  x and  x' 
directions  relative  to  O,  the  sign  of  V is  negative.  In  these  circumstances 
the  sign  of  8 is  the  same  as  the  sign  of  x.  Use  the  thought  experiment  in- 
volving the  train  to  explain  to  yourself  the  physical  significance  of  the  sign  of 
8 and  of  its  dependence  on  the  signs  of  V and  x. 

Also  note  that  the  predictions  initially  made  about  y = 1/Vl  — V2/c2 
have  been  borne  out.  It  is  dimensionless  and  approaches  1 as  V/c  ap- 
proaches 0.  Use  the  time  dilation-length  contraction  thought  experiment 
of  Sec.  14-5  to  explain  physically  the  dependence  of  y on  U.  In  doing  this, 
also  give  a physical  reason  why  y does  not  depend  on  the  sign  of  V,  as  is  ex- 
pressed mathematically  by  the  fact  that  V enters  as  a squared  quantity. 

Figure  14-15  is  a plot  versus  \V\/c  of  the  factor  y = 1/Vl  ~ V2 /c2 
which  appears  in  the  Lorentz  transformation.  Its  value  departs  very  little 
from  1 until  |V|/c  becomes  larger  than  about  0.5.  Then  it  begins  to  increase 
very  rapidly,  going  to  infinity  when  \V\/c  = 1.  For  |V|/c  > 1,  the  factor  be- 
comes an  imaginary  number.  Thus  the  condition  |V|  = c acts  as  a limit  to 
the  range  of  validity  of  the  Lorentz  transformation.  We  will  repeatedly 
come  across  c playing  the  role  of  a limiting  speed  as  we  continue  our  devel- 
opment of  the  theory  of  relativity. 


All  the  features  of  relativistic  kinematics  discussed  earlier  are  con- 
tained in  the  Lorentz  position-time  transformation.  Example  14-5  uses  the 
transformation  to  yield  one  of  these  features,  length  contraction. 


14-6  The  Lorentz  Position-Time  Transformation  645 


Fig.  14-15  A plot  of  the  factor  y = 
l/Vl  — V2/c2  versus  |V|/c. 


10 


0 .1  .2  .3  .4  .5  .6  .7  .8  .9  1.0 

IKI/c 


EXAMPLE  14-5  

Derive  the  length  contraction  formula  from  the  Lorentz  position-time  transforma- 
tion. To  facilitate  comparison  with  Eq.  (14-7),  make  this  derivation  for  a rod  at  rest 
in  the  primed  reference  frame. 

■ Think  of  the  rod  lying  along  the  x'  axis  of  the  primed  frame,  with  one  end 
always  at  coordinate  x[  and  the  other  end  always  at  coordinate  x'2 , where  x'2  > x[.  Its 
proper  length  is 

L'  = x 2 ~ x[ 

In  the  unprimed  frame,  which  is  moving  relative  to  the  primed  one  with  velocity  V, 
the  ends  of  the  rod  have  the  changing  coordinates  xx  and  x2.  According  to  the  hrst 
of  Eqs.  (14-16),  the  relations  between  the  unprimed  and  primed  coordinates  are 

X!  - Vtx 

Xi  = — . = 

Vl  - v2/c2 

and 

x2  — Vt2 
x2  = _ 

Vl  - V2/c2 


646  Relativistic  Kinematics 


So 


x2  — xt  — V(t2  - h) 

- x[  ..= 

Vl  - V2/c 2 


Now  the  length  of  the  rod  moving  through  the  unprimed  frame  is  measured  by 
the  difference  between  the  coordinates  at  which  its  endpoints  lie  simultaneously,  as 
judged  from  that  frame.  In  other  words,  the  length  is  given  by 

L = x2  — x i 

providing  t2  = h so  that  the  observer  in  the  unprimed  frame  judges  the  measure- 
ments of  x2  and  to  be  made  simultaneously.  Using  this  condition  in  the  equation 
ending  the  last  paragraph,  you  have 


X2  X 1 


X2  Xj 

Vi  - V2/c2 


or 

L = Vl  - V2/c2  L' 

l he  moving  rod,  of  proper  length  L',  is  measured  to  have  a contracted  length  L, 
and  the  contraction  factor  Vl  — V2/c2  is  in  agreement  with  Eq.  (14-7). 

How  would  you  use  the  Lorentz  transformation  to  derive  the  time  dilation  for- 
mula? 


Example  14-6  involves  using  the  Lorentz  position-time  transformation 
to  evaluate  quantitatively  the  amount  by  which  observers  in  relative  motion 
disagree  about  simultaneity. 


EXAMPLE  14-6  ■■Tiiiiniiiir 1 

Starship  Enterprise  is  overtaken  by  the  new  model,  Starship  Enterprise-prime,  with 
E'  passing  £ at  a relative  speed  |Vj  = c/2.  The  captain  of  E salutes  the  captain  of  E' 
by  blinking  the  bow  and  stern  lights  of  E,  simultaneously  from  the  point  of  view  of 
E.  As  measured  by  E,  the  distance  between  the  lights  is  1 00  m.  By  how  much  do  the 
times  of  emission  of  the  signals  from  the  lights  differ,  as  measured  by  £'? 

■ In  rhe  reference  frame  of  E the  signals  from  the  two  lights  are  emitted  at  times 
h and  t2  from  locations  x1  and  x2.  But  the  emission  times  are  judged  in  the  reference 
frame  of  E'  to  occur  at  times  t[  and  t2 . The  last  of  Eqs.  (14-16)  shows  you  that 


, _ 0 ~ Vxjc2 
' \ 1 - V2/cz 

and 


t2  — Vx2/c2 

Vl  - V2/c 2 


According  to  E\  the  difference  between  the  emission  times  is 


£2 


h ~ h ~ {V/c2)(x2  - xt) 

Vl  - V2/c 2 


There  is  no  difference  in  the  times  according  to  E,  so  t2  = tv  Thus  the  result  sim- 
plifies to 

(V/c2)(x2  - xj 

t'2  ~ ti= 7 

Vl  - V2/c 2 


14-6  The  Lorentz  Position-Time  Transformation  647 


To  aid  interpretation,  this  is  best  written  as 

1 V{x2  — xt ) 

*2  - t[= . 5 

n/i  - V2/c2  c 

For  numerical  evaluation,  you  use  |V|/c  = 0.500  to  calculate 

1 

, = 1.15 

V 1 - V2/c2 


Then,  choosing  x2  to  be  the  coordinate  of  the  bow  light,  with  the  forward  direction 
of  the  ship  being  positive,  you  have  V = |V|  = 0.500c  and 


V(x2  - xj) 


0.500c  (x2  ~ Xj) 


0.500  x 100  m 
3.00  X 108  m/s 


= 1.67  x 10~7  s 


So  you  obtain 

t2~  t\=  —1.15  x 1.67  x 10“7  s = - 1.93  x 1(L7  s 

The  minus  sign  means  that  the  captain  of  E'  judges  the  bow  light  of£  to  blink 
slightly  earlier  than  the  stern  light.  Explain  from  simple  considerations  why  it  is 
measured  to  be  early.  Also  identify  the  origins  of  the  two  factors  that  combine  to 
produce  the  measured  time  difference,  V(x2  - xd/c2  and  1/Vl  — T2/c2. 


14-7  THE  LORENTZ 
VELOCITY 
TRANSFORMATION 


Fig.  14-16  A moving  point  P,  as  viewed 
from  two  inertial  reference  frames  in 
relative  motion.  The  constant  velocity 
of  the  primed  frame  with  respect  to  the 
unprimed  frame  is  V.  The  velocity  of 
P measured  from  the  unprimed  frame 
is  v,  and  its  velocity  measured  from  the 
primed  frame  is  v' . 


Now  we  will  use  the  Lorentz  position-time  transformation  to  obtain  the 
transformation  equations  which  show  how  to  find  the  velocity  of  a particle 
as  measured  in  a certain  inertial  frame,  when  given  both  its  velocity  as  mea- 
sured in  some  other  inertial  frame  and  the  relative  velocity  of  the  two 
frames.  This  Lorentz  velocity  transformation  will  then  be  applied  in  two 
examples,  which  show  how  the  speed  of  light  c acts  as  a limiting  speed. 

In  Fig.  14-16  the  motion  of  a point  P is  viewed  from  two  inertial  refer- 
ence frames  x,  y,  z and  x’,  y',  z',  which  are  in  relative  motion.  The  x and  x' 
axes  have  been  constructed  on  a common  line,  along  which  is  the  direction 
of  relative  motion.  The  constant  velocity  of  the  primed  frame  with  respect 
to  the  unprimed  frame  is  specified  by  the  scalar  V,  whose  sign  is  positive  if 
this  velocity  is  in  the  direction  of  the  positive  x and  x'  axes,  as  illustrated  in 
the  figure.  The  remaining  axes  of  each  frame  have  been  constructed  paral- 
lel to  the  corresponding  axes  of  the  other  frame.  In  both  frames  of  refer- 
ence, time  is  measured  from  the  instant  at  which  the  origins  of  the  frames 
overlap.  In  the  unprimed  reference  frame,  the  location  of  the  point  at  a 
certain  time  is  specified  by  the  set  of  numbers  (x,  y,  z,  t).  In  the  primed 
frame,  the  same  information  is  given  by  the  set  (x',  y',  z' , t'). 

Let  us  start  by  reviewing  the  velocity  transformation  of  newtonian  kin- 
ematics. A way  to  obtain  this  transformation  is  first  to  take  the  differentials 
of  both  sides  of  each  of  Eqs.  (14-8),  the  Galilean  transformation  equations 
relating  the  two  sets  of  numbers.  Remembering  that  V is  a constant  while 
doing  this,  we  obtain 

dx'  = dx  — V dt 
dy'  = dy 
dz'  = dz 


dt'  = dt 


648  Relativistic  Kinematics 


Then  the  hrst  three  equations  are  each  divided  by  the  fourth,  giving 

dx'  dx 

dt'  dt  V 


dy'  dy 
dt'  dt 


(14-17) 


dP_  _ dz 
dt'  dt 


These  are  the  three  components  of  a form  of  the  velocity  transformation 
equation  we  have  used  on  a number  occasions  in  newtonian  kinematics. 
This  is  so  since  the  three  components  of  the  velocity  of  P measured  in  the 
unprimed  frame  are 


Vx 


dx 

dt 


(14-18) 


while,  as  measured  in  the  primed  frame,  they  are 


dx' 


dy' 

W 


dz '_ 
dt' 


(14-19) 


[Of  course,  in  the  newtonian  domain  no  distinction  is  made  between  t'  and 
t,  so  Eqs.  (14-19)  would  conventionally  be  written  with  t replacing  t'  in  each.] 
Thus  Eqs.  (14-17),  (14-18),  and  (14-19)  together  give 

v'x  = vx  - V 

v'y  = vy  (14-20) 


v 


Z 


vz 


These  are  just  the  representation  in  components  of  the  velocity  transfor- 
mation of  Eq.  (3-57)  for  a case  in  which  the  velocity  of  the  primed  frame 
with  respect  to  the  unprimed  frame  lies  along  the  x and  x'  axes. 


In  relativistic  kinematics  the  velocity  transformation  is  obtained  in  a 
completely  analogous  manner,  except  that  the  Lorentz  position-time  trans- 
formation rather  than  the  Galilean  position-time  transformation  is  used. 
First  the  differentials  are  taken  of  both  sides  of  each  equation  of  the 
Lorentz  transformation,  Eqs.  (14-16),  remembering  that  both  V and  c are 
constants.  The  results  are 


dx'  = 


Vl  - V2/c2 


(dx 


V dt) 


dy'  = dy 


dz'  = dz 


dt'  = 


1 

Vl  - V2/c2 


(dt  — V dx/c2) 


Then  each  of  the  hrst  three  of  these  equations  is  divided  by  the  fourth,  pro- 
ducing 


dx/ 

dt' 


Vl  - V2 /c‘ 


(dx  — V dt) 


Vl  - V2/c‘ 


(dt  - V dx/c2) 


Vdx 
c2  dt 


14-7  The  Lorentz  Velocity  Transformation  649 


d y 

~dt 


1 


rfy 


1 


dy 

dt 


VT 


E2/c2 


(dt  — V dx/c2) 


Vl  - V2/c 2 


v; 

c2  dt) 


dz' 

dt' 


dz 


dz 

It 


Vl  - E2/c5 


(df  — V'dx/c2) 


Vl  - E2/c2 


V V 

c“  df 


l - 


Using  Eqs.  (14-18)  and  (14-19)  to  write  these  in  terms  of  the  velocity  com- 
ponents, we  have  the  Lorentz  velocity  transformation: 

, _ vx-V 
Vx  1 - Vvjc2 


Vl,  = 


V,  = 


Vl  - V2/c2 

1 - Vvjc2 

Vl  - V2/c2 
1 - Vvjc2  ' 


(14-21) 


These  equations  tell  us  how  to  transform  a measured  velocity  from  one  in- 
ertial frame  to  another  moving  relative  to  the  first — that  is,  how  to  use  a 
velocity  measured  in  one  frame  to  predict  the  velocity  that  will  be  measured 
in  the  other. 

Inspection  will  immediately  show  you  that  as  \V\/c  approaches  0,  the 
Lorentz  velocity  transformation  equations  approach  Eqs.  (14-20),  the  Gali- 
lean velocity  transformation  equations.  Thus  you  see  once  more  that  rela- 
tivistic predictions  become  indistinguishable  from  nonrelativistic  predic- 
tions in  the  limit  of  speeds  small  compared  to  the  speed  of  light.  But  the 
nonlinear  Lorentz  velocity  transformation  equations  have  properties  very 
different  from  those  of  the  Galilean  velocity  transformation  for  speeds 
comparable  to  the  speed  of  light,  as  you  will  see  in  Examples  14-7  and  14-8. 


EXAMPLE  14-7  « l-— ■■■■■ 

Two  particles  are  moving  in  opposite  directions  along  the  x axis  of  the  x,  y,  z refer- 
ence frame,  as  illustrated  in  Fig.  14- 17a.  The  velocity  of  particle  1 is  vx  = 0.90c,  and 
that  of  particle  2 is  v2  = —0.80c,  relative  to  this  frame.  What  is  the  velocity  of  par- 
ticle 1 relative  to  particle  2? 

■ Your  newtonian  intuition  may  tell  you  the  answer  is  1.70c.  But  that  intuition  is 
wrong.  The  correct  answer  is  obtained  by  using  the  hrst  of  Eqs.  (14-21)  to  trans- 


V=  -0.80c 


Fig.  14-17  (a)  Particles  1 and  2 moving  at  high  speeds  in  the  opposite 

directions,  (b)  The  situation  as  viewed  in  a primed  reference  frame  in  which 
particle  2 is  stationary. 


650  Relativistic  Kinematics 


y 


y 


Fig.  14-18  A velocity  transformation  carried  out  in  Ex- 
ample 14-8. 


form  the  velocity  of  particle  1 to  a reference  frame  moving  along  with  particle 
2,  as  in  Fig.  14-176.  This  is  clone  by  setting  V = v2  = —0.80c  and  vx  = tq  = 0.90c  in 
the  transformation  equation.  You  obtain 

0.90c  - (-0.80c)  1.70c 

I,'  = — = = n 99c 

x 1 - (— 0.80c)(0.90c)/c2  1.72 

or,  since  v'x  = v[, 


v[  = 0.99c 


EXAMPLE  14-8  

At  the  end  of  Sec.  14-2,  we  considered  a thought  experiment  in  which  the  speed 
of  the  same  beam  of  light  was  being  measured  by  two  observers  O and  O' . Observer 
O'  was  moving  relative  to  O in  the  same  direction  as  the  light  beam  and  at  half  the 
speed  of  light.  Yet  Einstein  asserts  that  both  observers  measure  the  same  value  c 
for  the  speed  of  the  beam.  Show  that  the  Lorentz  velocity  transformation  leads 
to  an  identical  statement. 

■ The  experiment  is  illustrated  in  Fig.  14-18  from  the  point  of  view  of  observer 
O.  The  velocity  to  be  transformed  is  that  of  the  light  beam,  as  measured  by  O,  which 
is  vx  = c.  The  velocity  V of  O'  with  respect  to  O is  V = c/2.  So  the  first  of  Eqs.  (14-21) 
gives  you 


, = c - c/2  = c/2  = _c/2_ 

Vx  1 - (c/2)c/c2  1-1/2  1/2 

or 

v'x  = c 

Thus  both  observers  measure  the  same  speed  c for  the  beam  of  light. 


Examples  14-7  and  14-8  show  that  a feature  of  the  Lorentz  velocity 
transformation  equation  for  the  x component  of  velocity  is  that  relative  ve- 
locities cannot  be  compounded  in  such  a way  as  to  produce  a speed  exceed- 
ing the  speed  of  light.  The  y and  z component  equations  have  the  same  fea- 
ture. In  Chap.  15  Einstein’s  famous  relation  between  mass  and  energy  is 
obtained  from  arguments  based  on  the  Lorentz  velocity  transformation 
equation  for  the  y component. 


14-7  The  Lorentz  Velocity  Transformation  651 


EXERCISES 

Group  A 

14-1.  Ship  time.  How  fast  must  a space  ship  travel  rela- 
tive to  the  earth  so  that  exactly  10  years  of  earth  time  cor- 
respond to  exactly  1 year  of  space  ship  time? 

14-2.  Strongly  contracted.  A rectangle  fixed  in  the  O' 
system  has  one  side  2.00  m long  on  the  x'  axis  and  one  side 
1.00  m long  parallel  to  the  y'  axis.  Observer  O measures  it 
as  a square,  1.00  m on  each  side.  What  is  the  speed  of  O' 
relative  to  O along  the  common  x axis? 

14-3.  Muon  watch.  The  average  lifetime  of  muons  at 
rest  is  2.2  X 10-6  s.  A laboratory  measurement  made  on 
muons  produced  at  a high  energy  particle  accelerator 
yields  an  average  lifetime  of  6.9  x 10-6  s.  Determine  the 
speed  of  the  muons  with  respect  to  the  laboratory. 

14-4.  Blink-blink.  A spacecraft  of  length  10  m is 
moving  at  constant  speed  | V|  = 0.80c  relative  to  an  inertial 
frame  O'.  A light  signal  is  sent  from  the  front  of  the  craft 
toward  the  tail. 

a.  In  a frame  O fixed  to  the  spacecraft,  how  long  does 
the  light  signal  take  to  reach  the  tail? 

b.  How  long  does  it  take  to  reach  the  tail  as  viewed 
from  the  frame  O'? 

14-5.  A light  in  the  distance.  When  O'  passes  moving 
with  respect  to  O at  a speed  \V\  = 0.80c,  they  both  set  their 
clocks  to  zero  time.  One  hour  later  according  to  his  clock 
O'  illuminates  his  dial,  thus  sending  a light  signal  to  O. 

a.  According  to  the  clock  of  O,  when  was  the  signal 
sent? 

b.  According  to  O,  how  long  does  it  take  the  signal  to 
reach  her?  What  does  her  clock  register  on  the  arrival  of 
the  signal? 

c.  According  to  O',  how  long  does  it  take  for  the 
signal  to  reach  0?  What  is  the  reading  on  his  clock  when 
the  signal  reaches  0? 

14-6.  Child  of  the  galaxy.  It  takes  light  about  105  years 
to  travel  from  the  most  distant  star  in  our  galaxy  to  the 
earth.  But  in  principle  it  is  possible  for  a child  born  on  a 
space  ship  as  it  leaves  the  earth  to  arrive  at  the  star  before 
dying  of  old  age — -if  the  space  ship  travels  at  a high 
enough  speed.  Explain  why.  Then  estimate  the  required 
speed. 

14-7.  A fast  ship. 

a.  What  is  the  value  of  l/Vl  ~ V2/c2  for  a space  ship 
that  travels  1 light  year  as  observed  from  the  earth  in  one 
year  of  space  ship  time? 

b.  What  is  the  value  of  |V|/c? 

14-8.  Add  ’em  up ? A rocket  ship  approaches  the  earth 
with  speed  0.80c.  Another  ship  approaches  from  the  oppo- 
site direction  with  the  same  speed.  What  is  the  speed  of 
the  first  ship  relative  to  the  second? 


14-9.  Galileo  versus  Lorentz.  What  percentage  error  is 
made  in  using  the  Galilean  transformation  x'  = x — Vt 
instead  of  the  corresponding  Lorentz  transformation 
when  v/c  =1/7? 


14-10.  Lorentz  in  action.  Use  the  Lorentz  position-time 
transformation  to  derive  the  time  dilation  formula. 


roup  B 

j/14-11.  Tilt!,  I.  A meter  stick  is  at  rest  in  the  inertial 
frame  O'  with  one  end  at  the  origin,  and  makes  an  angle 
of  60°  with  the  x'  axis.  Observer  O'  sees  observer  O pass 
him  by,  moving  in  the  positive  x'  direction  with  velocity 
0.80  c. 

a.  If  0 fixes  her  x axis  parallel  to  the  x'  axis,  what 
angle  does  the  meter  stick  make  with  the  x axis  according 
to  her  measurements? 

b.  What  is  the  length  of  the  meter  stick  as  measured 

U lAHAA  Atkf  fhcll  % cjtdf  Li  tit 

Tilt!,  II.  Just  as  O passes 'O'  in  Exercise  14-11, 
a strobe  flash  focused  so  that  the  light  travels 
along  the  meter  stick.  fki  Qmi j (fa  "ylick.  [ft* 

a.  What  are  the  x'  and  y'  components  of  the  velocity 
of  the  light  flash? 

b.  Transform  these  components  into  the  frame  O. 

c.  What  angle  does  the  light  path  make  with  the  x 
axis?  Compare  this  result  with  the  effect  of  relative  motion 
on  the  tilted  meter  stick  in  Exercise  14-11. 

d.  Show  that  the  speed  of  the  light  in  the  frame  O is  c. 

4^  /l4  -13.  The  relativity  of  shape.  Consider  the  two  refer- 

ence frames  O and  O'  that  are  described  in  Sec.  14-6.  A 
wooden  slab  is  at  rest  in  frame  O'  and  is  centered  on  the 
origin  of  O'.  Observer  O'  finds  that  it  is  square,  with  sides 
exactly  1 m long.  Find  the  coordinates  of  its  corners  and 
describe  the  shape  and  orientation  of  the  slab,  as  measured 
in  frame  O at  t = 0,  when  frames  O and  O'  coincide,  in  each 
of  the  following  cases. 

a.  The  edges  of  the  slab  in  O'  are  parallel  to  the  x' 
and  y'  axes. 

b.  The  edges  of  the  slab  in  O'  are  inclined  at  45°  to 
the  x'  and  y'  axes. 


./ 14-14.  The  relativity  of  confinement,  A rocket  ship  of 
proper  length  100  m flies  at  a high  speed  into  a straight 
tunnel  bored  through  a mountain.  The  proper  length  of 
the  tunnel  is  50  m.  To  an  observer  stationed  at  the  tunnel, 
the  length  of  the  ship  is  contracted  so  that  he  judges  the 
ship  to  be  slightly  shorter  than  the  tunnel.  There  are 
sliding  doors  at  the  entrance  and  exit  of  the  tunnel, 
which  operate  independently.  If  a door  is  actuated,  it 
slams  shut  and  then  immediately  snaps  open.  When  the 
observer  finds  the  ship  to  be  in  the  tunnel,  he  actuates 
both  doors  at  the  same  time.  He  then  describes  what  has 
happened  by  saying  that  the  ship  was  momentarily  con- 
fined between  the  closed  doors.  There  is  also  an  observer 
on  the  ship.  For  her  it  is  the  length  of  the  tunnel  that  is 
contracted.  She  therefore  judges  the  ship  to  be  consider- 
ably longer  than  the  tunnel.  So  she  contends  that  the  ship 


652  Relativistic  Kinematics 


could  not  have  been  momentarily  confined  between  the 
closed  doors.  Just  how  would  she  describe  what  has  hap- 
pened? Use  the  Lorentz  position-time  transformation  to 
reconcile  the  two  descriptions.  That  is,  show  that  the  rela- 
tion between  the  descriptions  is  in  quantitative  agreement 
with  the  transformation.  (Note:  Persons  who  want  to 
“prove  Einstein  wrong”  often  pose  this  problem  as  an 
“irrefutable  inconsistency”  in  the  theory  of  relativity.) 

^^4-15 / Spatial  separation  versus  length.  As  seen  by  iner- 
tial observer  O',  a certain  event  1 takes  place  at  x[  = 
— L'/ 2 at  time  t[  = L' /2c.  Another  event  2 takes  place  at 
x2  = L'/2  at  time  t2  = L' /2c,  so  that  for  O'  the  two  events 
are  simultaneous. 

a.  Show  that  for  another  inertial  observer  O moving 
along  the  x'  axis  at  velocity  V with  respect  to  O',  the  events 
are  not  simultaneous  since  A t = syL'V !c 2,  where  y 

i/Vi  - vyv.  K — — - _ 

b.  Determine  the  spatial  separation  of  the  two  events, 
Ax,  for  observer  0. 

c.  Does  Ax  represent  the  length  of  an  object?  Ex- 
plain. 


1-16.  Simultaneity:  another  view.  Repeat  the  analysis 
of  the  simultaneity  thought  experiment  considered  in 
Se  4-4,  but  have  the  observer  stationed  on  the  train 
cause  the  blasting  caps  to  be  detonated  simultaneously 
from  his  point  of  view.  Compare  your  analysis,  and  your 
results,  with  those  of  the  text. 

14-17.  A close  look  at  delta.  Write  a paragraph  ex- 
plaining, in  terms  of  the  simultaneity  thought  experiment 
considered  in  Sec.  14-4,  the  physical  reason  why  the 
quantity  8 = — Tx/c2  of  Eq.  (14-13)  is  proportional  to  the 
first  powers  of  both  V and  x.  Write  a second  paragraph  ex- 
plaining the  physical  significance  of  the  sign  of  8,  and  of 
its  dependence  on  the  signs  of  V and  x. 

* 14-18.  A close  look  at  gamma.  Write  a paragraph  ex- 
plaining, in  terms  of  the  time  dilation-length  contraction 
thought  experiment  considered  in  Seci  14-5,  the  physical 
reason  for  the  dependence  of  the  quantity  y = 
l/Vl  — W/c2  of  Eq.  (14-15)  on  V.  Give  a physical  reason 
why  y does  not  depend  on  the  sign  of  V. 

/14-19.  The  inverse  Lorentz  transformations.  Since  all  in- 
ertial systems  are  equivalent,  it  follows  that  the  laws  of 
physics  are  the  same  for  all  inertial  systems.  The  Lorentz 
transformation  of  Eqs.  (14-16)  is  such  a law. 

a.  Use  this  principle  of  equivalence  to  obtain  the  in- 
verse Lorentz  transformation,  from  the  primed  system  to 
the  unprimed  one. 

b.  Obtain  the  same  result  by  the  more  laborious 
method  of  algebraic  manipulation. 

v/14-20.  An  alternative  derivation  of  the  Lorentz  transfor- 
mation. When  inertial  reference  frames  0 and  O'  coincide, 
let  a flash  of  light  be  produced  at  the  common  origin. 
Each  observer  is  justified  in  considering  himself  at  the 
center  of  an  expanding  sphere  of  light.  Experiment  has 


revealed  that  each  obtains  the  same  value  c for  the  velocity 
of  light.  The  Galilean  transformation,  x'  — x — Vt,  does 
not  give  this  result.  Therefore  try  a modification,  x'  = 
y(x  — Vt)  [Eq.  (14E-1)],  where  y is  to  be  determined.  The 
principle  of  equivalence  (see  Exercise  14-19)  requires  that 
Eq.  (14E-1)  hold  for  the  inverse  transformation,  x = 
y(x'  — V't')  = y{x'  + Vt')  [Eq.  (14E-2)].  In  this  equation, 
we  use  the  assertion  that  V'  = —V.  But  for  generality,  the 
possibility  has  been  allowed  that  t'  may  be  different  from  t. 

If  x and  x'  are  the  intersections  of  the  sphere  with  the 
axis  at  times  t and  t' , respectively: 

a.  To  what  is  x' /t'  equal? 

b.  To  what  is  x/t  equal? 

c.  Use  the  results  of  parts  a and  b to  eliminate  x and  x' 
in  Eqs.  (14E-1)  and  (14E-2)  and  thus  to  determine  y. 

d.  With  this  relation  for  y,  express  t'  in  terms  of  t 
and  x. 

\A±  -21.  Relativistic  pursuit.  A spaceship  passes  the  earth 
at  t = t'  = 0 with  relative  velocity  v.  At  time  t1  on  earth 
clocks,  a super-spaceship  leaves  the  earth  with  relative 
velocity  V > v to  catch  up  with  the  first  one.  This  will 
happen  at  earth  time  t2,  where  vt2  = V(t2  — f)  or  t2  = 
Vti/(V  - v). 

a.  Is  (fj  — 0)  a proper  or  dilated  time  interval? 

b.  What  does  the  clock  on  the  “slow”  spaceship  read 
when  the  earth  clock  reads  tf 

c.  How  far  from  the  spaceship  is  the  earth  then? 

d.  Is  (t2  — 0)  a proper  or  dilated  time  interval? 

e.  What  does  the  clock  on  the  slow  spaceship  register 
when  the  ship  is  overtaken? 

f.  In  the  frame  of  reference  of  the  slow  spaceship, 
how  much  time  has  elapsed  since  the  pursuit  started? 

g.  In  the  same  reference  frame,  how  large  a distance 
was  covered  in  the  pursuit? 

h.  Divide  the  result  of  part  g by  the  result  of  part / to 
obtain  the  velocity  of  the  fast  ship  relative  to  the  slow  one. 
Compare  this  result  with  the  one  that  could  be  obtained  by 
using  the  formulas  for  the  Lorentz  velocity  transformation. 


4-22.  The  speed  of  light  in  moving  water.  The  speed 
of  light  in  still  water  is  c/n,  where  n — 1.33  and  is  called  the 
index  of  refraction.  If  the  water  is  moving  with  speed  | V|  « c 
in  the  same  direction  as  the  light  is  traveling,  show  that 
the  speed  observed  in  the  laboratory  is  equal  to  c/n  + 
|V|(1  — \/n2).  This  result  was  obtained  by  the  French 
physicist  Fizeau  about  the  middle  of  the  nineteenth  century 
and  can  be  explained  only  relativistically. 


14-23.  Apchelson-Morley  fringe  shift  according  to  Galileo 
Ltk^jj_ic_d+<fa n ce  to  each  mirror  from  the  inclined  lightly 
silvered  mirror  P in  Fig.  14-2  be  L.  Assume  that  there  is  a 
stationary  medium,  the  ether,  through  which  the  appa- 
ratus moves  with  velocity  V in  the  direction  of  mirror  M2. 
Assume  also  that  newtonian  physics  is  valid,  so  that  the 
Galilean  transformations  apply. 

a.  If  the  speed  of  light  is  c,  what  is  the  speed  of  light 
.relative  to  ] } Ct^lQ  dcAJlA.  R 

Exercises  653 


b.  How  long  does  it  take  the  light  to  travel  from  P to 

(c  J What  is  the  speed  of  light  relative  to  P on  the  re- 
V-turn  trip^  t*Az  \o  p? 

d.  How  long  does  it  take  the  light  to  travel  from  M2  to 
P?  What  is  the  total  time  for  the  round  trip  from  P to  M2 
and  back  to  P? 

e.  Show  that  the  speed  of  light  relative  to  Mx  is 
Vc2  — V2  and  that  this  is  also  the  speed  relative  to  P on 
the  return  trip. 

f.  What  is  the  total  time  for  the  round  trip  to  the 
mirror  Mx  and  back  to  P? 

g.  If  \V\/c  «■  1,  show  that  the  difference  in  time 
between  the  round  trip  to  M2  and  the  round  trip  to  Mx  is 
equal  to  LV 2 /<? . 

h.  What  distance  does  light  travel  in  this  time  dif- 
ference? 

i.  If  the  wavelength  of  the  light  is  A.,  to  what  fraction 
of  a wavelength  does  this  distance  correspond? 

j.  Calculate  this  fraction  if  L = 10  m,  A.  = 6 x 10~7  m 
and  |V|  = 3 X 104  m/s.  Take  c equal  to  3 x 108  m/s. 

k.  If  the  apparatus  is  rotated  through  90°,  how  much 
of  a fringe  shift  is  expected?  (Michelson  and  Morley  could 
have  detected  a shift  of  0.01  fringe.) 


V14  -24.  Squashing  a circle.  A circular  hoop  of  radius  a' 
is  at  rest  in  the  x'y'  plane  of  system  O'.  The  inertial  frame 
O'  is  moving  as  described  in  Sec.  14-6  at  constant  velocity 

V with  respect  to  the  inertial  frame  0. 

a.  Show  that  the  measurements  made  in  frame  O will 
indicate  that  the  hoop  is  elliptical  in  shape,  with  its  semi- 
major axis  parallel  to  the  y axis  and  of  length  a = a',  and 
its  semiminor  axis  of  length  b = a's/l  -*  V2/c2. 

b.  The  eccentricity  e of  an  ellipse  is  defined  by  e = 

VI  ~ ( b/a )2.  Derive  a simple  expression  for  the  eccentric- 
ity of  the  ellipse  in  part  a.  Evaluate  e for  each  of  the  fol- 
lowing values  of  [V|,  the  relative  speed:  (1)  0.010c;  (2)  0.10c; 
(3)  0.50c;  (4)  0.90c;  (5)  0.999c. 


Group  C 

y]4  -25.  Relativistic  Doppler  effect.  A space  ship  traveling 
toward  the  earth  with  speed  |z/s|  has  a long  rod  sticking 
out  at  right  angles  to  its  direction  of  travel.  When  a light  at 
the  space  ship  end  of  the  rod  flashes,  the  light  pulse  which 
travels  along  the  rod  is  reflected  back  to  the  space  ship  by 
a mirror  at  the  end  of  the  rod.  The  returning  light  pulse 
activates  a very  fast-acting  mechanism  which  makes  the 
light  flash  again.  The  length  of  the  rod  is  such  that  ob- 
servers on  the  space  ship  agree  that  the  light  flashes  ex- 
actly once  per  microsecond.  That  is,  its  frequency  is  exactly 
v = 1 x 106  Hz. 


a.  If  the  time  dilation  were  the  only  effect  to  take  into 
account,  what  would  be  the  period  of  the  flash  on  earth 
clocks? 

Light  from  the  flash  also  travels  toward  the  earth. 
This  period  of  flashing  will  be  less  than  the  period  found 
in  part  a since  the  space  ship  moves  toward  the  earth 


between  flashes  and  the  second  flash  has  less  far  to  travel 
than  the  first  one. 

b.  How  far  toward  the  earth  will  an  earth  observer 
say  the  light  has  traveled  in  the  time  between  flashes? 

c.  How  far  will  this  observer  say  the  ship  has  traveled 
between  flashes? 

d.  What  is  the  distance  between  successive  flashes  for 
an  earth  observer? 

e.  Show  that  the  time  between  flashes  that  reach  the  » 

earth  is  equal  to^/(l  — \vs\/c)/(\  + \vs\/c)  . &/) 

f.  If  |us|  = f c\  what  is  the  period  T of  the  flashes  as  ob- 
served on  the  earth?  What  is  their  frequency  v'7 

g.  If  the  space  ship  were  traveling  away  from  the  earth 
with  the  same  speed,  what  would  be  the  period  between 
flashes  observed  on  the  earth? 

h.  Find  an  expression  for  the  frequency  v'  of  the 
flashes  as  they  are  observed  on  the  earth. 

This  change  in  the  period  and  frequency  of  a signal 
emitted  by  a source  moving  toward  or  away  from  an  ob- 
server is  called  the  relativistic  Doppler  effect.  Compare 
the  result  obtained  in  part  h with  Eq.  (12-71).  The  latter 
equation  gives  the  Doppler  frequency  shift  for  waves 
moving  through  a medium  when  the  source  is  moving  and 
the  observer  is  at  rest  with  respect  to  the  medium. 


■/14-26.  Doppler  effect  in  action.  Light  is  emitted  by  a 
certain  species  of  atoms  in  a star  located  in  a distant  gal- 
axy. An  identification  of  the  species  is  made  by  studying 
the  distribution  pattern  of  the  numerous  sharply  defined 
frequencies  of  the  light  arriving  at  the  earth.  (That  is,  the 
source  is  identified  by  studying  the  “spectrum”  of  the  light 
it  emits.)  From  this  identification  a certain  component  of 
the  light  is  determined  to  have  a frequency  v = 8.0  x 1014 
Hz,  as  would  be  measured  by  an  observer  stationed  in  the 
distant  glalaxy.  The  frequency  v'  of  this  component  is 
measured  by  an  observer  stationed  on  the  earth  and 
found  to  have  the  value  5.0  x 1014  Hz.  Use  the  relativistic 
Doppler  effect  formula  obtained  in  Exercise  14-25  to  de- 
termine the  direction  and  speed  of  the  motion  of  the  gal- 
axy with  respect  to  the  earth. 


(/14-27.  Interstellar  encounter.  A space  ship  coasting  in 
interstellar  space  encounters  an  alien  space  probe  which 
has  a radio  transmitter.  As  the  probe  approaches,  the  fre- 
quency initially  received  by  the  ship  is  130  MHz  = 130  x 
106  Hz.  As  the  probe  recedes  into  the  distance,  the  fre- 
quency eventually  drops  to  60  MHz.  Consult  Exercise 
14-25  and  then  answer  the  following  questions. 

a.  What  is  the  relative  speed  of  the  ship  and  the 
probe? 

b.  What  is  the  intrinsic  frequency  of  the  probe  s 

transmitter?  a/u/\ 

c.  Find  the  received  frequency  of  the  signalsTmitted 
hgahe-pmbe  at  the  time  of  its.closest  approach  to  the  ship. 


st  the  Hmeot  1 1 ^closest  appl- 
et iMj 

imposition  of  two  Lorentz  tram 


14-28.  Composition  of  two  Lorentz  transformations.  Con- 
sider three  inertial  reference  frames  O,  O' , and  O" . Let  O' 
move  with  velocity  V with  respect  to  O,  and  let  O"  move 


654  Relativistic  Kinematics 


with  velocity  V'  with  respect  to  O'.  Both  velocities  are  in 
the  same  direction. 

a.  Write  the  transformation  equations  relating  x,  y,  z, 
t with  x' , y',  z',  t'  and  also  those  relating  x' , y',  z' , t'  with  x", 
y",  z",  f.  Combine  these  equations  to  obtain  the  relations 
between  x,  y,  z,  t and  x",  y",  z",  f. 

b.  Show  that  these  relations  are  equivalent  to  a direct 
transformation  from  O to  O"  in  which  the  relative  velocity 
V"  of  0"  with  respect  to  0 is  given  by 

V + V' 

V"  = 

1 + VV'/c1 

c.  Show  (hat  this  expression  for  V"  is  in  agreement 
with  the  hrst  of  the  Lorentz  velocity  transformations,  Eqs. 
(14-21). 

d.  Explain  how  the  analysis  you  have  gone  through 
proves  that  two  successive  Lorentz  position-time  transfor- 
mations are  equivalent  to  one  direct  position-time  trans- 
formation. 

e.  M,  M',  and  M"  are  meter  sticks  lying  along  the  par- 
allel x,  x' , and  x"  axes.  They  are  at  rest  in  O,  O' , and  0", 
respectively.  Construct  a table  that  shows  the  lengths  ob- 
servers in  each  frame  would  assign  to  each  meter  stick. 

' 14-29.  A more  general  form  of  the  Lorentz  transformation. 
The  coordinate  origin  of  inertial  frame  O'  moves  with 
constant  velocity  V in  the  positive  x direction,  as  measured 
from  inertial  frame  0.  The  x'  and  x axes  are  parallel  but  not 
coincident.  Axesy'  and  z'  are  parallel  to  axesy and  z,  respec- 
tively. At  time  t0  (according  to  0 ) and  time  to  (according 
to  O'),  the  point  (x0,  y0,  z0)  in  the  frame  0 coincides  with 
the  point  (xf  y'a,  z’0)  in  the  frame  O'.  Construct  an  argu- 
ment, based  on  the  Lorentz  transformation  given  in  Eq. 
(14-16),  which  shows  that  the  correct  transformation  of 
coordinates  from  0 to  O'  must  be: 

x'  ~ x'o  = y[(x  — xo)  — V(t  — i0)] 

y - ;y6  = y - 3»o 

z'  - Zq  = Z - z0 

t'  - to  = y[(t  - to)  - V(x  - Xo )/c2] 

where  y = 1/Vl  — T2/c2.  (Hint:  What  is  the  transforma- 
tion for  two  observers  who  are  at  rest  with  respect  to  one 
another,  but  who  have  simply  chosen  different  origins  for 
position  and  time  coordinates?) 

* 14-30.  Form-invariance  of  the  wave  equation.  The  wave 
equation  for  a beam  of  light  traveling  in  the  x direction 
can  be  written 

d2f(x,  t)  _ I d2f(x,  t) 
dx2  c2  dt2 

Here  c represents  the  speed  of  light  and  f(x,  t)  represents 
the  value  of  the  “electric  held,”  or  of  the  “magnetic  held,” 
in  the  light  beam.  Together  these  fields  form  the  linked 
pair  of  traveling  transverse  waves  called  the  “electromag- 
netic held”  that  constitutes  the  light  beam.  The  wave 


equation  determines  the  properties  of  the  light  beam,  as 
measured  by  an  observer  O in  an  inertial  reference  frame 
with  position  and  time  variables  x and  t.  An  observer  O'  is 
in  an  inertial  reference  frame  moving  with  respect  to  the 
Irame  of  observer  O along  the  x axis  at  constant  velocity 
V.  The  position  and  time  variables  used  by  O'  are  x'  and  t' . 
Their  relations  to  the  variables  x and  t used  by  O are  given 
by  the  hrst  and  last  of  Eqs.  (14-16),  the  Lorentz  equations 
for  transforming  position  and  time.  Use  these  relations 
to  show  that  when  transformed  into  the  variables  x'  and  t' 
the  wave  equation  becomes 

d2f(x',  t')  1 d2f(x',  t') 

dx'2  C2  dt'2 

That  is,  the  wave  equation  is  form-invariant  under  the 
Lorentz  transformation.  The  variables  are  changed,  but 
the  form  of  the  equation  remains  the  same.  (Hint:  You 
will  need  to  make  repeated  use  of  the  “chain  rules” 

dg  dg  dx'  dg  dt' 

dx  dx'  dx  dt'  dx 

and 

dg  dg  dx'  dg  dt' 

dt  dx'  dt  dt'  dt 


where  g is  any  function  of  x and  t,  or  of  x'  and  t' .)  Com- 
pare your  results  with  the  results  of  Exercise  12-33 
which  calls  for  a similar  calculation,  except  that  the  Gali- 
lean equations  for  transforming  position  and  time  are  ap-  ■ 
plied  to  the  wave  equation  for  transverse  waves  in  a 
stretched  string.  In  particular,  discuss  the  physical  signifi- 
cance of  the  fact  that  the  transformed  wave  equation  for 
light  has  exactly  the  same  mathematical  form  after  it  is 
transformed  as  it  has  before  being  transformed.  What 
does  the  transformed  wave  equation  predict  for  the  speed 
ol  light,  as  measured  by  observer  O'?  Compare  this  pre- 
diction to  the  prediction  obtained  in  Exercise  12-34  con- 
cerning the  speed  of  the  waves  in  the  string,  as  measured 
by  observer  O'. 

f 14-31A Rocket  astronomy.  An  astronomer  observes  that  \ 
a grTTOp  of  protons  from  the  sun  (part  of  the  solar  wind) 
^passed  the  earth  at  time  O^AFThe  later_fhne7Y'— At, 

>h c ~di  s cove rs'Y h a t Jupiter  has  emitted  a large  burst  of, 
kradio  noiseyA  second  astronomer  O'  rifling  in  a rocket 
traveling  from  earth  to  Jupiter  at  speed  |U|,  observes  the 
same  two  events. 

a.  Assume  that  the  earth  is  directly  between  the  sun 
and  Jupiter,  6.3  x 108  km  from  Jupiter.  Let  |V|  = 0.50c 
and  A t = 900  s.  Calculate  the  time  interval  A t'  measured  by- 
observer  O'  in  the  rocket.  Could  the  protons  from  the  sun 
have  triggered  the  radio  burst  from  Jupiter? 

b.  Is  there  a second  rocket  reference  frame  in  which 
the  two  events  were  simultaneous?  If  so,  what  wAa?  its 
speed  with  respect  to  the  earth?  If  not,  why  not? 

c.  Assume  that  a radio  noise  burst  is  triggered  by  a 
burst  of  protons.  What  limit  can  be  placed  on  At? 


} 


y -d \J  cLi/)C(AViAA  'HvH/t  , Exercises 

Ada  ,£tkc 

AcuUo  M<}UL  a:\  tt  - 1 tXi  / 


655 


JrOlCF^-O 


d.  Suppose  that  the  two  events  were  separated  by 
At  = 60  min.  What  was  the  speed  of  an  observer  who  mea- 
sured a proper  time  interval  between  these  events?  Calcu- 
late the  time  interval  measured  by  the  observer. 

^ 14-32.  Three  equal  speeds.  Frames  O and  O'  are  related 
as  described  in  Section  14-6.  There  is  a particle  moving  in 
the  x'y'  plane  of  the  frame  O'  whose  speed  v'  equals  V, 
the  relative  speed  of  O and  O' . In  frame  O,  the  particle’s 
speed  v is  also  equal  to  V . 

a.  Find  the  angle  6 between  v and  the  x axis.  Find  the 
angle  6'  between  v'  and  the  x'  axis. 

b.  Evaluate  the  angles  6 and  O'  for  the  following  val- 
ues of  V:  (1)  0.10c;  (2)  0.30c;  (3)  0.90c;  (4)  0.99c. 


/l4-33.  Invariant  spacetime  interval.  As  seen  from  iner- 
tial frame  0,  event  1 occurs  at  time  tx  at  the  location  speci- 
fied by  the  coordinates  (xx,  ylt  zx).  Another  event,  event  2, 
occurs  at  time  t2  at  the  location  specified  by  the  coordi- 
nates (x2,  y2,  Zz).  Observer  O'  in  a different  inertial  frame 
observes  the  same  two  events,  whose  times  of  occurrence 
and  locations  he  specifies  by  means  of  the  corresponding 
primed  quantities. 

a.  Use  the  Lorentz  position-time  transformation  to  ex- 
press the  quantity  c2(t2  - h)2  - (x2  - xx)2  - (y2  - ylf  - 
(zz  — zx)2  in  terms  of  the  primed  quantities. 

b.  Show  that  the  quantity  given  above  reduces  to  the 
product  of  c2  and  the  time  interval  between  the  two  events 
if  the  events  occur  in  the  same  place,  and  that  the  quantity 
is  thus  the  product  of  c2  and  the  proper  time  between  the 
two  events. 

Note  that  x2  — xx  is  not  invariant  using  the  Lorentz 
transformation,  though  it  is  invariant  using  the  Galilean 
transformation.  The  same  holds  for  t2  — tf  The  combina- 
tion of  space  and  time  quantities  given  in  part  a is  Lorentz 
invariant:  hence  the  expression  “spacetime.” 

'he  invariant  speed.  A flash  of  light  has  compo- 
ocity  vx,vy,  and  ugin  the  unprimed  coordinate 

2.  Using  the  Lorentz 


f 

^nentS-of  v 
system.  That 


is. 


vi  + vl  + vi  = 


velocity  transformation,  calculate  the  speed  of  the  light  in 
the  reference  frame  O’  which  is  moving  with  speed  V rela- 
tive to  O in  the  direction  of  the  x and  x’  axes.  The  y and 
/ axes  are  parallel. 

v^14-35.  The  aberration  of  starlight.  Figure  14E-35  repre- 
sents a star  on  the  y axis  sending  light  toward  the  earth. 


?/ 


which  is  at  the  origin  of  a system  of  coordinates  whose  x’ 
axis  is  in  the  direction  of  the  earth’s  motion  relative  to  the 
sun.  Let  fbe  the  angle  between  they'  axis  and  the  direction 
of  the  star  as  observed  from  the  earth  or,  what  is  the  same, 
the  observed  path  traveled  by  the  light  from  the  star  to  the 
earth. 


a.  Show  that  sin  </>  = V/c,  where  V is  the  speed  of  the 
earth  relative  to  the  sun. 

b.  Is  the  shift  in  the  star’s  observed  direction  due  to 
the  motion  of  the  earth  toward  or  opposite  the  direction 
of  the  velocity  of  the  earth  relative  to  the  sun? 

c.  Show  the  speed  of  the  light  in  the  coordinate 
system  attached  to  the  earth  is  still  c. 

d.  Evaluate  f in  arc  seconds  if  V = 3 X 104  m/s. 

This  phenomenon  of  shift  of  star  position  is  called 


stellar  aberration. 


a/ 


fU>  a ccalpui  h t 


14-36/77^  relativity  of  acreleratinniW n accelerating  ob- 
ject Es-j^feserved  from  two  inertial  frames,  O and  O'.  The 
x and  x’  axes  are  collinear  with  the  path  of  the  object  and 
O'  has  a velocity  V relative  to  O which  lies  along  the  x axis. 
Show  that  the  accelerations  as  measured  in  the  two  frames 
are  related  by 


a’  = a(\  - V2/c2)3l2/(  1 - vV/c2)3 


where  v is  the  instantaneous  velocity  of  the  object  as  ob- 
served by  O.  [You  will  need  to  note  that  dv/dt’  = ( dv/dt ) 
(dt/df).] 


656 


Relativistic  Kinematics 


IE 

Relativistic 

Mechanics 


15-1  THE  BASIS  OF  Now  that  we  have  developed  the  most  important  properties  of  relativistic 
RELATIVISTIC  kinematics,  we  can  use  them  as  a foundation  to  build  die  structure  of  rela- 
MECHANICS  tivistic  mechanics.  The  procedure  we  follow  is  completely  parallel  to  one 
that  served  us  well  in  our  earlier  work  with  newtonian  mechanics.  It  starts 
from  the  law  which  states  that  the  total  momentum  of  an  isolated  system  is 
conserved  if  the  system  is  viewed  from  any  inertial  reference  frame.  There 
is  an  overwhelming  amount  of  experimental  evidence  supporting  this  law 
in  the  newtonian  domain;  some  is  found  in  the  strobe  photos  of  puck  colli- 
sions considered  in  Chap.  4.  When  Einstein  was  developing  relativistic  me- 
chanics, there  were  no  experiments  in  the  relativistic  domain  to  show  that 
the  total  momentum  of  a system  is  constant.  He  nevertheless  boldly  as- 
sumed that  this,  the  most  fundamental  law  of  newtonian  mechanics,  would 
also  apply  to  relativistic  mechanics.  Then  he  analyzed  the  implications  of 
the  assumption  by  means  of  a momentum-conservation  thought  experi- 
ment. 

We  will  analyze  such  a thought  experiment.  It  will  lead  us  to  a rela- 
tivistic definition  of  mass,  just  as  our  analysis  in  Chap.  4 of  an  experiment 
in  momentum  conservation  of  a puck  collision  led  us  to  a newtonian  defini- 
tion of  mass.  We  will  then  use  the  understanding  of  relativistic  mass  and 
momentum  thus  obtained  to  develop  the  relativistic  concepts  of  force  and 
energy. 

Although  our  development  of  relativistic  mechanics  will  follow  our 
development  of  newtonian  mechanics  very  closely  in  outline,  it  will  be  very 
significantly  changed  in  detail.  This  is  because  of  the  fundamental  dif- 
ferences between  newtonian  kinematics  and  the  kinematics  that  must  be 
used  in  the  relativistic  domain.  These  differences  lead  to  equally  funda- 
mental differences  between  the  predictions  concerning  rapidly  moving  ob- 
jects which  will  be  obtained  from  our  study  of  relativistic  mechanics  and  the 
predictions  concerning  slowly  moving  objects  we  obtained  when  studying 
newtonian  mechanics.  Particularly  striking  will  be  predictions  concerning 


657 


relativistic  mass,  relativistic  energy,  and  the  relativistic  relation  between 
mass  and  energy. 

While  Einstein  in  1905  had  to  carry  the  law  of  momentum  conserva- 
tion over  to  relativity  theory  as  an  assumption,  because  experimental  tests 
did  not  then  exist,  the  situation  now  is  quite  different  because  of  the  advent 
in  recent  years  of  high-energy  particle  accelerators.  Physicists  working  with 
these  machines  have  made  countless  measurements  confirming  that  mo- 
mentum conservation  holds  for  collisions  between  particles  moving  at 
speeds  very  close  to  the  speed  of  light.  And  long  before  it  was  technically 
possible  to  obtain  direct  experimental  confirmation  of  Einstein’s  assump- 
tion, there  were  experiments  whose  results  agreed  with  predictions  con- 
cerning relativistic  mass  and  energy,  which  are  logical  consequences  of 
relativistic  momentum  conservation.  Nuclear  fission  certainly  provides  the 
most  dramatic  example.  We  discuss  this  and  other  experimental  confirma- 
tions of  Einstein’s  theory  at  appropriate  places  in  this  chapter. 


15-2  RELATIVISTIC  We  obtain  the  relativistic  definition  of  mass  by  analyzing  a thought  experi- 
MASS  AND  ment  quite  analogous  to  the  experiment  of  the  air-table  puck  collision  ana- 
MOMENTUM  lyzecl  >n  Sec.  4-3.  Just  as  in  Sec.  4-3,  by  mass  we  mean  here  and  throughout 
this  chapter  the  inertial  aspect  of  mass  — not  the  gravitational  aspect  of  mass. 
In  the  thought  experiment,  observers  Ox  and  02  obtain  two  identical  par- 
ticles, say  billiard  balls  Bx  and  B2.  While  the  observers  are  at  rest  with 
respect  to  each  other  in  an  inertial  reference  frame,  they  carry  out  a pre- 
liminary experiment  in  which  the  balls  collide  while  moving  at  speeds  rela- 
tive to  the  two  observers  that  are  very  small  compared  to  the  speed  of  light. 
That  is,  they  carry  out  an  experiment  exactly  like  one  of  those  considered 
in  Sec.  4-3.  They  they  analyze  the  experiment  and  verify  that  both  balls 
have  the  same  mass  when  the  balls  are  observed  to  be  moving  very  slowly  or 
to  be  at  rest.  We  use  the  symbol  m0  for  the  mass  of  either  ball,  measured 
when  its  speed  relative  to  the  observer  making  the  measurement  is  zero  or 
very  small  compared  to  the  speed  of  light. 

Next  the  observers  repeat  the  collision  experiment,  but  modify  it  so 
that  each  observer  can  compare  the  mass  of  one  ball  moving  slowly  with 
respect  to  himself/herself  with  that  of  the  other  ball  moving  rapidly  with 
respect  to  himself/herself.  To  do  this,  each  observer  takes  one  ball,  and  the 
two  of  them  separate.  Then  they  start  moving  rapidly  along  parallel  lines 
toward  each  other,  so  that  they  will  pass  each  other  at  a constant  relative 
speed  comparable  to  the  speed  of  light.  The  situation  is  illustrated  in  Fig. 
15-1  from  the  point  of  view  of  an  inertial  reference  frame  with  axes  x,  y,  z. 
Observers  Ox  and  02  are  moving  in  opposite  directions  parallel  to  the  x axis 
at  a constant  speed  that  is  an  appreciable  fraction  of  the  speed  of  light.  Just 
before  they  pass,  Ox  throws  ball  Bx  in  a direction  that  he  judges  to  be  per- 
pendicular to  the  x axis  and  toward  02.  From  the  point  of  view  of  Ox,  the 
speed  with  which  he  throws  the  ball  has  the  value  v±.  At  the  same  instant,  02 
throws  ball  B2  toward  Ox  in  a direction  that  she  judges  to  be  perpendicular 
to  the  x axis.  From  her  point  of  view,  02  throws  B2  at  a speed  which  also  has 
the  value  vL.  The  two  balls  are  thrown  at  such  a time,  and  with  such  a 
speed,  that  they  collide  as  indicated  in  the  figure.  The  common  speed  v±  at 
which  the  two  observers  throw  their  balls  perpendicular  to  the  x axis  is  very 
small  compared  to  the  speed  of  light  c.  But  the  speed  with  which  the  ob- 
servers approach  each  other  along  the  x axis  is  comparable  to  the  speed  of 
light.  Thus  the  balls  approach  each  other  along  trajectories  that  are  seen  in 


658  Relativistic  Mechanics 


y 


Fig.  15-1  A symmetrical  collision  be- 
tween two  identical  billiard  balls,  as  seen 
from  an  inertial  reference  frame.  Since 
the  speed  of  observers  Ox  and  02  with 
respect  to  this  reference  frame  is  sup- 


z 


x posed  to  be  comparable  to  c,  while  the 
speed  with  respect  to  these  observers  of 
the  balls  B1  and  B2  is  supposed  to  be 
very  small  compared  to  c,  all  the  labeled 
angles  are  really  much  smaller  than 
shown. 


the  x,  y,  z frame  to  be  inclined  at  very  small  angles  to  the  x axis.  (In  the  fig- 
ure the  angles  have  been  exaggerated  for  the  sake  of  clarity.) 

From  the  point  of  view  of  the  x,  y,  z reference  frame,  illustrated  in  Fig. 
15-1,  the  balls  Bx  and  B2  approach  each  other  moving  in  opposite  directions 
at  equal  speeds  along  paths  that  form  equal  angles  du  = d2i  with  the  x axis. 
This  simplification  is  a consequence  of  the  symmetrical  way  that  we  assume 
the  observers  throw  the  balls.  After  the  collision,  B x and  B2  recoil  along 
paths  inclined  to  the  x axis  at  angles  0!/  and  d2f. 

We  require  that  the  total  momentum  of  an  isolated  system  be  conserved  in  any 
inertial  frame.  This  requirement  will  be  satisfied  by  the  collision  we  are 
studying,  since  the  two  balls  form  an  isolated  system  after  they  are  thrown, 
and  since  the  x,  y,  z frame  is  inertial.  The  initial  total  momentum  of  the  balls 
is  zero.  So  momentum  conservation  demands  that  their  final  total  mo- 
mentum be  zero  also.  This  means  that  the  balls  must  move  apart  after  the 
collision  with  equal  speeds  along  paths  inclined  at  the  equal  angles  By  = By. 
The  actual  value  of  these  angles  depends,  for  given  values  of  the  speeds  in- 
volved, on  the  precise  timing  of  the  throws  made  by  Ox  and  02.  We  assume 
for  the  sake  of  simplicity  that  this  was  done  so  as  to  make  the  balls  collide  as 
shown  in  the  figure,  recoiling  from  the  collision  at  final  angles  equal  to  the 
initial  angles  at  which  they  approach  the  collision.  That  is,  we  take  6 y = 8U 
and  By  = d2i.  We  further  simplify  the  analysis  by  assuming  that  the  collision 
is  elastic,  so  that  the  balls  move  apart  afterward  at  a relative  speed  equal  to 
their  relative  speed  of  approach  before  the  collision.  This  assumption  will 
not  restrict  the  validity  of  the  results  obtained  by  imposing  the  funda- 
mental requirement  that  the  total  momentum  of  an  isolated  system  be  con- 
served in  any  inertial  frame.  Just  as  in  the  newtonian  domain,  momentum 
is  conserved  in  a relativistic  collision  whether  or  not  the  collision  is  elastic. 
The  consequence  of  these  two  final  simplifying  assumptions — that  the 
final  angles  of  recoil  equal  the  initial  angles  of  incidence  and  that  the  colli- 
sion is  elastic — is  to  make  the  collision  completely  symmetrical,  as  seen  in  the 
x,  y,  z reference  frame. 

The  same  collision  is  shown  in  Fig.  15-2  from  the  inertial  frame  of  ref- 
erence xx,  yu  zx  in  which  observer  Cfi  is  stationed.  As  viewed  from  the  Xj,  yx, 
Zj  frame,  Ox  throws  Bx  with  an  initial  velocity  of  magnitude  v±  in  the  nega- 
tive )>!  direction.  After  the  collision  it  recoils  with  a final  velocity  of  the  same 


Fig.  15-2  The  symmetrical  collision 
between  5X  and  B2.  from  the  point 
of  view  of  Ox.  The  angles  between 
the  axis  and  the  initial  and  final 
trajectories  of  B2  again  have  been 
exaggerated  for  the  sake  of  clarity. 
Actually,  the  angles  are  much  smaller 
because  V is  comparable  to  c and 
v±  is  very  small  compared  to  c. 


15-2  Relativistic  Mass  and  Momentum  659 


magnitude  in  the  positive  y x direction.  So,  from  the  viewpoint  of  Ox,  ball  B1 
never  has  an  xx  component  of  velocity.  Also,  Ox  observes  B2  move  into  the 
collision  with  a velocity  that  has  an  Xj  component  equal  to  V,  the  magnitude 
of  the  velocity  of  02  relative  to  Ox.  After  the  collision,  Ox  sees  B2  move  away 
with  the  same  velocity  component  along  the  xx  axis.  Neither  ball  experi- 
ences a change  in  the  Xi  component  of  its  velocity,  and  there  will  be  no 
change  in  the  product  of  this  quantity  and  the  mass  of  the  ball.  Thus  it  is 
apparent  that  there  can  be  no  change  in  the  component  along  the  xx  direc- 
tion of  the  total  momentum  of  the  isolated  system  consisting  of  the  two  col- 
liding balls,  as  measured  from  the  inertial  frame  of  O v The  xx  component 
of  the  total  momentum  of  the  system  is  conserved. 

We  investigate  the  conservation  of  the  momentum  component  of  the 
system  along  the  yx  direction  by  evaluating  the  velocity  components  of  each 
ball  along  that  direction,  as  measured  by  Ox.  Now  we  know  that  02  mea- 
sures the  initial  velocity  of  B2  to  have  a component  only  along  the  y2  direc- 
tion of  her  x2,  y2,  z2  reference  frame,  with  a value  equal  to  the  quantity  v±. 
But  the  Lorentz  velocity  transformation  shows  that  Ox  does  not  measure  the 
3>x  component  of  the  velocity  of  B2  to  have  the  same  value.  To  find  out  what 
value  he  does  measure,  we  apply  the  velocity  transformation  from  the  x2,  y2, 
z2  reference  frame  to  the  xls  yu  zx  reference  frame  and  thereby  determine 
the  initial  yx  component  of  the  velocity  of  B2  as  seen  by  Ox.  For  this  purpose, 
we  use  the  second  of  Eqs.  (14-21),  with  the  notational  change  that  we  trans- 
form from  the  “sub-2”  frame  to  the  “sub-1”  frame  instead  of  from  the  “un- 
primed” frame  to  the  “primed”  frame.  We  write  the  equation  with  vV2  re- 
placing vy,  vyi  replacing  v'y,  and  vX2  replacing  vx.  Also,  we  set  the  signed 
scalar,  used  in  Eqs.  (14-21)  to  represent  the  velocity  of  the  primed  frame 
with  respect  to  the  unprimed  frame,  equal  to  —V.  The  reason  is  that  the 
sub-1  frame  is  moving  with  respect  to  the  sub-2  frame  at  speed  V in  the 
negative  x direction.  Then  we  have 

Vl  - V2/c2  Vy2 

Vyi~  1 + Vvjr 

Flere  vX2  is  the  initial  x2  velocity  component  of  \B2,  as  seen  by  02.  Its  value  is 
vX2  — 0.  And  vy2,  the  initial  y2  velocity  component  of  fi2  according  to  02,  has 
the  value  vy2  = vL.  Thus  we  find  that  when  Ox  measures  the  initial  yx  com- 
ponent of  the  velocity  of  B2,  he  obtains  the  value 

vyi  = Vl  - V2/c2 

After  the  collision  Ox  sees  B2  to  have  a velocity  with  a yx  component  of  the 
same  magnitude  but  with  a negative  sign. 

Now  that  we  know  all  the  yx  components  of  the  velocities  of  both  balls, 
let  us  see  if  the  yx  component  of  the  total  momentum  of  the  isolated 
system  they  form  is  constant,  as  viewed  from  the  inertial  reference  frame  of 
O j.  In  the  preliminary  experiment,  Ox  and  02  were  at  rest  with  respect  to 
each  other  and  observed  both  balls  moving  at  speeds  small  compared  to  the 
speed  of  light.  They  found  that  momentum  was  conserved  if  they  used  the 
newtonian  definition  that  the  momentum  of  each  ball  equals  its  mass  m0 
times  its  velocity.  If  we  apply  the  same  definition  in  this  experiment,  where 
the  balls  are  moving  at  speeds  comparable  to  the  speed  of  light,  we  find  that 
the  initial  momentum  of  the  system  has  a yx  component  given  by 

pVu  — m0{-v ±)  + w0Vl  - V2/c2  v±  (proves  to  be  incorrect) 


The  coefficient  of  m0  in  the  first  term  is  the  initial  velocity  component  in  the 
yx  direction  of  Bx,  and  the  coefficient  of  m0  in  the  second  term  is  the  initial 
velocity  component  of  B2  in  the  y,  direction.  (See  Fig.  15-2.)  We  hnd  also 
that  the  hnal  momentum  of  the  system  has  a yx  component  given  by 

pyif  = m0vx  + m0Vl  - V2/c 2 ( — vx)  (proves  to  be  incorrect) 

Here  the  coefficient  of  m0  in  the  first  term  is  the  hnal  yx  velocity  component 
of  Bx,  and  the  coefficient  of  m0  in  the  second  term  is  the  hnal  y1  velocity 
component  of  B2- 

These  two  expressions  do  not  yield  values  of  pyXi  and  pVlf  which  are 
equal.  Indeed,  they  yield  values  which  are  the  negative  of  each  other  be- 
cause the  right  side  of  the  hrst  expression  is  just  the  negative  of  the  right 
side  of  the  second.  Applying  the  newtonian  definition  that  momentum 
equals  the  product  of  mass  m0  and  velocity  leads  to  the  conclusion  that  the 
initial  total  momentum  of  the  isolated  system  has  a yx  component  which  is 
different  from  that  of  its  hnal  total  momentum.  If  so,  the  total  momentum 
of  the  system  will  not  be  conserved  in  the  xu  yu  zx  inertial  reference  frame, 
even  though  it  is  conserved  in  the  x,  y,  z inertial  frame.  This  would  contra- 
dict Einstein’s  postulate  — verified  experimentally  in  many  different 
ways  — that  all  inertial  frames  are  equivalent.  And  it  would  invalidate  in  the 
relativistic  domain  the  law  of  conservation  of  momentum,  which  we  know 
to  be  the  experimental  foundation  of  mechanics  in  the  newtonian  domain. 
Thus  we  do  not  accept  as  correct  the  values  of  pyi.  and  pyif  displayed  in  the 
equations  of  the  preceding  paragraph. 


How  can  we  salvage  Einstein’s  postulate  and  the  fundamental  law  of 
momentum  conservation  and  also  retain  the  definition  of  momentum  as 
the  product  of  mass  and  velocity?  By  changing  the  properties  of  mass 
and/or  velocity!  Since  we  are  already  using  properties  of  velocity  which  are 
different  from  those  used  in  the  newtonian  domain  and  which  are  in  agree- 
ment with  experiments  in  the  relativistic  domain,  we  try  changing  the 
properties  of  mass.  We  do  this  by  allowing  the  relativistic  mass  m of  a particle 
to  depend  on  the  particle’s  speed,  instead  of  taking  the  mass  to  be  a con- 
stant, as  in  the  newtonian  domain.  And  we  do  it  in  such  a way  that  no  con- 
flict can  arise  with  experiments  of  the  newtonian  domain  which  show  the 
mass  of  a particle  to  be  independent  of  its  speed  when  the  speed  is  small 
compared  to  the  speed  of  light.  This  is  accomplished  by  requiring  that  the 
relativistic  mass  m of  a particle  approach  the  constant  value  m0  when  its 
speed  becomes  small  compared  to  the  speed  of  light. 

We  hnd  an  expression  for  the  relativistic  mass  of  a particle  when  its 
speed  is  comparable  to  that  of  light  by  equating  the  initial  and  hnal  total 
momentum  components  along  the  yx  direction  of  the  system  of  two  col- 
liding balls.  Abandoning  the  newtonian  idea  that  the  mass  of  each  rapidly 
moving  ball  has  the  fixed  value  m0,  we  insert  into  the  expression  for  the  ini- 
tial total  yx  component  of  momentum  the  relativistic  mass  m.  We  have  then 

py „ = m (~v±)  + Wl  - V2/c2  vx 

The  hrst  term  on  the  right  side  is  the  yx  component  of  momentum  of  Bu  as 
measured  by  Ov  This  observer  sees  Bx  to  be  moving  at  the  speed  v±,  which 
is  small  compared  to  c.  So  for  Ox  the  relativistic  mass  of  Bx  is  indistin- 
guishable from  its  newtonian  value  m0,  and  the  equation  becomes 

pyu  = m0  ( — v±)  + mV  1 — V2/c2  v±  (15-la) 


15-2  Relativistic  Mass  and  Momentum  661 


But  Ox  sees  B2  to  be  moving  at  a speed  that  is  comparable  to  c because  it  has 
the  large  xx  component  of  velocity  derived  from  the  motion  of  02.  There- 
fore in  the  second  term  the  relativistic  mass  m cannot  be  replaced  by  the 
newtonian  value  m0.  In  exactly  the  same  way,  we  find  the  following  equa- 
tion for  the  final  total  yx  component  of  momentum  of  the  system 

pyi,  = m0v±  + m\J  1 - V2/c2  ( — vx)  (15-1/)) 

Now  we  insist  that  the  yx  component  of  momentum  of  the  system  be 
conserved  by  equating  its  initial  and  final  values: 

Pw  = Pm,  (15-2) 

By  using  Eqs.  (15-la)  and  (15-1/)),  this  gives  us 

m0vx  + mV  1 — V2/c2  ( — v±)  = m0(  — vx)  + mV  1 — V2/c2  v± 
Dividing  through  by  vx  and  then  transposing,  we  have 

2m  Vl  — V2/c2  = 2m0 


or 


m0 

Vl  - V2/c 2 


(15-3) 


We  have  found  the  expression  for  relativistic  mass  which  allows  the  yx  com- 
ponent of  momentum  of  the  system  to  be  conserved.  We  will  interpret  the 
meaning  of  this  expression  shortly. 


As  was  noted  earlier,  it  is  apparent  from  Fig.  15-2  that  the  initial  and 
final  total  momentum  components  of  the  system  along  the  xx  direction 
cannot  differ.  This  is  a consequence  of  the  system’s  symmetry.  So  we  also 
have  conservation  of  its  xx  component  of  total  momentum: 

Pxi,  = Px1,  (15-4) 

Equations  (15-2)  and  (15-4),  taken  together,  give  us  the  vector  conservation 

equation 

P/  = p,  (15-5a) 

A more  complete  expression  of  this  important  result  reads 

(Ptotal)final  — (Ptotal)initial  (15-5/)) 

The  total  relativistic  momentum  of  an  isolated  system  is  constant,  as  viewed  from  an 
inertial  reference  frame.  In  this  equation  the  relativistic  momentum  p of  a 
particle  is  defined  as 

p = mv  (15-6) 

where  m is  its  relativistic  mass,  given  by  Eq.  (15-3),  and  v is  its  velocity. 

To  satisfy  law  of  the  momentum  conservation,  we  have  had  to  allow 
the  mass  of  B2  as  measured  by  Ox  to  be  a function  of  the  speed  of  B2  as  mea- 
sured by  Ox.  From  Fig.  15-2  and  the  pythagorean  theorem,  you  can  see  that 
this  speed  v of  B2  relative  to  Ox  is  given  by 

v = Ve2  + ui(l  - V2/c2) 

But  since  V is  comparable  to  c,  while  vx  is  very  small  compared  to  c,  the  sec- 
ond term  under  the  square  root  is  completely  negligible  in  comparison  to 


662  Relativistic  Mechanics 


the  first  term.  Hence  the  speed  v of  B2  relative  to  0 x is  essentially  V.  Thus 
we  can  rewrite  Eq.  (15-3),  specifying  the  relativistic  mass  m of  B2  as  mea- 
sured by  0i,  in  terms  of  the  ball’s  speed  v as  measured  by  0X.  We  do  this  by 
setting  V — v in  Eq.  (15-3),  to  obtain 


m 


m0 

Vl  - v2/c2 


(15-7) 


In  this  equation  m is  the  relativistic  mass  of  a particle  which  is  observed  to 
be  moving  at  speed  v.  The  relativistic  mass  of  a particle  is  its  mass  measured  when 
it  is  observed  to  be  moving  at  a speed  comparable  to  the  speed  of  light.  The  mass  m0 
appearing  in  Eq.  (15-7)  is  called  the  rest  mass  of  the  particle.  The  rest  mass  of 
a particle  is  its  mass  measured  when  it  is  observed  to  be  either  moving  at  a speed  very 
small  compared  to  the  speed  of  light  or  at  rest. 


What  we  have  done  is  to  define  mass  in  relativistic  mechanics  in  such  a 
way  that  mass  times  velocity — that  is,  momentum — is  conserved.  We  used 
exactly  the  same  approach  in  Sec.  4-3  to  define  mass  in  newtonian  me- 
chanics. The  difference  is  that  velocity  has  relativistic  properties  which  are 
not  the  same  as  its  newtonian  properties,  and  this  leads  to  a change  in  the 
properties  of  mass. 

The  origin  of  this  change  in  the  properties  of  mass  can  be  traced  to 
time  dilation.  Observer  02  measures  the  mass  of  B2  to  be  m0.  She  finds  its 
initial  velocity  component  in  the  perpendicular  direction  by  dividing  the 
length  dy 2 it  moves  by  the  proper  time  interval  dt2  required  for  the  motion. 
Observer  01  accepts  the  measurement  of  the  perpendicular  length,  bnt  not 
of  the  time  interval.  In  fact,  Ox  measures  a dilated  time  interval  that  is  larger 
than  dt-2  by  the  factor  l/Vl  — v2/c2,  with  v being  equal  to  the  relative  speed 
of  the  observers.  Momentum  conservation  requires  they  both  agree  as  to 
the  momentum  component  of  B2  in  the  perpendicular  direction.  Ac- 
cording to  On  it  is  ra0  dy2/dt2.  Since  Ox  finds  the  time  interval  in  the  denomi- 
nator to  be  increased  by  the  factor  l/Vl  — v2/c2,  there  must  be  a compen- 
sating increase  in  his  measurement  of  one  of  the  terms  in  the  numerator. 
Since  there  is  no  change  in  the  length,  the  increase  must  be  in  the  mass.  So 
0X  finds  the  relativistic  mass  of  B2  to  be  larger  than  m0  by  the  factor 
l/Vl  - v2/c2. 


The  cloud  chamber  photograph  in  Fig.  15-3  provides  qualitative  confirmation 
of  the  relativistic  prediction  that  the  mass  of  a particle  is  larger  than  its  rest  mass  if 
its  speed  is  an  appreciable  fraction  of  the  speed  of  light.  A set  of  dots  shows  the 
path  of  a high-speed  electron,  emitted  from  a radioactive  source,  that  traveled 
from  the  left  of  the  photograph  to  collide  near  the  center  with  an  electron  which  is 
part  of  a molecule  of  the  gas  in  the  chamber.  The  molecular  electron  is  essentially 
free  and  stationary.  That  is,  its  speed  and  binding  energy  are  negligible  compared 
with  the  speed  and  energy  of  the  electron  from  the  source.  Emerging  from  the 
point  of  collision  are  sets  of  dots  showing  the  paths  followed  by  two  electrons,  one 
being  the  recoiling  incident  electron  and  the  other  the  struck  electron.  The  way  in 
which  the  cloud  chamber  makes  the  paths  visible  is  explained  in  the  caption  to 
the  figure. 

The  collision  between  the  two  electrons  does  not  have  the  appearance  of  an 
elastic  collision  between  two  particles  of  equal  mass,  one  of  which  is  initially  sta- 
tionary. You  can  see  this  by  comparing  Fig.  15-3  with  Fig.  15-4,  which  is  repro- 
duced from  Figs.  4-14  and  8-9.  It  shows  an  elastic  collision  on  an  air  table  between 
an  incident  puck  and  a stationary  puck  of  the  same  mass.  As  discussed  in  Chaps.  4 
and  8,  such  a collision  is  characterized  by  the  fact  that  the  two  particles  leave  the 
point  of  collision  on  paths  that  form  a 90°  angle.  The  angle  in  the  electron-electron 


15-2  Relativistic  Mass  and  Momentum  663 


Fig.  15-3  Cloud  chamber  photograph  of  a collision  between  a high-speed  electron  and 
another  electron  which  is  essentially  free  and  stationary.  The  incident  electron,  coming 
from  the  left,  happens  to  approach  the  other  electron  located  just  above  the  center  of 
the  photograph.  The  two  electrons  interact  through  an  electric  force  they  exert  on  each 
other  when  they  are  close  together.  Momentum  is  thereby  transferred  from  the  incident  to 
the  struck  electron,  and  the  two  move  off  at  high  speeds  to  the  right.  Their  paths  from 
the  collision  point  form  an  angle  of  about  45°.  The  electron  paths  are  recorded  by  droplets 
of  litjuid  condensing  from  supersaturated  vapor  filling  the  cloud  chamber.  The  droplets 
begin  to  condense  on  vapor  molecules  which  the  high-speed  electrons  have  ionized  (charged 
electrically  by  ejecting  a molecular  electron).  The  droplets  occur  at  essentially  random 
locations  along  the  so-called  tracks  of  the  colliding  electrons,  and  a single  photograph  is 
taken,  not  a stroboscopic  photograph.  Thus  the  separation  between  adjacent  droplets  does 
not  provide  direct  information  about  the  speeds  of  the  electrons.  Note  that  all  three  of  the 
tracks  are  in  good  focus  throughout  their  entire  lengths.  This  shows  that  the  plane  in  which 
they  lie  must  be  nearly  perpendicular  to  the  camera  axis,  so  that  the  angle  between  the  tracks 
emerging  from  the  point  of  collision  is  not  distorted  in  the  photograph. 


Fig.  15-4  A strobe  photo  of  a 
collision  on  an  air  table  between 
two  magnetic  pucks  of  equal  mass. 
The  incident  puck  comes  from 
the  left  and  makes  a very  close 
encounter  with  the  initially  sta- 
tionary puck  near  the  center  of 
the  table.  While  they  are  close 
together,  the  two  pucks  interact 
through  a magnetic  force.  Note 
that  the  paths  followed  by  the 
pucks  after  the  collision  form 
a 90°  angle. 


664  Relativistic  Mechanics 


collision  of  Fig.  15-3  is  clearly  less  than  90°.  This  makes  it  look  like  either  an  ine- 
lastic collision  between  two  pucks  of  equal  mass  or  an  elastic  collision  in  which 
the  mass  of  the  incident  puck  is  significantly  larger  than  the  mass  of  the  struck 
puck. 

In  the  typical  collision  shown  in  Fig.  15-3,  the  electrons  collide  elastically. 
The  physical  reason  is  that  it  is  difficult  for  electrons  to  get  rid  of  mechanical  en- 
ergy when  they  collide.  (To  do  so,  they  must  emit  an  X ray.)  Confirmation  of  the 
elastic  nature  of  the  collision  is  found  in  the  photograph.  Note  that  the  electrons 
follow  paths  which  are  arcs  of  circles.  This  is  because  there  is  a “magnetic  field” 
applied  to  the  cloud  chamber,  perpendicular  to  the  plane  of  the  photograph.  This 
results  in  a centripetal  force  on  the  charged  electrons  whose  strength  depends  on 
their  speeds.  In  Chap.  23  the  details  of  this  process  are  studied.  Here  it  suffices  to  say 
that  measurements  of  the  radii  of  the  arcs,  when  combined  with  results  to  be  ob- 
tained in  Sec.  15-3,  can  be  used  to  determine  the  speeds  of  the  electrons.  In  this 
way  it  can  be  shown  that  the  collision  between  the  two  electrons  actually  is 
elastic. 

The  radius  of  curvature  of  the  path  followed  by  the  incident  electron  shows 
that  its  speed  is  v = 0.97c.  For  such  a speed,  Eq.  (15-7)  predicts  that  the  relativ- 
istic mass  of  the  electron  will  be  m0/V  1 — v2/c2  = m0/Vl  — (0.97)2  = 4.1m0, 
where  m0  is  its  rest  mass.  An  elastic  collision  between  two  pucks,  with  the  ini- 
tially moving  puck  having  a mass  4 times  the  mass  of  the  initially  stationary  one, 
is  shown  in  Fig.  15-5.  The  similarity  is  certainly  evident  between  this  photograph 
and  the  one  showing  the  elastic  collision  between  the  electron  of  relativistic  mass 
4.1  times  the  rest  mass  of  the  initially  stationary  electron.  It  provides  good  qualita- 
tive confirmation  of  the  relativistic  prediction.  (The  confirmation  is  only  qualita- 
tive because  a newtonian  collision  cannot  precisely  model  a relativistic  collision. 
After  a collision  the  ratio  of  the  puck  masses  is  still  4/1.  But  a collision  reduces  the 
ratio  of  the  electron  masses  to  a value  smaller  than  4.1/1,  since  the  incident  elec- 
tron slows  down  and  the  struck  electron  speeds  up.)  Quantitative  confirmation  of 
the  values  of  relativistic  mass  m,  predicted  by  m = m0/Vl  — v2/c2,  is  presented 
in  Sec. 15-3. 

Examples  15-1  and  15-2  carry  out  numerical  calculations  using  Eq. 
(15-6),  p = mx,  and  Eq.  (15-7),  m = w0/V  1 — v2/c2. 


Fig.  15-5  A collision  on  an  air 
table  between  magnetic  pucks  of 
unequal  mass.  The  puck  incident 
from  the  left  has  a mass  which 
is  4 times  as  large  as  that  of 
the  struck  puck.  Here  the  angle 
between  the  paths  followed  by 
the  pucks  is  about  45°.  Com- 
pare this  collision  to  the  electron 
collision  shown  in  Fig.  15-3. 


15-2  Relativistic  Mass  and  Momentum  665 


EXAMPLE  15-1 


Calculate  the  relativistic  momentum  and  mass  of  a cosmic  ray  muon  moving  at 
speed  v = 0.999c.  The  rest  mass  of  a muon  is  1.9  X 10~28  kg. 

■ The  relativistic  mass  is 

m0 

m - — 

Vl  - V/c2 

In  Example  14-3  you  saw  that  l/Vl  - v2/c 2 = 22  for  v/c  = 0.999.  So  you  have 
m = 22m0  = 22  x 1.9  x 10~28  kg  = 4.2  x 10~27  kg 
The  relativistic  momentum  can  then  be  calculated  from 

p = mx 

The  magnitude  is 

p = mv  = 4.2  X 10-27  kg  x 3.0  x 108  m/s  = 1.3  X 10~18  kg-rn/s 
The  direction  of  the  momentum  is  the  same  as  that  of  the  muon  velocity. 


EXAMPLE  15-2  

A high-energy  particle  accelerator  produces  a beam  of  protons  (hydrogen  atom  nu- 
clei) that  have  relativistic  masses  m which  are  100  times  their  rest  masses  m0.  Deter- 
mine the  speed  of  the  protons  in  the  beam. 

■ From  the  equation  m = ra0/V  1 — v2/c2,  you  have 

m 1 

mo  Vl  - v2/c2 

Since  the  value  of  m/m0  is  given  to  be  100,  you  know  that  l/Vl  — v2/c2  = 1.00  x 
102,  or 

Vl  - v2/c2  = 1.00  X 10-2 

You  evaluate  v2/c 2 by  carrying  out  the  following  sequence  of  calculations: 

1 - v2/c2  = 1.00  x 10-4 

v2/c2  = 1 - 1.00  X 10-4 
v/c  = (1  - 1.00  x 10"4)1'2 
I he  binomial  theorem  approximation  gives  you 

v/c  = 1 — 2 x 1.00  x lO"4  = 1 - 5.0  x 10“5  = 0.999950 
Thus  you  have 

v = 0.999950c 


15-3  RELATIVISTIC  Now  that  we  have  obtained  an  expression  for  momentum  which  satisfies 
FORCE  AND  ENERGY  the  requirements  of  relativistic  mechanics,  we  can  use  it  to  find  a satisfac- 
tory expression  for  force.  The  procedure  is  exactly  the  same  as  in  new- 
tonian  mechanics:  force  is  obtained  from  momentum  by  definition.  The 
relativistic  force  F acting  on  a particle  is  defined  to  be  the  rate  of  change 
of  its  relativistic  momentum  p.  That  is 


(15-8) 


666  Relativistic  Mechanics 


This  definition  of  force  has  precisely  the  mathematical  form  of  the  second 
law  of  motion  of  newtonian  mechanics.  Thus  the  fundamental  relation 
between  force  and  motion  is  the  same  in  both  newtonian  and  relativistic 
mechanics — providing  Newton’s  second  law  is  written  in  terms  of  the  fun- 
damental mechanical  quantity  momentum,  and  not  acceleration. 

In  newtonian  mechanics  the  mass  m0  of  a body  on  which  a force  acts  is 
almost  always  a constant  (for  an  exception,  see  Example  4-8).  Thus  it  is  al- 
most always  true  in  newtonian  mechanics  that 

dp  d(m0\)  d\ 

F = ~dt  = dt  =m°~dt 


or 


F = m0a 


The  equation  F = m0 a is  (with  few  exceptions)  completely  equivalent  to  the 
equation  F = dp/dt  in  newtonian  mechanics.  But  it  is  not  generally  possible 
to  write  an  equation  analogous  to  F = m0a  in  relativistic  mechanics,  not  even 
if  the  relativistic  mass  m is  substituted  for  the  rest  mass  m0.  The  error  in 
making  this  common  mistake  can  be  seen  by  repeating  the  calculation 
above  using  the  variable  relativistic  mass  m.  It  yields 


F 


d(mv) 

dt 


dv  dm 

m — h v — r- 
dt  dt 


(15-9a) 


or 


F = ma  + v— r-  (15-96) 

dt 

In  relativistic  mechanics  the  mass  m of  a body  on  which  a force  acts  is  usu- 
ally not  constant,  since  the  force  will  typically  lead  to  a change  in  the  speed 
of  the  body  and  therefore  to  a change  in  its  relativistic  mass.  This  means 
the  term  v dm/dt  is  generally  not  zero.  Thus,  in  general,  F ^ ma  for  relativ- 
istic mechanics. 


An  exception  is  found  in  the  special  case  of  a body  moving  at  a constant  rela- 
tivistic speed  in  a circular  path  under  the  application  of  a centripetal  force.  The 
electrons  in  Fig.  15-3,  for  example,  are  moving  through  circular  arcs  in  a magnetic 
field.  In  such  circumstances  m = constant,  v dm/dt  = 0,  and  Eq.  (15-9b)  sim- 
plifies to  F = ma.  This  equation  was  used  by  Alfred  H.  Bucherer  in  1909  to  ana- 
lyze an  experiment  that  provided  the  first  quantitative  confirmation  of  the  relativ- 
istic prediction  m = m0/V  1 — v2/c2 

Bucherer  sent  a beam  of  electrons  from  a radioactive  source  through  a speed 
selector  consisting  of  a region  having  both  “electric  fields”  and  “magnetic  fields” 
in  an  arrangement  that  is  explained  in  Chap.  23.  For  a particular  adjustment  of  the 
fields,  only  electrons  of  a certain  speed  emerged  from  the  selector.  Those  that  did 
entered  a region  containing  only  a magnetic  field,  where  they  moved  through  a 
circular  arc  at  constant  speed.  A measurement  of  the  radius  of  curvature  of  the 
arc,  combined  with  the  known  speed  of  the  electrons,  determined  their  centripe- 
tal acceleration  a.  The  centripetal  force  F acting  on  them  could  be  determined 
from  the  measured  strength  of  the  magnetic  field  and  the  electron  speed,  in  a way 
that  also  is  explained  in  Chap.  23.  Since  the  simplified  form,  F = ma,  of  Eq.  (15-9b) 
applied  to  the  motion  of  the  electrons,  the  measured  values  of  F and  a could  be 
used  to  measure  the  value  of  their  mass  m.  Bucherer  obtained  data  for  several  val- 
ues of  the  electron  speed  v.  His  results  are  shown  by  the  crosses  in  Fig.  15-6.  The 
dots  in  the  figure  show  some  results  obtained  by  others  more  recently.  The  solid 


15-3  Relativistic  Force  and  Energy  667 


Fig.  15-6  Predicted  and  measured  values  of  the 
mass  of  high-speed  electrons.  The  solid  curve  is  the 
prediction  of  relativistic  mechanics  for  the  ratio  of 
the  mass  m of  an  electron  to  its  rest  mass  m0,  versus 
the  ratio  of  its  speed  v to  the  speed  of  light  c.  The 
dashed  curve  shows  the  prediction  of  newtonian 
mechanics.  The  crosses  are  values  measured  by 
Bucherer,  and  the  dots  are  values  obtained  in  more 
recent  measurements. 


curve  represents  the  relativistic  prediction  for  the  mass,  and  the  dashed  curve 
gives  the  newtonian  prediction.  Agreement  between  the  relativistic  theory  and 
experiment  is  certainly  satisfactory. 

The  complications  that  typically  arise  from  the  presence  of  the  term 
v dm /dt  in  Eq.  ( 15-%)  make  force  a somewhat  less  useful  concept  in  relativ- 
istic mechanics  than  in  newtonian  mechanics.  But  this  is  not  a serious  dis- 
advantage because  relativistic  mechanics  is  most  frequently  used  in  connec- 
tion with  systems  of  microscopic  size.  At  very  small  distances,  forces  are 
much  more  difficult  to  determine  experimentally  than  energies.  So  even 
more  emphasis  is  put  on  the  advantages  of  using  energy  considerations 
with  microscopic  systems  than  with  the  macroscopic  systems  treated  in  new- 
tonian mechanics  — a fortunate  match  with  the  features  of  relativistic  me- 
chanics. Of  the  not-so-frequent  uses  to  which  force  is  put  in  relativistic 
mechanics,  by  far  the  most  important  is  found  in  the  following  derivation 
of  the  relativistic  expression  for  kinetic  energy. 


Xi  Xf 

Fig.  15-7  A particle,  initially  at  rest 
at  Xi,  which  moves  to  xf  as  a result 
of  the  application  of  a force  F. 


Imagine  a particle  of  rest  mass  m0  that  is  initially  stationary  at  xt.  A 
force  of  magnitude  F is  applied  to  it  in  the  direction  of  the  positive  x axis,  as 
in  Fig.  15-7.  The  force  causes  the  particle  to  move  with  increasing  speed 
along  the  axis.  To  evaluate  the  kinetic  energy  which  the  particle  acquires, 
we  calculate  the  work  W done  by  the  force  while  the  particle  moves  to  some 
final  location  xf  and  then  equate  it  to  the  particle’s  kinetic  energy  at  xf.  The 
method  is  just  the  same  as  that  used  in  Sec.  7-3  to  evaluate  kinetic  energy  in 
newtonian  mechanics,  but  the  calculation  is  more  complicated  because  in 
the  present  case  the  mass  of  the  particle  increases  as  its  speed  increases. 

First  we  carry  over  into  relativistic  mechanics  the  definition  of  the  work 
W used  in  newtonian  mechanics: 


VF  ss 


(15-10) 


Then  we  begin  a sequence  of  manipulations  which  will  lead  to  evaluation  of 
the  integral.  As  applied  to  this  one-dimensional  situation,  the  equations 
F = dp/dt  and  p = mx  produce  Eq.  (15-9fl)  in  the  form 

dv 

F = m—  + v 
dt 


668  Relativistic  Mechanics 


dm 

~dt 


Hence 


( dv  dm\ 

F dx  = m— — hi;  -7-  dx 
\ dt  dt ) 

I he  variables  vi  and  v in  the  term  in  parentheses  are  related  by  the  equa- 
tion m = m0/\/ 1 — v2/c2.  Multiplying  through  by  the  square  root  and  then 
squaring  both  sides  of  the  equality,  we  get 

m2(  1 — v2 /c2)  = ml 

Multiplying  through  by  c2  converts  this  equation  to 

2 2 2 2 2 2 

me  — m v = MqC 

We  take  the  time  derivative  of  each  term,  remembering  that  m0  is  a constant 
as  well  as  c,  and  obtain 

2 d(m2)  d{m2v2) 

c — ; ; = 0 

dt  dt 


Differentiation  gives 


2 c2m 


dm 

dt 


2 m2v 


dv 

dt 


2 v2m 


dm 

dt 


Dividing  through  by  — 2 mv  and  transposing  the  first  term  produce 

dv  dm  dm  1 

m — h v -7-  = c 

dt  dt  dt  v 

Using  this  relation  in  the  expression  for  F dx,  we  have 

F dx  = c2  — dx 
dt  v 

Since  v = dx/dt,  it  follows  that  l/v  = dt/dx,  so  that 

F dx  = c2~^~  dx 
dt  dx 


This  simplifies  to 


F dx  = c2  dm 


(15-11) 


Now  we  use  Eq.  (15-11)  in  Eq.  (15-10)  to  write  the  work  done  as 

rxf  r m f rmf 

W — \ Fdx=  c2  dm  = c2  I dm 
J Xj  J mi  J mi 

Here  m * and  mf  are  the  masses  of  the  particle  when  it  is  at  xt  and  xf,  respec- 
tively. The  fundamental  theorem  of  calculus  allows  us  to  evaluate  the  inte- 
gral on  the  right  side  of  the  last  equality  immediately,  to  yield 

W = c2(mf  — mf) 

But  mi  is  the  rest  mass  m0  because  the  particle  was  stationary  at  x,.  And  at  xf 
its  mass  mf  is  the  relativistic  mass  m.  So  we  have  the  result 

W — me2  — m0c2  (15-12) 

The  work  done  by  the  force  in  building  up  the  speed  of  the  particle,  and 
thereby  increasing  its  mass,  equals  its  final  relativistic  mass  times  c 2 minus 
its  initial  rest  mass  times  c2. 


15-3  Relativistic  Force  and  Energy  669 


We  now  relate  the  work  W done  in  setting  the  particle  into  motion  to 
the  kinetic  energy  K that  the  particle  has  when  moving.  In  newtonian  me- 
chanics this  is  done  by  defining  tfie  kinetic  energy  to  be  equal  to  the  work: 

K = W 


The  same  definition  is  used  in  relativistic  mechanics.  So  we  have  for  the 
relativistic  kinetic  energy  K of  the  moving  particle 

K — me 2 — m0c 2 (15-13) 

Confidence  in  the  validity  of  Eq.  (15-13)  can  be  gained  by  using  it  to 
evaluate  the  kinetic  energy  of  a particle  moving  at  a speed  v that  is  very 
small  compared  to  c.  We  write  m = m0/V  1 - v2/c2,  and  the  equation 
becomes 


K 


m0c 


Vl  - v2/c2 


m0c 


(15- 14a) 


or 


K = ?«0c2[(l  — v2/c2)  1/2  — 1] 

For  v/c  very  small  compared  to  1,  we  can  apply  with  accuracy  the  binomial 
expansion  approximation 

(1  - v2/c2)~112  = 1 + W/c2 

From  this  we  obtain 

K = m0c2[\  + \v2/c 2 — 1]  = im0v2  for  v/c  « 1 (15-146) 

So  we  do  find  that  the  relativistic  expression  for  kinetic  energy  reduces,  as 
it  must,  to  the  newtonian  expression  when  the  speed  of  the  particle  is  in  the 
newtonian  domain. 

A particularly  straightforward  experimental  verification  of  Eq.  (15-14a)  for 
v/c  comparable  to  1 was  published  in  1964  by  Bertozzi.  A beam  of  electrons 
having  a controllable  high  speed  was  produced  by  a particle  accelerator.  For  each 
setting  used,  the  speed  v was  determined  by  measuring  electronically  the  time  re- 
quired for  the  electrons  in  the  beam  to  travel  a distance  of  8.40  m between  two 
electron  detectors.  The  kinetic  energy  of  the  electrons  in  the  beam  was  found  at 
each  setting  by  stopping  the  beam  in  a thick  aluminum  absorber,  where  the  kinetic 
energy  was  converted  to  heat  energy.  Measurements  of  the  resulting  increase  in 
the  absorber’s  temperature  in  a given  amount  of  time  were  used  to  determine  the 
total  kinetic  energy  deposited  in  the  absorber  during  that  time,  in  a way  that  is  ex- 
plained in  Chap.  17.  Separate  measurements  of  the  electric  charge  collected  by  the 
absorber  gave  the  number  of  electrons  stopping  in  it,  and  so  the  kinetic  energy  K 
per  electron  could  be  determined.  Results  are  shown  in  Fig.  15-8,  which  is  a plot 
of  K/m0c2  versus  v/c.  Experimental  values  are  represented  by  the  two  dots.  The 
solid  curve  is  the  prediction  of  relativistic  mechanics,  and  the  dashed  curve  is  the 
prediction  of  newtonian  mechanics.  The  experimental  data  agree  quite  well  with 
the  relativistic  prediction. 

We  continue  the  interpretation  of  Eq.  (15-13), 

K = me 2 — m0c2 

Since  the  quantity  K in  this  equation  is  an  energy,  the  quantities  me 2 and 
m0c2  must  also  be  energies.  Specifically,  me2  is  an  energy  associated  with  the 


670  Relativistic  Mechanics 


0 0.2  0.4  0.6  0.8  1.0 


v/c 

Fig.  15-8  Predicted  and  measured  val- 
ues of  the  kinetic  energy  of  high-speed 
electrons.  The  solid  curve  is  the  pre- 
diction of  relativistic  mechanics  for  the 
ratio  of  the  kinetic  energy  K of  an 
electron  to  its  rest  mass  energy  m0c2, 
versus  the  ratio  of  its  speed  v to  the 
speed  of  light  c.  The  dashed  curve 
shows  the  prediction  of  newtonian  me- 
chanics. The  dots  are  values  measured 
by  Bertozzi. 


particle  when  it  is  moving  and  has  mass  m,  and  m0c2  is  an  energy  associated 
with  it  when  it  is  at  rest  and  has  mass  m0.  To  see  their  significance,  we  write 
the  equation  as 

me2  = K + m0c2 

and  then  consider  what  it  means  physically.  We  come  to  an  unavoidable 
conclusion:  The  energy  me 2 is  the  total  energy  of  the  moving  particle,  since 
it  is  the  sum  of  the  energy  K required  to  set  it  into  motion  and  an  energy 
m0c2  it  has  when  at  rest.  The  energy  me2  is  assigned  the  symbol  E and  the 
name  total  relativistic  energy.  The  energy  m0c 2 is  symbolized  by  E0  and 
called  the  rest-mass  energy. 

We  have  obtained  Einstein’s  famous  relations  between  energy  and 
mass:  The  total  relativistic  energy  of  an  object  equals  its  relativistic  mass  times  c2: 

E = me2  (15-15) 

The  rest-mass  energy  of  an  object  equals  its  rest  mass  times  c 2: 

E0  = m0c2  (15-16) 

In  these  equations,  Einstein  established  the  fundamental  connection 
between  energy  and  mass.  The  two  mechanical  quantities  are  inseparably 
related.  The  energy  content  of  an  object  can  be  measured  by  its  mass,  and  vice  versa, 
since  energy  is  proportional  to  mass.  A nd  the  square  of  the  speed  of  light  is  the  propor- 
tionality constant. 


Let  us  use  the  relations  we  have  obtained  to  describe  what  a force  does  when 
it  is  applied  to  an  initially  stationary  particle,  as  when  one  of  the  electrons  is 
coming  up  to  speed  in  the  particle  accelerator  used  by  Bertozzi.  It  is  always  true 
that  an  increment  of  work  done  by  the  force  leads  to  an  increment  in  its  total 
relativistic  energy.  You  can  see  this  from  Eq.  (15-11),  F dx  = c2  dm.  But  while 
the  speed  v of  the  particle  has  not  yet  become  an  appreciable  fraction  of  the  speed 
of  light  c,  its  kinetic  energy  K is  small  compared  to  its  rest-mass  energy  m0c2.  So 
while  the  particle  is  still  in  the  newtonian  domain,  its  total  relativistic  energy, 
which  according  to  Eq.  (15-13)  is  me2  = K + m0c2,  is  not  significantly  larger  than 
m0c2.  Thus  the  fractional  increase  in  m is  negligible.  This  is  why  the  particle 
follows  the  newtonian  prediction  K = m0v2/2,  as  you  can  see  from  Fig.  15-8,  and 
why  the  principal  manifestation  of  an  increment  of  work  done  by  the  force  is  an 
increment  in  the  particle’s  speed. 

But  as  the  force  continues  to  do  work,  v starts  to  approach  c,  andK  starts  to  be- 
come significant  in  comparison  to  m0c2.  This  means  the  increase  in  m becomes  im- 
portant. It  is  as  if  the  inertia  of  the  particle  increased  because  its  mass  increased, 
so  that  it  becomes  more  difficult  for  the  applied  force  to  increase  its  speed.  This  is 
made  clear  in  Fig.  15-8  by  the  way  the  value  c acts  as  a limit  to  the  values  of  v.  In 
the  highly  relativistic  domain  where  v — c,  the  most  apparent  manifestation  of  an 
increment  of  work  being  done  by  the  applied  force  is  an  increment  in  the  particle's 
mass,  not  in  its  speed.  The  work  done  increases  K = me2  - m0c2by  increasing  m, 
and  v hardly  changes. 


We  now  have  a physical  picture  of  why  the  speed  of  light  is  nature's 
speed  limit,  that  is,  why  no  particle  whose  rest  mass  m0  is  greater  than  zero  can 
move  with  a speed  v exactly  equal  to  c or  greater  than  c.  For  any  such  particle, 
me 2 = m0c2/\J  1 — v2/c2  approaches  infinity  as  v approaches  c.  Thus,  the 
force  applied  to  a particle  would  have  to  do  an  infinite  amount  of  work  to 
make  it  attain  a speed  equal  to  the  speed  of  light.  This  would  consume  an 


15-3  Relativistic  Force  and  Energy  671 


EXAMPLE  15-3 


15-4  RELATIVISTIC 
ENERGY  RELATIONS 


infinite  amount  of  energy,  and  that  much  energy  is  never  available.  If  v is 
greater  than  c,  then  me2  is  an  imaginary  quantity. 

Example  15-3  applies  Einstein’s  energy  expressions  to  a cosmic  ray 
muon. 


Calculate  the  relativistic  kinetic  energy,  total  relativistic  energy,  and  rest-mass  en- 
ergy of  a cosmic  ray  muon  moving  at  speed  0.999c.  The  rest  mass  of  a muon  is 
1.9  x 10~28  kg. 

■ First  you  calculate  the  rest-mass  energy 
E0  = m0c2 

= 1.9  x 1(T28  kg  X (3.0  x 108  m/s)2 
= 1.7  x 10-11  J 

Then  use  the  evaluation  1/v/l  — v2/c2  = 22  for  v/c  = 0.999  from  Example  14-3  to 
calculate  the  total  relativistic  energy 

E = me2  = — ° — - = 22  m0c2 

Vl  - v2/c2 

= 22  x 1.7  x 10_n  J 

= 3.8  x IQ”10  J 


You  can  now  obtain  the  relativistic  kinetic  energy  from  Eq.  (15-13): 


You  have 


K = me2  — m0c2  = E — E0 


K = 3.8  x 10-10  J - 0.17  x 10-10  J 
= 3.6  x IQ"10  J 


The  total  relativistic  energy  of  the  rapidly  moving  muon  considered  in 
Example  15-3  is  very  small  by  the  standards  of  the  everyday  world,  and  its 
rest-mass  energy  is  even  smaller.  But  a muon  is  a microscopic  particle.  For 
a macroscopic  body  the  total  relativistic  energy  is  enormous,  even  when 
the  body  is  stationary,  because  it  has  so  much  rest-mass  energy.  For  in- 
stance, the  rest-mass  energy  of  1 kg  of  any  material  is  m0c2  = 1 kg  x 
(3  x 108  m/s)2  — 1017  J!  It  is  fortunate,  for  safety’s  sake,  that  in  bulk  matter 
almost  all  this  energy  is  permanently  locked  in.  Even  in  uranium  about  99.9 
percent  of  the  rest-mass  energy  is  unavailable.  But  approximately  0.1  per- 
cent can  be  extracted  by  incorporating  uranium-235  into  a nuclear  reactor. 
This  amounts  to  a potentially  available  energy  of  about  1014  J in  1 kg  of 
uranium-235. 

The  word  “potentially”  suggests  that  there  is  a relation  between  rest- 
mass  energy  and  potential  energy.  The  relation  will  be  developed  in  the 
next  section  by  considering  yet  another  thought  experiment. 


Figure  15-9  shows  two  stationary  balls  of  rest  mass  m0 1 and  m0 2-  A strong 
spring  is  compressed  between  the  balls,  but  they  are  held  together  by  a 
latch.  For  simplicity,  we  assume  the  rest  mass  of  the  spring  and  latch 
arrangement  to  be  negligible.  The  rest-mass  energy  stored  in  the  two  balls 
is  m01c2  + m02c2  and  die  potential  energy  stored  in  the  spring  is  U.  So  the 
total  energy  content  of  the  system  is  m01c2  + m02c2  + U. 


672  Relativistic  Mechanics 


Initial 


nrowirS  — ^ 


Final 

Fig.  15-9  A thought  experiment  used 
to  investigate  the  relation  between  rest- 
mass  energy  and  potential  energy. 


The  latch  is  opened  with  an  insignificant  expenditure  of  energy,  al- 
lowing the  spring  to  revert  to  its  normal  length.  This  drives  the  balls  apart 
at  high  speed,  and  they  move  away  with  relativistic  masses  m x and  m2.  Since 
the  spring  has  lost  its  potential  energy  U,  the  total  energy  content  of  the 
system  is  now  the  total  relativistic  energy  m xc2  + m2c2  of  the  two  balls.  Equa- 
tion (15-13)  shows  that  the  value  of  this  quantity  is  rw0iC2  + m02c2  + Ki  + K2, 
where  K x and  K2  are  the  relativistic  kinetic  energies  of  the  two  balls. 

In  newtonian  mechanics  potential  energy  is  defined  so  that  the  loss  U 
in  the  potential  energy  of  such  a system  is  just  equal  to  the  gain  Kx  + K2  in 
its  kinetic  energy.  We  adopt  this  definition,  and  the  resulting  relation, 

Kx  + K2  = U 

into  relativistic  mechanics.  Then  adding  ?w01c2  + m02c2  to  both  sides,  we 
have 


m01c2  + m02c2  + Ki  + K2  = m01c2  + m02c2  + U (15-17) 

This  energy  conservation  equation  describes  what  happens  in  the  isolated 
system.  Its  left  side  is  the  final  value  of  the  energy  content  of  the  system.  It 
is  comprised  partly  of  the  rest-mass  energies  of  the  two  moving  balls  and 
partly  of  their  relativistic  kinetic  energies.  The  right  side  of  the  equation  is 
the  initial  value  of  the  energy  content.  Part  of  this  energy  is  contained  in 
the  rest-mass  energies  of  the  two  balls,  and  part  of  it  is  in  the  potential  en- 
ergy of  the  spring. 


Now  let  us  reconsider  what  happens  in  a less  detailed  way.  Initially 
there  is  a stationary  object  with  some  sort  of  internal  structure  that  we  do 
not  describe.  Nevertheless,  we  can  still  say  what  the  value  of  its  total  energy 
content  is  because  we  can  measure  its  mass  and  Einstein  has  shown  us  that 
the  result,  multiplied  by  c2,  gives  this  energy.  Since  the  object  is  at  rest,  we 
will  also  say  that  this  energy  content  is  its  rest-mass  energy  M0c2,  having  the 
numerical  value 

M0c2  = m0ic2  + m02c2  + U (15-18) 

What  we  have  done  is  to  incorporate  into  the  rest-mass  energy  of  the  sta- 
tionary object  a contribution  U,  which  is  an  energy  arising  from  its  internal 
properties. 

If  the  stationary  object  had  any  other  internal  energy,  we  could  do  the 
same.  For  instance,  if  heated  to  a high  temperature,  the  object  would  gain 
thermal  energy  because  of  the  ensuing  random  motion  of  its  internal  con- 
stituents. We  would  add  that  heat  energy  to  the  rest-mass  energy.  If  the 
constituents  were  given  an  organized  internal  motion,  they  would  have 
kinetic  energy.  But  we  would  lump  this  into  the  rest-mass  energy  of  the  ob- 
ject if  it  is  considered  as  a whole  and  if  as  a whole  it  remains  at  rest.  Providing 
that  a body  is  at  rest,  overall,  its  total  energy  content  is  its  rest-mass  energy. 

Of  course,  it  is  sometimes  not  convenient  to  ignore  the  internal  struc- 
ture of  a composite  object,  if  the  structure  is  known.  For  instance,  it  might 
be  more  useful  to  continue  to  treat  the  potential  energy  U of  the  spring  in 
the  macroscopic  object  in  Fig.  15-9  as  such,  and  not  use  Eq.  (15-18)  to  de- 
fine M0c2.  If  so,  then  Eq.  (15-17)  would  be  used  to  describe  what  happens. 
But  relativistic  mechanics  is  frequently  applied  to  microscopic  particles 
which  have  an  incompletely  known  internal  structure  and  which  are,  there- 
fore, best  considered  as  a whole. 


15-4  Relativistic  Energy  Relations  673 


To  take  this  point  of  view,  we  use  Eq.  (15-18)  in  the  energy  conserva- 
tion equation,  Eq.  (15-17),  converting  it  to  the  form 

m01c2  + m02c2  + K,  + K2  = M0c2  (15-19) 

This  says  that  since  the  initial  rest-mass  energy  content  of  the  isolated 
system  M0c2  is  greater  than  the  final  rest-mass  energy  content  m01c2  + m02c2, 
the  system  gains  kinetic  energy  K1  + K2  in  the  transition  to  compensate  for 
the  loss  of  rest-mass  energy.  An  extension  of  our  argument  to  a system  con- 
taining any  number  of  bodies  leads  to  the  general  expression  of  the  energy 
conservation  law  of  relativistic  mechanics: 

(*»0C total  K total  )final  (^0^  total  T -^-total  linitial  ( 1 5-20(7  ) 

For  cm  isolated  system  the  sum  of  the  total  rest-mass  energy  and  total  relativistic 
kinetic  energy  is  conserved,  as  measured  in  any  given  inertial  reference  frame. 

Equation  (15-13)  shows  that  a completely  equivalent  form  is 

(wCtotal)final  (wtCtotallinitial  (15-206) 

For  an  isolated  system  the  total  relativistic  energy  is  conserved,  as  measured  in  any 
given  inertial  reference  frame.  In  either  form,  the  single  law  replaces  the  sepa- 
rate laws  of  conservation  of  mechanical  energy  and  conservation  of  mass 
used  in  newtonian  mechanics.  Thus  the  two  separate  principles  of  conser- 
vation of  energy  and  conservation  of  mass  (which  are,  in  particular,  the 
foundation  stones  of  the  science  of  chemistry)  are  supplanted  by  a single 
more  general  principle  of  the  conservation  of  mass-energy.  In  this  view, 
mass  is  one  manifestation  of  energy,  in  much  the  same  sense  that  a com- 
pressed spring  is  a manifestation  of  energy.  We  have  tried  to  make  this 
principle  seem  plausible.  But  its  real  justification  is  found  in  an  abundance 
of  direct  experimental  confirmation.  We  present  such  evidence  in  later  sec- 
tions and  give  a precursor  later  in  this  section  in  Example  15-5. 


First,  however,  we  will  obtain  a set  of  very  useful  relations  among  the 
quantities  that  characterize  the  mechanical  properties  of  a relativistic  par- 
ticle. For  the  purpose  of  relating  its  total  relativistic  energy  me2,  its  rest-mass 
energy  m0c2,  and  the  magnitude  p = mv  of  its  relativistic  momentum,  we 
evaluate 


2 4 

me 


2 4 2 4 

MqC  = m%c 


1 — v2  /c2 

V2 / c 2 

1 - v2/c2 
c2m2v2  = c2p2 


— m\c 4 


= c2 


ml 


1 — v2/c 2 


Thus 


m2c4  = c2p 2 + mlc4 

and  we  obtain  the  result 

{me2)2  = ( cp )2  + ( m0c 2)2  (15-2  la) 

Equation  (15-13)  gives  the  relation  between  me2  and  m0c2  involving  the  rela- 
tivistic kinetic  energy  K,  instead  of  p.  Let  us  write  it  again  in  the  form 

me 2 = K + m0c2  (15-216) 


674  Relativistic  Mechanics 


Fig.  15-10  A right  triangle  and  circu- 
lar arc  which  form  a figure  useful  in 
remembering  Eqs.  (15-21).  You  should 
apply  simple  trigonometry  to  show  that 
the  relations  between  the  lengths 
marked  in  the  figure  are  in  agreement 
with  these  equations. 


Taking  the  square  root  of  both  sides  of  Eq.  (15-2  la)  and  substituting  into 
Eq.  (15-21&),  we  find  immediately  that  the  relation  between  K and  p is 

K = \/(cp)2  + (w0c2)2  — m0c2  (15-2  lr) 

A convenient  way  of  remembering  all  three  of  Eqs.  (15-21)  is  provided 
by  the  geometrical  construction  in  Fig.  15-10.  Justify  to  yourself  that  the  con- 
struction correctly  describes  the  three  equations.  Example  15-4  makes  use 
of  the  third. 


EXAMPLE  15-4 

a.  An  electron  has  momentum  of  magnitude^  = 5.00  x 10~22  kg-m/s.  Evalu- 
ate its  relativistic  kinetic  energy  K.  The  rest  mass  of  an  electron  is  w0  = 911  X 
10"31  kg. 

■ First  you  should  calculate 

cp  = 3.00  x 108  m/s  x 5.00  x 10“22  kg-m/s  = 1.50  x 10“13  kg-m2/s2 
= 1.50  x 10“13  J 

and 


m0c2  = 9.11  x 10-31  kg  x (3.00  x 108  m/s)2  = 8.20  x 10"14  kg-m2/s2 
= 8.20  x 10“14  J 

Then  you  substitute  these  quantities  into  Eq.  ( 1 5-2  lc)  and  evaluate 

K = V(15.0  x 10“14J)2  + (8.20  x 10~14J)2  - 8.20  x 10“14  J 
= 17.1  x 10“14J  - 8.20  x 10"14  J 

= 8.9  x 10“14  J ■ 

b.  For  the  sake  of  comparison,  use  newtonian  mechanics  to  evaluate  the  kinetic 
energy  of  the  electron. 

■ In  newtonian  mechanics  at  all  speeds  m = m0,  p = m0v,  and 

m0  v mf,v  pr 

2 2 m0  2 m0 

Setting  p = 5.00  X 10-22  kg-m/s  and  m0  = 9.11  X 10~31  kg,  you  have 


(5.00  x 10-22  kg-m/s)2 

2 x 9.11  x 10“31  kg 


= 1.37  x 10-13  kg-nr/s2 


= 13.7  x 10“14J 


15-4  Relativistic  Energy  Relations  675 


1 his  value,  obtained  by  applying  newtonian  mechanics  to  the  electron,  is  consider- 
ably larger  than  the  value  found  in  part  a by  applying  relativistic  mechanics.  It  is  not 
correct  because  newtonian  mechanics  is  not  applicable  to  an  electron  whose  kinetic 
energy  is  comparable  to  its  rest-mass  energy,  since  such  an  electron  moves  at  a 
speed  comparable  to  the  speed  of  light.  Can  you  give  a qualitative  explanation  of 
why  newtonian  mechanics  overestimates  the  value  of  the  electrons’s  kinetic  energy? 


The  relation  between  the  relativistic  kinetic  energy  and  momentum  of 
a particle  has  a simple  limiting  behavior  at  both  ends  of  the  range  of  pos- 
sible values  of  the  particle’s  speed  v.  For  v/c  approaching  0,  Eq.  (15-146) 
shows  that  K reduces  to  the  expression  K = m0v2/2.  Also,  p reduces  to  p = 
m0v  in  this  limit  where  m becomes  indistinguishable  from  m0.  We  can  com- 
bine these  two  expressions,  just  as  in  Example  15-46,  to  obtain  K — p2/2m0. 
Multiplying  and  dividing  the  right  side  of  this  expression  by  c2,  we  have 

(cp)2 

K = — ^ f°r  CP  ^ moC2  (15-22) 

J,771qC~ 

The  restriction  is  expressed  in  terms  of  quantities  appearing  in  the  equa- 
tion. It  is  equivalent  to  v/c  «.  1.  Why? 

For  v/c  approaching  1,  the  quantity  cp  in  Eq.  ( 1 5-2 1 r)  increases  very 
rapidly  because  p = mv  and  m becomes  ever  larger.  On  the  other  hand,  the 
quantity  m0c2  is  a constant.  So  in  this  limit  cp  becomes  very  large  compared 
to  m,0c2,  and  Eq.  (15-21c)  reduces  to  the  expression 

K = cp  for  cp  >£>  m0c2  (15-23) 

In  the  region  between  these  two  limiting  behaviors,  Eq.  (15-2  lc)  must 
be  used  to  connect  K to  cp.  The  overall  relation,  for  different  values  of  par- 
ticle rest  mass  m0,  is  pictured  qualitatively  in  Fig.  15-1  1.  What  would  a plot 
of  A versus  cp  look  like  if  K is  evaluated  for  all  cp  by  using  the  newtonian  re- 
lation K = (cp)2 /2m0c2?  How  do  the  features  of  the  correct  relativistic  rela- 
tion between  K and  cp  compare  with  those  of  the  incorrect  newtonian  rela- 
tion? 

A particularly  simple,  although  almost  unique,  case  of  Eqs.  (15-21) 
is  found  in  a very  interesting  particle  called  the  neutrino.  Several  accurate, 
but  indirect,  experiments  show  that  the  rest  mass  of  a neutrino  is  zero! 
For  the  neutrino  m0  = 0,  and  Eqs.  (15-21)  reduce  to 

me 2 = K = cp  for  m0  — 0 (15-24) 


Fig.  15-11  The  relation  between  a particle’s  kinetic 
energy  K and  the  speed  of  light  c times  its  momentum  p. 
The  form  of  the  relation  is  indicated  schematically  for 
three  different  values  of  the  particle’s  rest  mass  tn0. 


676  Relativistic  Mechanics 


Other  experiments  show  that,  even  though  they  have  no  rest  mass,  neu- 
trinos do  carry  relativistic  kinetic  energy  K and  momentum  p (in  amounts 
that  vary  depending  on  the  circumstances).  Therefore  neutrinos  do  have 
relativistic  mass.  With  this  in  mind,  the  equation 


m 


m o 

Vl  - u2/c2 


requires  that  for  a neutrino  v = c in  all  circumstances.  The  denominator  must 
always  be  zero  since  the  numerator  is  always  zero.  Otherwise  m,  and  also  K 
and  p,  would  always  be  zero.  A neutrino’s  energy  is  entirely  kinetic  because 
it  has  no  rest-mass  energy.  Furthermore,  its  momentum  is  directly  propor- 
tional to  its  energy.  A neutrino  is  a completely  relativistic  particle  that  is 
always  in  motion  and  moving  with  a speed  that  is  always  measured  to  have 
the  value  c.  This  is  true,  independent  of  the  state  of  motion  of  the  observer. 


For  a particle  of  zero  rest  mass,  the  concept  of  rest  mass  becomes  more  ab- 
stract. It  is  not  possible  to  define  zero  rest  mass  operationally  in  terms  of  mass 
measurements  made  when  the  particle  is  moving  relative  to  the  observer  at  a 
speed  small  compared  to  the  speed  of  light  c.  This  cannot  be  done  since  a neutrino 
always  moves  at  speed  c.  Instead,  an  observer  can  measure  the  particle’s  mo- 
mentum p and  kinetic  energy  K.  If  the  results  show  that  K = cp,  then  the  particle 
is  said  to  have  zero  rest  mass. 

Since  neutrinos,  having  zero  rest  mass,  move  at  the  speed  c under  all  circum- 
stances, they  could  be  used — just  as  well  as  light — to  synchronize  Einstein’s 
clocks.  This  would  not  be  a practical  thing  to  do  because  it  is  much  more  difficult 
to  detect  neutrinos  than  to  detect  light.  The  reason  is  that  neutrinos  interact  much 
more  weakly  with  matter  than  light  does.  (Although  neutrinos  move  at  the  same 
speed  as  light,  it  does  not  follow  that  their  other  properties  must  be  related  to 
those  of  light.)  Nevertheless,  the  point  being  made  is  a very  significant  one.  The 
speed  c is  not  just  the  universal  speed  of  light.  Rather,  c is  the  universal  limiting 
speed  in  nature.  This  speed  limit  is  reached  by  neutrinos  because  they  have  zero 
rest  mass. 

Another  particle  having  zero  rest  mass  is  called  the  graviton.  The  existence  of 
gravitons  is  predicted  reliably  by  the  theory  of  gravity.  But  their  interaction  with 
matter  is  so  extremely  weak  that  at  the  time  of  this  writing  they  have  not  been 
detected  experimentally.  Although  gravitons,  like  neutrinos,  have  nothing  to  do 
with  light,  they  travel  at  the  same  speed  c because  they  have  zero  rest  mass. 

There  is  one  more  type  of  zero-rest-mass  particle,  the  photon.  As  we  discuss  at 
length  in  Chap.  30,  the  existence  of  photons  is  very  well  established  experi- 
mentally. In  contrast  to  neutrinos  and  gravitons,  which  have  no  relation  to  light, 
photons  have  the  most  intimate  relation  to  light.  Photons  are  light,  when  light  is 
considered  from  the  viewpoint  of  the  quantum  domain.  But  they  travel  at  speed  c 
because  they  have  zero  rest  mass,  not  because  of  their  relation  to  light. 

Most  kinds  of  particles  have  nonzero  rest  mass.  So  the  typical  particle  cannot 
reach  the  limiting  speed  c.  However,  the  very  existence  of  this  limit  exerts  a 
dominant  influence  on  the  particle’s  behavior  when  its  speed  becomes  an  ap- 
preciable fraction  of  c.  This  is  why  c plays  a fundamental  role  in  so  many  different 
phenomena  that  have  no  relation  to  light  or  to  other  forms  of  electromagnetic 
radiation. 


The  process  involved  in  the  formation  of  a cosmic  ray  muon  is  ana- 
lyzed in  Example  15-5. 


EXAMPLE  15-5 

A cosmic  ray  muon  is  formed  from  the  decay  of  a particle  called  a pion.  (See  the  dis- 
cussion preceding  Example  14-3.)  In  the  decay  a neutrino  is  also  formed.  The 


15-4  Relativistic  Energy  Relations  677 


process  can  be  expressed  symbolically  as 

7 T » /X  + V 

where  the  symbols  represent  a pion,  a muon,  and  a neutrino.  Measured  values  of 
the  rest  masses  of  these  particles  are  m ^ = 2.49  X 10-28  kg,  m0fJ.  = 1.89  X 10-28  kg, 
and  m0v  = 0.  (These  are  274,  207,  and  0 times  the  rest  mass  of  an  electron.)  Predict 
the  kinetic  energy  of  the  muon,  as  measured  in  a reference  frame  moving  along 
with  the  pion  before  it  decays. 

■ Since  the  pion  is  at  rest  in  the  stipulated  reference  frame,  you  can  picture  the 
decay  by  means  of  the  initial-final  diagrams  of  Fig.  15-12.  Initially,  the  isolated 
system  has  zero  total  momentum.  So  the  momentum  conservation  law,  Eqs.  (15-5a) 
and  (15-56),  requires  it  to  have  zero  total  momentum  finally.  Therefore,  you  can 
conclude  that  the  muon  and  neutrino  are  emitted  “back  to  back,”  as  shown,  with 
momenta  of  equal  magnitude: 

Pul  = Pv  (15-25) 

Next  you  utilize  the  energy  conservation  law,  Eqs.  (15-20a)  and  (15-206).  It  de- 
mands that 


m^c2  = cpv  + \/(c/v)z  + (m0lJ,c2)2  (15-26) 

The  term  on  the  left  side  of  this  equation  is  the  pion  rest-mass  energy,  which  is  the 
initial  total  relativistic  energy  of  the  system.  The  system’s  final  total  relativistic  en- 
ergy is  on  the  right  side.  In  agreement  with  Eq.  (15-24),  the  first  term  is  the  total 
relativistic  energy  of  the  zero-rest-mass  neutrino.  The  second  term  is  the  total  rel- 
ativistic energy  of  the  muon,  evaluated  by  taking  the  square  root  of  both  sides  of 
Eq.  ( 15-2 la). 

Substituting  Eq.  (15-25)  into  Eq.  (15-26)  gives  you 

m^c2  = cpu.  + V(cp M)2  + (m0lJ.c2)2  (15-27) 


Initial  Final 

Fig.  15-12  T he  decay  of  a pion  at  rest 
into  a muon  and  a neutrino. 


Since  m^c2  and  m^c2  are  known,  you  now  have  one  equation  in  one  unknown,  cp M. 
Transposing  and  squaring  produce 

m^c2  - cp M = V(c^M)2  + (w0Mc2)2 

and 

(m^c2)2  - 2 m^cpy.  + (c/v)2  = {cp^)2  + (ot0mc2)2 
or 


Cpu 


(mpjrC2)2  ~ (m0lxc2)2 
2m^c2 


To  determine  the  corresponding  value  of  K you  can  use  Eq.  ( 1 5-2 1 c)  to  write 
Eq.  (15-27)  as 

OTott c2  = cpu  + Ku,  + mmc 2 
or 

Ku  = m{ yjrC2  - m0lxc2  - cp M 

Using  the  expression  just  obtained  for  you  write 

„ {m07Tc2)2  - (m0Mc2)2 

= Wta-r  - m0lMc2 2 


678  Relativistic  Mechanics 


Then  you  can  group  the  terms  as  follows: 


2{m(yuc2)2  - 2m(y7Tc2m0^c2  - (m^c2)2  + (w0Mc2)2 
2m(hrc2 

jm^c2)2  - 2m(yuc2m0lJ,c2  + {m0fJLc2)2 
2mmrc2 

(m^c2  - m0llc2)2 
2m(krc2 


or 


_ (%  ~ m^fc2 

M “ 2 JBcr 

The  numerical  value  of  the  kinetic  energy  of  the  muon  is 

(2.49  x 10“28  kg  - 1.89  x 1(T28  kg)2  x (3.00  x 108  m/s)2 
M 2 x 2.49  x 10-28  kg 

= 6.51  x l(r13  J 

This  value  is  in  excellent  agreement  with  experiments  that  measure  the  kinetic 
energy  of  the  muon  in  a reference  frame  in  which  the  pion  is  at  rest  when  it  decays. 
In  these  experiments  the  pions  are  produced  in  collisions  between  protons,  emitted 
from  a high-energy  particle  accelerator,  and  nuclei  in  a target.  The  pions  are  imme- 
diately stopped  by  an  absorber,  so  that  they  decay  while  at  rest. 

Can  you  predict  the  kinetic  energy  of  the  neutrino? 


Several  aspects  of  Example  15-5  are  worthy  of  comment.  First,  the  solution  of 
the  equations  arising  in  Example  15-5  might  seem  complicated,  but  it  is  actually 
simple  compared  to  what  is  often  involved  in  applying  relativistic  mechanics  to  a 
system  in  which  two  or  more  particles  interact.  The  calculation  was  made  easier 
by  the  zero  initial  momentum  of  the  system  and  by  the  zero  rest  mass  of  one  of  the 
particles  in  the  final  system.  In  more  typical  cases,  the  algebraically  complicated 
relations  between  momentum  and  energy  that  characterize  relativistic  mechanics 
can  lead  to  equations  which  are  quite  difficult  to  handle. 

Second,  note  should  be  taken  of  the  strong  analogy  between  the  decay  of  a 
pion  into  a muon  and  a neutrino,  pictured  in  Fig.  15-12,  and  the  process,  pictured 
in  Fig.  15-9,  in  which  a composite  object  decays  by  disintegrating  into  two  pieces. 
If  you  wish,  you  can  look  closely  enough  at  the  macroscopic  object  to  see  the 
spring  and  latch  arrangement  that  constitutes  its  internal  structure  before  decay. 
This  is  very  much  harder  to  do  for  a microscopic  particle.  In  fact,  the  internal 
structure  of  a pion  is  not  completely  understood,  although  it  is  known  that  what  is 
called  the  weak  nuclear  force  (see  Sec.  1-2]  is  involved  in  its  “spring  and  latch” 
arrangement.  But  in  any  case  it  is  not  necessary  to  know  the  internal  structure  of 
whatever  it  is  that  decays,  if  what  you  are  interested  in  doing  is  predicting  the  en- 
ergies of  the  decay  products.  All  you  need  to  know  is  the  rest  masses  of  the  initial 
and  final  constituents  of  the  system. 

The  third  point  to  note  is  that  in  pion  decay  there  is  an  appreciable  decrease 
in  the  total  rest  mass  of  the  isolated  system.  About  24  percent  of  the  rest  mass  dis- 
appears. However,  there  is  also  a significant  increase  of  the  kinetic  energy  of  the 
system.  In  describing  such  a process,  it  is  often  said  that  mass  has  been  converted 
into  energy.  But  it  is  more  accurate  to  say  that  rest-mass  energy  has  been  con- 
verted into  kinetic  energy.  The  agreement  between  the  measured  and  predicted 
values  of  the  kinetic  energy  gain  confirms  the  validity  of  the  relativistic  energy 
conservation  law  on  which  the  prediction  is  based. 


15-4  Relativistic  Energy  Relations  679 


15-5  ENERGY  AND 
REST  MASS  IN 
CHEMICAL  AND 
NUCLEAR  REACTIONS 


Certainly  the  most  important  practical  example  of  the  conversion  of  rest- 
mass  energy  into  kinetic  energy  is  found  in  nuclear  fission.  In  this  section 
we  discuss  this  nuclear  reaction,  ending  up  by  evaluating  the  quite  appre- 
ciable change  in  rest  mass  that  occurs  in  the  fission  of  a uranium-235  nu- 
cleus. We  also  consider  a simple  chemical  reaction  so  that  we  can  compare  it 
to  the  nuclear  reaction  with  regard  to  energy  and  rest-mass  change.  We 
start  with  the  chemical  reaction  because  Sec.  8-4  has  already  presented  an 
essential  feature  of  the  explanation  of  what  happens  during  that  reaction. 

The  chemical  reaction  of  interest  is 


Na  + Cl  » NaCl 


A sodium  atom  (Na)  combines  with  a chlorine  atom  (Cl)  to  form  a molecule 
of  sodium  chloride  (NaCl).  The  process  can  be  described  most  conveniently 
in  terms  of  Fig.  15-13.  This  is  a quantitative  version  of  Fig.  8-18  that  plots 
the  potential  energy  U of  the  system  as  a function  of  the  center-to-center 
separation  r of  the  atoms  Na  and  Cl,  or  of  the  ions  Na+  and  Cl-,  that  form 
the  molecule.  Consider  the  two  electrically  neutral  atoms  Na  and  Cl  widely 
separated  but  approaching  each  other.  When  r decreases  to  the  dissocia- 
tion separation  rd,  the  atoms  begin  to  overlap  significantly.  At  tfiis  point  an 
atomic  electron  will  jump  from  Na  to  Cl,  so  that  the  two  constituents  of  the 
system  become  the  positive  ion  Na+  (Na  with  a negatively  charged  electron 
missing)  and  the  negative  ion  Cl-  (Cl  with  an  electron  added).  As  r con- 
tinues to  decrease,  the  system  remains  in  the  form  of  the  two  ions  that 
together  constitute  the  molecule. 

To  understand  why  the  mutual  ionization  occurs,  you  should  consider 
the  fact  that  Na  has  1 1 electrons  and  Cl  has  17.  When  Na  loses  an  electron, 
Na  becomes  the  ion  Na+,  which  has  the  same  number,  10,  of  electrons  as 
the  noble  gas  atom  neon.  At  the  same  time  Cl  gains  an  electron  to  become 
Cl-,  having  18  electrons  just  as  the  noble  gas  atom  argon  does.  The  par- 
ticular stability  of  these  two  noble  gases  arises  from  the  energetically  favor- 
able arrangements  of  “closed  shells”  formed  by  10  and  18  electrons.  So  the 
ionization  energy  Zq  absorbed  in  the  mutual  ionization  has  the  compara- 
tively small  value  of  about  2 X 10— 19  J,  because  it  leads  to  favorable  electron 
arrangements  in  both  ions.  Furthermore,  when  the  process  occurs,  energy 
becomes  available  as  a result  of  the  electric  attraction  between  the  oppo- 


680  Relativistic  Mechanics 


sitely  charged  ions.  As  you  probably  know  (and  will  study  in  detail  in  Chap. 
20),  bodies  of  opposite  electric  charge  attract  one  another  with  a force  that 
obeys  the  same  — 1/r2  law  as  does  the  gravitational  force.  Thus  there  is  a 
negative  electric  potential  energy,  proportional  to  —1/r,  for  the  ions  Na+ 
and  Cl-,  just  as  there  is  a gravitational  potential  energy  with  the  same  sign 
and  r dependence  for  a system  of  two  gravitating  masses.  The  electric  force 
is  extremely  strong  compared  to  the  gravitational  force,  so  the  electric  po- 
tential energy  has  a magnitude  extremely  large  compared  to  the  gravita- 
tional potential  energy.  Nevertheless,  it  is  not  large  enough  to  supply  the 
required  energy  until  r decreases  to  rd. 

As  r continues  to  decrease  below  that  value,  the  system  consists  of 
Na+  and  Cl-,  and  its  energy  U becomes  more  negative  in  proportion  to 
— 1/r.  But  when  r approaches  the  equilibrium  separation  re,  the  curve  of  U 
versus  r departs  from  this  behavior.  It  reaches  a minimum  at  re  and  then 
turns  up.  This  also  arises  from  the  energy  associated  with  an  electric  force. 
But  in  this  case  it  is  the  repulsive  force  exerted  by  the  positively  charged 
sodium  nucleus  on  the  positively  charged  chlorine  nucleus  (assisted  by  a 
quantum-mechanical  effect  called  the  “exclusion  principle”).  As  you  proba- 
bly also  know  (and  will  study  in  Chap.  20),  bodies  of  the  same  electric 
charge  repel  with  a force  law  just  like  that  for  opposite  charges  except  that 
the  sign  is  positive.  So  the  potential  energy  is  positive  in  such  a case.  As  r be- 
comes smaller  than  re,  there  is  less  and  less  negative  charge  of  the  inter- 
vening electrons  to  “shield”  the  positive  charge  of  one  nucleus  from  that  of 
the  other.  So  as  the  highly  charged  nuclei  come  closer,  the  positive  energy 
associated  with  their  repulsion  rapidly  causes  the  energy  U of  the  system  to 
increase. 

Now  let  us  use  Fig.  15-13  to  describe  what  happens  in  the  reaction. 
The  two  widely  separated  constituents  of  NaCl  approach  each  other  with 
an  initial  relative  speed  that  is  low  enough  for  the  corresponding  kinetic 
energy  K to  be  negligible.  So  the  system  has  an  initial  total  energy  E equal 
to  the  constant  value  of  U at  large  r,  defined  to  be  U = 0.  As  r decreases  to  a 
value  less  than  rd,  U becomes  negative  and  K becomes  positive  in  such  a way 
as  to  maintain  the  value  E = K + U = 0.  When  r is  near  re,  the  system 
emits  electromagnetic  radiation  that  carries  away  an  energy  of  about  5.8  x 
10-19  J.  This  reduces  the  remaining  total  energy  of  the  system  to  a value 
near  the  binding  energy  Eb  = -5.8  x 10-19J.  With  this  value  of  E,  the  sep- 
aration distance  between  the  molecular  constituents  must  be  near  the  equi- 
librium value  re  = 2.4  x 10-1°  m.  Thus  the  molecule  is  formed  with  its  con- 
stituents essentially  fixed  at  that  separation.  In  the  environment  of  a typical 
chemical  reaction  (for  instance,  in  a solution)  the  energy  emitted  in  electro- 
magnetic radiation  is  rapidly  converted  to  thermal  energy.  This  is  the 
source  of  the  heat  produced  by  the  reaction. 

The  amount  of  energy  emitted  in  the  reaction  forming  the  molecule  is 
measured  by  the  binding  energy  of  the  molecule.  Example  15-6  calculates 
the  rest-mass  change  associated  with  this  binding  energy  and  the  fractional 
rest-mass  change. 


EXAMPLE  15-6 

a.  Determine  the  difference  between  the  combined  rest  mass  of  an  Na  atom 
and  a Cl  atom  and  the  rest  mass  of  an  NaCl  molecule. 

b.  Then  compare  this  total  rest-mass  difference  to  the  initial  total  rest  mass. 
The  rest  mass  of  Na  is  very  nearly  equal  to  23  u,  where  the  atomic  mass  unit  (u)  is 


15-5  Energy  and  Rest  Mass  in  Chemical  and  Nuclear  Reactions  681 


defined  so  that  the  mass  of  the  carbon- 12  atom  is  precisely  12  u.  The  numerical 
value  of  the  unit  is 


u = 1.661  x 10“27  kg  (15-28) 

There  are  two  species  (isotopes)  of  Cl.  Assume  that  the  more  prevalent  one, 
chlorine-35,  is  involved  in  the  reaction.  Its  rest  mass  is  very  close  to  35  u. 

■ a.  You  consider  a system  consisting  initially  of  the  separated  and  almost  sta- 
tionary atoms  Na  and  Cl,  with  a total  energy  content  defined  to  be  £ = 0.  The  final 
state  of  the  system  can  be  considered  to  consist  of  the  stationary  molecule  NaCl  of 
total  energy  content  £ = Eb  = — 5.8  X 10“19  J,  with  the  electromagnetic  radiation 
having  escaped  the  system.  The  energy  has  decreased  since  the  system  is  not  iso- 
lated. But  the  energy  decrease  in  the  system  is  just  equal  to  the  energy  carried  out  of 
it  by  the  radiation,  so  energy  would  be  conserved  in  a larger  system  in  which  the 
radiation  was  considered  to  be  a part.  Both  the  initial  total  energy  and  final  total  en- 
ergy of  the  system  you  are  considering  consist  of  rest-mass  energies  only,  because  in 
each  case  there  is  no  appreciable  kinetic  energy.  So  the  decrease  A £ in  the  total  en- 
ergy £ means  a decrease  Am0c2  in  the  total  rest-mass  energy  m0c2,  with 

Arrive2  = A£  = -5.8  x 10~19J 

Thus 


A £ —5.8  x 10-19  I 

— — _ 

0 r (3.0  x 108  m/s)2 
= —6.4  x 10-36  kg 

You  should  repeat  the  analysis,  using  the  larger  system  that  contains  in  its  final 
state  also  the  emitted  radiation,  and  show  that  the  same  result  is  obtained. 

b.  The  initial  total  rest  mass  m0  equals  the  rest  mass  of  Na  plus  the  rest  mass  of 
Cl.  Using  the  values  given,  you  have 

m0  = 23  u + 35  u = 58  u = 58  x 1.66  x 1(T27  kg 
= 9.63  x IQ”26  kg 


To  compare  m0  with  A m0,  you  should  evaluate  the  ratio 
A m0  -6.4  x 10-36  kg 


m o 


9.63  x 1CT26  kg 


= -6.7  x 10“n  = -10- 


This  is  the  fractional  decrease  in  rest  mass  during  the  chemical  reaction.  It  is 
extremely  small. 


The  results  we  have  obtained  from  considering  the  formation  of  what 
is  called  an  ionically  bonded  molecule  are  typical  of  all  chemical  reactions. 
In  a covalently  bonded  molecule,  mutual  ionization  does  not  occur,  but  the 
U versus  r curve  looks  very  much  like  Fig.  15-13.  The  reason  for  its  behav- 
ior when  r is  greater  than  re  is  more  difficult  to  explain  for  such  a molecule, 
but  it  still  involves  electric  attraction  between  charges  of  one  sign  (electrons 
shared  by  the  molecule)  and  charges  of  the  other  sign  (the  nuclei).  When  r 
is  less  than  re,  the  behavior  of  U is  governed  by  just  the  same  electric  repul- 
sion between  the  nuclei  as  in  ionically  bonded  molecules.  As  a consequence 
of  this  similarity,  the  binding  energy  £6  for  a typical  covalently  bonded  dia- 
tomic molecule  has  about  the  same  value  as  in  NaCl,  and  the  fractional  de- 
crease in  rest  mass  also  has  a value  A m0/m0  — — 10-10. 

Because  this  fractional  change  is  so  minute  in  any  chemical  reaction, 
the  difference  between  the  final  and  initial  rest  masses  cannot  be  observed 
by  direct  measurements  of  these  masses.  (If  this  were  not  the  case,  an 


682  Relativistic  Mechanics 


Energy 


Fig.  15-14  Four  successive  views  of  a 
fissioning  nucleus. 


experimental  chemist  probably  would  have  discovered  the  relativistic 
mass-energy  relations  long  before  they  were  discovered  by  a theoretical 
physicist.)  The  unobservably  small  change  in  rest  mass  in  any  chemical 
reaction  provides  the  practical  justification  for  the  law  of  conservation  of 
mass  in  chemical  reactions,  which  you  have  undoubtedly  come  across  if  you 
have  studied  chemistry. 

But  in  nuclear  reactions  \bm0/mo\  can  be  larger  than  in  chemical  reac- 
tions by  a factor  of  around  107.  That  is,  in  nuclear  reactions  the  magnitude 
of  the  fractional  change  in  rest  mass  is  |Am0/m0|  — 10~3  in  favorable  cases. 
And  in  reactions  in  which  a single  microscopic  particle  decays,  this  ratio  can 
be  even  larger.  For  the  decay  of  the  pion  it  is  about  as  you  saw  in  Ex- 
ample 15-4.  Thus  in  nuclear  and  elementary-particle  physics,  rest-mass 
changes  in  reactions  are  much  too  large  to  be  overlooked  experimentally. 
The  large  values  of  |Am0/m0|  encountered  in  nuclear  and  elementary- 
particle  physics  are  a reflection  of  the  interactions  that  take  place  between 
the  constituents  of  nuclei  and  elementary  particles.  These  interactions  are 
much  stronger  than  those  involved  in  chemical  reactions,  as  we  explain 
below  for  the  case  of  nuclear  fission. 


Fission  is  a nuclear  reaction  that  is  of  great  significance — in  many  dif- 
ferent ways.  Symbolically,  it  can  be  expressed  as 

F * fi  + f2 

Flere  F stands  for  a nucleus  of  an  atom,  like  uranium,  that  splits  by  fission 
into  two  smaller  nuclei  /x  and  /2,  called  fission  fragments.  The  reaction  is 
depicted  in  Fig.  15-14.  It  is  most  conveniently  described  in  terms  of  Fig. 
15-15,  which  is  a plot  of  the  potential  energy  U of  the  system  as  a function 
of  the  center-to-center  separation  r of  its  two  constituent  parts.  The  behav- 
ior of  U as  r varies  can  be  explained  best  by  imagining  that  there  are  two 
positively  charged  atomic  nuclei  /x  and  /2,  each  containing  about  half  the 
rest  mass  and  half  the  positive  electric  charge  of  the  nucleus  of  a uranium 


i 

5 x ict11  J - 


Fig.  15-15  The  potential  energy  U as  a f unction  of  the  center- 
to-center  separation  r of  the  fission  fragments  in  the  nuclear 
reaction  leading  to  the  fission  of  uranium. 


15-5  Energy  and  Rest  Mass  in  Chemical  and  Nuclear  Reactions  683 


atom.  They  are  widely  separated  but  approaching  each  other.  Since  each 
exerts  a repulsive  electric  force  on  the  other  proportional  to  + 1/r2,  there  is 
a positive  potential  energy  U in  the  system  which  increases,  as  r decreases,  in 
proportion  to  + \/r. 

When  r decreases  to  about  1 X 10-14  m,  the  nuclei/j  and /2  begin  over- 
lapping. At  this  point,  but  not  before,  they  begin  to  attract  each  other  as  a 
result  of  the  strong  nuclear  force  (see  Sec.  1-2)  that  begins  to  act  between 
them.  Like  the  weak  nuclear  force  involved  in  pion  decay,  the  strong  nu- 
clear force  does  not  act  over  large  distances.  Both  varieties  of  nuclear  force 
cut  off  abruptly  when  objects  between  which  the  forces  act  cease  being  es- 
sentially in  contact.  They  are  said  to  be  of  short  range,  in  contrast  to  the 
long-range  electric  and  gravitational  forces  whose  effects  on  objects  in- 
teracting through  them  diminish  only  gradually  as  the  objects  are  separat- 
ed. The  strong  nuclear  force  is  so  named  because  it  is  somewhat  stronger 
than  the  electric  force,  if  both  are  operating.  For  example,  two  protons  (hy- 
drogen atom  nuclei)  separated  by  2 X 10~15  m exert  both  forces  on  each 
other.  The  nuclear  force  is  attractive,  and  the  electric  force  is  repulsive, 
with  the  magnitude  of  the  former  being  about  10  times  greater  than  that  of 
the  latter.  But  the  nuclear  force  becomes  repulsive,  and  even  stronger,  if 
the  two  particles  approach  closer  than  about  0.5  x 10“15  m.  The  same 
strong  nuclear  force  acts  between  two  neutrons  (particles  of  almost  the 
same  rest  mass  as  protons,  but  having  no  electric  charge)  or  between  a pro- 
ton and  a neutron — it  has  nothing  to  do  with  electric  charge. 

The  strong  nuclear  force  acting  between  the  neutrons  and  protons  in 
one  fission  fragment  and  those  in  the  other  commences  when  r decreases 
below  rd  = 1 x 10-14  m,  the  sum  of  the  radii  of  the  two  fragments.  This 
produces  a decrease  in  U.  But  as  r continues  to  decrease,  repulsive  effects 
set  in  and  U increases  again.  So  U goes  through  a minimum  at  the  equilib- 
rium value  re. 

There  is  a striking  similarity  between  what  happens  in  a nucleus  un- 
dergoing fission  and  what  happens  in  the  system,  pictured  in  Fig.  15-9,  of 
two  balls  separated  by  the  spring  and  latch.  And  the  U versus  r curves  for 
each  of  these  systems  are  qualitatively  alike.  Before  fissioning,  a nucleus  has 
a shape  specified  by  a point  on  Fig.  15-15  at  the  equilibrium  separation  re. 
That  is,  its  two  parts  have  separation  re.  The  corresponding  potential  en- 
ergy is  Ue.  Since  there  is  no  significant  motion  of  the  nucleus  or  its  two  con- 
stituent parts,  its  total  energy  content  E equals  its  potential  energy  Ue.  But 
if  the  nucleus  is  given  enough  extra  energy,  then  it  will  be  possible  for  it  to 
elongate  so  that  r passes  by  the  value  rd  where  the  potential  energy  curve 
has  the  maximum  value  Ud.  That  is,  the  nucleus  must  be  given  extra  energy 
at  least  as  great  as  Ud  - Ue  to  surmount  what  is  called  the  fission  barrier.  If 
the  nucleus  at  re  receives  this  energy,  it  has  a total  energy  greater  than  its 
potential  energy.  The  difference  can  be  in  the  kinetic  energy  of  the  two 
separating  fission  fragments.  It  is  enough  for  the  fragments  to  move  apart, 
past  rd.  They  then  continue  to  separate  with  increasing  rapidity  as  the 
kinetic  energy  K of  their  motion  increases  while  U decreases.  This  happens 
in  such  a way  as  to  maintain  a constant  value  of  the  total  energy  E = K + 
U.  When  they  are  widely  separated,  U has  dropped  essentially  to  zero  and 
the  total  kinetic  energy  of  the  motion  of  the  fission  fragments  has  the  value 
Ud.  Since  the  fragments  are  about  the  same  size,  this  energy  is  divided 
between  them  about  equally.  The  net  effect  is  that  adding  the  energy  Ud  — 
Ue  to  the  nucleus  leads  to  the  liberation  of  the  very  much  larger  energy  Ud. 


684  Relativistic  Mechanics 


You  can  describe  in  very  similar  terms  what  happens  to  the  double-ball 
spring-latch  system,  when  the  bit  of  energy  required  to  open  the  latch  is 
supplied  to  the  system.  The  analogy  is  apparent:  The  fission  fragments 
play  the  role  of  the  balls;  their  electric  repulsion  plays  the  role  of  the 
spring;  their  nuclear  attraction  plays  the  role  of  the  latch. 

The  value  of  Ue  is  approximately  3.2  X 10~u  J,  and  the  value  of  Ud  is 
approximately  3.3  x 10-11  J.  How  does  the  fissioning  nucleus  get  the  en- 
ergy, about  0.1  x 10~n  J,  required  to  “open  the  latch”?  For  the 
uranium-235  nucleus,  this  energy  can  be  supplied  if  the  nucleus  is  hit  by  a 
slow-moving  neutron.  Since  a neutron  is  uncharged,  it  experiences  no  elec- 
tric repulsion  and  so  can  move  freely  up  to  a highly  charged  nucleus.  At  the 
nuclear  surface,  the  strong  attractive  nuclear  force  starts  to  act  on  the  neu- 
tron, accelerating  it  into  the  nucleus.  The  kinetic  energy  it  thereby  gains  is 
distributed  through  the  nucleus  by  collisions,  and  the  value  happens  to  be 
just  the  required  amount,  that  is  0.1  x 1 0— 11  J.  So  the  nucleus,  which  is  now 
actually  uranium-236,  can  commence  to  fission.  Uranium-238  will  not  fission 
upon  capture  of  a neutron  of  low  initial  speed,  since  the  energy  provided  is 
not  quite  enough  to  put  uranium-239  over  the  top  of  its  fission  barrier.  It 
takes  a neutron  of  appreciable  initial  kinetic  energy.  Since  such  neutrons 
are  not  sufficiently  abundant  in  a nuclear  reactor,  uranium-238  is  not 
directly  useful  as  a reactor  fuel. 

A nuclear  reactor  is  a device  in  which  fission  takes  place  continuously, 
for  the  purpose  of  producing  heat  energy  to  run  an  electric  power  plant. 
What  makes  the  process  possible  is  the  fact  that  in  a typical  fission  reaction 
each  fission  fragment  emits  on  the  average  about  one  neutron.  This  neu- 
tron emission  takes  away  about  0.2  X 10— 11 J from  the  total  kinetic  energy 
of  the  fragments.  But  it  leaves  approximately  3.1  x 10_n  J,  which  is  de- 
graded into  heat  energy  as  the  fission  fragments  are  brought  to  a stop  in 
the  reactor  by  a series  of  collisions.  Of  the  two  neutrons  emitted  on  the 
average  in  each  fission,  one  neutron  typically  is  lost  in  some  way  or  other. 
The  other  neutron  undergoes  collisions  in  the  reactor  material,  which  slow 
it  clown.  But  eventually  the  neutron  hits  a uranium-235  nucleus  and 
triggers  another  fission.  Thus  the  process  perpetuates  itself.  This  is  the 
chain  reaction. 

The  significant  feature  of  the  energetics  of  fission,  and  power  produc- 
tion by  reactors,  is  that  somewhat  more  than  3 x 10-11  J of  energy  is  pro- 
duced in  each  fission  reaction.  Note  that  this  is  larger  than  the  6 x 10~19  J 
produced  in  a typical  chemical  reaction  by  a factor  of  5 x 107.  Comparison 
of  Figs.  15-13  and  15-15  will  show  that  the  factor  5 x 107  is  accounted  for, 
in  part,  by  the  difference  in  size  of  a nucleus  and  of  a molecule.  In  both 
cases  the  energy  produced  in  the  reaction  depends  on  the  value  of  U at  the 
inner  limit  of  the  region  where  U is  determined  by  the  electric  force.  In  this 
region  U varies  approximately  like  1/r.  Since  the  value  of  r at  this  limit  is 
smaller  for  nuclei  than  for  molecules  by  the  factor  0.5  x 10-4,  the  value  of 
U,  and  therefore  the  energy  characterizing  the  reaction,  will  be  larger  by 
the  factor  2 x 104  for  fission.  The  rest  of  the  factor  5 X 107  has  its  origin  in 
the  fact  that  the  electric  force  between  the  two  constituents  of  the  molecule 
is  exerted  between  single  electric  charges,  while  in  fission  it  is  exerted 
between  the  approximately  46  charges  in  one  fragment  (half  the  number  in 
uranium)  and  a comparable  number  of  charges  in  the  other.  Now,  the 
strength  of  the  electric  force  is  proportional  to  the  number  of  charges  in 

15-5  Energy  and  Rest  Mass  in  Chemical  and  Nuclear  Reactions  685 


one  of  the  interacting  bodies  times  the  number  in  the  other  (just  as  the 
strength  of  the  gravitational  force  is  proportional  to  the  product  of  the 
masses  of  the  interacting  bodies).  So  the  electric  force,  and  therefore  the 
potential  energy  U,  should  be  larger  for  the  fissioning  nucleus  than  for  the 
molecule  by  a factor  of  about  (46)2  = 2 x 103.  Since  the  product  of  the  size 
factor  and  the  charge  factor  is  2 x 104  x 2 x 103  — 5 x 107,  we  have  ob- 
tained a very  satisfactory  explanation  of  why  a fission  reaction  releases  so 
much  more  energy  than  a chemical  reaction.  Because  the  point  is  frequently 
misunderstood,  we  emphasize  once  more  that  in  nuclear  fission  the  energy 
is  supplied  by  the  electric  force,  not  by  the  nuclear  force. 

Example  15-7  evaluates  the  rest-mass  change  and  the  fractional  rest- 
mass  change  in  the  fission  of  a uranium-235  nucleus. 


EXAMPLE  15-7  rim irnri  r.  n .■■■■■ ii.m. , , , - 

a.  Evaluate  the  decrease  A m0  in  the  total  rest  mass  of  a system  consisting  ini- 
tially of  a uranium-235  nucleus  plus  a neutron  of  negligible  kinetic  energy  that  is 
about  to  hit  it,  and  consisting  finally  of  two  widely  separated  fission  fragments  that 
have  not  yet  emitted  neutrons. 

b.  Then  compare  the  decrease  in  total  rest  mass  to  the  initial  total  rest  mass. 
■ a.  Since  there  is  no  kinetic  energy  in  the  initial  system,  its  total  energy  consists 

entirely  of  rest-mass  energy.  If  you  measure  energies  so  that  E = 0 when  U = 0 in 
Fig.  15-15,  then  E = Ud  = 3.3  x 10~u  J.  In  the  final  system  the  fission  fragments 
share  a total  kinetic  energy  K = Ud.  You  know  that  this  kinetic  energy  comes  from  a 
decrease  in  rest-mass  energy  of  the  same  magnitude  because  the  system  is  isolated. 
So  you  have 

Am0c2  = -U d = -3.3  X 10"11  J 

and 


A Wo  = - 


3.3  x IQ-11  J 
(3.0  x 108  m/s)2 


-3.7  x 10~28  kg 


b.  The  initial  total  rest  mass  is  approximately 

w0  = 236  u = 236  x 1.66  x 10~27  kg 
= 3.9  x 10~25  kg 


Thus  the  comparison  called  for  yields 


A m0 
m0 


3.7  x 10“28  kg 
3.9  x 10“25  kg 


= -9.5  x 10—*  = -10-3 


The  ratio  A mjm0  is  a figure  of  merit  for  an  energy-producing  reaction.  The 
value  you  obtained  in  Example  15-6  for  a chemical  reaction  was  A m0/mo  — — 10_1°. 
Much  the  same  value  would  be  obtained  for  any  other  chemical  reaction,  for  in- 
stance the  one  involved  in  the  production  of  energy  by  burning  coal.  On  the  basis  of 
joules  of  energy  produced  per  kilogram  of  fuel  consumed,  a nuclear  power  plant  is 
107  times  more  efficient  than  a power  plant  using  coal  or  oil. 


15-6  NUCLEAR 
REACTION 
Q VALUES 


Most  nuclear  reactions  involve  nuclei  and  particles  with  electric  charges  ap- 
preciably smaller  than  the  charges  of  a fissioning  nucleus  and  its  fission 
fragments.  Such  reactions  are  not  dominated  by  the  electric  force,  and 
therefore  the  nuclear  force  plays  a crucial  role  in  determining  the  energies 
involved  in  the  reactions.  The  nuclear  force  is  much  more  complicated 
than  the  electric  force.  As  a consequence,  it  is  difficult  to  predict  the  en- 


686  Relativistic  Mechanics 


/ 


Fig.  15-16  Schematic  illustra- 
tion of  a nuclear  reaction. 


B 


Final 


ergies  involved  in  a typical  nuclear  reaction  from  a knowledge  of  the  forces 
that  act  during  the  reaction.  But  it  is  not  at  all  necessary  to  do  so  if  what  is 
required  is  the  relation  between  the  initial  and  final  values  of  the  kinetic  en- 
ergies of  the  constituents  of  the  system.  The  laws  of  conservation  of  mo- 
mentum and  energy  show  that  this  relation  depends  on  only  the  values  of 
the  rest  masses  of  the  constituents  before  and  after  the  reaction.  And  these 
rest  masses  can  be  obtained  directly  from  appropriate  experiments.  The 
change  in  rest  mass  is  a measure  of  the  amount  of  binding  produced  by  the 
nuclear  force  and  hence  provides  a measure  of  the  strength  of  the  force. 

A typical  nuclear  reaction  is  illustrated  schematically  in  Fig.  15-16  and 
written  symbolically  as 


a + A > b + B 


Before  the  reaction,  a bombarding  particle  a is  incident  on  a target  nu- 
cleus A.  After  the  reaction,  a product  particle  b is  emitted  at  an  angle  4 > 
with  respect  to  the  direction  of  the  bombarding  particle,  and  the  residual 
nucleus  B moves  off  at  some  other  angle.  A specific  example  is  provided  by 
the  reaction  in  which  a is  an  alpha  particle,  as  the  helium-4  nucleus  is 
called;  A is  a nitrogen- 14  nucleus;  b is  a proton,  or  hydrogen- 1 nucleus;  and 
B is  an  oxygen- 17  nucleus.  This  “transmutation”  of  nitrogen  into  oxygen 
by  alpha-particle  bombardment  was  the  first  artificial  nuclear  reaction.  It 
was  hrst  produced  in  1919  by  Ernest  Rutherford,  and  collaborators,  who 
obtained  alpha  particles  from  a radioactive  source  and  used  air  to  provide 
the  nitrogen  target  nuclei. 

In  Rutherford’s  reaction,  as  in  most  other  reactions  studied  in  nuclear 
physics,  each  of  the  bodies  involved  has  a kinetic  energy  quite  small  com- 
pared to  its  rest-mass  energy.  Thus  their  total  relativistic  energies  are  all 
only  slightly  larger  than  their  rest-mass  energies,  and  so  they  are  moving  at 
speeds  small  compared  to  the  speed  of  light.  Therefore  it  is  a good  approx- 
imation to  use  the  newtonian  relations 


( 15-29o) 


and 


(15-296) 


Pa  = moaVa 


to  evaluate  the  kinetic  energy  and  momentum  of  a in  terms  of  its  rest  mass 
m0a  and  speed  va,  and  similarly  for  b and  B. 

But  relativistic  mechanics  still  enters  the  reaction  in  a vital  way  because 
the  total  final  kinetic  energy  Kb  + KB  of  the  isolated  system  is  generally  not 
equal  to  the  total  initial  kinetic  energy  Ka.  The  kinetic  energy  difference 
A K is  called  the  Q value  of  the  reaction.  That  is, 


Q - AK  = Kb  + Kb  ~ Ka 


(15-30T) 


15-6  Nuclear  Reaction  Q Values  687 


The  relativistic  law  of  energy  conservation  shows  that  the  Q value  can  also 
be  written  in  terms  of  the  rest-mass  difference  A m0  as 

Q = ~ hm0c2  = ~ (mo b + m0B  - m0a  - m0A)c 2 (15-306) 


Some  reactions  have  positive  Q values.  A positive  Q value  means  that  the 
total  final  kinetic  energy  is  larger  than  the  total  initial  kinetic  energy,  and 
thus  the  total  final  rest  mass  is  smaller  than  the  total  initial  rest  mass.  For 
other  reactions  where  the  Q value  is  negative,  the  total  kinetic  energy  de- 
creases in  the  reaction,  and  the  total  rest  mass  increases. 

Using  Eqs.  (15-29a)  and  (15-296),  and  the  similar  ones  for  6 and  B,  in 
Eqs.  (1  5-30a)  and  (15-306),  and  employing  momentum  conservation  to 
eliminate  the  difficult-to-measure  quantity  KB,  we  obtain  the  Q-value  equa- 
tion 


<2 


(15-31) 


The  Q-value  equation  involves  the  rest  masses  of  the  bombarding  particle 
m0a,  the  product  particle  m0& , and  the  residual  nucleus  m0B ■ If  approximate 
values  are  known  for  these  quantities  (and  in  practice  they  always  are),  then 
the  equation  allows  a determination  of  the  Q value  from  the  measured 
kinetic  energy  Ka  of  the  bombarding  particle  and  the  measured  kinetic  en- 
ergy Kb  and  emission  angle  4>  of  the  product  particle.  The  measured  Q 
value  can  then  be  used  in  Eq.  (15-306)  to  obtain  an  accurate  evaluation  of 
the  difference  between  the  initial  and  final  total  rest  masses  of  the  reacting 
bodies. 

It  is  not  worthwhile  taking  the  space  to  derive  Eq.  (15-31)  because  the 
derivation  is  almost  identical  to  the  one  given  for  Eq.  (8-186).  You  can  see 
that  this  is  so  by  comparing  the  two  equations.  The  earlier  one  is  just  a spe- 
cial case  of  the  present  equation,  pertaining  to  a type  of  collision  called 
scattering.  In  scattering,  a and  6 are  the  same  body  (called  particle  1 in  the 
earlier  equation),  and  A and  B also  are  the  same  body  (called  particle  2). 
That  is,  in  scattering  each  of  the  two  particles  retains  its  separate  identity. 
There  are  two  kinds  of  scattering.  In  inelastic  scattering,  the  Q value  is 
negative;  there  is  a decrease  in  the  total  kinetic  energy  of  the  system  and  a 
corresponding  increase  in  the  total  rest  mass.  For  isolated  macroscopic 
bodies,  the  loss  of  kinetic  energy  appears  as  a gain  in  the  thermal  energy  of 
the  bodies;  but  the  fractional  rest-mass  increase  that  results  is  so  small  as  to 
be  unmeasurable.  In  elastic  scattering,  the  Q value  is  zero.  An  application 
of  Eq.  (15-31),  reduced  to  the  form  for  elastic  scattering,  was  given  in  Ex- 
ample 8-10.  An  application  of  the  equation  in  its  general  form  to  a nuclear 
reaction  is  given  in  Example  15-8. 


EXAMPLE  15-8 


iiimii  niMBWBwrfTnnHHwrn^1— M—U|i|  inummim  iimhii  imiii  i i inipnmii  ■ ' i— mirnMii—iiwr mwnii  • 

In  Rutherford’s  reaction,  alpha  particles  of  measured  kinetic  energy  Ka  = 1.23  x 
10-12  J are  used  to  bombard  nitrogen- 14  nuclei,  producing  protons  and  oxygen- 17 
nuclei.  The  protons  emitted  in  the  forward  direction,  where  </>  = 0,  are  measured 
to  have  a kinetic  energy  Kb  = 9.53  X 10-13  J.  Determine  the  Q value  of  the  reaction. 
Then  evaluate  A m0,  the  difference  between  the  final  total  rest  mass  and  the  initial 
total  rest  mass.  Express  A m0  in  atomic  mass  units. 


688  Relativistic  Mechanics 


■ The  first  thing  you  must  do  is  to  verify  that  Eq.  (15-3 1 ) is  applicable  by  showing 
that  Ka  and  Kb  are  indeed  small  compared  to  m0ac 2 and  m0bc2,  so  that  Eqs.  (15-29a) 
and  (15-296)  are  usable.  For  this  put  pose  it  is  certainly  accurate  enough  to  equate 
the  rest  masses  of  an  alpha  particle  and  a proton  to  the  rest  masses  of  the  corre- 
sponding atoms  helium-4  and  hydrogen- 1.  (More  than  99.9  percent  of  an  atom’s 
rest  mass  is  in  its  nucleus.)  The  rest  masses  of  the  helium-4  atom  and  the 
hydrogen- 1 atom  are  very  close  to  4 u and  1 u,  where  u = 1.66  x 10-27  kg.  So  you 
have  for  the  nuclear  rest-mass  energies 

m0ac2  = 4 x 1.66  x 10-27  kg  x (3.00  x 108  m/s)2  = 5.98  x 1(T10  J 

and 

m0bc2  = 1 x 1.66  x IQ"27  kg  x (3.00  x 108  m/s)2  = 1.49  x lO-10  J 


Comparison  with  the  kinetic  energies  given  above  justifies  the  use  of  Eqs.  (15-29«) 
and  (15-296)  from  newtonian  mechanics  in  deriving  Eq.  (15-31)  for  the  Q value,  as 
far  as  a and  6 are  concerned.  Can  you  explain  why  the  same  will  be  true  for  B? 
Writing  Eq.  (15-31)  with  cos  c/>  = 1 for  </>  = 0,  you  have 


<2  = Kb 


m0B J \ m0B 1 nifts 


It  is  sufficiently  accurate  to  use  the  values  m0a  = 4 u and  mob  = 1 u in  this  equation, 
as  justified  above.  The  same  justification  shows  that  you  can  use  mbA  — 14  u and 
m0B  = 17  u for  the  rest  masses  of  the  nuclei  of  nitrogen- 14  and  oxygen- 17.  With 
these  numerical  values  you  obtain 

Q = 9.53  x 10~13  J (l  +^)  - 1.23  x 10-12J  (l  - 

V4u  x 1 u , 

-2 — n/1.23  x 10-12  1 x 9.53  x 10“13  I 

17  u J J 

= -1.89  x 10“13  J 


Since  Q = A K,  you  have  shown  that  1.89  X 1 0— 13  J of  kinetic  energy  is  lost  in  the 
reaction.  4'he  (/-value  equation  allows  you  to  reach  this  conclusion  even  though  KB , 
the  kinetic  energy  of  the  residual  nucleus,  is  not  measured. 

The  change  in  rest-mass  energy  is 

Am0c2  = -Q  = 1.89  x 10~13  J 

There  is  as  much  rest-mass  energy  gained  as  there  is  kinetic  energy  lost.  The 
amount  of  rest  mass  gained  is 


A wo 


Q 


1.89  x 1Q~13J 
(3.00  x 108  m/s)2 


= 2.10  x l(r30  kg 


To  express  this  result  in  atomic  mass  units,  you  evaluate 

1 u 

Aw0  = 2.10  x 10  30  kg  x — — = 0.00127  u 

1.66  x 10  kg 


The  result  A m0  = 0.00127  u,  obtained  in  Example  15-8,  establishes  a 
relation  among  the  rest  masses  of  the  nuclei  of  the  atoms  helium-4, 
nitrogen-14,  hyclrogen-1,  and  oxygen-17.  It  also  relates  the  rest  masses  of 
the  atoms  themselves.  The  helium  atom  has  two  electrons,  and  the  nitrogen 
atom  has  seven.  So  the  initial  total  nuclear  rest  mass  in  the  reaction  is 
approximately  the  rest  masses  of  these  atoms,  less  nine  electron  rest 
masses.  The  hydrogen  atom  has  one  electron,  and  the  oxygen  atom  has 
eight.  So  the  final  total  nuclear  rest  mass  is  also  approximately  the  total 


15-6  Nuclear  Reaction  Q Values  689 


atomic  rest  mass  less  nine  electron  rest  masses.  The  word  “approximately” 
is  used  because  we  are  ignoring  the  mass  equivalents  of  the  binding  en- 
ergies of  the  atomic  electrons.  The  error  in  so  doing  is  appreciably  smaller 
than  the  possible  error  in  the  measured  value  A m0  — 0.00127  u,  which  is 
about  1 in  the  fifth  decimal  place.  Thus  this  value  also  equals  the  difference 
between  the  sum  of  the  rest  masses  of  the  hydrogen- 1 and  oxygen- 17 
atoms  and  the  sum  of  the  rest  masses  of  the  helium-4  and  nitrogen- 14 
atoms. 

As  indicated  earlier,  if  very  accurate  measurements  already  have  been 
made  for  three  of  the  four  rest  masses,  the  value  of  A m0  obtained  from  the 
Q-value  analysis  of  the  nuclear  reaction  measurement  can  be  used  to  deter- 
mine accurately  the  fourth  rest  mass.  This  is  a widely  employed  experi- 
mental procedure.  In  the  particular  case  of  the  particpants  in  Rutherford’s 
reaction,  completely  independent  measurements  have  been  made  with 
great  accuracy  of  all  four  of  the  atomic  rest  masses.  The  mass  spectroscopy 
technique  used  in  these  measurements  is  very  similar  to  that  used  by 
Bucherer  and  described  in  Sec.  15-3.  A beam  of  atoms  of  the  species  of 
interest  is  sent  through  an  electric  discharge,  which  ionizes  each  atom  by 
removing  one  electron.  The  charged  ions  then  enter  a region  of  electric 
and  magnetic  fields,  where  they  follow  a path  that  specifies  their  speed 
and  their  momentum.  From  these  two  quantities  the  mass  of  each  ion  can 
be  evaluated.  Because  their  speeds  are  quite  low,  the  mass  is  the  rest  mass. 
Adding  the  rest  mass  of  an  electron  to  the  rest  mass  of  the  ion  gives  the  rest 
mass  of  the  neutral  atom.  The  measured  atomic  rest  masses  and  their  sums 
are 

Helium-4:  4.002603  u 

Nitrogen- 14:  14.003074  u 

Sum:  18.005677  u 


Hydrogen- 1 
Oxygen- 17 
Sum 


1.007825  u 
16.999133  u 

18.006958  u 


The  rest-mass  difference  obtained  from  mass  spectroscopy  is  therefore 
given  by 

Final  m0:  18.006958  u 

Initial  m0:  18.005677  n 

Difference:  0.001281  u 

Thus  (he  value  obtained  from  this  experimental  technique  is  A m0  = 
0.001281  u.  The  value  found  from  the  analysis  of  the  completely  indepen- 
dent nuclear  reaction  measurement  is,  according  to  Example  15-8,  A m0  = 
0.00127  u. 

The  two  determinations  agree  to  within  the  accuracy  of  the  nuclear 
reaction  measurement.  Agreement  to  even  greater  precision  is  found  in 
other  cases.  This  provides  the  most  accurate  experimental  verification  of 
the  relativistic  energy  conservation  law,  on  which  the  (2-value  equation 
used  in  the  nuclear  reaction  analysis  is  based.  Thus  it  verifies  with  the 
highest  degree  of  accuracy  one  of  the  most  basic  predictions  of  Einstein’s 
special  theory  of  relativity. 


We  have  completed  our  development  and  testing  of  the  special  theory 
of  relativity.  But  we  will  make  further  use  of  relativistic  kinematics  and  me- 


chanics,  where  appropriate,  as  we  proceed  with  our  study  of  physics.  For 
instance,  relativistic  kinematics  is  used  to  give  a physical  interpretation  of 
the  origin  of  magnetic  fields  and  of  their  relation  to  electric  fields.  Several 
applications  of  relativistic  mechanics  are  presented  in  our  treatment  of 
electromagnetism.  And  particularly  important  use  is  made  of  relativistic 
mechanics  when  we  investigate  the  particlelike  properties  of  electromag- 
netic radiation,  in  connection  with  the  topic  of  quantum  physics.  On  the 
other  hand,  we  continue  to  employ  newtonian  mechanics  whenever  we  can 
because  it  is  simpler,  and  more  familiar,  than  relativistic  mechanics. 


EXERCISES 


Group  A 

15-1.  Muon  watch.  Evaluate  the  relativistic  mass,  mo- 
mentum, and  kinetic  energy  of  the  muons  considered  in 
Exercise  14-3.  These  muons  are  produced  at  a high-en- 
ergy particle  accelerator  with  a speed  such  that  a labora- 
tory measurement  of  their  average  lifetime  gave  the  value 
6.9  X 10-6  s.  The  average  lifetime  of  muons  at  rest  is  2.2  x 
10~6  s,  and  their  rest  mass  is  1.89  x ]()-28  kg. 


15-2.  Nonrelativistic  and  extreme  relativistic  limits. 

a.  What  is  the  maximum  value  of  v/c  for  which  the 
relativistic  kinetic  energy  of  a particle  can  be  expressed  as 
i m0v2  with  an  error  ot  not  more  than  1 percent? 

b.  What  is  the  minimum  value  of  v/c  for  which  the 
relativistic  kinetic  energy  of  a particle  can  be  taken  equal 
to  the  total  energy  with  an  error  of  not  more  than  1 

perc^L 

(f5-3 .^Survival  distance.  The  rest  (na^of  a tt+  particle 
is  il.zS-x 10~u  J.  As  seen  by  an  observer  with  respect  to 
whom  the  particle  is  essentially  at  rest,  its  half-life  is  1.77  x 
10-8  s.  (This  is  the  time  required  for  half  the  particles  to 
decay  into  something  else.)  Suppose  that  another  observer 
measures  the  total  relativistic  energy  of  one  of  a group  of 
such  particles,  finding  10  x 2.23  x 10“n  J.  From  the 
point  of  view  of  this  observer,  how  far  will  the  group  of 
particles  travel  before  half  of  them  decay? 

(15-4.  fnergy  versus  speed. 

kjfdhe  total  relativistic  energy  of  a particle  is  E anc 
its  rest-mass  energy  is£0,  show  that  v/c  = [1  — ( E0/Ej s2j1/2. 

b.  Calculate  v/c  for  an  electron  whose  kinetic  energy 
equals  its  rest -mass  energy. 


electron  + electron  — » electron  + electron 
+ positron  + electron 

All  the  particles  have  rest-mass  energy  8.2  x 1 0— 14  J . 

15-8.  Annihilation.  An  electron  and  its  antiparticle,  a 
positron,  annihilate  one  another  to  create  photons,  par- 
ticles with  zero  rest  mass. 

a.  If  the  electron  and  positron  are  at  rest  with  respect 
to  each  other,  why  is  it  impossible  for  die  annihilation  to 
produce  just  one  photon? 

b.  If  two  photons  are  produced,  how  are  their  mo- 
tions related?  What  is  the  energy  of  each?  The  rest-mass 
energy  of  an  electron  or  a positron  is  8.2  x 10-14  J. 

15-9.  Mass-energy  equivalence.  Consider  two  identical 
objects,  each  of  rest  mass  m0,  including  the  mass  of  a re- 
laxed spring  attached  to  it.  See  Fig.  15E-9a.  An  experi- 
menter brings  the  two  together,  compressing  the  springs 
as  shown  in  Fig.  15E-96.  He  then  releases  them,  and 
finds  that  the  springs  cause  them  to  move  quickly  in 
opposite  directions,  each  with  speed  v. 

m Fis-  15E-9 

m0 

(a) 


(b) 


15-5.  Rutherford’s  reaction.  Use  the  Q value  obtained  in 
Example  15-8  to  determine  the  expected  kinetic  energy  Kh 
of  protons  emitted  at  the  scattering  angle  </>  = 40°. 

15-6.  Mass  loss  in  the  freezing  of  water.  When  1 .00  kg  of 
water  at  0°C  freezes  into  ice  at  the  same  temperature,  it 
liberates  3.34  x 105  J of  heat  energy.  What  is  the  fractional 
decrease  of  the  mass  of  water  when  it  freezes  to  ice? 

15-7.  A colli  ding- be  am  experiment.  Find  the  minimum 
energy  and  corresponding  speed  of  identical,  oppositely 
directed  electron  beams  that  will  allow  the  production  of 
electron-positron  pairs  by  the  reaction 


a.  What  is  the  relativistic  expression  for  the  total  en- 
ergy of  the  system  when  the  objects  are  in  motion? 

b.  On  the  basis  of  the  conservation  of  total  energy, 
what  must  have  been  the  energy  of  the  system  when  the 
objects  were  stationary? 

c.  What  was  the  rest-mass  energy  of  the  masses  alone 
when  the  objects  were  stationary?  Account  for  the  fact 
that  the  total  rest-mass  energy  of  the  stationary  system 
in  fig.  15E-96  is  greater  than  the  rest-mass  energy  of 
the  separated  masses  in  Fig.  15E-9«. 


Binding  energy:  the  helium  nucleus.  A helium  nu- 
cleus consists  of  two  protons  each  of  mass  1 .007825  u and 


/S-'o,///S'+ZS 


Exercises 


691 


two  neutrons  each  of  mass  1.008665  u.  The  mass  of  a he- 
lium nucleus  is  4.002603  u.  What  is  the  binding  energy  of 
the  helium  nucleus  per  nucleon  (nuclear  particle)? 

(15-11 J Relativistic  conservation  law  for  mass-energy.  In 
■'the  hrsTnuclear  disintegration  experiment  using  an  artifi- 
cially accelerated  particle,  lithium  nuclei  were  bombarded 
by  swiftly  moving  protons  (hydrogen  nuclei),  resulting  in 
the  reaction 


proton  + lithium-7 > helium-4  + helium-4 

The  kinetic  energy  of  the  proton  was  0.80  x 10-13  J. 
Each  helium  nucleus  had  a kinetic  energy  of  14.24  x 
1 0— 13  J . The  rest  mass  of  the  proton  is  1.008145  u,  that  of 
the  lithium-7  is  7.018034  u,  and  that  of  the  helium-4  is 
4.003874  u. 

Show  that  the  experimental  data  confirms  the  relati- 
vistic equation  for  the  conservation  relation  of  mass- 
energy. 

15-12.  Pick  your  angle,  choose  your  energy.  A nuclear 
reaction  sometimes  used  to  produce  neutrons  of  uniform 
energy  from  a beam  of  protons  of  uniform  energy  is 

proton  + lithium-7 » neutron  + beryllium-7 

The  Q value  of  the  reaction  is  —2.62  X 10~13  J.  A target 
containing  lithium-7  nuclei  is  bombarded  by  a beam  of 
protons  of  kinetic  energy  8.00  x 10-13  J.  At  what  angle  to 
the  proton  beam  will  neutrons  of  kinetic  energy  5.00  x 
10~13  J be  emitted? 

15-13.  P roton  kinetic  energies  in  Rutherford's  reaction. 
Use  the  Q value  equation,  together  with  information  pre- 
sented in  Example  15-8,  to  calculate  the  kinetic  energy  of 
protons  emitted  in  Rutherford’s  reaction  at  an  angle  ot 
20°  to  the  beam  of  incident  alpha  particles. 

15-14.  Threshold  energy  for  a reaction  of  alpha  particles 
with  aluminum , I.  Radioactive  phosphorus  can  be  produced 
by  bombarding  aluminum  with  alpha  particles  (helium 
nuclei)  according  to  the  reaction 


helium-4  + aluminum-27 » neutron  + phosphorus-30 


The  masses  of  the  atoms  are:  aluminum-27:  26.981532  u; 
helium-4:  4.002603  u;  phosphorus-30:  29.978353  u;  neu- 
tron: 1.0086653  u. 

What  is  the  minimum  kinetic  energy  of  the  alpha  par- 
ticles required  to  bring  about  the  reaction?  Ignore  the  re- 
coil energy  of  the  products  by  assuming  that  the  target 
nucleus  is  infinitely  massive  compared  to  the  bombard- 
ing partiple. 


XVE5-15)  Q value for  a reaction  producing  carbon-14.  In  the 
upper  atmosphere,  as  a result  of  cosmic  ray  bombard- 
ment, the  following  reaction  takes  place 


neutron  + nitrogen- 14 » proton  + carbon- 14 

The  mass  of  the  neutron  is  1.0086653  u.  A nitrogen- 14 
atom  has  a mass  of  14.0030732  u;  a carbon-14  atom  has  a 


mass  of  14.003239  u;  and  a proton  has  a mass  of  1.007825 
u.  What  is  the  Q value  of  the  reaction?  (The  nearly  con- 
stant rate  at  which  this  reaction  occurs  over  periods  of 
millenia  makes  possible  the  technique  of  radiocarbon  dat- 
ing, which  is  of  great  importance  in  archaeology  and  related 
fields.  The  radioactive  carbon- 14  atoms  become  incorpo- 
rated first  into  carbon  dioxide  molecules  and  thence 
through  the  process  of  photosynthesis  into  living  plant 
matter.  Since  the  radioactive  decay  rate  of  carbon- 14  is 
known,  the  time  elapsed  since  the  death  of  the  plant  can 
be  determined  by  measuring  the  amount  of  carbon- 14  re- 
maining in  it.) 


Group  B 

v 15-16.  Power  play.  In  newtonian  mechanics  the  rela- 
tion dE/dt  = F • v is  valid,  where  E is  the  total  energy  of  a 
particle  that  is  moving  with  velocity  v and  is  acted  on  by  a 
net  force  F.  Show  that  this  relation  is  also  valid  in  relati- 
vistic mechanics.  (Note:  You  will  need  to  express  the 
squared  magnitude  of  the  momentum  in  vector  form: 

P = P’  P-) 

i/l5-17.  Velocity  in  terms  of  energy  and  momentum.  Show 
that  the  components  of  the  velocity  of  a particle  of  energy 
E and  momentum  p are  given  by 


Vx 


dE 

dpx 


Vy 


dE 

hpy 


Vz  = 


dE 

dpz 


These  relations  apply  in  both  the  relativistic  and  new- 
tonian domains. 

»/l5-18.  Hit  and  stick.  A particle  of  rest  mass  m0  travel- 
ing at  a speed  of  xy  makes  a completely  inelastic  collision 
with  an  identical  particle  initially  at  rest,  and  they  stick 
together. 

a.  What  is  the  rest  mass  of  the  composite  particle? 

b.  What  percentage  of  the  kinetic  energy  was  con- 
verted into  rest-mass  energy? 

c.  What  happened  to  the  remainder  of  the  kinetic  en- 
ergy? 

d.  What  is  the  speed  of  the  composite  particle? 

Jn>  -19.  Decay  of  a neutral  pion.  A moving  770  meson 
decays  into  two  photons,  which  have  zero  rest  mass.  One 
photon  travels  in  the  same  direction  as  the  770  was  travel- 
ing, the  other  in  the  opposite  direction.  What  is  the  energy 
of  each  photon  if  the  total  relativistic  energy  of  the  pion, 
E„,  is  twice  its  rest-mass  energy  of  2.16  x 1()-11  J? 

1^5-20.  Decay  of  a positive  kaon.  A K+  particle  decays 
to  a /jl+  particle  and  a neutrino.  If  the  K+  is  at  rest,  what 
are  the  kinetic  energies  of  the  /u.+  and  the  neutrino?  The 
rest  mass  energy  of  the  K+  is  7.90  x 10-11  J;  that  of  the 
(lO  is  1.71  x 10~n  j.  The  rest-mass  of  the  neutrino  is  0. 

pi  5-2 1.  Decay  of  a neutral  kaon.  A moving  K?  particle 
decays  into  two  770  particles.  If  one  of  these  is  observed  at 
rest,  what  are  the  total  relativistic  energy,  EK,  of  the  K? 
and  the  total  relativistic  energy,  E of  the  moving  7 r°? 


692  Relativistic  Mechanics 


The  rest-mass  energy  of  the  K?  is  m0Kc2  = 7.96  x 10  11  J; 
that  of  the  7r°  is  m07rc2  = 2.16  x 10-11  J. 


v/5  -22.  Decay  of  a pion  into  a muon  and.  a neutrino.  Con- 
tinue the  calculation  in  Example  15-5  so  as  to  obtain  a pre- 
diction of  the  kinetic  energy  of  the  neutrino,  measured  in 
a reference  frame  moving  along  with  the  pion  before  it 
decays. 


tion. 


'he  (Rvalue  equation.  Derive  the  Q value  equa- 
5-3(J),  by  modifying  the  derivation  in  the  text 
leading  to  Eq.  (8- 18b). 

1^24.  Threshold  energy  for  a reaction  of  alpha  particles 
with  aluminum,  II.  In  Exercise  15-14,  you  were  asked  to 
ignore  the  recoil  energy  of  the  products  in  calculating 
the  reaction  threshold. 

a.  Recalculate  the  threshold  energy  for  the  reaction 


helium-4  + aluminum-27 » neutron  + phosphorus-30 

taking  the  finite  mass  of  aluminum-27  into  account. 

b.  What  percentage  error  in  the  threshold  is  caused 
by  assuming  the  target  nucleus  is  infinitely  massive  as  you 
did  in  Exercise  15-14? 

Decay  of  free  neutrons.  Free  neutrons  decay  by 


neutron > proton  + electron  + antineutrino 


The  masses  of  the  particles  are  as  follows:  neutron: 
1.008665  u;  proton:  1.007277  u;  electron:  5.49  x 10-4  u; 
antineutrino:  0 u. 

a.  Find  the  Q value  for  this  decay  process. 

b.  In  this  decay,  most  of  the  released  energy  is  carried 
by  the  electron  and  the  antineutrino,  because  they  are 
much  less  massive  than  the  proton.  This  energy  can  be  di- 
vided between  the  electron  and  the  antineutrino  in  essen- 
tially any  proportion.  Suppose  that  in  a particular  decay 
the  energy  and  momentum  carried  by  the  antineutrino 
are  negligibly  small.  Find  the  energies  and  momenta  of 
the  proton  and  the  electron. 

c.  Suppose  that  in  another  decay,  the  electron  is 
emitted  with  negligible  kinetic  energy  and  momentum. 
Find  the  energies  and  momenta  of  the  proton  and  the  an- 
ti neutrino. 


5-26d  Threshold  energy  for  a reaction  of  protons  with 

The  reaction 


proton  + beryllium-9 * neutron  + boron-9 

has  Q = —2.97  x It)-13  j.  If  a beam  of  protons  is  directed 
against  a beryllium  target,  what  is  the  minimum  proton 
kinetic  energy  required  for  the  reaction  to  occur?  — ) 

e P / 

I J q.oiotfi?-  4 

Grojip  C 

^5-27.  Relativistic  dragster.  A particle  of  rest  mass  m0  is 
acted  on  by  a constant  force  of  magnitude  F directed 
along  the  x axis.  The  particle  is  at  rest  at  the  origin  at  time 
t = 0. 


a.  Find  the  particle’s  relativistic  momentum  as  a func- 
tion of  the  time  /. 

b.  Find  the  particle’s  total  relativistic  energy  as  a 
function  of  its  displacement  x. 

c.  Use  the  relation  E = y/(cp)2  + ( m0c 2)2  to  find  the 
particle’s  displacement  as  a function  of  time  t. 

d.  Use  the  result  of  part  c to  find  the  particle’s  veloc- 
ity and  acceleration  as  functions  of  time.  Do  your  results 
agree  with  nonrelativistic  mechanics  for  small  values  of  U 

ii5-28.  Pair  production.  The  photon,  a particle  with 
zero  rest  mass,  is  sometimes  transformed  into  two  par- 
ticles, an  electron  and  a positron,  each  of  which  has  rest- 
mass  energy  of  8.2  x 10-14  J.  This  process  is  called  pair 
production.  Suppose  the  new  particles  are  emitted  in  the 
direction  of  travel  of  the  original  photon. 

a.  Show  that  the  momentum  of  either  particle  is  less 
than  its  energy  divided  by  c. 

b.  Show  from  this  that  if  the  conservation  of  energy  is 
satisfied  by  the  transformation,  the  conservation  of  mo- 
mentum is  not.  (This  result  means  that  the  decay  can 
occur  only  when  some  other  particle  is  available  to  absorb 
the  excess  momentum  of  the  photon.) 

c.  What  is  the  threshold  value  (smallest  possible 
value)  of  the  photon  energy  for  this  transformation? 

0/15-29  A.4n  elastic  collision.  A particle  whose  rest  mass  is 
m0  Trrrd—umose  relativistic  kinetic  energy  is  K strikes  an 
identical  particle  initially  at  rest.  The  collision  is  elastic, 
the  particles  remaining  unchanged.  The  collision  is  also 
symmetrical,  each  particle  moving  off  at  an  angle  of  6/2 
with  the  original  velocity. 

a.  Show  that  cos2(0/2)  = (K  + 2 m0c2)/(K  + 4 m0c2) 

b.  From  this  show  that  cos  6 = K/(K  + 4 m0c2) 

c.  Calculate  6 if  K = m0c2. 


15-30.  The  Lorentz  invariance  of  E2  — ( cp )2.  A par- 
ticle with  rest  mass  m0  has  speed  v in  the  positive  x direc- 
tion in  the  “laboratory’’  frame  of  reference  O. 

a.  (Calculate^  its  energy,  kinetic  energy,  and  mo- 
mentum in  the  laboratory  frame.  Verify  explicitly  the  re- 
sult, taken  from  Eq.  (15-21),  that  E2  — (cp)2  = ( m0c 2)2. 

b.  Calculate  the  energy,  kinetic  energy,  and  mo- 
mentum of  the  particle  in  a “rocket”  frame  of  reference  O' 
moving  at  speed  V in  the  positive  x direction  with 
respect  to  the  laboratory  frame.  Show  that  (E')2  — (cp')2  = 
(. m0c 2)2.  This  is  an  example  of  an  invariant  quantity; 
E2  — (cp)2  = E2  - c2(p%  + pi  + pi)  is  the  same  in  every  in- 
ertial coordinate  system.  For  another  example,  see  Exer- 
cise 14-33. 


A 5-3 1 . Lorentz  momentum-energy  transformation.  A par- 
ticle of  rest  mass  r«0  is  moving  at  speed  v in  the  positive 
direction  of  the  x axis  of  inertial  frame  O.  In  that  frame 
the  components  of  its  relativistic  momentum  are  px  - 
Wptt/Vl  - v2/(2,py  = 0,  pz  = 0.  Its  total  relativistic  energy 
is  E = moc1  /wl—xFJA.  An  observer  O'  is  moving  in_fhe 


positive  direction  along  the  x axis  at  speed  V,  witlvV  < v. 
In  the  inertial  frame  moving  with  O'  the  components  of 


Exercises 


693 


[(UopuUUAdyuyi 


the  relativistic  momentum  of  the  particle  are  p'x  = 
Wow'/Vl  - v'2//2,  p'y  = 0,  p'z  = 0,  and  its  total  relativistic 
energy  is  E'  = wtocVVT  - u'2/c2.  Here  v'  is  the  speed  of 
the  particle  with  respect  to  O'.  Use  the  Lorentz  velocity 
transformation,  given  by  the  first  of  Eqs.  (14-21),  to  evalu- 
ate v' . Then  use  this  value  of  v'  in  the  right  sides  of  the  ex- 
pressions for  p'x,  p'y , p'z,  and  E'  to  derive  the  following  set  of 
ec|uations: 


P'x 


1 


Vl  - V 2/c2 

Py  — Py 
P'z  = Pz 

1 

E'  = , 

Vl  - V/c2 


(Px  - VE/P1) 


(E  - Vpx) 


These  equations  constitute  the  Lorentz  momentum- 
energy  transformation.  Compare  them  to  Eqs.  (14-16), 
the  Lorentz  position-time  transformation.  In  the  compari- 
son show  that  the  quantities  px,  py,  pz,  E/c 2 transform 
in  exactly  the  same  ways  as  the  quantities  x,  y,  z,  t,  re- 
spectively. This  circumstance  is  the  starting  point  of  a more 
advanced  treatment  of  special  relativity,  using  four  dimen- 
sional vectors  with  components  (x,  y,  z,  t)  and  (px,  pu , pz, 
E/c1).  These  vectors  are  usually  called  “four-vectors.” 


‘V  5-32)  Momentumr energy  conservation.  Two  particles 
with  resrmasses  m0l  and  m02  are  moving  along  the  x axis  in 
the  inertial  frame  O with  velocities  vt  and  v2.  They  collide  ( 
head  on;  out  of  the  collision  emerge  two  different  par- 
ticles with  rest  masses  m03  and  m0 4 moving  along  the  x axis' 
with  velocities  v3  and  v4.  Conservation  of  momentum  re- 
quires that  the  relativistic  momenta  of  the  four  particles 
obey  the  relation^  -4-  />2  — p3  — p4  = 0. 

a.  Use  the  Lorentz  momentum-energy  transforma- 

tion to  obtain  the  relation  that  holds  in  a second  inertial 
frame  O'  moving  with  velocity  V relative  to  O along  its/ 
x axis.  ,Ar  \o  ajpWilACWTvg, 

b.  If  the  conservation  of  momentum  holds  fpf  Of 


what  other  conservation  law  must  hold  simultaneously: 

^ 15-33 . Energy  of  a cosmic-ray  muon.  In  the  formation 
process  for  cosmic-ray  muons  considered  in  Example 
15-5,  the  pion  is  moving  in  the  direction  toward  the 
earth’s  surface  before  it  decays,  and  has  a relativistic 
kinetic  energy  30  times  its  rest-mass  energy.  The  muon 
emitted  when  the  pion  decays  happens  to  be  emitted  in 
the  direction  towards  the  earth’s  surface.  Use  the  Lorentz 
momentum-energy  transformation  derived  in  Exercise 
15-31  to  evaluate  the  relativistic  kinetic  energy  of  the 
muon,  as  measured  in  a reference  frame  fixed  to  the 
earth’s  surface. 


15-34.  Speed  of  the  center-of  -momentum  frame.  In  the  in- 
ertial frame  of  a laboratory,  particle  1 is  moving  to  the 
right  with  total  relativistic  energy  E j and  relativistic  mo- 
mentum px,  and  particle  2 is  stationary  with  total  relati- 


vistic energy  equal  to  its  rest-mass  energy.  Use  the 
Lorentz  momentum-energy  transformation  equations 
derived  in  Exercise  15-31  to  show  that  in  an  inertial  frame 
moving  to  the  right  with  speed 


V = 


cp  1 

Ex  + E2 


c 


the  system  of  two  particles  has  zero  total  relativistic  mo- 
mentum. 

15-35.  An  elastic  electron-electron  collision.  An  electron 
is  moving  through  the  inertial  reference  frame  of  the  lab- 
oratory at  a speed  comparable  to  the  speed  of  light.  It 
experiences  an  elastic  collision  with  another  electron, 
which  is  free  and  initially  stationary.  Use  the  conservation 
of  relativistic  momentum  and  the  conservation  of  total 
relativistic  energy  to  derive  an  expression  showing  that 
the  angle  between  the  trajectories  of  the  two  electrons 
after  the  collision  will  be  less  than  90°.  Then  use  this  ex- 
pression to  show  that  the  angle  you  measure  between  the 
electron  trajectories  in  the  cloud  chamber  photograph  of 
Fig.  1 5-3  is  consistent  with  a relativistic  mass  of  the  incident 
electron  equal  to  4.1  times  its  rest  mass.  (Hint:  You  may 
find  it  easier  to  use  the  equations  displayed  in  Exercises 
15-31  and  15-34  to  perform  the  calculation  as  follows: 
transform  to  the  inertial  frame  in  which  the  total  rela- 
tivistic momentum  of  the  electrons  before  the  collision  is 
zero;  treat  the  collision;  then  transform  back  to  the 
laboratory  reference  frame.) 

15-36.  Threshold  for  production  of  a proton-antiproton 
pair.  An  observer  in  a laboratory  sees  a proton  of  suffi- 
cient energy  strike  a stationary  proton  to  produce  a 
proton-antiproton  pair  in  addition  to  the  two  original  pro- 
tons. (The  antiproton  is  the  antiparticle  of  the  proton.  It 
has  the  same  mass  as  a proton  but  a negative  charge. 
See  Exercise  15-7  for  an  analogous  process  involving 
electrons  and  a positron.) 

To  determine  the  reaction  threshold  (the  minimum 
kinetic  energy  of  the  proton  moving  in  the  laboratory  ref- 
erence frame  to  bring  about  the  reaction),  shift  to  a frame 
of  reference  in  which  initially  the  two  protons  have  zero 
total  relativistic  momentum.  In  this  frame,  the  two  pro- 
tons have  equal  and  opposite  speeds,  equal  masses,  and 
therefore  equal  energies.  Suppose  the  energies  are  such 
that  after  the  collision  the  two  original  protons  and  the 
proton-antiproton  pair  are  at  rest  in  this  frame  of  ref- 
erence. The  energy  required  to  bring  this  about  will  be  the 
reaction  threshold  energy. 

a.  In  this  coordinate  frame,  what  is  the  minimum 
total  relativistic  energy  needed  to  create  a proton-anti- 
proton pair?  What  is  the  minimum  total  relativistic  energy 
of  each  original  proton? 

b.  What  is  the  speed  of  either  original  proton  in  this 
coordinate  system? 

c.  Suppose  the  speed  of  one  of  these  protons  is  zero 
in  the  laboratory  frame  of  reference.  What  is  the  speed 


694  Relativistic  Mechanics 


of  the  zero-momentum  frame  of  reference  relative  to  the 
laboratory  frame  of  reference? 

d.  What  is  the  speed  of  the  other  proton  in  the  labo- 
ratory frame  of  reference? 

e.  What  is  the  total  relativistic  energy  of  this  proton? 
What  is  its  relativistic  kinetic  energy,  the  minimum  re- 
quired in  the  laboratory  frame  of  reference? 

f.  Only  one-third  of  the  relativistic  kinetic  energy  is 
converted  to  the  rest  mass  energy  of  the  proton-antiproton 
pair.  What  happens  to  the  remainder  of  the  relativistic 
kinetic  energy? 

15-37.  An  inefficient  process.  This  problem  illustrates 
the  inefficiency  of  using  a high-energy  particle  to  collide 
with  an  identical  initially  stationary  one  to  produce  new 
particles.  Let  the  initially  moving  particle  have  speed  v( 
and  total  relativistic  energy  E = me2  = wt0c2/V  1 — vf/c2 
and  let  £0  = m0c2  be  the  rest-mass  energy  of  either  particle. 
Let  M0  be  the  rest  mass  of  the  particle  formed  in  the 
completely  inelastic  collision  and  tt/be  its  speed  so  that  its 
total  relativistic  energy  is  E = Me2  = M0c2  /\/l—vf/c2 

a.  From  the  conservation  of  relativistic  momentum 
and  total  relativistic  energy,  show  that  vt /vf  = (m  + m0)/m. 

b.  Use  this  result  to  show  that  1/V 1 — vf/c2  = 

V(wz  + m0)/2m0 

c.  From  this,  show  that  Mqc2  = V2 m0  (m  + m0)  c2. 

d.  Show  that  the  energy  which  appears  as  new  rest- 
mass  energy,  the  useful  energy  Eu,  is  given  by  Eu  = 
2m0c2 \\J(m  + m0)/2m0  — 1],  Then  show  that  the  effi- 
ciency 7]  of  conversion  to  new  rest-mass  energy,  defined 
as  the  ratio  of  Eu  to  the  energy  the  accelerator  must  give 


to  the  incoming  particle,  is 

2m0  / jm  + m0  \ 

^ m - m0\s  2 m0  / 

e.  Accelerators  have  been  built  which  accelerate  pro- 
tons so  that  their  relativistic  kinetic  energy  is  30  times  their 
rest-mass  energy.  What  is  the  efficiency  of  conversion  of 
such  accelerators  in  a proton-proton  collision?  Where  is 
the  remainder  of  the  energy? 

f.  Apply  the  expression  for  Eu  to  the  case  of  the  for- 
mation of  a proton-antiproton  pair  discussed  in  Exercise 
15-36,  and  compare  the  results. 

g.  In  some  modern  high-energy  particle  accelerators, 
this  ‘inefficiency”  problem  is  overcome  by  making  the  de- 
sired collisions  occur  between  two  beams  of  particles 
moving  in  opposite  directions,  instead  of  aiming  a single 
beam  of  particles  at  a stationary  target.  Refer  to  Exercise 
15-36,  and  explain  why  this  approach  is  desirable. 

Numerical 

15-38.  Kinetic  energy  versus  momentum.  Write  a pro- 
gram that  will  make  the  calculating  device  you  use  evalu- 
ate Eq.  ( 1 5-2 lc),  K = V (cp)2  + ( m0c 2)2  - m0c2.  Check  it 
against  the  result  of  Example  15-4a  by  taking  m0c2  = 
8.20  x 10-14  J,  the  electron  rest-mass  energy,  and  eval- 
uating K for  the  value  of  cp  employed  in  the  example. 
Then  use  a number  of  other  values  of  cp  so  as  to  obtain 
values  of  K from  which  you  can  produce  a quantitative 
version  of  Fig.  15-11  for  the  case  in  which  m0  is  the  elec- 
tron rest  mass.  Plot  these  values  over  a range  of  K ex- 
tending from  0 to  5 moc2. 


Exercises  695 


Answers 


1.  Most  answers  are  given  to  three  significant  figures.  The  departures  from  this  convention  occur  where 
appropriate,  based  either  on  the  problem  statement  or  on  the  numerical  details. 

2.  Unless  otherwise  indicated,  g has  been  assigned  the  value  9.80  m/s1 2. 

3.  For  problems  which  require  estimates,  the  numerical  answers  are  preceded  by  “Est.” 


CHAPTER 

1.  Est:  (a)  0.1  m/s  = 3.3  x 10~10  c\  ( b ) 1 m/s  = 
3.3  x 10-9  c\  (c)  6.7  m/s  = 2.2  x 10“8c;  (d)  27 
m/s  = 9 x 10-8  r,  ( e ) 291  m/s  = 9.7  x 10-7  c; 
(/)  8.1  X 103  m/s  - 2.7  X 10~5  c;  (g)  1.01  X 
103  m/s  = 3.4  x 10-6c;  (. h ) 3 x 104  m/s  = 

10-4  c 

3.  500  s = 8.33  min 

5.  (a)  9.47  x 1012;  (b)  4.33  light-years;  (c)  8.66  yr 
7.  (a)  8.8  x 10“2  s;  (b)  0.22 
9.  (a)  0.441  s 
13.  8 m/s2 

15.  (a)  4 m/s2;  (b)  4.55  m/s2 

17.  Est:  (a)  50  mi/h;  (b)  2 x 1 03 ; (c)  0.82  h/day,  5% 

19.  v = Vi  + a(t  — tj),  x = X{  + Vi(t  — ti)  + \a{t  — fi)2, 


1/  — V' 

x = Xi  H 9a  '1  *-no  change). 

X = Xi  + \(v  + Vi  )(t  — ti) 

21.  (a)  0.5  m/s;  (b)  — 0.5  m/s;  ( c ) — 0. 167  m/s;  (d)  0 
23.  40.7  m 

25.  (a)  7t/4  s = 0.785  s;  (b)  0;  ( c ) 3tt/4  s = 2.36  s, 
0.12  m/s2 

29.  (a)  17.1  mi/(h  • s)  = 7.66  m/s2  = 0.78  g; 

(b)  46.9  m;  (c)  176  mi/h  = 78.5  m/s,  88  mi/h  = 
39.3  m/s,  10.2  s 

31.  11.2  m 

33.  (a)  29.4  m/s;  (b)  44.1  m 
35.  (c)  1.78  s,  11.2  m 
37.  (a)  1.48  s;  (b)  10.7  m 


697 


39.  (a)  0.981  s,  5.39  m/s.  ( b ) 0.563  s,  20.5  m/s. 
(c)  Actual  round-trip  time  is  1.54  s,  while 
constant-speed  time  would  be  1.33  s.  (d)  Bal- 
loon intended  for  Lou  must  be  dropped  1.70  s 
before  Hugh  throws  the  ball;  balloon  intended 
for  Hugh  must  be  dropped  1.19  s before  Hugh 


throws  the  ball.  Yes.  0.05  s.  0.51  s.  (e)  0.55  s.  (/) 
Lou’s  tomato  strikes  first,  5.13  s after  it  was 
thrown  (and  4.13s  after  missing  on  the  way  up); 
it  hits  her  at  20.3  m/s.  Hugh’s  tomato  hits 
0.44  s later  (5.57  s after  it  was  thrown);  it  hits  at 
24.6  m/s. 


CHAPTER  3 

1.  (a)  0.5  s;  (0)  8.0  m/s2 

3.  (a)  0.553  s,  0.553  s,  1.11  s;  (b)  vx  = 1.80  m/s, 
vy  = 5.42  m/s;  (c)  71.6°,  5.71  m/s;  ( d ) 5.71  m/s 

5.  46.2  m 

7.  6.14  m/s 

9.  (b)  yes;  ( c ) no 

11.  (b)  A + B = 5.0  cm  at  0°,  A - B = 8.66  cm  at 
90°,  B - A = 8.66  cm  at  270°  = -90° 

13.  247  km,  26.6°  south  of  east 

15.  20  cm  at  C,  45  cm  at  D 

17.  D 

19.  (a)  5.00  m/s;  ( b ) 0;  (c)  2.62  m/s2  toward  center 
of  circle 

21.  (a)  189  km/h;  ( b ) 5.6°  east  of  north 

23.  6.43  m/s  = 14.3  nti/h  downward,  7.84  m/s  = 
17.4  mi/h 


X27-  (a)  tx  = 1.41  v0 /g;  ( b ) t2  = (1.04  v0/g)  + A t; 

— (c)  tx  = 5.04J”  s,  t2  = 4.20  s,  relay  method  is 
quicker  ^ ^ 

V 29.  (a)  0.344%;  (b)  5 nO/f  x 103  s = 84^5  min;  (c)  the 
two  periods  would  be  equal 

31.  (c)  53.1°,  5 nt/s;  (d)  133  m 

33.  (a)  Rx  = (2t'o/g)(sin  0 cos  0 — tan  a cos2  0); 
(0)  0max  = 45°  + a/2;  (c)  60°;  {d)  41.2  m/s 

35.  (a)  w — v0  cos  0 x + (a0  — v0  sin  6)  y,  or 
w = Vuo  + u'o  — 2u0v0  sin  6 at  an  angle  of 
sin_1(u0  cos  0/w)  with  y axis; 

( b ) w = (25.2  — 4.1  sin  0)1/2  at  angle  of 
sin-1  (0.41  cos  0/w),  wmax  - 5.4  m/s, 
the  maximum  angle  between  w and  y is  sin-1 
(vo/u0)  — 4.7°,  which  occurs  when  0 = 4.7°  and 
also  when  0 = 175.3° 


'CHAPTER  4 

1.  (a)  -52  slug;  (0)4.1  slug;  ( c ) divide  weight  in 
pounds  by  g;  (d)  32  lb;  (e)  32  slug 

3.  0.5  m 

5.  (a)  6.25  x 103  N;  (b)  213  times  the  weight 

7.  1.86  x 102  N on  moon,  1.13  x 103  N on  earth 

9.  (a)  8.0  x 105  N;  (c)  net  force  is  4.1%  of  the 
weight 

13.  6.67  m/s 

15.  (a)  1.1  m/s;  (b)  to  minimize  the  recoil  speed 

17.  The  ground  exerts  forces  on  the  ball  during 
impact. 

19.  (a)  130.7  kg/s.  ( b ) The  thrust  remains  the  same 
but  the  rocket’s  mass  is  smaller. 

698  Answers 


21.  (a)  They  meet  halfway  between  their  initial  po- 
sitions. ( b ) a length  L. 

23.  Block  slides  forward  when  stopping  if  H > ^sg- 
Block  will  not  slide  backward  when  starting  if 
M < ^sg- 

25.  (a)  Vji  = Vjy  + v2f.  ( b ) The  three  vectors  form  a 
closed  triangle  and  therefore  lie  in  a plane. 
(d)  The  vector  vlf  is  a diagonal  of  the  parallelo- 
gram with  sides  v^,  v^.  (e)  The  vector  dif- 
ference Vjj-  — v2 f is  the  other  diagonal,  (f)  The 
parallelogram  must  then  be  a rectangle,  and 
therefore  must  be  perpendicular  to  \2j. 

27.  (a)  23  N;  ( b ) 2.9  x 10“2  s;  0.58  m 

29.  (a)  Vtf  = — u0x/2,  = v0k/2;  (0)  = -|u0x, 

v3/  = ~iu0x 


33.  (b)  m2/m1  — 2.9.  (c)  The  data  are  consistent  with 
an  elastic  collision. 

35.  (a)  4.43  ni/s;  (b)  3.13  m/s;  (c)  3.78  x 103  m/s2; 
(d)  3.79  x 102  N;  (e)  3.87  x 102 

37.  k"  = kx  + k2 

39.  (a)  xM  = /zsL,  }'m  = [x\L/ 2;  (6)  xw  = 8m,  vA/  = 
3.2  m 

41.  4.2  m/s 

43.  (a)  -f  m/s;  (b)  § m/s;  (c)  2 s;  (d)  f m;  (e)  -f  m; 

(/)  o 

45.  (a)  \xBf  - xAf\  = 4.29  m/s,  |vBi  - xAi\  = 17.3  m/s, 
no;  (b)  2.48;  (c)  15.9  m/s  at  59.0° 

47.  (a)  Taking  the  positive  direction  to  be  down- 
ward, the  equation  reads  Ma  = Mg  + Tt  — Tu. 

49.  Letting  F A andFB  represent  the  frictional  forces 


at  interfaces  A and  B,  (a)  F A = mAa,  FB  — (»;,  + 
m2)[o  cos  dB  + g sin  0B],  FA/FAmax  = a/fiAg, 


EB/TBmax  — 


a/g  + tan  6, 


. / / t „ . (b)  There  is 

M-b  Ll-  (a/g)  tan  dB] 

slippage  at  A if  a 5=  there  is  slippage  at  B if 

a ^ g ) Interface  A slips  first  if 

45  \ 1 + fJiB  tan  eB)  v 

Va  < (fJ-B  ~ tan  0B)/(1  + fxB  tan  dB).  (c)  The 

equations  of  part  b apply  with  the  substitution 

0B  -»  — 0B.  Interfaced  slips  first  if  /jla  < (/jlb  + 

tan  0B)/{\  - aB  tan  0B).  (d)  12.1°  s£  dB  =£  38.7°. 


51.  (a)  Trucker  can  stop  without  slippage  if 
vl/2fxsg  < S0-  (b)  vI/2/jl sg  = 91.8  m < S0,  and  so 
trucker  can  stop  without  slippage,  (c)  Trucker 
can  stop  without  slippage  if  n0A t + v%/2 fxsg  < 
S0-  (d)  v0M  + vl/2usg  =107  m ^ iS o ? and  so 
trucker  cannot  stop  without  slippage. 


(Chapter  5 


1. 


3. 

5. 

7. 

11. 

13. 


15. 

17. 

fl9. 

i>l£. 

23. 


a = g 

1.2  m/s2 

(a)  3.27  m/s2  downward;  (b)  4.9  m/s2  down- 
ward; (c)  13.1  N;  (d)  9.8  N 

(a)  2.5  N directed  opposite  to  motion;  ( b ) 0.255 

0.857  m/s2  forward 

Est:  fictitious  force  of  0.5  mg  downward,  50%  in- 
crease in  perceived  weight 

(a)  2.25  x 105  N;  (b)  7.5  m/s2 

(a)  128  N;  (b)  64  N 


(a)  aA  = 4.9  m/s2  upward,  ci2  — 0;  (b)  ax  = 

14.7  m/s2 7  upward,  a2  = 2.45  m/s2 *  upward 


(a)  /jls  < tan  0 for  block  to  slide  back;  (b)  Vf/vi  = 
fian  0 — /Akx  1/2 
dan  0 + /xk, 


/tan  0 - /J.k\ 12  , , in  , 

„ , ; (0  1 o.b  m/s 

\tan  0 + ixj  w 


27.  fi.O^x  I ()3  s = 84T/T min 
29.  (a)  cos  0 = m/M;  (b)  L = 


T2gM 
4TT2  m ' 


2TT\fhJg 


(c)  T = 


31.  V2 

33.  (a)  d2  = 2di,  (b)  ax  = 2.18  m/s2,  a2  = 4.36  m/s2; 
(c)  0.545  N 


35.  The  bananas  rise  with  the  same  acceleration  and 
speed  as  the  monkey  until  they  get  stuck  against 
the  pulley. 

37.  (a)  9.8  N;  (b)  a2  = 4.9  m/s2  down,  aA  = 2.45 
m/s2  down,  a3  = 2.45  m/s2  up;  (c)  19.6  N; 
(d)  1.6  kg 

39.  (a)  no 


Chapter  6 

aanHwnw 

1.  0.25  kg 
3.  1.58  Hz 
5.  6.2 1 cm 


14.1  cm,  umax  = 443  cm/s,  amax  = 1.39  x 

104  cm/s2 

11.  27 T\/d/g 


7.  (a)  x = (10.0  cos  1077/  — 10.0  sin  IO77/)  cm; 

( b ) x = 14.1  cm  cos  (1077/  + 77/4);  (c)  xmax 


13.  1.72  s 

15.  ( b ) N = N0eRt 


Answers  699 


19.  (a)  p.sg  /4tt2v2\  ( b ) 1.65  cm 


21.  (a)  The  period  T — 2tt 


m1m2 


A2/Ax  = mx/m2,  ( c ) T = 2ir\/m/2k,  v2  - ~vx, 
A2  = A1 


the  veloc- 


23.  (a)  112  s;  (b)  30  s 


(m1  + m2)k ' 

ities  of  1 and  2 are  related  by  v2/vx  = -mjm2,  OK  , . q , \ c- 

rtSEraTSd  2 arri^ecrwiror^iT/g?^ 25- {a)  2r {b)  pAl ■ {c)  F 


2pgAy;  (d)  2rr\/l/2g 


and  the  amplitudes  A1  and  A2  are  related  by 


CHAPTER  7 

1.  (a)  (1)  120  J,  (2)  250  J.  (3)  315  J,  (4)  6.25  x 
105  J,  (5)  1.5  x 107  J;  (b)  (1)0.171  m,  (2) 
0.357  m,  (3)  0.45  m,  (4)  893  m (5)  21.4  km 

3.  m/M 

5.  42.9° 

9.  (a)  \m(v2  — vf );  ( b ) 2 m(vf  — v2)/s\  ( c ) W = 225  J, 
F = 11.3  N 

11.  (a)  9.8  x 107  J;  (b)  4.9  x 107  J 

13.  (a)  866  J,  ( b ) 86.6  N;  ( c ) — 866  J;  (d)  0 


27.  (a)  44  J;  (b)  44  J;  (c)  44  J 
29.  (a)  4.90  J;  ( b ) 2.55  J;  (c)  2.35  J 
31.  (a)  4.43  m/s;  ( b ) 0.333 
33.  20  m 

35.  (a)  \/gR;  ( b ) R/ 2;  (c)  ~ = 0.769 

idea] 

X 37.  (a)  xmax  = \/{p.kyng/k)2  + tiflT20/k  - /x kmg/k ; 

( b ) block  returns  if  £xmax  > pLsmg\  (c)  xmax  = 
0.202  m,  block  does  not  return 


15.  (a)  6.26  m/s;  ( b ) 49  N 
19.  IT  = rngx 


21.  (a) 


2 mBgP 
mA  + mB’ 


7.67  m/s 


23.  (a)  783  J;  ( b ) 695  J;  (c)  2.82  x 1 03  J;  (d)  371  J; 
(e)  2.62  x 103  J;  (f)  92.4  J 

25.  (a)Yf  - Vt  - |vg|  \n(mf/mi);  ( b ) mf  = 0.223 


39.  2 mg 

41.  (a)  0.500  J;  (b)  0.500  J;  (c)  0.667  J;  (d)  no 

43.  (a)  \k(h  + r - /0)2;  ( b ) \k(\h  - r\  - /0)2; 

(c)  2 kr(h  — /0)  if  h > r,  2 kh(r  — I0)  if  r > h; 

(d)  2 k{\/h2  + r2  + 2 hr  cos  6 — /0)2;  if)  the  an- 
swers for  parts  a,  b,  and  d 

45.  (a)  0.791  m/s;  ( b ) 0.601  m/s 


! CHAPTER  8 

1.  206  W,  0.276  hp 

3.  (a)  16.3  W/m2;  (b)  5.45  x 10~3 

5.  (a)  2 - T2)\  (6)  27rrn(T1  - T2);  (c)  4.24  x 
104  W = 56.9  hp 

7.  ir2/r1 

9.  (a)  ( R/r )3;  ( b ) 125,  7.84  N 
11.  (a)  2;  (b)  8;  (c)  4 
13.  (a)  NR/r;  (b)  24 

15.  (a)  MgD/d;  (b)  490  N;  (c)  1.63  x 106  N/m2; 
(d)  16.1  atm 

17.  M/(M  + m) 

19.  {a)  96.1%;  ( b ) 4.71  N;  (c)  6.00  x 104  N/m2  - 
0.592  atm 


21. 


(a)  1.98  m/s;  (b)  1.01  x 10-3  s,  10~3  m,  yes; 
(c)  2 mg;  (d)  4 x 10_,;  J.  4 x 10-3  W;  (e)  flea: 
2 x 103  W/kg,  climber:  4.93  W/kg 

(a)  mxgh\  ( b ) i(m1  + m2)%2  + m2gh\ 

,,  l2gh(mi  ~ m2)  0«i  ~ m2)h 

V mx  + m2  ’ mx  + m2 


25. 


(a) 

( b ) 


Vl  / 


mx  - m2 


mx  + m2 
4mxm2 


(mi  + m2)2' 


Vi i , v2/  = 

(c)  for  m2  = mx ; 


2^iVii 

+ m2’ 
(d)  0.284 


27.  (a)  Ku  = 3.13  x 10“2  J.  K2i  = 0,  = 1.25  x 

10-2  J,  K2f=  6.29  x 10“3  J,  \AK\  = 1.25  x 
10“2  J;  (b)  | A A' | = 1.17  x 10“2  J 

31.  Est:  (a)  2.85  kg-m/s;  (b)  4 x 10-3  s;  (c)  710  N; 
18  kW 


700  Answers 


33.  (a)  and  ( b ):  mxtrA2/ 4 

35.  (a)  elastic  if  e = 1,  inelastic  if  0 e < 1 

\ 37.  (a)  14.5°,  37.8°;  (b)  Kpf  = 0.40(fXai  = 1.2  x 
106  eV  ^ 


39.  (b)  \(mx  + m2)vf  + \kd2; 

(c)  Uf  = aid  at  angle  a with  x axis; 

m2cod  \2  2vjm2(x)d  cos  a 


id)  vlf  = 


vt  + 


to. 


m2 


TOX  + TO2 


1/2 


01 


= tan  1 


VZf  = 


m2cod  sin  a 

Vi  (to  i + m2)  — m2iod  cos  ay 
/ m ] 0) d \2  ^ 2vim1ojd  cos  a 
VTOx  + m2)  mi  + m2 


1/2 


I32  = tan  1 


mitod  sin  a 

Vi(mx  + m2)  + TOjOirf  cos  a 


(e)  tr  = 7.85  x 10“2  s,  Kf  = 19.6  J,  u,  = 

4.00  m/s  at  a = 1 10°,  vx/  = 2.96  m/s  at  /3X  = 
30.6°,  v2/  = 2.54  m/s  at  /32  = 62.4° 


% 


lCHAPTER  9 

1.  6.89  rad 

5.  (a)  1.57  rad/s2;  ( b ) 1800  revolutions 

7.  (a)  v0/r;  (b)  v0/(R  — r);  ( c ) 20  rad/s,  5 rad/s 

9.  The  day  would  become  longer. 

13.  1700  km  below  the  surface 
15.  xc  = 0.5/,  yc  — 0.3/ 

17.  T0  = mg(  1 - r1/r2)  + Mg/ 2 


45. 

47. 


4MgR(\  - cos  6)/3tt  + U(0) 
(a)  26.6°;  (b)  45° 


X49- 


(a)  225  N;  (b)  298  N upward  at  an  angle  of 
107.6°  with  top  surface  of  obstacle;  (c)  Ft  = 
224  N upward,  F„  = 196  N inward,  F{/F/  = 

1.14 


X51. 


(a)  d0  — tan  1 
(d)  90°;  (e)  fxG 


2/C 


w 


; ( b ) 46.4°;  (c)  51.3°; 


21.  a = —arx 


w r r 


25.  1.5  m 

31.  (a)  Hkh/ 2;  ( b ) w/2\xk;  (c)  0.165  m,  0.758  m 
35.  (a)  5.88  x 103  N;  (b)  horizontal:  5.09  x 103  N; 


53. 


lV2 


(a)  on  the  angle  bisector,  a distance — j-1  from  71 ; 

(b)  18.4° 


57. 


vertical:  1.96  x 103  N;  (c)  2.29  x 104  Nrn,  bal- 
ancing torque  is  due  to  forces  acting  under- 
ground 


(a)  1/2;  ( b ) 1/4,  SI/ 4;  (c)  //27V, 
/ 


/ £ 1 


39.  (a)  m1y1/(m1  + to2); 

•WlVl 


(b)  m2\ i/ (to j + to2); 


59. 


2 (1  + i + i + • • • + VA0  -9  X; 
5r/3 


n=l 


(c) 


m 1 


■;  (d)  0 

TO2  ,vw. 


y 

CHAPTER  10 


1.  (a)  hoopi  MR2;  square:  iMR2 
3.  19.7  km 
7.  hMA 2 

11.  (a)  3 g sin  0/21;  (b)  3 g Vf  — 2 cos  0 + f cos2  6J  ^ 

13.  (a)  2tt  V2 R/g;  ( b ) 2nX//3R /2g;  ( c ) part  a,  15.5% 
longer 

o 

15.  2 rotations  per  second 
17.  (a)  g/ 51;  (b)  50Mg/5l 

If 

(d) 


19.  (a)  ^ sin  0;  (b)  ^ sin  0;  (c)  3^c/4 


25.  27? /5  above  center 
29.  a = 0.134g,  M/m  = 3.05 

31.  1.1  x 103  J 


2 dg 


15.6  rad/s, 


( 33 . )(a)  im0r%(ii$,  0.0790  J;  (b)  up  = 

Uf  /wko  , r . 

Mp/^s  — 0.124^  -v\54Vtc k.  ^ 

w b tool  dm- f live  , 


35.  (a)  (7 R - 27r)/ 10 

37.  (a)  3tf0/4;  (b)  Ri/R0  = 0.823,  Vc/V0  = 0.557 


39.  ( a ) le/Is Ph  = 0.828;  (b)  5.86  x 1033  kg-m2/s; 

(c)  2.68  x 104°  kg-m2/s;  (d)  2.19  x 10~7; 

(e)  2.14  x 1029  J;  (/)  2.66  x 1033  J (g)  8.06  x 
10“5 


(a)  I = iMR 2 + \mc2. 

( b ) \(H  + 5R/8)M  4-  (H  + R 4-  l)m\/(M  + m). 

, [(H/R  + f)  + (H/R  + 1 + //i?)(m/4|  g 

[2/5  + mc2/2MR2]Rw^  ^ ’ 

(d)  For  maximum  precession  rate,  use  / = fi- 
d/2. For  minimum  precession  rate,  use  I — d/2. 

(e)  1.39. 


3M 

41.  (b)  dM  = — r r2dr ; (c)  dl 


R- 


2 M 

>-  dr-. 


(d)  0 


43.  no 

45.  (a)  2//3  from  ^4;  (6)  27r  v/2//3g;  (c)  2//3 

47.  (a)  A3Mi;  (6)  A5/i;  (c)  A = 1.2:  M2  = 1.73Mj, 
U = 2.49/j;  A.  = 2.0:  M2  = 8Mj , /2  = 32  /x ; A = 
5.0:  M2  = 125Mj , /2  = 3125  /x 


51.  Subscripts  h,  p,  and  c refer  to  hanging  weight, 
pulley,  and  cylinder,  respectively. 

4g 

(a)  a/i  = (2  — sin  0), 

°ip  = (2  - sin  6),  ac  = -y|-  (2  - sin  0), 


2£ 


= _ sin  S);  (4)  r» = i§ 


7 + 4 sin  0), 


Mg  n . 1+7  sin  6 

Tc  = -r-  (1+2  sin  0);  (c)  — — — 

5 la  cos  0 


'^6. 


:hapter  ii 


(Note:  Small  discrepancies  between  your  results  and 
the  answers  given  here  may  be  due  to  use  of  dif- 
fering values  for  G,  the  earth’s  mass  and/or  radius, 

and  so  on.)  , 

[-1® 

S\.  8.0)5  x 106  m,  rmx  106  ill 
> 3.  275  m/s2 
✓5.  6.04 
i/7.  3.26  s 

/ 9.  ^ of  the  way  from  the  earth’s  center  to  the 
moon’s  center,  also  I of  the  way  from  the  earth’s 
center  to  the  moon’s  center,  or  3.46  x 105  km 
and  4.32  x 105  km  from  earth’s  center;  8.6  x 
104  km 

/1 3.  Est:  30  km 

45.  (a)  2;  ( b ) concave  toward  the  sun 
4^9.  (b)  3.30  x 105 

23  V(a)  \ rn2orr\ , \(mxr\  + m2r  2)oj2 


•^5.  (a)  re/2h;  ( b ) i 


27.  (a) 


2ttR1R2  jM1R1  + M ,R  2 


\/G  V M XR^  + M2Rj 


G MeM1M2 

j^2 


Mj  + 


r / 

A LY 

m2  y 

x 


Ri 


3 'i)2  + 


Ax3 


1 + 


(c)  -93TT min, 

2.66  x 10"3  N = 3.88  x 10“6  M2g\  (d)  95r0  N 

^9‘  (C)  1$  i Wb 

31.  (a)'*! .27  x 103  km;  ( b ) y = FtO+4  x 104  km,  8 = 

4+54  x 1 03  km,  c = 4+5 rl  x 103  km 
4 OS'  3Md 

33.  (a)  8.19  km/s;  ( b ) 0.64  km/s;  (c)  5.73  km/s; 
(d)  increase  speed  to  6.32  km/s  (an  increase  of 
0.59  km/s);  (e)  141° 


^ CHAPTER  12 

(Note:  In  the  answers  to  Chapter  12  problems,  the 
units  of  x and  t are  assumed  to  be  m and  s,  respec- 
tively, unless  otherwise  specified.) 

^3.  0.75  s 


4>.  24.5  min 

4.  3.0  m 

<4).  (a)  10  Hz;  (b)  0.4  m,  4 m/s,  y = 0.10  m 

COS  (20774  — 57tx) 


702  Answers 


11.  (a)  0.27 t sin(1077X  — 4077t)~“;  ( b ) 0.628  m/s, 
transverse  displacement  is  zero. 

<13.  (a)  4.0  m;  ( b ) 80  N;  (c)  0.987  J/m;  ( d ) 19.7  W 
U 5.  (a)  1.13  x 10“4  W;  ( b ) 0.113  J 
A7.  305.6  Hz,  294.6  Hz 
<*2,3.  0.020  m sin(27rx  — IO77O 
'-25.  42.6  m/s 


1.  (a)  10  Hz;  ( b ) positive  direction:  A,  = 

0.4  m _ 4 m/s 


8 n 


j,  v = g---  + for  n = 0,  1,2,  3,  and  so 


on;  negative  direction:  K 
4 m/s 


0.4  m 


8 n - V 

, for  n = 1,  2,  3,  . . . and  so  on. 


8n  — 1 

k37.  pE  = 5 J/mfc  5 — 760  W for  \x  — vt\  <0.1  m' 
^ awl  zero  otherwise^  £ = 1.0  J 

<^9.  (b)  54.4  km 


*27.  (a)  766  Hz;  ( b ) 837  Hz 

1^9.  (a)  71.4  s;  ( b ) 101.1  Hz,  98.9  Hz 


1.09  x 103  N 

111  i/ 

A.  3 of  the  distance  from  the  end  of  the  string  (so 
that  the  vibrating  portion  has  f of  its  original 
length) 

-9.  (a)  0.309A,  0.951A,  0.809 h;  (b)  0.809 h,  0.588 h, 

0.309 h;  ( c ) 0,  0,  0 

dl.  (a)  283  Hz;  (b)  850  Hz 

A3.  252  Hz,  260  Hz 

ill  42.5  Hz,  127.5  Hz 

i 3.  317  m/s 

^25.  (a)  A cos(27 rvt  + Sx)  + A cos(27 rvt  + S2)  + 


& -f  8 

A cos(27 Tvt  + 63);  (c)  1 2 — “> 

A[  1 + 2cosi(81  - S2)];(d)  (1)  0°,  3d,  (2)  30°,  2.73d, 

(3)  45°,  2.414,  (4)  60°,  2 A,  (5)  90°,  A;  (f)  If 

/ 81  - 8,\  _ . «.  (8,+  S2\ 

cos  I J < 0 then  use  63  = y 7; j ± 7 r. 

^7.  (a)  ~^=  cos(&7i  - kvt  + 8d  + 

Vn 

-~=cos (kr2  - kvt  + 82);  (c)  — = (7^)  , 

Vr2  r2  V ^2  / 

k(rl  - r2)  = (2n  + 1)77  - (Sj  - 82), 

where  n — 0,  ± 1,  ±2,  . . 

^9.  (a)  i/ra>2  A2 


1/ 

CHAPTER  14 


D d"1,  ~ im );  ™ A':  (“7|="™  °)-  ^ 

B’:  (°’  7J”1)'  c’:  (“  7f  m’0)’D': 

0 ■ vf 

C:  (' 77! 

!'15.  (8)  yL';  (c)  no 

Uzl.  (a)  proper;  (8)  yh,  where  y = 1 /V 1 — u2/c2; 
(c)  yuh;  (d)  improper;  (c)  t2/y;  (/)  (t2/y)  - ytx; 
(g)  yvti 


Answers  703 


1.  0.995c 
3.  0.948c 

5.  (a)  1.67  h;  ( b ) 1.33  h,  3 h;  (c)  4 h,  5 h£j” 
7.  (a)  V2;  ( b ) 1/V2 
9.  1.0  percent 
11.  (a)  70.9°;  ( b ) 0.917  m 
13.  (a)  A':  (im,  im),  B (—im,  im),  C': 


(-im,  -im),D'\  (im,  ~im),A:  (77^  m’ 
B:  ( — ~ m,  irnj,  C:  7—  m,  ~imj, 


(IMsUaAAA  d\l  iu 


VUJU 


i d cLa^  1 ^ $ 


hO 


23.  Xa)  c - V;  {v(l/(c  - V);  #)  c + V; 

id)  L/(c  + V),2 Lc/(c2  - V2)-,\Sf)2L/(c2  - V2)1'2; 
0(h)  LV2/c2\\^f)  LV2/kc2lFj)  h (A)  Tofxa  fringe 

x25.  (a)  ljus/v'l  - 1 4/c2;  ^ ^ 

300  m 


(6)  (c/v)/V  i - u!/c2  = 


Vi  - yf/c2 


, w / v /.  /-j rr?  (300  m)(vs/c) 

(c)  (vs/v)/\/\  - vi/C  - — ; 

X 1 - Vs/c- 


(d) 


(c  - Vs)  _ 


'V  1 - v2/c2 


300  m 


/c 


1 + \vs\/c 


; (/)  0.5 


/ 1 ± 

vs\/c 

/ 1 + 

Vs\/C 

/u,s,  2 MHz;  (g)  2 /its;  (ft)U^  ,=  i/ 
with  upper  signs  for  motion  towards  the  earth 
t2f7.  (a)  0.368c;  (ft)  88.3  MHz;  (c)  82.1  MHz 


^31.  (a)  - 173  s,  no;  ( b ) yes,  3 c/7;  (c)  At  > 2100  s = 

4^rSS0.SX3CO 

(?2  - 


35  min;  (d) a2-92  x 1 ; 


t^2 


* 

33.  (a)  c2(t20t[)2 

35.  (ft)  in  the  direction  of  the  earth’s  motion; 
(d)  20.6  arc  seconds 


02 


A) 


9.  (a)  2moC2/\/\  - uVc2;  (ft)  2woC2/Vl  - u2/c2; 
(c)  2w„c2 

13.  9.34  x 10“13J 

15.  6.75  x 10-4  uc2  = 1.01  x 10“13J 

9 

V3 


m0c\ 

F ’ 

a = ( F/m0 )/ 


(d)  v = ( Ft/m0 )/ \l  1 + 

3/2 


1 


Ft  \2 

moc) 


yes. 


19.  ( 1 +~  \ m0n-  c2  = 4.03  x 1 0“n  J, 
1-^1  m0n  ■ c2  = 0.29  x 10~n  J 
21.  Ek  = 1.47  x lO”10  J,  = 1.25  x lO”10  J 


29.  (c)  78.5° 

33.  6.95  x 10"10  J 
37.  (e)  20  percent 


704 


Answers 


Index 


Aberration,  stellar,  656 
Acceleration,  35,  77 
angular,  340,  350 
centripetal,  79,  84,  89 
centripetal  component,  343 
constant,  40 
of  gravity,  46,  1 8 1 , 45 1 
tangential,  79 
tangential  component,  342 
Acceleration  transformation,  Galilean,  100 
Acceleration  vector,  74 
Acoustic  intensity,  592 
Acoustical  interferometer,  55 1 
Acoustics: 
physical,  591 
physiological,  591 
psychological,  591 
Action-reaction  pair  of  forces,  124 
Air  table,  10 
Air  track,  213 
Algebraic  addition,  61 
Amplitude,  215,  231 
Amplitude  coefficient,  596 
Analytical  solution  of  differential  equation,  228 
Angular  acceleration,  340,  350 
Angular  frequency,  231 
Angular  momentum,  351, 352 
conservation  of,  366 
total,  364 

Angular  speed,  348 
Angular  velocity,  339,  348 
precessional,  41 1 
spin,  41 1 


Anharmonic  oscillator,  236 
Antinode,  562 
Aphelion,  473 
Areal  density,  579 
Astronomical  unit,  472 
Atomic  mass  unit,  681 
Attractive  force,  359 
Attwood’s  machine,  164 
Aural  harmonics,  604 
Average  velocity,  28 
Axial  vector,  345 
Axis,  15 


Balance  point,  376 
Ballistic  pendulum,  319 
Banking  of  curve,  94 
Barycenter,  386 
Beats,  602 
Binding  energy,  681 
Block  and  tackle,  306 
Bob,  pendulum,  219 
Body,  74 

at  end  of  spring,  212 
Bombarding  particle,  687 
Bond: 

covalent,  62 

ionic,  682 

Bound  system,  324,  484 
Boundary  condition,  565 
Brake  horsepower,  331 
Bridge  of  stringed  instrument,  609 


705 


Center  of  gravity,  376 

Center  of  mass,  375 

Center  of  percussion,  440 

Centimeter,  18 

Central  force,  354 

Centrifugal  force,  124,  174 

Centripetal  acceleration,  79,  84,  89 

Centripetal  acceleration  component,  343 

Centripetal  force,  124 

Chain  reaction,  685 

Circular  membrane,  standing  waves  on,  579 

Circular  motion,  uniform,  84 

Circular  orbits,  459 

Circular  waves,  535 

Coefficient: 

of  amplitude,  596 
of  damping,  240 
of  drag,  153 
of  kinetic  friction,  149 
of  restitution,  334 
of  static  friction,  148 
of  viscosity,  152 
Coherence  length,  622 
Collision,  310,  31 1 
elastic,  129,  314,  315 
inelastic,  129,  315 
Combination  tone,  605 
Commutativity,  69 
Component  of  a vector,  61 
Concave  downward,  39 
Concave  upward,  39 
Conic  section,  460 
Conical  pendulum,  92 
Conservation: 

of  angular  momentum,  366 
of  mass,  126 
of  mass-energy,  674 
of  momentum,  126,  142,658 
of  potential  energy,  287 
of  total  mechanical  energy,  261, 289 
Conservative  force,  279,  281 
Consonant  notes,  601 
Constant  acceleration,  40 
Constant  velocity,  22 
Constituent  vector,  64 
Constraint,  166 
workless,  280 

Constructive  interference,  551 
Contact  force,  152 
Contact  friction,  147,  173 
Contracted  length,  636 
Contraction: 

Fitzgerald-Lorentz,  636 
length,  633,  636 
Convergence,  limits  of,  26 
Convergence  of  sequence,  26 
Coordinate,  15 
Coriolis  force,  183 
Cosmic  radiation,  637 
Couple,  384 

Coupled  differential  equations,  470 
Covalent  bond,  682 
Critical  angle,  frictional,  150 
Critically  damped  oscillator,  245 


Cross  product,  346 

right-hand  rule  for,  346 
Cycle,  214 


d’Alembert  force,  177 
Damped  oscillator,  236,  325 
Damped  oscillator  equation,  237 
Damping,  236 
Damping  coefficient,  240 
Damping  force,  236 
Decibel,  592 
Definite  integral,  266 
Density,  137 
areal,  579 
linear,  516 

Dependent  variable,  30 
Derivative,  26,  30 
partial,  293 
second,  35 
Derived  unit,  1 16 
Destructive  interference,  551 
Difference  tone,  605 
Differential,  74 
Differential  equation: 
analytical  solution  of,  228 
linear,  225 
nonlinear,  225 
numerical  solution  of,  225 
ordinary,  225 
partial,  519 
second-order,  225 
Differential  pulley,  307 
Differentiation,  30 
Dilation,  time,  633,  635 
Dimensional  analysis,  45 
Dimensions,  45 
Directrix,  477 
Discrete  spectrum,  596 
Displacement,  76 
Dissociation  energy,  324 
Dissociation  separation,  324,  680 
Dissonant  notes,  601 
Distortion  of  spring,  145 
Doppler  effect,  538,  539,  54 1 
relativistic,  654 
Doppler  shift,  542 
Dot  product,  273 
Drag,  152 
coefficient  of,  153 


Earth  satellite,  89 
Elastic  collision,  129,314,315 
Elastic  scattering,  688 
Electromagnetic  force,  5 
Electromagnetic  radiation,  143 
Ellipse,  459 

Empirical  procedure,  143 
Energy,  257 
binding,  681 

in  chemical  reactions,  680 
dissociation,  324 
in  gravitational  orbits,  478 


706  Index 


Energy  (Cont.): 

gravitational  potential,  258,  480 
ionization,  680 
kinetic,  257,  270,  276 
potential,  258,  287 
relativistic  kinetic,  670 
rest-mass,  67 1 
in  rotational  motion,  427 
thermal,  262 

total  mechanical,  260,  288 
total  relativistic,  67 1 
in  waves,  524 
Energy  conservation,  287 
in  relativistic  mechanics,  674 
Energy  density: 
average,  534,  536,  537 
kinetic,  524 
potential,  526 
total,  527 
Energy  flux,  529 
average,  534,  536,  537 
Engine,  reaction,  186 
Equation  of  motion,  166 
Equilibrium: 
neutral,  381 
position  of,  208 
stable,  207,  379 
static,  371 
unstable,  21 1,  380 
Equilibrium  separation,  324,  681 
Equivalence  of  inertial  frames,  627,  628 
Escape  speed,  482 
Event(s),  629 
simultaneous,  629 
Exponential  function,  240 


Fictitious  force,  174,  177,  180 

Fission,  683 

Fission  barrier,  684 

Fission  fragment,  683 

Fitzgerald-Lorentz  contraction,  636 

Fluid  friction.  147 

Fluid  friction  force,  152 

Focus,  459 

Force(s),  138 

action-reaction  pair  of,  124 
attractive,  359 
central,  354 
centrifugal,  124,  174 
centripetal,  124 
conservative,  279,  281 
contact,  152 
contact  friction,  148 
Coriolis,  183 
d'Alembert,  177 
electromagnetic,  5 
fictitious,  174,  177,  180 
fluid  friction,  152 
gravitational,  5,  144 
net,  139 
normal,  148 
reaction,  124 
relativistic,  666 


Force(s)  (Cont.): 
repulsive,  359 
restoring,  146 
short  range,  684 
spring,  145 
strong  nuclear,  5,  684 
tension,  514 
velocity-dependent,  147 
weak  nuclear,  5,  679 
work  done  by,  264 
workless  constraint,  280 
Force  center,  354 
Force  constant,  1 46,  2 1 3 
Form  invariance,  655 
Foucault  pendulum,  186 
Fourier  analysis,  596 
Fourier  component,  595 
Fourier  expansion,  595 
Fourier  spectrum,  596 
Fourier  synthesis,  594 
Frame  of  reference,  56 
Free-body  diagram,  161,  162 
Free  fall,  47 
Frequency,  216,  231 
angular,  231 
fundamental,  562 
Friction: 

coefficient  of  kinetic,  149 
coefficient  of  static,  148 
contact,  147,  173 
fluid,  147 
Fringes,  62 1 

Fuel  consumption  rate,  189 
Fulcrum,  305 

Function,  mathematical.  16 
Fundamental  frequency,  562 
Fundamental  theorem  of  calculus,  267 
Fundamental  unit,  1 16 


Galilean  acceleration  transformation,  100 
Galilean  position  transformation,  99 
Galilean  velocity  transformation,  99 
Gaussian  integral,  301 
Gram,  1 15 

Gravitational  acceleration,  46,  1 8 1 , 45 1 
Gravitational  constant,  universal,  447,  455,  458 
Gravitational  force,  5,  144 
Gravitational  potential  energy,  258,  480 
Graviton,  677 
Gyration  radius,  404,  405 


Harmonic  frequency,  570 
Harmonic  function,  230 
Harmonic  mode,  568 
Harmonic  oscillator,  230 
Harmonic  oscillator  equation,  230 
Harmonics,  568 
aural,  604 

Heavily  damped  oscillator,  239 
Hertz,  216 


Index  707 


Hooke’s  law,  146,  213 
rotational,  409 
Horsepower,  303 

Impulse,  310,  31 1 
Impulse-momentum  relation,  31 1 
Independent  variable,  30 
Inelastic  collision,  129,315 
Inelastic  scattering,  688 
Inertial  reference  frame(s),  108,  1 12 
equivalence  of,  628 
Initial  conditions,  23 1 
Initial  values,  22 
Integral,  266 
Integration,  39,  266 
limits  of,  266 
Interference: 
constructive,  55 1 
destructive,  55 1 
Interferometer: 
acoustical,  551 
Michelson,  620 

Intermodulation  distortion,  605 
Interval,  musical,  601 
Inverse-square  law,  447 
Ionic  bond,  682 
Ionization  energy,  680 
Isolated  system,  126 


Joule,  255 


Kepler’s  laws,  459 
Kilogram,  1 1 5 
Kilometer,  18 
Kinematics,  14 
relativistic,  619 
rotational,  338 
translational,  337 
traveling-wave,  501 
Kinetic  contact  friction  force,  149 
Kinetic  energy,  257,  270,  276 
relativistic,  670 
Kinetic  energy  density,  24 

Length: 

measurement  of,  16 
proper,  636 

Length  contraction,  633,  636 
Lever,  305 

Light,  speed  of,  619,  623 

Lightly  damped  oscillator,  239,  325 

Limit  of  convergence,  26 

Limits  of  integration,  266 

Line  of  nodes,  590 

Line  spectrum,  596 

Linear  density,  516 

Linear  differential  equation,  225 

Longitudinal  wave,  531 

Lorentz  contraction,  636 

Lorentz  momentum-energy  transformation,  694 


Lorentz  position-time  transformation,  640,  645 
Lorentz  velocity  transformation,  648,  650 
Loudness,  592 


Mach  angle,  548 
Mach  cone,  548 
Mach  number,  548 
Machine,  305 
Major  axis,  459,  460 
Mass,  1 14,  133 
conservation  of,  12 

inertial  and  gravitational  aspects  of,  137 
reduced,  465 
relativistic,  663 
rest,  663 

Mass-energy,  conservation  of,  674 
Mass  ratio,  133 
Mass  spectroscopy,  690 
Mass  unit,  atomic,  681 
Mechanical  advantage,  306 
Mechanical  energy,  total,  260,  288 
conservation  of,  261,  289 
Mechanics,  4 
newtonian,  4 
quantum,  4 

relativistic,  4,  61 8,  657,  674 
rotational,  338 
translational,  337 
Meter,  17 

Michelson  interferometer,  620 
Michelson-Morley  experiment,  626 
Millimeter,  18 
Minor  axis,  460 
Mode,  567 
Modulated  tone,  603 
Moment  of  inertia,  392,  394,  395,  398 
Momentum,  133 
angular,  351,  352 
relativistic,  662 
total  angular,  364 

Momentum  conservation,  126,  142,  658 
Momentum-energy  transformation,  Lorentz,  694 
Multidimensional  wave,  531 
Muon,  638 


Net  force,  139 
Neutral  equilibrium,  381 
Neutrino,  676 
Newton,  1 ] 6 
Newtonian  domain,  4 
Newtonian  mechanics,  4 
Newton’s  first  law  of  motion,  108,  112 
Newton’s  second  law  of  motion,  1 13,  114,  138,  139, 
666 

rotational  form,  360,  366 
Newton’s  third  law  of  motion,  113,  123,  142,  143 
rotational  form,  370 
Newton’s  universal  gravitation  law,  448 
Nodal  circle,  588 
Nodes,  56 1 
line  of,  590 

Noninertial  reference  frame,  174 


708  Index 


Nonlinear  differential  equation,  225 
Nonquantum  domain,  4 
Nonrelativistic  domain,  4 
Normal  direction,  120 
Normal  force,  148 
Normal  probability  integral,  301 
Nuclear  force: 
strong,  5,  684 
weak,  5,  679 

Nuclear  magnetic  resonance,  4 1 7 
Nuclear  reaction,  687 
Q value  for,  686 
Nuclear  reactor,  685 
Null  vector,  69 

Numerical  solution  of  differential  equation,  225 
Nutation,  417 


Octave,  601 

Operational  definition,  1 37 
Orbit,  minimum  earth,  90 
Orbit  stability,  487 
Ordered  triplet  of  scalars,  72 
Ordinary  differential  equation,  225 
Origin,  15 
Oscillator: 

anharmonic,  26 
critically  damped,  245 
damped,  236,  325 
harmonic,  230 

heavily  damped  or  overdamped,  239 
lightly  damped  or  underdamped,  239,  325 
Oscillator  equation: 
damped,  237 
harmonic,  230 
Oscillatory  motion,  207 
Overdamped  oscillator,  239 


Pair  production,  693 
Parabolic  trajectory,  80 
Parallax  angle,  445 
Parallel-axis  theorem,  402 
Parameter,  57,  214 
Parametric  equation,  57 
Partial  derivative,  293 
Partial  differential  equation,  519 
Particle,  352 
Pendulum,  29 
ballistic,  319 
conical,  92 
Foucault,  186 
physical,  406 
simple,  219 
torsion,  409 

Pendulum  equation,  221 
Perigee,  482 
Perihelion,  473 
Period,  19,  215,  231 
Perturbation,  485 
Phase,  554 
Phase  constant,  23 1 
Phase  lag,  235 
Photon,  677 


Physical  acoustics,  59 1 
Physical  pendulum,  406 
Physics,  1 

Physiological  acoustics,  59 1 
Pion,  638 
Pitch,  332,  593 
Planck  function,  301 

Position-time  transformation,  Lorentz,  640,  645 
Position  transformation,  Galilean,  99 
Position  vector,  74 
Potential  energy,  258,  287 
gravitational,  258,  480 
Potential  energy  density,  526 
Power,  302 
Precession,  411, 493 
Precessional  angular  velocity,  41  1 
Principal  axis,  402 
Product  particle,  687 
Projectile,  55 
Projectile  motion,  54 
Propagation  medium,  513 
Propagation  of  waves,  5 1 3 
Proper  length,  636 
Proper  time,  635 
Psychological  acoustics,  591 


Q factor,  328 
Q value,  687 
Q-value  equation,  688 
Quality  factor,  328 
Quantum  domain,  4 
Quantum  mechanics,  4 

Radian,  34 
Radiation: 
cosmic,  637 
electromagnetic,  143 
Range  of  projectile,  82 
Reaction  engine,  186 
Reaction  force,  124 
Reaction  torque,  370 
Reactor,  nuclear,  685 
Reduced  mass,  465 
Reed, 609 

Reference  frame,  56 
inertial,  108,  1 12,  628 
noninertial,  174 
Reference  position,  258,  287 
Reflection  of  waves,  557 
Relativistic  domain,  4,  618 
Relativistic  Doppler  effect,  654 
Relativistic  energy,  total,  67 1 
Relativistic  force,  666 
Relativistic  kinematics,  619 
Relativistic  kinetic  energy,  670 
Relativistic  mass,  663 
Relativistic  mechanics,  4,  618,  657 
energy  conservation  law'  of,  674 
Relativistic  momentum,  662 
Relativistic  particle,  677 
Relativity: 

general  theory  of,  618 


Index  709 


Relativity  (Cont.): 
principle  of,  628 
special  theory  of,  618 
Repulsive  force,  359 
Residual  nucleus,  687 
Rest  mass,  663 
Rest-mass  energy,  67 1 
Restitution,  coefficient  of,  334 
Restoring  force,  146 
Revolution,  444 
Right-hand  rule: 

for  cross  product,  346 
for  rotation  vector,  344 
Rigid  body,  339 
Ripple  tank,  535 
Roche’s  limit,  497 
Rocket,  186 

Rotation,  338,  363,  444 
Rotation  angle,  339 
Rotation  axis,  338 
Rotation  vector,  344 
right-hand  rule  for,  344 
Rotational  kinematics,  338 
Rotational  mechanics,  338 


Satellite,  89 

minimum-orbit  earth,  89 
Scalar(s),  61 
ordered  triplet  of,  72 
signed,  80 
Scalar  product,  273 
Scaling  property,  215 
Scattered  particle,  317 
Scattering,  688 
elastic,  688 
inelastic,  688 
Scattering  angle,  3 1 7 
Second,  20,  21 
Second  derivative,  35 
Second-order  differential  equation,  225 
Semimajor  axis,  459 
Sense,  170,  339 
Separation: 
dissociation,  324,  680 
equilibrium,  324,  681 
Separation  constant,  573 
Separation  of  variables,  573 
Short  range  force,  684 
Sidereal  period,  445 
Signed  scalar,  80 
Significant  figure,  18 
Simultaneity,  629 
Simultaneous  events,  629 
separated,  630 
Sinusoid,  228 
Skydiver,  190 
Sleeping  top,  417 
Slope  of  curve,  23 
Slug,  1 1 5 

Solar  day,  mean,  20 
Sound,  speed  of,  61 1 
Sound  wave,  533 
Sounding  board,  609 


Source  power,  536 

Space-independent  wave  function,  51 1 
Specific  impulse,  189 
Specific  thrust,  1 89 
Spectroscopy,  mass,  690 
Spectrum: 
discrete,  596 
Fourier,  596 
line,  596 
Speed,  27 
angular,  348 
of  light,  619,  623 
of  sound, 61 1 
terminal,  153 
Spherical  wave,  537 
Spin  angular  velocity,  41 1 
Spring  constant,  146 
Spring  force,  145 
Stability  of  orbit,  487 
Stable  equilibrium,  207.  379 
position  of,  208 
Standing  wave,  56 1 
Standing-wave  mode,  567 
Standing-wave  solution  to  wave  equation,  570 
Static  contact  friction  force,  148 
Static  equilibrium,  371 
Stellar  aberration,  656 
Stokes'  law,  152 
Strong  nuclear  force,  5,  684 
Sum  tone,  605 
Superposition,  55 1 
principle  of,  550 
of  waves,  549 
Synchronization,  630 
System,  1 


Tangential  acceleration,  79 

Tangential  acceleration  component,  342 

Tangential  velocity  component,  342 

Target  nucleus,  687 

Tension,  579 

Tension  force,  514 

Terminal  speed,  153 

Terminal  velocity,  198 

Thermal  energy,  262 

Thought  experiment,  631 

Threshold  of  hearing,  592 

Time: 

measurement  of,  19 
proper,  635 
Time  dilation,  633,  635 
lime-independent  wave  function,  51  1 
Top,  41 1 
Torque,  358,  360 
Torsion  balance,  456 
Torsion  constant,  409 
Torsion  fiber,  409 
Torsion  pendulum,  409 
Trajectory,  56 
parabolic,  80 
Translation,  337 
Translational  kinematics,  337 
Translational  mechanics,  337 


710  Index 


Transverse  wave,  502 

Traveling  wave,  500 

Traveling-wave  kinematics,  501 

Traveling-wave  solution  to  wave  equation,  518 

Turbulent  flow,  153 

Turning  point,  323 

Two-body  problem,  465 


Unbound  system,  325,  484 

Underdamped  oscillator,  239 

Underdetermined  structure,  379 

Uniform  motion,  15 

Uniform  precessional  motion,  41  1 

Unison,  603 

Unit: 

derived,  1 1 6 
fundamental,  1 16 
Unit  vector,  64 

Universal  gravitational  constant,  447,  455,  458 
Unstable  equilibrium,  211,  380 
position  of,  2 1 1 


Variable: 

dependent,  30 
independent,  30 
Vector,  61 

acceleration,  74 

axial,  345 

component  of,  61 

constituent,  64 

direction  of,  62 

division  of,  by  scalar,  7 1 

magnitude  of,  62 

multiplication  of,  by  scalar,  71 

negative  of,  69 

null,  69 

position,  74 

projection  of,  73 

rotation,  344 

unit,  64 

velocity,  74 

zero,  69 

Vector  addition,  65 
Vector  difference,  70 
Vector  product,  346 
Vector  sum,  64 
Velocity,  26,  75 
angular,  339,  348,  41 1 
average,  28 
constant,  22 


Velocity  (Cont.): 
direction  of,  23 
magnitude  of,  23 
tangential  component  of,  342 
terminal,  198 
wave,  504 

Velocity-dependent  force,  147 
Velocity  transformation: 
Galilean,  99 
Lorentz,  648,  650 
Velocity  vector,  74 
Viscosity,  coefficient  of,  152 


Watt,  302 
Wave(s): 
circular,  535 
energy  in,  524 
longitudinal,  531 
multidimensional,  531 
propagation  of,  5 1 3 
reflection  of,  557 
sound, 533 
spherical,  537 
standing,  561 
superposition  of,  549 
transverse,  502 
traveling,  500 
Wave  equation,  516 

for  circularly  symmetrical  wave,  582 
standing-wave  solution  to,  570 
traveling-wave  solution  to,  518 
Wave  front,  535 
Wave  function,  503 
space-independent,  51 1 
time-independent,  51 1 
Wave  number,  512 
Wave  pulse,  501 
Wave  train,  507 
Wave  velocity,  504 
Waveform,  596 
Wavelength.  508 
Weak  nuclear  force,  5,  679 
Weight,  118,  144 
Work,  255,  266,  270,  275.  668 
Work-kinetic  energy  relation,  277 
Work-potential  energy  relation,  288 
Workless  constraint,  280 


Zero  vector,  69 


Index  711 


Important  Physical  Constants 

k 

Quantity 

Symbol 

Value 

Universal  gravitational  constant 

G 

6.67  x 1CT11  N • m2/k] 

Speed  of  light 

c 

3.00  x 108  m/s 

Permeability  of  free  space 

Mo 

4tt  x 10^7  T • m/A 

(by  definition) 

Mo/4  tt 

1 x 10"7  T • m/A 

Permittivity  of  free  space 

e„  (=  1/moC2) 

8.85  X 10“12  C2/N  • nr 

1/47 T€0 

8.99  x 109  N • m2/C2 

Elementary  charge  (magnitude 

e 

1.60  x 10“19  C 

of  electron  charge) 

Boltzmann's  constant 

k 

1.38  X 10“23  J/K 

Avogadro's  number 

A 

6.02  x 1026  kmol-1 

Universal  gas  constant 

R (=A k) 

8.31  x 103  J/kmol  • K 

Faraday's  constant 

S'  (=  Ae) 

9.65  x 107  C/kmol 

Electron  rest  mass 

me 

9.11  x 10-31  kg 

Electron  rest  mass  energy 

mec2 

5.11  x 10s  eV 

Electron  charge/mass  ratio 

e/me 

1.76  X 1011  C/kg 

Proton  rest  mass 

mp 

1.67  x 10“27  kg 

Proton  rest  mass  energy 

mpc2 

9.38  x 108  eV 

Planck’s  constant 

h 

6.63  x 10^34  J ■ s 

Bohr  magneton 

mB 

9.27  x 10“24  A • m2 

* More  precise  values  are  cited  in  text;  see 

index. 

Mathematical  Symbols 

« is  proportional  to 

= is  equal  to 

^ is  not  equal  to 

— is  approximately  equal  to 

= is  identical  to 

> is  greater  than 

< is  less  than 

3=  is  greater  than  or  equal  to 

is  less  than  or  equal  to 
|z|  the  absolute  value  of  z 

(z)  the  average  value  of  z 


Powers-of-Ten  Notation 


Power  of  ten 

Equivalent  value 

Prefix 

Symbol 

10“12 

0.000  000  000  001 

pico 

P 

icr9 

0.000  000  001 

nano 

n 

1(T6 

0.000  001 

micro 

10~3 

0.001 

milli 

m 

icr2 

0.01 

centi 

c 

10“' 

0.1 

deci 

d* 

10° 

1 

— 

— 

101 

10 

deka 

da* 

102 

100 

hecto 

h* 

103 

1000 

kilo 

k 

106 

1 000  000 

mega 

M 

109 

1 000  000  000 

giga 

G 

1012 

1 000  000  000  000 

tera 

T 

* Prefix  and  symbol  not  currently  in  wide  use  in  the  physical  sciences. 


The  Greek 

Alphabet 

Alpha 

A 

a 

Iota 

Beta 

B 

p 

Kappa 

Gamma 

r 

y 

Lambda 

Delta 

A 

8 

Mu 

Epsilon 

E 

e 

Nu 

Zeta 

Z 

£ 

Xi 

Eta 

H 

V 

Omicron 

Theta 

0 

e 

Pi 

I 

L 

Rho 

P 

P 

K 

K 

Sigma 

2 

(J 

A 

k 

Tau 

T 

T 

M 

P 

Upsilon 

Y 

V 

N 

V 

Phi 

0) 

77 

£ 

Chi 

X 

X 

0 

o 

Psi 

T 

* 

n 

7 T 

Omega 

n 

CO 

PHYSICS  volume  II 

Foundations  and  Applications 

ROBERT  M.  EISBERG 
LAWRENCE  S.  LERNER 


nun 


Selected  Physical  Quantities 


Typical 
symbol  for 


Quantity 

magnitude 

SI  unit 

Dimensions 

Mass 

m 

kilogram,  kg 

M 

Length 

1 

meter,  m 

L 

Time 

t 

second,  s 

T 

Velocity 

V 

m/s 

LT-1 

Acceleration 

a 

m/s2 

LT~2 

Angle 

</>,  e 

radian,  rad 

dimensionless 

Angular  frequency, 

a> 

rad/s 

T-i 

angular  velocity 

Angular  acceleration 

a 

rad/s2 

rJ~'—2 

Frequency 

V 

hertz,  Hz  (=  s_1) 

T~ 1 

Momentum,  impulse 

p,  I 

kg  • m/s 

M LT-1 

Force 

F 

newton,  N (=  kg  ■ m/s2) 

MLT~2 

Work,  energy 

W,  E,  K,  U 

joule,  J (=  N • m) 

M L2T~2 

Power 

P 

watt,  W (=  J/s) 

ML2T-3 

Angular  momentum 

L 

kg  • m2/s 

ML2T^ 

Torque 

T 

N • m 

ML2T~2 

Moment  of  inertia 

I 

kg  • m2 

ML2 

Stress,  pressure 

cr,  p 

pascal,  Pa  (=  N/m2) 

ML^T-2 

Elastic  moduli 

Y,  B,  G 

Pa 

ML-'T~2 

Compressibility 

K 

Pa”1 

M-'LT2 

Viscocity 

V 

Pa  ■ s 

ML -‘T-1 

Temperature 

T 

kelvin,  K 

dimensionless 

Heat 

H 

J 

ML2T~2 

Entropy 

S 

J/K 

ML2T~2 

Electric  charge 

q 

coulomb,  C 

C 

Electric  flux 

<t>e 

N • m2/C 

ML3t-2c-i 

Electric  field 

% 

N/C  (=  V/m) 

MLT-2C-1 

Electric  potential,  emf 

V 

volt,  V 

ML2T-2c-> 

Electric  dipole  moment 

p 

C • m 

LC 

Capacitance 

c 

farad,  F (=  C/V) 

M~1L~2T2C2 

Electric  current 

i 

ampere,  A (=  C/s) 

T_1C 

Current  density 

j 

A/m2 

L“2T-1C 

Conductance 

S 

siemens,  S (=  A/V) 

M_1L_2TC2 

Resistance 

R 

ohm,  Fl  (=V/A) 

ML2T-1C-2 

Conductivity 

C T 

S/m 

M~'L~3TC2 

Resistivity 

P 

O • m 

ML3T-1C-2 

Magnetic  field 

tesla,  T 

MT-'C-1 

Magnetic  flux 

weber,  Wb  ( = 1 • m2) 

ML2T_1C_1 

Magnetic  dipole  moment 

m 

A ■ m2 

L2T~'C 

Magnetic  pole  strength 

A ■ m 

LT_1C 

Magnetic  permeability 

P 

T • m/A 

MLC~2 

Magnetization 

M 

A/m 

L-’T-'C 

Inductance 

L 

henry.  H (=  V • s/A) 

ML  2C-2 

WILLIAM  H.  INGHAM 

Selected  Non-SI  Units  and  Conversion  Factors 

360  \ 

1 degree  of  arc  (°)  = rad  = ■ rad  = 0.0175  rad 

1 minute  of  arc  (')  = 2.91  x 10~4  rad 
1 second  of  arc  ("}  = 4.85  x 10~6  rad 

1 day  (d)  = 86  400  s 
1 year  (yr)  = 3.156  x 107  s 

1 Angstrom  unit  (A)  = 0.1  nm  = 10~10  m 
1 inch  (in)  = 2.54  cm 
1 foot  (ft)  = 0.3048  m 
1 mile  (mi)  = 1.61  km 
1 light  year  (ly)  = 9.46  x 1015  m 

1 liter  (1)  = 10-3  m3 
1 cm3  = 10-6  m3 

1 atomic  mass  unit  (u)  = 1.661  x 10-27  kg 
1 slug  = 14.59  kg 

1 mole  (mol)  = 10^3  kmol 

1 dyne  (dyn)  = 10~5  N 
1 pound  weight  (lb)  = 4.45  N 

1 bar  = 105  Pa 

1 atmosphere  (atm)  = 1.013  x 105  Pa 
1 mm  of  mercury  (Torr)  = 133.3  Pa 
1 lb/in2  = 6.90  x 103  Pa 

1 electron  volt  (eV)  = 1.60  x 10-19  | 

1 erg  = 10-7  ) 

1 kcal  = 4186  J 
1 cal  = 10-3  kcal  = 4.186  J 
1 kilowatt-hour  (kWh)  = 3.6  x 106  J 
1 foot-pound  (ft  • lb)  = 1.356  J 
1 British  thermal  unit  (BTU)  = 1055  J = 0.252  kcal 
1 horsepower  (hp)  = 746  W 

1 gauss  (G)  = 10~4  T 


Useful  Physical  Data 

Quantity 

Gravitational  acceleration,  ground  level  value 
in  United  States 
Mass  of  earth 
Mass  of  moon 
Mass  of  sun 
Average  radius  of  earth 
Average  earth-moon  distance 
Average  earth-sun  distance 
Triple  point  temperature  of  water 
Absolute  zero  of  temperature 


Value 

9.80  m/s2 

5.99  x 1024  kg 
7.36  x 1022  kg 

1.99  x 103°  kg 
6.367  X 106  m 
3.84  x 108  m 

149.6  x 109  m = 1 AU 
273.16  K = 0.01°C 
— 273.1 5°C 


PHYSICS 

Foundations  and  Applications 

volume  II 


PHYSICS 

Foundations  and  Applications 

volume  II 


ROBERT  M.  EISBERG 

Professor  of  Physics 
University  of  California,  Santa  Barbara 


LAWRENCE  S.  LERNER 


Professor  of  Physics 

California  State  University,  Long  Beach 


McGraw-Hill  Book  Company 

New  York  St.  Louis  San  Francisco  Auckland  Bogota  Hamburg 
Johannesburg  London  Madrid  Mexico  Montreal  New  Delhi 
Panama  Paris  Sao  Paulo  Singapore  Sydney  Tokyo  Toronto 


PHYSICS:  Foundations  and  Applications,  volume  II 

Copyright  © 1981  by  McGraw-Hill,  Inc.  All  rights  reserved.  Printed  in 
the  United  States  of  America.  No  part  of  this  publication  may  be  repro- 
duced, stored  in  a retrieval  system,  or  transmitted,  in  any  form  or  by  any 
means,  electronic,  mechanical,  photocopying,  recording,  or  otherwise, 
without  the  prior  written  permission  of  the  publisher. 

1234567890  RMRM  8987654321 

This  book  was  set  in  Baskerville  by  Progressive  Typographers. 

4'he  editor  was  John  J.  Corrigan; 

the  designer  was  Merrill  Haber; 

the  production  supervisor  was  Dominick  Petrellese. 

The  photo  researcher  was  Mira  Schachne. 

The  drawings  were  done  by  J & R Services,  Inc. 

Rand  McNally  8c  Company  was  printer  and  hinder. 


Library  of  Congress  Cataloging  in  Publication  Data 

Eisberg,  Robert  Martin. 

Physics,  foundations  and  applications. 

Includes  index. 

1.  Physics.  I.  Lerner,  Lawrence  S.,  date 
joint  author.  II.  Title. 

QC21.2.E4  530  80-24417 

ISBN  0-07-019091-7  (v.  I) 

ISBN  0-07-019092-5  (v.  II) 


Cover:  “Vega-Nor”  by  Victor  de  Vasarely,  reproduced  by  courtesy  of 
the  Albright-Knox  Art  Gallery,  Buffalo,  New  York,  and  the  Vasarely 
Center,  New  York,  New  York. 


Contents 


PREFACE  ix 

Chapter  16  MECHANICS  OF  CONTINUOUS  MEDIA  697 


16-1  Continuous  Media  697 

16-2  Stress  and  Strain  698 

16-3  Fluids  and  Pressure  706 

16-4  Boyle’s  Law  713 

16-5  Bulk  Modulus  and  Compressibility  714 

16-6  Fluid  Friction,  Laminar  Flow,  and  Turbulent  Flow  718 

16- 7  Dynamics  of  Ideal  Fluids  725 

Exercises  735 

Chapter  17  THE  PHENOMENOLOGY  OF  HEAT  743 

17- 1  The  Phenomenological  Approach  743 

17-2  Temperature  744 

17-3  Charles’  Law  747 

17-4  The  Equation  of  State  of  an  Ideal  Gas  752 

17-5  Thermal  Expansion  of  Solids  and  Liquids  756 

17-6  Heat  761 

17- 7  The  Mechanical  Equivalent  of  Heat  767 

Exercises  772 

Chapter  18  KINETIC  THEORY  AND  STATISTICAL  MECHANICS  776 

18- 1  The  Ideal-Gas  Model  776 

18-2  Kinetic  Theory  of  the  Ideal  Gas  778 

18-3  Improvements  to  the  Kinetic  Theory  787 

18-4  Heat  Capacity  and  Equipartition  792 

18-5  The  Boltzmann  Factor  800 

18-6  The  Maxwell-Boltzmann  Speed  Distribution  818 

18- 7  Disorder  and  Entropy  825 

Exercises  835 

Chapter  19  THERMODYNAMICS  842 

19- 1  Thermodynamic  Interactions  and  the 

First  Law  of  Thermodynamics  842 

19-2  Isometric  and  Isobaric  Processes  848 

19-3  Isothermal  and  Adiabatic  Processes  855 

19-4  Entropy,  Temperature,  and  Thermodynamic  Efficiency  865 

19-5  The  Carnot  Engine  and  the  Second  Law  of 

Thermodynamics  870 

19-6  Heat  Pumps,  Refrigerators,  and  Engines  878 

19- 7  The  Third  Law  of  Thermodynamics  885 

Exercises  888 

Chapter  20  THE  ELECTRIC  FORCE  AND  THE  ELECTRIC  FIELD  894 

20- 1  The  Electromagnetic  Force  894 

20-2  Electric  Charge  and  Coulomb’s  Law  896 


v 


20-3 

Alpha-Particle  Scattering 

907 

20-4 

The  Electric  Field  and  Electric  Field  Lines 

917 

20-5 

Electric  Flux  and  Gauss’  Law 

926 

20-6 

Applications  of  Gauss’  Law 

932 

Exercises 

940 

Chapter  21 

THE 

ELECTRIC  POTENTIAL 

944 

21-1 

Electric  Potential  Energy  and  Electric  Potential 

944 

21-2 

Evaluation  of  Electric  Field  from  Electric  Potential 

954 

21-3 

Equipotential  Surfaces  and  Electric  Field  Lines 

958 

21-4 

Electric  Dipoles 

964 

21-5 

Laplace’s  Equation 

974 

21-6 

Capacitors  and  Capacitance 

987 

21-7 

Energy  in  Capacitors  and  Electric  Fields 

997 

21-8 

Dielectrics 

1002 

Exercises 

1006 

Chapter  22 

STEADY  ELECTRIC  CURRENTS 

1012 

22-1 

Electromotive  Force  and  Its  Sources 

1012 

22-2 

Flow  of  Electric  Charge  and  Electric  Current 

1018 

22-3 

Ohm’s  Law 

1024 

22-4 

The  Electron  Gas 

1034 

22-5 

The  Microscopic  Basis  of  Electric  Resistance 

1036 

22-6 

Joule’s  Law 

1044 

22-7 

Direct-Current  Circuits 

1047 

Exercises 

1055 

Chapter  23 

MAGNETIC  FIELDS,  I 

1061 

23-1 

Magnetic  Poles  and  Magnetic  Field  Lines 

1061 

23-2 

The  Magnetic  Force  and  the  Magnetic  Field 

1065 

23-3 

Cyclotron  Resonance  and  Cyclotrons 

1073 

23-4 

The  Lorentz  Force 

1077 

23-5 

The  Biot-Savart  Law 

1086 

23-6 

Ampere’s  Law 

1096 

23-7 

Applications  of  Ampere’s  Law 

1107 

Exercises 

1116 

Chapter  24 

MAGNETIC  FIELDS,  II 

1122 

24-1 

Ampere’s  Experiment  and  the  Ampere 

1122 

24-2 

Relativistic  Origin  of  the  Magnetic  Force 

1129 

24-3 

Magnetic  Dipoles  and  Their  Applications 

1140 

24-4 

Ampere’s  Conjecture  and  Diamagnetism 

1150 

24-5 

Paramagnetism  and  Ferromagnetism 

1157 

Exercises 

1166 

Chapter  25 

ELECTROMAGNETIC  INDUCTION 

1172 

25-1 

Faraday’s  Law:  Induced  Currents 

1172 

25-2 

Faraday’s  Law:  The  Crucial  Role  of  Changing 

Magnetic  Flux 

1178 

25-3 

Faraday’s  Law:  Induced  Electric  Fields 

1184 

25-4 

Electric  Generators  and  Motors 

1191 

25-5 

Inductance  and  Inductors 

1196 

25-6 

Energy  in  Inductors  and  Magnetic  Fields 

1201 

Exercises 

1205 

Chapter  26  CHANGING  ELECTRIC  CURRENTS  1211 

26-1  Inductance,  Resistance,  and  Capacitance  in  Electric 

Circuits  1211 

26-2  The  RL  Circuit  1212 

26-3  The  RC  Circuit  1217 

26-4  The  LC  Circuit  1221 

26-5  The  LRC  Circuit  1226 

26-6  Alternating-Current  Circuits:  Numerical  Description  1235 

26-7  Alternating-Current  Circuits:  Phasor  Description  1241 

26-8  Alternating-Current  Circuits:  Analytical  Description  1247 

26- 9  Power  in  Alternating-Current  Circuits  1255 

Exercises  1259 

Chapter  27  MAXWELL’S  EQUATIONS  AND 

ELECTROMAGNETIC  WAVES  1265 

27- 1  The  Displacement  Current  1265 

27-2  Maxwell’s  Equations  1273 

27-3  The  Electromagnetic  Wave  Equations  1274 

27-4  Electromagnetic  Waves  1281 

27-5  Energy  and  Momentum  in  Electromagnetic 

Radiation  1288 

27- 6  Emission  of  Radiation  by  Accelerated  Charges  1295 

Exercises  1309 

Chapter  28  WAVE  OPTICS  1314 

28- 1  Huygens’  Construction  1314 

28-2  Reflection  1318 

28-3  The  Speed  of  Light  in  Transparent  Materials  1325 

28-4  Refraction  and  Total  Internal  Reflection  1328 

28-5  Dispersion  1337 

28-6  Two-Slit  Diffraction  1340 

28-7  Multislit  Diffraction  1347 

28-8  Single-Slit  Diffraction  1354 

28- 9  Polarization  of  Light  1363 

Exercises  1368 

Chapter  29  RAY  OPTICS  1373 

29- 1  Wave  Optics  and  Ray  Optics  1373 

29-2  Fermat’s  Principle  1375 

29-3  Lenses  1377 

29-4  Image  Formation  1387 

29-5  Optical  Systems  1395 

29-6  The  Matrix  Method  1405 

29- 7  Applications  of  the  Matrix  Method  1417 

Exercises  1425 

Chapter  30  PARTICLE- WAVE  DUALITY  1432 

30- 1  The  Quantum  Domain  1432 

30-2  The  Emission  and  Absorption  of  Photons  1435 

30-3  The  Scattering  of  Photons  1444 

30-4  Recent  Evidence  for  the  Existence  of  Photons  1452 

30-5  The  Wavelike  Motion  of  Photons  1453 

30-6  Matter  Waves  1456 

30-7  The  Uncertainty  Principles  1465 

Exercises  1480 

Contents  vii 


Chapter  31  ENERGY  QUANTIZATION  IN  MATTER  1485 

31-1  The  Particle  in  a Box  1485 

31-2  The  Hydrogen  Atom  1492 

31-3  Schrodinger’s  Equation  1505 

31-4  The  Harmonic  Oscillator  1510 

Exercises  1522 

ANSWERS  Al 

INDEX  II 


vm 


Contents 


Preface 


Science  is  constructed  of  facts,  as  a house  is  of  stones. 
But  a collection  of  facts  is  no  more  a science  than  a 
heap  of  stones  is  a house. 

Henri  Poincare 
Science  and  Hypothesis 


In  this  book  we  present  the  science  of  physics  in  a carefully  structured  man- 
ner which  emphasizes  its  foundations  as  well  as  its  applications.  The  struc- 
ture is  flexible  enough,  however,  for  there  to  be  paths  through  it  compati- 
ble with  the  various  presentations  encountered  in  introductory  physics 
courses  having  calculus  as  a corequisite  or  prerequisite. 

We  have  always  kept  in  view  the  idea  that  a textbook  should  be  a com- 
plete study  aid.  Thus  we  have  started  each  topic  at  the  beginning  and  have 
included  everything  that  a student  needs  to  know.  T his  feature  is  central  to 
the  senior  author’s  successful  textbooks  on  modern  physics  and  on  quan- 
tum physics. 

The  book  is  written  in  an  expansive  style.  Attention  paid  to  motivating 
the  introduction  of  new  topics  is  one  aspect  of  this  style.  Another  is  the 
space  devoted  to  showing  that  physics  is  an  experimentally  based  science. 
In  Volume  I direct  experimental  evidence  is  repeatedly  brought  into  the 
developments  by  the  use  of  photographs.  And  although  the  experiments 
underlying  the  topics  considered  in  Volume  II  generally  do  not  lend  them- 
selves to  photographic  presentation,  at  least  the  flavor  of  the  laboratory 
work  is  given  by  including  careful  descriptions  of  the  experiments.  Still 
another  aspect  of  the  expansive  style  is  found  in  the  frequent  discussions  of 
the  microscopic  basis  of  macroscopic  phenomena. 

Developments  are  often  presented  in  “spiral"  fashion.  That  is,  a quali- 
tative discussion  is  followed  by  a more  rigorous  treatment.  An  example  is 
found  in  the  development  of  Newton’s  second  law.  Chapter  1 introduces  its 
most  important  features  in  a purely  qualitative  way.  When  the  second  law  is 
treated  systematically  in  Chap.  4,  Newton’s  approach,  using  intuitive  no- 
tions of  mass  and  force,  is  followed  by  Mach’s  approach,  where  mass  and 
force  are  defined  logically  in  terms  of  momentum  in  a manner  suggested 
by  the  analysis  of  a set  of  collision  experiments. 

The  book  contains  many  features  designed  to  help  the  student.  For  in- 
stance, when  a term  is  defined  formally  or  by  implication,  or  is  redefined  in 
a broader  way,  it  is  emphasized  with  boldface  letters.  And  all  such  items  in 
boldface  are  listed  in  the  index  to  make  it  easy  to  locate  definitions  which  a 
student  may  have  forgotten. 

It  is  not  intended  that  course  lectures  cover  every  point  made  in  the 


IX 


book.  The  book  can  be  relied  upon  to  do  many  of  the  straightforward 
things  that  need  to  be  done,  thereby  freeing  the  instructor  to  concentrate 
on  the  things  that  cause  students  the  most  trouble.  Instructors  interested  in 
teaching  a self-paced  course  will  find  that  the  completeness  of  this  book 
makes  it  well  adapted  to  use  in  such  a course. 

A novel  feature  of  this  book  is  the  use  of  numerical  procedures  em- 
ploying programmable  calculating  devices.  At  the  risk  of  giving  them  more 
emphasis  than  is  warranted  by  their  importance  to  the  book,  we  describe 
in  the  following  paragraphs  what  these  procedures  make  possible,  and  how 
they  can  be  implemented.  Numerical  procedures  are  used  for: 

1.  Numerical  differentiation  and  integration.  For  students  concur- 
rently studying  calculus  this  drives  home  the  fundamental  concepts  of  a 
limit,  a derivative,  and  an  integral. 

2.  Assistance  in  curve  plotting.  This  is  put  to  good  use  in  studying  bal- 
listic trajectories,  electric  field  lines  and  equipotentials,  and  wave  groups. 

3.  Numerical  solution  of  differential  equations.  This  procedure  per- 
mits the  use  of  Newton’s  second  law  in  a variety  of  cases  involving  varying 
forces.  It  also  is  applied  to  the  vibration  of  a circular  drumhead,  LRC  cir- 
cuits, and  Schrodinger’s  equation. 

4.  Simulation  of  statistical  experiments.  The  procedure  allows  funda- 
mental topics  of  statistical  mechanics  to  be  introduced  in  an  elementary 
way. 

5.  Multiplication  of  several  2 by  2 matrices.  This  makes  practical  the 
introduction  of  a very  simple  yet  very  powerful  method  of  doing  ray  optics. 

fhe  principal  advantage  of  using  numerical  procedures  in  the  intro- 
ductory course  is  that  it  frees  the  physics  content  of  the  course  from  the  limi- 
tations normally  imposed  by  the  students’  inability  to  manipulate  differen- 
tial equations  analytically  or  to  handle  certain  other  analytical  techniques. 
To  give  just  one  example  of  the  many  embodied  in  this  book,  we  have 
found  that  students  are  quite  interested  in  the  numerical  work  on  celestial 
mechanics  and  are  well  able  to  understand  the  physics  involved.  It  is  the 
mathematical  difficulty  of  the  traditionally  used  analytical  techniques  that 
normally  mandate  the  deferral  of  this  material  to  advanced  courses. 

fhe  advantages  of  the  numerical  procedures  go  the  other  way  as  well 
— they  open  up  mathematical  horizons  not  usually  accessible  to  the  intro- 
ductory-level student.  The  analytical  solution  of  a differential  equation 
generally  requires  an  educated  guess  at  the  form  of  the  solution.  It  is  pre- 
cisely such  a guess  that  the  student  is  not  prepared  to  make,  or  to  accept 
from  others.  But  the  numerical  solution  suggests  the  correct  guess  strongly 
and  directly.  Our  experience  is  that  students  armed  with  such  insight  can 
go  through  the  analytical  solution  confidently.  The  book  exploits  this  ad- 
vantage on  several  occasions. 

fhe  numerical  work  can  be  presented  in  the  lecture  part  of  a course  in 
various  ways.  One  which  has  proven  to  be  successful  is  to  demonstrate  to 
the  students  the  first  numerical  procedure  that  is  emphasized  by  using  a 
closed-circuit  TV  system  to  provide  an  enlarged  view  of  the  display  of  a 
programmable  calculator  or  small  computer  running  through  the  proce- 
dure. (Programs  for  every  numerical  procedure  used  in  the  book,  and  step- 
by-step  operating  instructions,  are  given  in  the  accompanying  pamphlet, 
the  Numerical  Calculation  Supplement.)  After  the  demonstration,  a graph  of 


the  results  obtained  is  shown  to  the  students  by  projecting  a transparency 
made  from  the  appropriate  figure  in  the  book,  and  the  significance  of  the 
results  is  explained.  In  subsequent  lectures  involving  numerical  proce- 
dures, all  that  need  be  done  is  to  graph  their  results  and  then  discuss  the 
meaning  of  the  results.  An  instructor  who  is  more  inclined  to  numerical 
procedures  may  want  to  give  more  demonstrations;  one  who  is  less  con- 
vinced of  their  worth  need  not  give  any.  The  essential  point  is  that  explana- 
tions of  the  physics  emerging  from  the  numerical  work  can  be  well  under- 
stood by  students  who  do  no  more  than  look  carefully  at  graphs  of  the 
results  obtained. 

But  it  goes  without  saying  that  students  will  get  more  out  of  an  active 
involvement  with  the  numerical  procedures  than  a passive  one.  The  most 
active  approach  is  to  ask  the  students  to  do  several  of  the  homework  exer- 
cises labeled  Numerical  in  each  of  the  fourteen  chapters  where  some  use  is 
made  of  numerical  procedures.  But  the  instructor  should  not  assign  too 
many  numerical  exercises,  particularly  at  first,  because  some  are  rather 
time-consuming.  A good  way  to  start  is  to  make  the  numerical  exercises  op- 
tional or  to  give  extra  credit  for  them.  Instruction  in  operating  a program- 
mable calculator  or  small  computer  can  be  given  in  a laboratory  period  or 
in  one  or  two  discussion  periods. 

We  now  describe  paths  which  may  be  taken  through  this  book,  other 
than  the  one  going  continuously  from  the  beginning  to  the  end. 


1.  Several  entire  topics  can  be  deleted  without  difficulty.  These  are: 
relativity,  Chaps.  14  and  15;  fluid  dynamics,  Secs.  16-6  and  16-7;  thermal 
physics,  Chaps.  17  through  19;  changing  electric  currents,  Chap.  26;  elec- 
tromagnetic waves,  Chap.  27;  optics,  Chaps.  28  and  29;  and  quantum  phys- 
ics, Chaps.  30  and  31. 

2.  We  believe  the  book  contains  as  much  modern  physics  as  should  be 
in  the  introductory  course.  This  material  is  distributed  throughout  the 
book,  but  it  has  been  written  in  such  a way  that  there  will  be  no  problem  in 
presenting  it  all  in  the  final  term.  To  do  so,  the  following  material  should 
be  skipped  in  proceeding  through  the  book,  and  presented  at  the  end: 
Chaps.  14  and  15;  Secs.  20-1,  20-3,  22-4,  22-5,  23-3,  24-2,  24-4,  and  24-5. 
Then  close  with  Chaps.  30  and  31. 

3.  In  some  schools  the  study  of  thermal  physics  is  undertaken  before 
that  of  wave  motion.  For  such  a purpose  Chaps.  16  through  19  can  be 
treated  before  Chaps.  12  and  13. 

4.  If  it  is  desired  to  present  a shorter  course  in  which  no  major  topics 
are  to  be  deleted,  the  sections  in  the  following  list  can  be  dropped  without 
significantly  interrupting  the  flow  of  the  argument  and  without  passing 
over  material  essential  to  subsequent  subject  matter.  (In  some  cases  it  will 
be  necessary  to  substitute  a very  brief  qualitative  summary  of  the  ideas  not 
treated  formally  when  the  need  for  these  ideas  arises.  Sections  marked  with 
an  asterisk  are  those  to  be  deleted  if  it  is  desired  to  avoid  entirely  the  wave 
equation  in  its  various  forms.  If  this  is  done,  electromagnetic  radiation  may 
still  be  treated  on  a semiquantitative  basis.)  The  sections  which  can  be 
dropped  are:  2-5,  2-8,  3-7,  4-2  (if  some  of  the  examples  are  used  later),  5-4, 
5-5,  6-1,  6-6,  7-1,  7-3,  8-2,  8-5,  9-7,  10-2,  10-3,  11-2,  11-4,  11-7,  12-3*, 
12-4*,  12-5,  12-6,  13-4*,  13-5*,  13-6,  13-7,  13-8,  15-5,  15-6,  16-5,  16-6, 
17-5,  18-6,  19-6,  19-7,  20-1,  20-3,  21-5,  21-8,  22-4,  22-5,  23-3,  24-2,  24-4, 
24-5,  25-4,  26-6,  26-7,  26-8,  26-9,  27-3*,  27-4*,  27-6,  28-5,  28-7,  29-2,  29-6, 


Preface  xi 


29-7,  30-3,  30-4,  31-3,  and  3 1 -4.  In  addition,  any  material  in  small  print  can 
be  dropped. 

Many  persons  have  assisted  us  in  writing  this  book.  In  particular,  ad- 
vice on  presentation  or  on  technical  points  and/or  aid  in  producing  many 
of  the  photographs  was  given  by  R.  Dean  Ayers,  Alfred  Bork,  John 
Clauser,  Roger  H.  Hildebrand,  Daniel  Hone,  Anthony  Korda,  Jill  H.  Lar- 
ken,  Isidor  Lerner,  Narcinda  R.  Lerner,  Ralph  K.  Myers,  Roger  Osborne, 
and  Abel  Rosales.  The  manuscript  was  reviewed  at  various  stages,  in  part 
or  in  whole,  by  Raymond  L.  Askew,  R.  Dean  Ayers,  Carol  Bartnick, 
George  H.  Bowen,  Sumner  P.  Davis,  Joann  Eisberg,  Lila  Eisberg,  Austin 
Gleeson,  Russell  K.  Hobbie,  William  H.  Ingham,  Isidor  Lerner,  Ralph  K. 
Myers,  Herbert  D.  Peckham,  Earl  R.  Pinkston,  James  Smith,  Jacqueline  D. 
Spears,  Edwin  F.  Taylor,  Gordon  G.  Wiseman,  Mason  Yearian,  Arthur  M. 
Yelon,  and  Dean  Zollman.  Isidor  Lerner  contributed  many  of  the  exer- 
cises; others  were  written  by  Van  Blnemel,  Don  Chodrow,  Eugene  God- 
fredsen,  John  Hutcherson,  William  Ingham,  Daniel  Schechter,  and  Mark  F. 
Taylor.  Dean  Zollman  assisted  greatly  in  selecting  and  editing  exercises. 
Don  Chodrow  and  William  Ingham  checked  all  solutions,  compiled  the 
short  answers  that  appear  in  the  back  of  the  book,  and  prepared  the  Solu- 
tions Manual.  Herbert  D.  Peckham  wrote  the  original  versions  of  the 
computer  programs  lor  the  Numerical  Calculation  Supplement.  Lila  Eisberg 
played  a major  role  in  reading  proof  and  prepared  the  index.  Important 
contributions  to  the  development  of  the  manuscript  and  its  transformation 
into  a book  were  made  by  John  J.  Corrigan,  Mel  Haber,  Annette  Hall. 
Alice  Macnow,  Peter  Nalle,  Janice  Rogers,  [o  Satloff,  and  Robert  Zappa  at 
McGraw-Hill,  and  by  our  photo  researcher,  Mira  Schaclme.  Many  students 
at  the  University  of  California,  Santa  Barbara,  and  at  California  State  Uni- 
versity,  Long  Beach,  had  a real  impact  on  the  manuscript  by  asking  just 
the  right  questions  in  class.  To  all  these  persons  we  express  our  warmest 
thanks. 


Robert  M.  Eisberg 
Lawrence  S.  Lerner 


PHYSICS 

Foundations  and  Applications 

volume  II 


I 

Mechanics 
of  Continuous  Media 


16-1  CONTINUOUS  We  open  the  second  half  of  our  study  of  physics  by  returning  to  the  new- 
MEDIA  tonian  domain.  In  developing  newtonian  mechanics,  it  was  logical  to  begin 
with  the  mechanics  of  particles.  Systems  containing  a single  particle  or  a few 
particles  are  the  simplest  ones  to  which  the  laws  of  mechanics  can  be  ap- 
plied. Fortunately,  many  systems  which  do  not  actually  consist  of  a small 
number  of  particles  can  be  treated  for  many  purposes  as  if  they  did.  We 
treated  many  oscillating  systems,  for  example,  as  if  a point  mass  were  acted 
on  by  a Hooke’s-law  force  exerted  by  a massless  spring.  The  mechanical  de- 
tails of  the  part  of  the  system  which  supplied  that  force  were  deliberately 
ignored. 

Next  considered  was  the  mechanics  of  rigid  bodies,  whose  parts  main- 
tain a fixed  position  with  respect  to  one  another.  In  such  a body  there  must 
be  internal  forces  exerted  on  the  parts  by  one  another.  However,  we  found 
ways  of  treating  the  motion  of  rigid  bodies  so  that  these  forces  could  be 
neglected. 

In  discussing  the  mechanics  of  waves,  it  is  not  possible  to  ignore  the  in- 
teractions among  neighboring  parts  of  the  continuous  medium  through 
which  the  wave  propagates.  While  we  still  assumed  that  Hooke’s  law  was 
valid,  we  no  longer  made  an  explicit  physical  separation  between  ideal, 
massless  springs,  which  supply  the  forces  that  act  when  the  system  is  dis- 
turbed, and  ideal,  inert  particles  whose  mass  furnishes  an  inertial  resistance 
to  those  forces.  Rather,  we  pictured  the  mass  of  the  system  (such  as  a string 
or  a drumhead)  as  smoothly  distributed  throughout  the  system.  We  as- 
sumed the  “springiness”  of  the  system  to  be  likewise  smoothly  distributed. 
In  other  words,  we  treated  the  system  as  a continuous  medium  whose  iner- 
tial and  elastic  properties  are  no  longer  spatially  isolated  from  one  another 
(as  would  be  the  case  for  a linear  array  or  network  of  ideal  particles  hooked 


697 


together  by  ideal  massless  springs).  It  was  necessary  in  studying  the  passage 
of  waves  through  continuous  media  to  ascribe  certain  properties  to  the 
media.  However,  we  took  these  properties  more  or  less  for  granted.  In  this 
chapter  we  study  them  in  detail. 

Since  all  matter  is  made  up  on  the  atomic  scale  of  discrete  particles,  a 
continuous  picture  can  be  valid  only  on  the  macroscopic  scale.  Neverthe- 
less, it  can  be  very  useful.  In  this  and  the  next  three  chapters,  we  consider 
first  the  gross  behavior  of  continuous  systems  and  then  the  microscopic  me- 
chanical behavior  which  underlies  it.  In  particular,  in  this  chapter  we  con- 
sider some  of  the  explicitly  mechanical  properties  of  continuous  media. 
Chapter  17  is  devoted  to  their  thermal  properties.  In  Chap.  18,  the  most 
important  aspects  of  the  macroscopic  thermal  and  mechanical  behavior  of 
matter  are  interpreted  in  terms  of  the  mechanics  of  their  microscopic  parts, 
which  are  treated  as  particles.  In  Chap.  19,  we  proceed  from  the  micro- 
scopic back  to  the  macroscopic  world  and  interpret  the  thermal  behavior 
of  macroscopic  systems  in  terms  of  the  branch  of  physics  called  thermo- 
dynamics. 


16-2  STRESS  AND 
STRAIN 


Positive 
x direction 


/ «■ 

1 

1 

1 1 

[+►  A/>  0 


(a) 


Positive 
x direction 


(6) 

Fig.  16-1  (a)  A rod  is  fixed  at  one  end 

to  a rigid  wall.  The  direction  from  the 
fixed  end  to  the  free  end  is  taken  as  the 
positive  x direction.  The  length  of  the 
rod  is  l.  A force  F applied  in  the  posi- 
tive direction  results  in  a positive  change 
A l in  the  length  of  the  rod.  The  strain 
is  the  ratio  A///,  which  is  positive.  ( b ) A 
force  F having  the  same  magnitude  but 
opposite  direction  is  applied  to  the  rod. 
This  results  in  a negative  change  A l in 
the  length  of  the  rod.  The  strain  is  the 
ratio  A l/l,  which  is  negative. 


Provided  they  are  not  stretched  too  far,  most  solid  bodies  obey  Hooke’s 
law.  This  empirical  law  was  discussed  in  Sec.  4-6  and  applied  extensively  in 
Chaps.  6,  7,  12,  and  13.  These  discussions  were  carried  out  in  terms  of  the 
force  exerted  by  the  distended  body  on  some  other  object  to  which  it  was  at- 
tached. In  what  follows,  it  is  preferable  to  consider  the  force  exerted  on  the 
distended  body  by  the  other  object.  This  is  because  we  focus  our  interest  on 
what  happens  to  the  distended  body  itself  . 

If  a one-dimensional  force,  expressed  by  the  signed  scalar  F,  is  ap- 
plied to  a solid  body  along  a certain  direction  (which  we  take  to  be  the  x 
direction),  the  body  will  distort  in  that  direction  by  an  amount  given  by  the 
signed  scalar  A/.  This  is  shown  in  Fig.  16-1,  where  a rod  of  undisturbed 
length  / is  stretched  and  compressed  by  this  amount.  Within  limits  soon  to 
be  discussed,  experiment  shows  that  the  distortion  and  the  force  are 
directly  proportional.  This  relationship  can  be  expressed  mathematically  in 
the  form 

F = kM  (16-1) 

l he  positive  proportionality  constant  k is  the  force  constant. 

For  a given  body,  the  measured  value  of  the  force  constant  k depends 
on  four  general  circumstances: 

1.  The  size  and  shape  of  the  body 

2.  The  material  of  which  the  body  is  made 

3.  The  temperature  and  pressure  to  which  the  body  is  subjected  while 
the  measurements  are  carried  out 

4.  In  some  cases,  the  direction  in  which  the  body  is  distended 

The  last  of  these  considerations  may  seem  unfamiliar  to  you  at  Hrst, 
but  wood  is  a very  common  example  of  a material  whose  ability  to  resist  dis- 
tortion depends  on  the  direction  in  which  the  distorting  force  is  applied. 
The  mechanical  properties  of  wood  are  quite  different  with  the  grain  and 
across  the  grain.  Such  materials  are  called  anisotropic.  Anisotropy  is  not 
at  all  rare.  For  example,  most  metals  which  have  been  processed  in  a direc- 


698  Mechanics  of  Continuous  Media 


tional  fashion,  such  as  cold-rolled  steel,  exhibit  anisotropic  properties.  The 
mathematical  technique  which  has  been  developed  to  deal  with  anisotropy 
is  beyond  the  scope  of  this  book.  Therefore,  we  do  not  consider  item  4.  We 
limit  our  discussion  to  isotropic  materials,  that  is,  those  whose  mechanical 
properties  are  the  same  in  all  directions. 

Since  most  everyday  experience  with  materials  takes  place  close  to 
room  temperature  and  at  negligible  external  pressures,  for  the  most  part 
we  confine  our  discussion  to  such  environments.  Therefore,  we  do  not  con- 
sider item  3 either. 

With  these  simplifications,  we  must  still  consider  the  size  and  shape  of 
the  object  subjected  to  a force,  as  well  as  the  material  of  which  it  is  made. 
We  begin  with  the  latter.  Everyone  knows  that  some  materials  are  stiffer 
than  others.  This  means,  stated  precisely,  that  if  two  identical  forms  are 
fabricated  from  different  materials,  the  object  made  of  the  stiffer  material 
will  have  the  larger  force  constant  k.  That  is,  the  distortion  A/  of  the  stiffer 
material  will  be  less  for  the  same  applied  force  F.  This  leads  us  naturally  to 
try  to  express  the  stiffness  of  a material  quantitatively,  in  such  a way  that  it  is 
independent  of  the  form  into  which  the  material  happens  to  be  fabricated. 
This  is  of  great  practical  importance  to  the  engineer,  for  example,  who 
knows  what  materials  are  available  and  needs  to  design  a structure  which 
will  bear  a given  load. 

The  key  to  the  desired  quantitative  expression  for  the  stiffness  of  a 
material  lies  in  taking  into  consideration  the  length  and  thickness  of  the 
particular  sample  which  is  tested.  In  Example  4-10,  you  found  that  two 
springs  linked  end  to  end  have  less  stiffness  than  either  spring  alone.  When 
the  two  linked  springs  are  subjected  to  a certain  external  force  F,  each 
spring  bears  the  entire  force  and  thus  distorts  just  as  much  as  it  would  if  it 
were  not  linked  to  the  other.  Consequently,  the  linked  pair  stretches  far- 
ther than  either  spring  alone,  and  the  force  constant  k'  of  the  pair  is  smaller 
than  the  force  constant  k of  either  spring. 

The  main  practical  advantage  in  using  coil  springs  is  that  they  can 
usually  be  stretched  a greater  proportion  of  their  undisturbed  length  than 
is  possible  with  a straight  wire  or  rod  without  violating  Hooke’s  law.  Never- 
theless, although  solid  straight  wires  or  rods  are  usually  stiffer  than  the  coil 
springs  which  were  considered  in  Example  4-10,  the  very  same  conclusions 
apply  as  long  as  Hooke’s  law  is  obeyed — that  is,  as  long  as  the  wire  or  rod  is 
not  stretched  too  far.  The  result  of  the  example  is  that  if  two  identical  rods 
each  having  force  constant  k are  linked  end  to  end,  the  force  constant  k'  of 
the  combination  will  be 


What  would  be  the  result  for  N identical  rods? 

Two  identical  rods  linked  end  to  end  amount  to  the  same  thing  as  a 
single  bar  twice  as  long  as  either  original  one.  Other  things  being  equal,  the 
change  in  length  of  a bar  of  material  when  it  is  subjected  to  a given  external  force  is 
directly  proportional  to  its  length.  In  order  to  make  possible  comparisons  of 
samples  of  material  when  they  are  subjected  to  external  forces,  in  a way 
which  is  independent  of  their  length,  we  define  the  strain  e (lowercase 
Greek  epsilon)  to  be 


16-2  Stress  and  Strain  699 


The  strain  e is  the  change  in  length  A/  of  a sample  per  unit  of  undistorted  length  l of 
the  sample;  the  quantities  A/  and  / are  shown  in  Fig.  16-1.  Since  strain  is  the 
quotient  of  two  lengths,  it  is  a dimensionless  number.  The  quantity  A / is  a 
signed  scalar  which  is  defined  so  that  its  value  is  positive  when  the  sample  is 
stretched  and  negative  when  it  is  compressed.  Since  the  length  l of  the 
sample  is  always  positive,  the  value  of  the  strain  e is  positive  for  stretch  and 
negative  for  compression. 

To  see  that  the  strain  is  a quantity  which  is  indeed  independent  of  the  length 
of  the  sample  on  which  it  is  measured,  consider  again  the  system  of  two  identical 
springs  linked  end  to  end.  If  each  spring  has  length],  the  two  together  have  length 
21.  If  each  spring  stretches  an  amount  AJ,  the  two  together  stretch  by  an  amount 
2A1.  Thus  the  strain  of  each  spring  individually  is  given  by  e = AI  /l.  The  strain  of 
the  two  taken  together  is  given  by  e'  = 2 AJ/21  = e. 

Now  that  we  have  defined  a quantity,  strain,  which  is  independent  of 
the  length  of  the  sample,  we  develop  a definition  of  a quantity  which  is 
independent  of  its  thickness.  Example  16-1  investigates  the  effect  of 
thickness  by  considering  two  identical  rods  mounted  side  by  side. 


EXAMPLE  16-1 


Positive 
x direction 


F 

*■ 


Fig.  16-2  Illustration  for  Example  16-1. 


Two  identical  steel  rods,  each  having  a square  cross  section  of  area  A a and  a force 
constant  k,  are  arranged  side  by  side  in  the  apparatus  shown  in  Fig.  16-2.  (The  as- 
sumption of  a square  cross  section  is  for  simplicity  only  and  does  not  affect  the  con- 
clusions.) Find  the  effective  force  constant  k"  of  the  pair.  Then  let  the  force  constant 
k for  each  rod  be  k = 2 X 107  N/m,  and  find  the  force  constant  required  to  stretch 
the  pair  of  rods  1 mm. 

■ The  symmetry  of  the  apparatus  is  such  that  the  force  applied  to  the  crossbar  at 
a point  midway  between  the  rods  must  be  balanced  by  two  opposite  forces,  each  of 
magnitude  F/ 2,  exerted  by  the  two  rods.  According  to  Hooke’s  law,  each  rod  will 
therefore  stretch  by  an  amount  A l"  given  by  the  equation 

F 1 F 


But  the  whole  system  stretches  by  the  same  amount  as  either  rod,  and  the  force  ap- 
plied to  the  whole  system  is  F.  When  you  insert  these  whole-system  values  into 
Hooke’s  law,  you  obtain 


F 


A l" 


F 

F/  2k 


or 

k"  = 2k 

Since  the  force  constant  is  a measure  of  stiffness,  it  is  not  surprising  that  two  iden- 
tical rods  tied  together  in  parallel,  as  shown,  and  sharing  the  external  load  are  twice 
as  stiff  as  either  rod  alone. 

You  can  now  find  the  force  required  to  stretch  the  system  an  amount  A/"  = 
A/  = 1 mm.  Since  k = 2 x 107  N/m,  you  have  for  the  pair  of  rods 

k"  = 2 X 2 x 107  N/m  = 4 x 107  N/m 

Hooke’s  law  thus  gives  you 

F = k"  M"  = 4 x 107  N/m  x 1 x 10“3  m = 4 x 104  N 


700 


Mechanics  of  Continuous  Media 


—F 


/ 


F 


Fig.  16-3  A rod  of  cross-sectional  area 
a may  be  regarded  as  a bundle  of  N small 
rods  each  of  cross-sectional  area  A a 
where  a = N A a.  A force  applied  to 
the  end  of  the  large  rod  is  distributed 
uniformly  among  the  imaginary  small 
rods. 


The  argument  of  Example  16-1  can  be  extended  directly  to  the  case  in 
which  an  arbitrary  number  N of  identical  rods  of  equal  cross-sectional  area 
Aa  are  linked  side  by  side.  Such  an  arrangement  is  N times  stiffer  than  any 
one  rod.  That  is,  the  force  constant  of  the  arrangement  is  N times  the  indi- 
vidual force  constant  k.  But  such  a “bundle”  of  rods  amounts  to  a single  rod 
of  cross-sectional  area  a = N A a.  See  Fig.  16-3.  Other  things  being  equal, 
the  stiffness  of  a rod  ( its  ability  to  resist  stretching)  is  directly  proportional  to  its 
cross-sectional  area.  This  is  because  the  applied  force  may  be  thought  of  as  di- 
vided evenly  among  the  individual  members  of  the  imaginary  bundle  of 
rods  of  which  the  actual  rod  is  made  tip. 

In  order  to  make  comparisons  of  samples  of  material  when  they  are 
subjected  to  external  forces  F,  in  a way  which  is  independent  of  their  cross- 
sectional  areas  a,  we  define  the  stress  cr  to  be 


The  quantity  cr  (lowercase  Greek  sigma)  is  more  precisely  called  the  uni- 
axial stress,  that  is,  the  stress  along  a single  axis.  It  is  defined  to  be  the  total 
force  applied  to  an  object  along  that  axis,  divided  by  the  cross-sectional  area  of  the  ob- 
ject. Thus  a thick  rod  experiences  less  stress  than  a thin  rod  carrying  the 
same  load,  as  an  externally  imposed  force  is  often  called.  For  a given  rod, 
the  stress  is  proportional  to  the  load.  If  a load-bearing  object  has  a nonuni- 
form cross-sectional  area,  the  thinnest  part  experiences  the  greatest  stress. 
This  is  made  evident  in  Fig.  16-4.  The  force  F which  appears  in  Eq.  (16-3) 
is  a signed  scalar.  Its  value  is  defined  to  be  positive  when  the  sample  is 
under  tension  (is  being  stretched)  and  negative  when  the  sample  is  under 
compression  (is  being  squeezed).  Thus  the  tensile  stress  experienced  by  a 
sample  under  tension  has  a positive  value,  while  the  compressive  stress 
experienced  by  a sample  under  compression  has  a negative  value. 

The  unit  stress  is  a unit  force  divided  by  a unit  area,  and  is  expressed 
in  newtons  per  meter  squared  in  SI.  This  unit  stress  is  called  the  pascal  (Pa): 

1 Pa  = 1 N/m2  (16-4) 

Flow  much  stress  is  required  to  produce  a given  strain?  It  depends  on 
the  material.  The  stiffness  of  a material  can  be  characterized  by  the  ratio  of  the  stress 
to  the  strain.  This  ratio  is  defined  to  be  the  quantity  Y: 

Y = — (16-5a) 

€ 

The  quantity  Y,  the  stress  divided  by  the  strain,  is  called  Young’s  modulus. 
Its  value  is  always  positive.  According  to  the  definitions  of  stress  and  strain, 
both  have  positive  values  for  a sample  which  is  being  stretched,  and  both 
have  negative  values  for  a sample  which  is  being  compressed.  Hence  the 


Fig.  16-4  A force  applied  to  the  end  of  a rod  of  nonuniform 
cross-sectional  area.  The  stress  at  any  point  in  the  rod  is  inversely 
proportional  to  the  cross-sectional  area  at  that  point. 


16-2  Stress  and  Strain  701 


Table  16-1 


Approximate  Values  of  Young’s  Modulus  for  Various  Materials 

Material  Y (in  1010  Pa) 


Aluminum 
Brass 
Copper 
Glass 
Gold 
Iron,  cast 
wrought 
carbon  steel 
Lead 
Tin 

Tungsten 


7 

9 

10  to  12 
5 

7.8 

8.5  to  10.0 

18  to  20 

19  to  20 

1.5 
4 to  5 
36 


quotient  on  the  right  side  of  Eq.  (16-5a)  always  has  a positive  value.  Since 
the  stress  is  defined  to  be  the  applied  force  F per  unit  of  cross-sectional  area 
a,  and  the  strain  is  defined  to  be  the  change  in  length  A / per  unit  of  undis- 
tencled  length  /,  Young’s  modulus  can  be  written  in  the  form 


_ F/a  _ F l 
F “ A J/l  ~ A la 


(16-5  b) 


Young’s  modulus  is  named  after  its  inventor,  the  brilliant  English  physician, 
amateur  physicist,  and  egyptologist  Thomas  Young  (1773-1829).  Young  is  also 
famous  for  his  experiments  on  the  diffraction  of  light  (discussed  in  Chap.  28)  and 
was  the  first  to  use  the  term  “energy”  to  denote  something  very  like  what  we  today 
call  kinetic  energy  (discussed  in  Chap.  7). 


The  strain  e is  a dimensionless  number,  so  that  the  units  of  Young’s 
modulus,  Y = cr/e,  are  the  same  as  those  of  stress  cr,  namely,  pascals.  Table 
16-1  gives  the  Young’s  modulus  for  a number  of  materials. 

Equation  (16-5a)  can  be  written  in  the  form 

cr  = Ye  (16-6) 

Equation  (16-6)  is  Hooke’s  law  rewritten  more  generally,  so  that  it  applies 
to  the  material  of  which  an  object  is  made  rather  than  only  to  the  specific 
object.  This  statement  is  proved  in  Example  16-2. 


Show  that  when  Eq.  (16-6)  is  applied  to  a specific  object,  it  reduces  to  Hooke’s  law  in 
the  familiar  form  F = k A!  given  by  Eq.  (16-1). 

■ Consider  a rod  made  of  a material  whose  Young’s  modulus  is  Y.  Suppose  the 
unstressed  length  of  the  rod  is  / and  its  cross-sectional  area  is  a.  If  a force  F is  ap- 
plied to  this  rod,  the  rod  will  distend  by  an  amount  A/.  Equation  (16-6)  then  be- 
comes 


F At 
~a  ~ Y~l 

If  you  multiply  both  sides  of  this  equation  by  a,  you  obtain 


(16-7a) 


702  Mechanics  of  Continuous  Media 


Since  Y,  a,  and  l are  all  constants,  you  can  identify  the  quantity  Ya/l  with  the  force 
constant  k of  the  rod  and  rewrite  the  equation  in  the  form 

F = k 11  where  k = Y — (16-7 b) 


The  definition  of  the  force  constant  given  by  Eq.  ( 1 6-76)  makes  it  clear 
that  the  force  constant  of  any  object  made  of  a given  material  is  directly 
proportional  to  its  cross-sectional  area  and  inversely  proportional  to  its 
length.  The  results  of  Example  4-10  for  two  identical  springs  linked  end  to 
end,  and  of  Example  16-1  for  identical  rods  linked  side  by  side,  are  special 
cases  of  Eq.  (16-76). 

You  can  used  Eq.  (16-7a)  to  predict  the  force  constant  for  a uniform 
rod  made  of  a specified  substance,  as  show  n in  Example  16-3. 


EXAMPLE  16-3  ^ 

You  are  designing  the  cylindrical  suspender  rods  for  a suspension  bridge.  These 
rods  hang  from  the  main  suspension  cable  and  support  the  roadway.  If  the  rods  are 
to  be  of  steel  and  if  the  longest  rod  is  to  be  30  m long,  what  must  their  diameter  be  if 
a 20-ton  (2  x 104  kg)  truck  crossing  the  bridge  is  not  to  depress  the  roadway  more 
than  1 cm?  Assume  that  the  spacing  of  the  rods  is  such  that  most  of  the  weight  of 
the  truck  is  borne  by  a single  rod  at  a time,  and  the  distortion  of  the  main  cable  is 
relatively  negligible. 

■ The  loading  force  is  the  weight  of  the  truck,  which  is  F = 2 x 104  kg  x 9.8 
nr/s2  = 2 x 103  N.  From  Eq.  ( 16-7a)  and  Table  16- 1 , you  have  for  the  cross-sectional 
area 


a 


FI 

Yii 

2 x 105  N x 30  m 
20  x 1010  Pa  x 1 x Hr2  m 


3 x 1(T3  m2 


For  a cylindrical  rod,  a = 7rd2/4,  where  d is  the  diameter.  Thus  d = ( 4a/-ir )1/2  = (4  x 
3 X 10-3  m2/7 r)1,a  = 0.06  m,  or  6 cm.  The  shorter  suspender  rods,  hanging  from  the 
lower  part  of  the  main  cable,  will  stretch  proportionately  less. 


Since  Eq.  (16-6)  is  a form  of  Hooke’s  law,  we  expect  the  behavior  of  a 
rod  under  stress  to  be  symmetrical  about  its  undistended  length.  That  is,  a 
compressive  stress  will  shorten  the  rod  by  the  same  amount  as  an  equal  tensile 
(that  is,  stretching)  stress  lengthens  it.  This  is  true  as  long  as  the  strain  is  not 
so  great  that  Hooke’s  law  fails  to  give  a correct  description  of  the  behavior 
of  the  rock  In  dealing  with  Young’s  modulus,  it  must  always  be  remem- 
bered that  the  constant  Y cannot  be  defined  if  Hooke’s  law'  does  not  apply. 

As  you  almost  certainly  know'  from  having  played  with  rubber  bands 
as  a child,  an  object  which  is  stretched  along  one  direction  becomes  thinner 
as  it  becomes  longer.  (Similarly,  objects  become  thicker  as  they  become 
shorter  when  they  are  compressed.)  To  put  it  another  way,  if  a stress  is  ap- 
plied along  the  x axis  of  a set  of  coordinates  fixed  with  respect  to  the  object, 
there  is  a strain  along  that  axis  which  w'e  call  th e primary  strain.  This  primary 
strain  is  accompanied  by  a strain  of  opposite  sign  along  the  y and  z axes, 


16-2  Stress  and  Strain  703 


which  we  call  the  induced  strain.  The  phenomenon  is  illustrated  in  Fig.  16-5. 


F 


Fig.  16-5  When  a rod  is  stretched,  it 
becomes  thinner.  Here  the  applied  stress 
results  in  a positive  primary  strain 
and  negative  induced  strains. 


We  can  account  qualitatively  for  this  phenomenon  on  the  microscopic  level. 
In  an  isotropic  substance  [one  whose  macroscopic  properties  are  the  same  in  all 
directions),  every  atom  lies  equidistant,  on  the  average,  from  its  nearest 
neighbors.  This  is  the  position  in  which  all  the  attractive  and  repulsive  forces 
between  it  and  neighboring  atoms  balance.  In  Fig.  16-6  are  shown  two  neighboring 
atoms  A andB,  lying  along  thex  direction.  When  a tensile  stress  is  applied  along 
the  x axis,  the  average  interatomic  distance  along  that  direction  increases  as  the 
sample  stretches.  As  A and  B separate,  their  neighbors  C and  D tend  to  move 
toward  the  axis  joining  A and  B — that  is,  the  stress  axis.  This  displacement  of  C 
and  D reduces  the  average  interatomic  distance  toward  the  undisturbed  value.  The 
result  in  the  large  is  a reduction  in  the  dimensions  of  the  sample  in  the  plane 
normal  to  the  direction  of  stress.  In  similar  manner,  a compressive  stress  along  the 
x axis  and  its  accompanying  negative  strain  will  result  in  an  expansion  of  the 
sample  in  the  yz  plane  normal  to  the  axis  of  stress. 


A uniaxial  stress  always  leads  to  a change  in  the  volume  of  the  sample. 
That  is,  the  strain  induced  in  the  plane  normal  to  the  applied  stress  (which 
contains  C and  D in  Fig.  1 6-6)  is  never  sufficient  to  make  up  for  the  primary 
strain  along  the  stress  axis  (along  which  A and  B lie  in  the  same  figure). 

In  an  isotropic  substance  the  induced  strain  is  the  same  in  all  directions 
in  the  plane  normal  to  the  primary  strain.  The  ratio  of  the  induced  strain  to 
the  primary  strain  is  called  Poissons  ratio  [after  the  French  mathematician 
and  physicist  Simon  Poisson  (1781-1840)].  Poisson’s  ratio  is  expressed 
mathematically  as  follows.  Suppose  that  a stress  is  applied  lengthwise  to  a 
sample  of  length  /,  width  w,  and  thickness  t.  As  a result,  the  length  changes 
by  an  amount  A/,  while  the  width  and  thickness  change,  respectively,  by 
amounts  Arc  and  A t,  whose  values  are  of  opposite  sign  to  that  of  A/.  The  pri- 
mary strain  is  then  given  by  the  ratio  A///,  while  the  induced  strain  has  a 
value  given  by  either  the  ratio  Aw/w  or  the  ratio  A t/t,  which  have  equal  val- 
ues. Poisson’s  ratio  v is  then  defined  by  the  equation 


Aw/w  A t/t 

A///  = “A Tfl 


(16-8) 


A 


• D 


r 

Ax 


Fig.  16-6  Idealized  microscopic  picture  of  the  pro- 
cess shown  in  Fig.  16-5.  Atoms  A,  B.  C,  and  D are 
equidistant  in  the  unstressed  isotropic  sample.  Stress 
applied  along  the  x axis  separates  A and  B.  Atoms 
C and  D move  to  new  equilibrium  positions  closer  to 
one  another,  in  such  a way  as  to  reduce  the  change  in 
the  average  interatomic  distance  produced  by  the 
applied  stress. 


Av 


D 


I 


704  Mechanics  of  Continuous  Media 


(Because  of  the  minus  sign  in  the  definition  of  Poisson’s  ratio,  v always  has  a 
positive  value.) 

If  we  call  the  primary  strain  e and  the  induced  strain  e,,  we  can  write 
Eq.  (16-8)  in  the  form 


e; 

e 


(16-9) 


According  to  Eq.  (16-6),  <x  = Ee,  the  stress  is  proportional  to  the  strain. 
Therefore,  Poisson’s  ratio  can  also  be  expressed  in  terms  of  stress  as 
follows: 


v = 1 (16-10) 

< T 

where  cr,  is  the  stress  induced  in  a direction  perpendicular  to  the  axis  of  the 
applied  stress  cr.  Poisson’s  ratio  can  be  measured  directly  by  experiment. 
For  most  materials,  its  value  lies  between  0.1  and  0.4. 


Fig.  16-7  A rectangular  parallelepi- 
pedal  solid  sample  subjected  to  shear 
stress.  The  two  oppositely  directed 
forces  of  magnitude  F are  distributed 
evenly  over  the  opposite  faces  of  area  a. 
The  sample  is  distorted  from  its  original 
shape,  shown  in  dashed  lines,  into  a non- 
rectangular  parallelepiped.  Relative  to 
one  face,  the  opposite  face  is  moved 
through  a displacement  A/.  The  much 
larger  distance  between  faces  is  l,  so  the 
angle  of  shear  strain,  expressed  in 
radians,  is  y = A///. 


Besides  being  stretched  or  compressed,  a solid  object  can  be  sheared. 
Shear  is  illustrated  in  Fig.  16-7.  It  is  the  kind  of  stress  applied  by  a pair  of 
shears  when  it  is  cutting.  A pair  of  shearing  forces  of  magnitude  F is  always 
applied,  by  definition,  in  a pair  of  opposite  directions  parallel  to  the  surface 
of  the  object,  but  not  along  a single  axis  (that  is,  not  uniaxially).  To  see  this, 
imagine  plates  of  area  a,  glued  to  the  upper  and  lower  surfaces  of  the  ob- 
ject shown  in  the  figure,  being  pulled  in  opposite  directions.  We  define  the 
shear  stress  crs  to  be 


crs 


(16-11) 


This  definition  is  analogous  to  the  one  for  the  uniaxial  (tensile  or  compres- 
sive) stress.  Under  the  action  of  a shear  stress,  the  sample  will  distort  as 
shown.  The  shear  strain  y is  defined  to  be 

M 

y = ~j  (16-12) 


This  is  analogous  to  the  definition  of  the  uniaxial  (tensile  or  compressive) 
strain.  In  shear,  however,  the  displacement  A / is  measured  in  a direction 
perpendicular  to  that  along  which  the  reference  length  l is  measured.  Note 
also  that  the  force  of  magnitude  F in  Eq.  (16-11)  is  directed  parallel — not 
perpendicular — to  the  surface  of  area  a.  For  practical  purposes,  it  is 
usually  sufficient  to  consider  small  strains  only,  where  A l/l  «.  1.  In  this 
case,  y is  the  angle  shown  in  Fig.  16-7,  and  A l/l  is  its  value  expressed  in  ra- 
dians. 

In  analogy  to  what  we  did  in  the  tensile  and  compressive  cases,  we 
characterize  the  ability  of  a material  to  resist  shear  deformation  by  defining 
the  shear  modulus  G (also  often  called  the  modulus  of  rigidity)  to  be 


G = — (16-13) 

y 

The  unit  of  G,  like  that  of  Young’s  modulus  Y,  is  the  pascal.  For  most  mate- 
rials, the  value  of  the  shear  modulus  is  somewhat  less  than  half  that  of 
Young’s  modulus. 


16-2  Stress  and  Strain  705 


16-3  FLUIDS  AND 
PRESSURE 


F 


Fig.  16-8  A perfectly  rigid  tube  of 
square  cross  section  having  area  a is 
closed  at  both  ends  by  close-fitting  pis- 
tons on  which  an  external  force  of  mag- 
nitude F is  exerted.  The  region  within 
the  pistons  is  filled  with  a uniform  fluid 
in  hydrostatic  equilibrium.  A set  of  co- 
ordinate axes  is  defined  as  shown.  An 
infinitesimal  volume  element  in  the  form 
of  a right  triangular  prism  is  located 
with  one  corner  at  the  origin  of  these 
axes. 


While  solids  are  very  familiar,  they  comprise  only  a very  small  part  of  all 
the  matter  in  the  universe.  By  far  the  greater  part  of  that  matter  is  in  the 
form  of  one  or  another  kind  of  fluid.  The  most  familiar  fluids  are  liquids 
andgzwcs.  (We  make  a precise  distinction  between  liquids  and  gases  later  in 
this  section.)  A fluid  is  defined  to  be  any  substance  which  has  negligible  or 
zero  resistance  to  shear  stress  under  the  conditions,  and  for  the  purposes, 
at  hand.  That  is,  under  the  smallest  shear  stress  of  interest,  applied  for  the 
shortest  time  of  interest,  a fluid  yields  and  deforms  indefinitely.  This  is  the 
process  we  call  flow.  (Thus  a fluid  does  not  obey  Hooke’s  law  when  it  is  sub- 
jected to  a shear  stress.)  All  the  other  familiar  properties  common  to  fluids 
can  be  inferred  from  this  definition. 

The  magnitude  of  the  shear  stress  and  the  time  during  which  it  is  applied  are 
significant.  Many  substances  such  as  glasses,  plastics,  and  waxes  act  as  solids 
under  ordinary  circumstances.  They  maintain  their  shapes,  obey  Hooke’s  law,  and 
shatter  when  struck  sharply.  But  over  a long  period,  or  with  the  application  of 
large  shear  stresses  under  circumstances  which  prevent  breakage,  they  flow  like 
fluids.  For  example,  many  glass  objects  made  in  antiquity  exhibit  evidence  of 
having  flowed.  And  even  rocks,  which  are  mostly  “orthodox”  crystalline  solids, 
flow  like  fluids  when  the  stress  and  elapsed  time  have  the  magnitudes  of  interest 
to  geologists.  (Geologists  call  the  propensity  of  a particular  type  of  rock  to  flow 
under  specified  conditions  of  interest  its  “rheidity.”) 


For  most  fluids,  it  is  accurate  enough  to  say  that  any  shear  stress,  how- 
ever small,  is  sufficient  to  deform  the  fluid  indefinitely — that  is,  to  make  it 
flow.  Therefore,  if  a fluid  is  completely  at  rest,  there  can  be  no  shear  stresses  any- 
where within  it  or  at  its  boundaries.  When  this  is  the  case,  the  fluid  is  said  to  be 
in  hydrostatic  equilibrium.  In  hydrostatic  equilibrium,  the  force  transmitted 
across  any  imaginary  plane  within  the  fluid — that  is,  the  force  exerted  by  the 
fluid  on  one  side  of  the  plane  on  the  fluid  on  the  other  side — must  be  directed 
normal  to  the  plane.  The  reason  is  that  there  can  be  no  shear  stress,  and 
therefore  no  force  directed  along  the  plane.  The  same  thing  is  true  at  any 
boundary  of  the  fluid,  for  example,  where  it  makes  contact  with  the  con- 
tainer that  holds  it. 

We  now  apply  the  condition  for  hydrostatic  equilibrium  to  the  fluid 
contained  in  the  system  shown  in  Fig.  16-8.  The  fluid  is  confined  in  a per- 
fectly rigid,  long  tube  of  square  cross  section,  which  is  closed  at  both  ends 
by  close-fitting  pistons.  As  a result  of  the  force  of  magnitude  F exerted  on 
each  of  the  pistons,  the  fluid  is  subjected  to  stress.  In  particular,  the  infini- 
tesimal volume  element  of  the  fluid  shown  schematically  in  the  center  of 
the  tube  is  subjected  to  stress.  In  order  to  analyze  the  stress  on  this  element, 
which  has  the  form  of  a right  triangular  prism  with  base  angle  6,  we  depict 
it  very  much  “magnified”  in  Fig.  16-9.  According  to  the  general  conclusions 
reached  in  the  previous  paragraph,  the  surrounding  fluid  exerts  the  forces 
shown  in  the  figure  on  the  five  sides  of  the  volume  element.  (The  forces  are 
actually  distributed  uniformly  over  the  sides,  but  for  the  purpose  of  discus- 
sion we  show  them  as  if  they  were  concentrated  at  the  centers  of  the  respec- 
tive sides  on  which  they  are  exerted.)  Each  force  is  normal  to  its  respective 
surface.  The  infinitesimal  force  dFv  is  exerted  on  the  vertical  rectangular 
surface  in  the  xz  plane,  whose  area  has  the  infinitesimal  value  dav.  The  force 
dFh  is  exerted  on  the  horizontal  rectangular  surface  in  the  xy  plane,  whose 
area  is  dah.  The  force  dFs  is  exerted  on  the  slanted  surface  of  the  volume 
element,  whose  area  is  das.  Finally,  it  is  evident  by  symmetry  that  the  forces 


706  Mechanics  of  Continuous  Media 


z 


Fig.  16-9  “Magnified"  view  of  the  infinitesimal  volume 
element  of  the  fluid  shown  in  Fig.  16-8.  The  rectangular 
vertical  surface  has  area  dav,  the  horizontal  surface  has 
area  dah,  and  the  slanted  surface  has  area  das.  The  normal 
forces  exerted  on  the  five  faces  of  the  prism  by  the  ad- 
joining fluid  are  shown. 


z"'"  -dF, 

dFh 


y 


X 


on  the  vertical  triangular  surfaces  have  the  equal  and  opposite  values  dFt 
and  — dF,. 

Since  the  entire  fluid  in  Fig.  16-8  is  in  hydrostatic  equilibrium,  the 
infinitesimal  volume  element  of  Fig.  16-9  is  likewise  in  hydrostatic  equilib- 
rium and  so  must  be  at  rest.  This  can  be  the  case  only  if  the  net  force  acting 
on  the  volume  element  is  zero.  Using  the  methods  developed  in  Chap.  5, 
we  treat  the  infinitesimal  volume  element  of  fluid  as  a free  body.  Specifi- 
cally, we  add  the  x,  y,  and  z components  of  the  forces  acting  on  the  element 
and  set  the  three  sums  thus  obtained  to  zero.  For  the  x components,  we  ob- 
tain immediately  the  trivial  result 


dFt  - dF,  = 0 


The  y component  of  the  force  dFs  has  the  value  — dFs  sin  0.  Adding  this  to 
dFv,  which  is  the  y component  of  the  vector  dFv,  we  obtain 

dFv  — dFs  sin  0 = 0 

Thus  the  magnitudes  dFv  and  dFs  are  related  by  the  expression 

dFv  = dFs  sin  0 

Similarly,  the  z component  of  dFs  has  the  value  — dFs  cos  0.  Adding  this  to 
dFh,  which  is  the  z component  of  the  vector  dFh,  we  obtain 

dFh  ~ dFs  cos  0 = 0 

Thus  the  magnitudes  dFh  and  dFs  are  related  by  the  expression 

dFh  = dFs  cos  0 

We  now  wish  to  calculate  the  stresses  corresponding  to  the  forces  ex- 
erted on  the  vertical,  horizontal,  and  slanted  rectangular  surfaces  of  the 
volume  element  depicted  in  Fig.  16-9.  In  each  case,  the  stress  is  compres- 
sive and  is  found  by  dividing  the  force  by  the  corresponding  area.  The 
stress  crv  on  the  vertical  rectangular  surface  is  therefore  given  by  the  ex- 
pression 


The  quantities  dFv  and  dav  on  the  right  side  of  this  equation  are  both  mag- 
nitudes and  thus  always  have  positive  values.  The  minus  sign  is  required  to 
make  the  value  of  <x„  conform  to  the  convention  that  it  be  negative  when  crv 
represents  a compressive  stress.  Similarly,  the  stress  crh  cm  the  horizontal 


16-3  Fluids  and  Pressure  707 


rectangular  surface  is 


CTft  = 


dFh 

dcit. 


And  the  stress  cr,  on  the  slanted  surface  is 


(To 


dFs 
da o 


I he  three  stresses  cr„,  crh,  and  crs  obtained  immediately  above  can  now 
be  related  to  one  another.  To  do  this,  we  note  that  the  areas  dav  and  dah  can 
be  expressed  in  terms  of  the  area  das  and  the  base  angle  9 of  the  rectangu- 
lar prism  which  is  the  fluid  volume  element.  Referring  to  Fig.  16-9,  which 
shows  that  the  width  of  the  prism  along  the  x axis  is  uniform,  we  have 

dav  = das  sin  9 

Substituting  this  value  and  the  value  dFv  = dFs  sin  9 obtained  in  the  last 
paragraph  but  one  into  the  expression  for  crv,  we  obtain 


cr„  = - 


dFv 

dav 


dFs  sin  9 
da , sin  9 


dFs 
da , 


But  the  term  on  the  right  of  the  last  equality  is  just  the  quantity  <xs,  the 
stress  on  the  slanted  face  of  the  volume  element.  So  we  have 


V v (Ts 

In  like  manner,  we  find  for  the  area  dah  of  the  horizontal  rectangular  sur- 
face of  the  volume  element  the  value 

dah  = das  cos  6 

Substituting  this  value  and  the  value  dFh  — dFs  cos  9 obtained  in  the  last 
paragraph  but  one  into  the  expression  for  crh,  we  obtain 


CTh 


dFh 

dah 


dFs  cos  0 
da^  cos  6 


dFs 

das 


Flere  again  the  last  term  on  the  right  is  the  quantity  <xs,  and  we  have 

o~s 

We  have  thus  found  that  the  stresses  exerted  on  the  horizontal,  vertical, 
and  slanted  sides  of  the  infinitesimal  volume  element  of  fluid  are  the  same! 
Note  in  particular  that  this  result  is  independent  of  the  value  of  the  prism 
angle  9.  We  can  thus  conclude  that  the  stress  exerted  on  an  infinitesimal  element 
of  fluid  at  any  particular  location  within  a fluid  in  hydrostatic  equilibrium  is  inde- 
pendent of  the  orientation  of  its  surfaces.  That  is,  at  any  particular  location  the 
stress  in  a fluid  is  isotropic — the  same  in  all  directions. 

This  isotropic  compressive  stress,  which  is  one  of  the  properties  which 
typify  fluids,  is  the  familiar  q uantity  pressure.  Because  it  is  an  isotropic  quan- 
tity, we  need  not  specify  the  direction  of  the  force  or  the  orientation  of  the 
surface  area  element  in  defining  it.  We  thexefore  drop  subscripts  and  de- 
fine the  pressure  p of  a fluid  at  any  location  within  the  fluid  to  be 


P = 


dF 

da 


(16-14) 


Like  stress,  pressure  has  the  dimensions  of  force  divided  by  area.  While 


708  Mechanics  of  Continuous  Media 


compressive  stress  in  solids  is  conventionally  defined  so  that  compressive 
stress  is  represented  by  a negative  value,  pressure  is  defined  so  that  a posi- 
tive value  of  p corresponds  to  the  nearly  universal  situation  in  which  all 
parts  of  a fluid  in  hydrostatic  equilibrium  are  under  compression.  The  SI 
unit  of  pressure,  like  that  of  stress,  is  the  pascal,  or  newton  per  meter 
squared. 

The  isotropy  of  pressure  is  sometimes  expressed  in  the  statement 
"Pressure  is  transmitted  uniformly  in  all  directions.”  This  statement  is 
called  Pascal’s  law,  or  Pascal’s  principle.  We  have  shown  how  it  follows 
from  the  definition  of  stress  and  the  zero  shear  resistance  of  fluids.  It  is  eas- 
ily verified  directly  by  experiment.  A pressure  gauge  submerged  in  a sta- 
tionary fluid  at  a certain  point  will  give  the  same  pressure  reading  regard- 
less of  its  orientation. 

Pascal’s  law  was  probably  known  to  Archimedes  (287—212  b.c.)  but  was 
enunciated  in  detail  by  Blaise  Pascal  (1623-1662).  Pascal  made  several  important 
direct  contributions  to  the  development  of  physics,  although  his  impact  was 
greater  on  mathematics  and  philosophy.  He  was  the  first  to  suggest  that  baromet- 
ric pressure  varies  with  altitude.  The  first  experimental  verification  was  carried 
out  successfully  by  his  brother-in-law  in  conjunction  with  a mountain-climbing 
picnic. 

Because  of  its  isotropy,  pressure  is  defined  unambiguously  only  within 
fluids,  except  for  special  cases.  However,  it  is  possible  to  apply  pressure  to 
an  isotropic  solid  by  surrounding  it  with  a fluid  in  an  apparatus  of  the  sort 
suggested  by  Fig.  16-8.  To  emphasize  the  necessary  isotropy  of  the  situa- 
tion, we  refer  to  the  pressure  in  such  cases  as  hydrostatic  pressure  (that  is, 
the  pressure  exerted  by  a fluid  in  hydrostatic  equilibrium),  a term  which  is 
occasionally  used  for  emphasis  in  other  contexts  as  well. 

There  are  several  other  units  of  pressure  in  common  use  besides  the  pascal: 

1.  The  bar  (from  the  Greek  word  meaning  heaviness  or  weight): 

1 bar  = 103  Pa 

In  U.S.  meteorological  practice,  the  most  widely  used  unit  is  the  millibar  (mbar), 
equal  to  10~3  bar.  In  current  Canadian  practice,  the  kilopascal  (1  kPa  = 103  Pa)  is 
used,  in  concord  with  preferred  SI  usage. 

2.  The  pressure  of  the  atmosphere  at  sea  level  varies  slightly  as  the  weather 
changes.  For  convenience,  the  atmosphere  (atm)  is  somewhat  arbitrarily  defined 
to  be 


1 atm  = 1.013  bar  = 1.013  x 105  Pa 

Other  units  encountered  in  meteorological  practice  are  the  inch  of  mercury  and 
the  millimeter  of  mercury,  which  both  refer  to  barometer  readings.  This  means  of 
measuring  pressure  is  discussed  in  Example  16-5. 

3.  In  vacuum  technology,  the  popular  unit  is  the  torr  (not  usually  abbre- 
viated). This  is  the  pressure  corresponding  to  a mercury  barometer  reading  of  1 
mm.  The  value  is 


1 torr  = 133  Pa 

4.  In  U.S.  engineering  practice,  the  pound  per  square  inch  (lb/in2)  is  still 
used.  The  equivalence  is 

1 lb/in2  = 6.89  x 103  Pa 


16-3  Fluids  and  Pressure  709 


F 


Fig.  16-10  An  external  pressure  jf)ext 
is  applied  to  the  fluid  confined  in  the 
cylinder  by  the  application  of  a force 
of  magnitude  F to  the  piston  of  area  a. 
At  a point  Q at  a depth  h below  the 
piston,  there  is  an  additional  internal 
pressure  pint.  The  value  of  piM  is  cal- 
culated in  the  text  by  considering  the 
weight  of  the  column  of  fluid  sup- 
ported by  the  imaginary  horizontal  area 
element  da  located  at  Q. 


5.  In  European  engineering  practice,  kilograms  per  centimeter  squared  is 
used,  most  familiarly  in  expressing  automobile  tire  pressures.  The  unit  is  mis- 
named. What  is  really  meant  is  that  pressure  which  would  be  exerted  if  a kilogram 
mass  were  to  be  supported  uniformly  by  a surface  of  area  1 cm2,  under  normal 
gravitational  conditions.  That  is, 


1 


“kg/cm2” 


1 kg  x 9.80  m/s2 
1 x 10“4  m2 


= 9.80  x 104  Pa 


None  of  the  above  units  is  consistent  with  any  standard  system  of  units;  thus  they 
cannot  be  used  in  equations  containing  pressure  and  other  quantities,  unless 
proper  conversion  factors  are  included.  Of  these  nonstandard  units,  we  use  only 
the  atmosphere  in  this  book. 


In  Fig.  16-10,  a fluid  is  enclosed  in  a cylinder  having  a tightly  fitting 
piston.  What  is  the  pressure  at  an  arbitrary  point  Q,  located  a distance  h 
below  the  top  of  the  fluid?  T his  pressure  arises  from  two  sources.  The  first 
is  the  external  force  F exerted  by  the  piston  of  area  a,  and  the  second  is  the 
downward  force  exerted  by  the  weight  of  the  fluid  above  Q.  Both  result  in 
isotropic  pressures  in  the  fluid.  T he  pressure  p at  Q can  be  written  as  the 
sum  of  two  terms,  one  external  and  one  internal: 

P — Pex t "F  pini  (16-lotf) 

Since  we  have  pext  — F/a,  this  equation  can  be  written 

P=~  + Pm\.  (16-156) 

The  internal  pressure  pint  at  point  Q is  connected  with  the  weight  of  the 
fluid  lying  above  it  and  therefore  with  the  mass  of  that  fluid.  It  is  useful  to 
define  density,  a quantity  which  has  to  clo  with  the  mass  m of  the  fluid  but  is 
independent  of  the  volume  V of  the  particular  sample  being  considered. 
We  make  the  definition 


P = ^ (16-16) 

The  lowercase  Greek  letter  rho  is  conventionally  used  for  density.  The  SI 
unit  of  density  is  the  kilogram  per  meter  cubed,  which  has  no  special  name. 


In  general,  the  density  of  a substance  depends  on  the  pressure.  Con- 
sider a small  sample  of  volume  dV.  If  the  pressure  is  increased,  the  volume 
of  the  sample  will  decrease.  Thus  the  same  amount  of  matter  will  occupy 
less  volume,  and  the  density  will  therefore  increase.  This  effect  is  always 
present.  But  under  familiar  conditions  of  pressure  and  temperature,  we 
find  that  nearly  all  familiar  fluids  can  be  clearly  separated  into  two  classes. 
If  we  start  at  atmospheric  pressure  ( p = 1 atm)  and  increase  the  pressure 
by  a modest  amount — say,  double  it — the  volume  of  the  first  class  of  fluids, 
called  liquids,  is  decreased  only  slightly  (by  a fraction  of  1 percent).  The 
volume  of  the  second  class  of  fluids  called  gases,  is  reduced  to  about 
one-half  under  the  same  circumstances.  (In  Chap.  18,  we  discuss  the  micro- 
scopic reasons  for  this  behavior  of  gases.)  For  our  present  purposes,  it  suf- 
fices to  make  the  approximation  that  liquids  are  incompressible  or  nearly 
so,  while  gases  are  compressible.  Thus  liquids  have  a density  which  de- 
pends only  weakly  on  the  pressure.  For  most  purposes  the  density  of  a liq- 
uid can  be  considered  a constant  under  conditions  of  changing  pressure. 


710  Mechanics  of  Continuous  Media 


There  is  a very  convenient  connection  between  pressure  and  density  in 
a uniform  liquid.  Consider  the  column  of  liquid  of  height  h and  cross- 
sectional  area  da,  which  lies  above  Q in  Fig.  16-10.  The  volume  of  this  col- 
umn is  dV  = h da.  According  to  Eq.  (16-16),  its  mass  is 

dm  = p dV  = ph  da 

Its  weight  is  therefore 

dW  = g dm  = pgh  da 

where  g is  the  acceleration  of  gravity.  The  pressure  exerted  by  the  weight 
of  the  column  is  thus 


Pint  = ~da  = Pgh  (16-17) 

To  this  must  be  added  the  pressure  pext  produced  by  the  force  on  the 
piston.  According  to  Eqs.  (16-15«),  (16-156),  and  (16-17),  the  total  pressure 
at  Q is  thus 

F 

P = Pext  + Pint  = ~ + Pgh  (16-18) 

In  most  cases  of  practical  interest,  one  term  or  the  other  of  this  equa- 
tion is  negligible.  When  pressure  in  a vessel  is  produced  by  a piston,  the  en- 
tire apparatus  is  usually  small  enough  that  the  pressure  difference  between 
top  and  bottom  produced  by  the  pressure  head  h is  negligible.  In  such  a 
case,  we  have  p — pext.  On  the  other  hand,  when  a vessel  is  deep  enough  for 
the  pressure  head  to  be  significant,  it  is  usually  not  artificially  pressurized, 
so  we  have  p — pint.  Nevertheless,  Eq.  (16-18)  in  its  complete  form  can  be 
useful  as  well  as  informative,  as  Example  16-4  shows. 


EXAMPLE  16-4 


A tank  contains  a pool  of  mercury  0.30  m deep,  covered  with  a quantity  of  water 
which  is  1.2  m deep.  The  density  of  water  is  1.0  x 103  kg/m3,  and  that  of  mercury  is 
13.6  x 103  kg/m3.  Find  the  pressure  exerted  by  the  double  layer  of  liquids  at  the 
bottom  of  the  tank. 

■ First  you  must  find  the  pressure  at  the  top  of  the  mercury  pool.  As  far  as  a 
point  below  the  surface  of  the  mercury  is  concerned,  this  may  be  regarded  as  the 
external  pressure  pex , in  Eq.  (16-18).  You  have 


Pext  Pwatergh 


water 


= 1.0  x 103  kg/m3  x 9.8  m/s2  x 1.2  m = 1.2  x 104  Pa 


You  can  find  the  pressure  pint  exerted  by  the  mercury  column  itself  in  the  same 
manner: 


^int  Pmevcgh  mere 

= 13.6  x 103  kg/m3  x 9.8  m/s2  x 0.30  m = 4.0  x 104  Pa 
The  total  pressure  at  the  bottom  is  thus 

P = Pex t + Pint  = (1.2  + 4.0)  x 104  Pa 
= 5.2  x 104  Pa 

iii—"ii  pf  wniii  nil1  ii1  ■iiFf  miii  'i  mm  iiii'MiiiiBiii— i|i|iii 


LTnlike  the  pressure  in  a liquid,  the  pressure  in  a gas  at  a given  depth 
cannot  be  expressed  in  the  simple  form  of  Eq.  (16-18),  since  the  density  p is 
not  a constant  unless  the  height  of  the  column  of  gas  is  quite  small.  There  is 


16-3  Fluids  and  Pressure  711 


Vacuum 


^atm 


d 

H 

• •B 

Mercury 

Fig.  16-11  The  mercury  barometer. 
The  pressure  of  the  atmosphere  patm 
is  equal  to  that  exerted  by  a column  of 
mercury  of  height  h.  Since  the  space 
above  the  mercury  column  is  evacuated, 
there  is  no  downward  force  exerted  on 
the  top  of  the  column,  and  pext  = 0. 
The  condition  that  hydrostatic  pressure 
-be  isotropic  everywhere  is  satisfied  if 
the  upward  force  exerted  on  the  mer- 
cury column  is  just  great  enough  to  sup- 
port its  weight.  Under  these  circum- 
stances, the  pressure  will  be  the  same  at 
the  two  nearby  points  A and  B,  which 
lie  at  a small  depth  d below  the  mercury 
surface  in  the  dish. 


nonetheless  a well-defined  pressure  at  any  given  depth.  Even  without 
knowing  the  functional  relation  of  pressure  to  altitude  in  an  atmosphere,  it 
is  possible  to  measure  the  pressure  of  the  atmosphere  at  any  point  (or  the 
pressure  of  any  fluid,  for  that  matter)  by  means  of  the  barometer.  The  in- 
vention of  the  barometer  is  generally  attributed  to  Evangelista  Torricelli 
(1608-1647),  a student  of  Galileo. 

In  its  simplest  form,  the  barometer  is  an  instrument  which  compares 
the  air  pressure  at  a given  point  with  the  (necessarily  equal)  pressure  pro- 
duced by  a column  of  an  incompressible  fluid,  usually  mercury.  See  Fig. 
16-11.  In  order  to  make  a mercury  barometer,  you  fill  a closed-end  tube,  of 
the  sort  shown  in  the  figure,  with  mercury.  You  then  close  the  open  end 
with  your  thumb,  invert  the  tube,  and  submerge  the  end  in  a dish  of  mer- 
cury. When  you  remove  your  thumb,  some  of  the  mercury  will  pour  out  of 
the  tube,  leaving  a vacuum — a region  essentially  empty  of  matter — at  the 
closed  end.  If  there  were  no  air  outside  the  system,  the  mercury  would  con- 
tinue to  pour  out  until  the  tube  was  empty  to  the  level  of  the  liquid  surface 
in  the  dish.  However,  the  surrounding  air  presses  on  the  surface  of  the 
mercury  in  the  dish  with  a pressure  pa ,m.  The  system  comes  to  equilibrium 
when  the  pressures  on  opposite  sides  of  the  air-mercury  interface  are  ecjual 
in  magnitude.  This  happens  when  the  air  pressure  equals  the  mercury 
pressure. 

To  see  this,  consider  the  two  nearby  points  A and  B in  Fig.  16-11, 
which  lie  at  a small  depth  d below  the  surface  of  the  mercury  in  the  dish. 
The  pressure  at  point  A is  produced  by  the  weight  of  a column  of  mercury 
of  height  h + d.  The  pressure  at  point  B is  produced  by  the  weight  of  a col- 
umn of  mercury  of  height  d,  together  with  the  external  pressure  pext  = 
Patm-  If  the  pressures  at  A and  B were  not  equal,  the  mercury  in  the  dish 
could  not  be  at  rest. 

What  makes  the  barometer  convenient  is  that  the  pressure  exerted  by 
the  mercury  column  can  be  expressed  directly  in  terms  of  its  readily  meas- 
urable height.  Thus  the  barometer  measures  pressure  directly,  without  the 
necessity  of  separate  measurements  of  force  and  area. 

You  can  determine  the  pressure  by  using  Eq.  (16-17),  which  reduces 
for  the  mercury  column  to  p — pgh.  It  is  a common  custom,  however  (now 
beginning  to  die  out),  to  give  the  “barometric  pressure”  in  units  of  centi- 
meters, millimeters,  or  inches  of  mercury,  without  bothering  to  use  Eq. 
(16-17).  These  units  are  not  really  units  of  pressure,  since  they  give  the 
height  of  the  mercury  column  necessary  to  balance  the  pressure  of  the 
atmosphere  rather  than  the  pressure  itself.  However,  you  can  easily  make 
the  conversion,  as  is  illustrated  in  Example  16-5. 


The  barometric  reading  (that  is,  the  height  of  the  mercury  column  in  a barometer 
like  that  of  Fig.  16-1 1)  is  760  millimeters  of  mercury.  Find  the  pressure  in  pascals. 
■ Using  Eq.  (16-17),  you  have 

p = pgh  = 13.6  x 103  kg/m3  X 9.80  m/s2  x 0.760  nr 
= 1.013  x 105  Pa 


The  concepts  underlying  the  operation  of  the  barometer  are  applied 
in  Sec.  16-4  to  a study  of  the  relationship  between  the  pressure  of  a gas  and 
its  volume  and  density. 


712  Mechanics  of  Continuous  Media 


16-4  BOYLE’S  LAW  The  density  p of  a substance  always  depends  in  some  manner  on  the  pres- 
sure p to  which  it  is  subject.  Equation  (16-16),  rewritten  slightly  so  as  to 
bring  out  the  functional  dependence  explicitly,  becomes 

m 

pW  = V(J)  <16‘19) 

That  is,  the  density  p is  influenced  by  the  pressure  p because  the  volume  V 
into  which  a given  mass  m of  the  substance  is  packed  depends  on  the  pres- 
sure. Gases  are  substances  for  which  the  density  is  influenced  significantly 
by  the  pressure  because  the  volume  depends  strongly  on  the  pressure.  In 
this  section  we  are  concerned  with  the  relation  between  pressure  and  vol- 
ume for  a fixed  quantity  of  gas. 

We  can  determine  the  relationship  between  the  pressure  and  the  vol- 
ume for  an  arbitrary  quantity  of  gas  by  trapping  the  gas  in  a cylinder  with  a 
leakproof  piston  and  measuring  the  pressure  in  the  gas  as  its  volume  is 
varied.  In  doing  such  an  experiment,  we  must  be  sure  not  to  vary  any  other 
quantity  which  might  affect  the  pressure  reading  obtained  for  a given  vol- 
ume. It  turns  out  that  this  condition  can  be  satisfied  by  keeping  the  temper- 
ature constant  throughout  the  experiment.  (Temperature  effects  are  stud- 
ied in  Chap.  17.) 

The  experiment,  first  performed  by  the  Anglo-Irish  physicist  Robert 
Boyle  (1627-1691),  is  both  simple  and  ingenious.  A quantity  of  air  (other 
gases  would  do  as  well)  is  trapped  in  a closed-end  U-tube  by  means  of  a col- 
umn of  mercury.  By  allowing  air  to  bubble  past  the  mercury,  its  pressure 
can  be  made  equal  to  atmospheric.  See  Fig.  16-  12a.  After  the  volume  occu- 
pied by  the  air  has  been  measured,  the  pressure  can  be  varied  in  a con- 
trolled fashion  by  pouring  more  mercury  into  the  open  end  of  the  U-tube. 
See  Fig.  16-126.  As  the  pressure  is  increased  by  adding  mercury,  the  vol- 
ume of  the  trapped  air  is  seen  to  decrease.  As  long  as  the  temperature  is 
not  changed,  the  following  simple  rule  is  found  to  apply  with  considerable 
accuracy  over  a fairly  wide  range  of  pressures  and  volumes: 


Patm 


r\ 

P ~ Patm 


/ay 


Mercury 


(a) 


Fig.  16-12  Boyle’s  apparatus  for  measuring  the  volume  of  a confined  quantity 
of  gas  as  a function  of  its  pressure,  (a)  By  tilting  the  apparatus,  air  bubbles  can 
be  made  to  pass  back  and  forth  between  the  open  and  closed  arms  until  the 
height  of  the  mercury  column  is  the  same  in  both  when  the  apparatus  is  level. 
The  pressure  of  the  confined  air  is  then  equal  to  that  of  the  outside  atmosphere. 
By  means  of  prior  calibration,  the  volume  V of  the  confined  air  can  be  found  by 
measuring  the  height  / of  the  air  column,  (b)  More  mercury  is  poured  into  the 
open  arm  of  the  apparatus.  The  confined  air  is  compressed  to  a new  volume  V , 
which  can  be  found  by  measuring  the  new  air  column  height  The  pressure  of 
the  confined  air  is  now  equal  to  the  sum  of  atmospheric  pressure  patm  and 
the  pressure  Ap  exerted  by  the  mercury  column  of  height  h. 


P = Patm 


+ Ap 


(b) 


16-4  Boyle’s  Law  713 


p K — for  trapped  gas  at  constant  temperature  (16-20a) 

V 

or 

pV  = constant  for  trapped  gas  at  constant  temperature  (16-206) 

This  relation  is  known  as  Boyle’s  law. 

The  value  of  the  constant  in  Eq.  (16-206)  is  different  every  time  the 
experiment  is  done.  It  is  plausible  to  assume  that  the  constant  will  depend 
on  the  amount  (that  is,  the  mass)  of  air  trapped  in  the  closed  tube.  If  we 
think  of  the  trapped  air  as  acting  something  like  a spring  under  compres- 
sion (as  Boyle  did),  we  may  argue  that  adding  more  air  in  the  same  trapped 
volume  is  something  iike  thickening  a spring  and  thus  increasing  its 
stiffness.  Indeed,  Boyle  referred  to  the  resistance  of  air  to  compression  as 
the  “spring  of  air.”  In  Sec.  16-5  we  develop  a means  of  describing  the 
“spring  of  air”  in  a quantitative  fashion. 


16-5  BULK 
MODULUS  AND 
COMPRESSIBILITY 


When  an  isotropic  substance,  either  a solid  or  a fluid,  is  subjected  to  hydro- 
static pressure,  its  volume  is  reduced.  In  the  one-dimensional  case,  the  ratio 
of  stress  to  strain  is  expressed  as  the  Young’s  modulus  of  the  substance.  We 
can  express  a similar  relation  in  three  dimensions  between  pressure  change 
and  volume  change.  Suppose  that  an  object  has  a volume  V when  it  is  situ- 
ated in  an  isotropic  environment  where  the  pressure  has  a certain  value. 
The  pressure  is  then  changed  by  an  amount  A p.  As  a consequence,  the  vol- 
ume of  the  object  changes  by  an  amount  AE,  so  that  the  fractional  change 
of  volume  is  W/V.  We  define  the  isothermal  bulk  modulus  B of  the  sub- 
stance of  which  the  object  is  made  in  a manner  analogous  to  the  definition 
of  Young’s  modulus,  Y = (F/a)/(Al/l).  This  is  done  by  means  of  the  equa- 
tion 


B = — 


±P 

W/V 


for  constant  temperature  (16-2 la) 


In  the  limiting  case  where  the  pressure  change  has  the  infinitesimal  value 
dp,  Eq.  (16-2 la)  becomes 


dp 

B = ~Jy  *or  constant  temperature 


(16-216) 


Since  an  increase  in  the  pressure  always  results  in  a decrease  in  volume,  the 
minus  sign  is  made  a part  of  the  definition,  so  that  B will  be  a positive 
number.  The  pressure  change  A p appears  in  the  definition  rather  than  the 
pressure  p = —F/a  because  bulk  modulus  measurements  do  not  usually 
begin  at  zero  pressure.  (Why  does  this  have  no  effect  on  the  value  of  B for  a 
substance  obeying  Hooke’s  law?)  The  bulk  modulus  B has  the  same  dimen- 
sions and  units  as  pressure  p,  stress  cr,  and  Young’s  modulus  Y.  In  the  SI 
system,  the  unit  of  B is  the  pascal. 

The  bulk  modulus  is  a measure  of  the  resistance  of  a material  to  com- 
pression. Although  fluids  have  no  resistance  to  shear  and  little  or  none  to 
tensile  stress,  they  do  resist  compression.  You  might  guess  that  solids  are 
quite  resistant  to  compression  and  liquids  somewhat  less  so.  Table  16-2, 
which  gives  the  value  of  .6  for  typical  materials  at  room  temperature,  bears 
this  out.  It  shows  that  solids  are  generally  about  10  times  less  compressible 
than  liquids. 


714  Mechanics  of  Continuous  Media 


Table  16-2 


Bulk  Modulus  for  Typical 

Solids  and 

I Liquids 

Substance 

B (in 

Pa) 

Aluminum 

7.46 

X 

1010 

Brass 

10.7 

X 

1010 

Copper 

13.1 

X 

1010 

Glass 

1.4 

X 

1010 

Steel 

18 

X 

1010 

Lead 

5.0 

X 

1010 

Diamond 

20 

X 

1010 

Ethanol  (grain  alcohol) 

0.9 

X 

109 

Glycerine 

4.6 

X 

109 

Mercury 

27.0 

X 

109 

Water  (20°C) 

2.06 

X 

109 

For  gases,  the  bulk  modulus  can  be  deduced  directly  from  Boyle’s  law. 
From  Ec].  (16-206)  we  have 

P=^  (16-22) 


where  c is  some  constant.  If  the  pressure  is  changed  by  an  infinitesimal 
amount  dp,  the  corresponding  change  in  the  volume  can  he  found  by  dif- 
ferentiating both  sides  of  this  equation  with  respect  to  V,  to  obtain 


dp  c 

~dV  = ~ V* 


(16-23) 


Substituting  this  value  of  dp/dV  into  the  definition  of  B , Eq.  (16-216),  we 
obtain 


Comparing  this  with  Eq.  (16-22)  gives  the  final  result 

B = p for  constant  temperature  (16-24) 

Thus/or  a gas,  the  bulk  modulus  is  just  equal  to  the  pressure.  The  limitations  on 
the  applicability  of  this  equation  are  the  same  as  the  limitations  on  Boyle’s 
law.  Generally  speaking,  it  is  accurate  if  the  pressure  is  not  too  high  and  the 
temperature  is  not  too  low.  It  is  quite  accurate  for  familiar  gases  such  as 
oxygen  and  nitrogen  near  atmospheric  pressure  and  room  temperature. 


Particularly  in  the  case  of  gases,  it  is  often  more  convenient  to  speak  in 
terms  of  the  susceptibility  to  compression  rather  than  the  resistance  to  com- 
pression. For  this  purpose,  we  define  the  compressibility  k (lowercase 
Greek  kappa)  to  be  the  reciprocal  of  the  bulk  modulus.  That  is, 


1 _ 

B ~ ~ V ~dp 


(16-25) 


The  compressibility  of  solids  is  of  the  order  of  10-11  Pa-1.  In  other 
words,  an  increase  in  pressure  of  1 Pa  results  in  a reduction  in  volume  of 
about  1 part  in  1011.  In  terms  of  commonly  encountered  pressures,  a 
doubling  of  the  pressure  from  1 atm  to  2 atm  results  in  a reduction  in  the 


16-5  Bulk  Modulus  and  Compressibility  715 


volume  of  solids  of  only  about  1 part  in  106.  In  contrast,  a gas  will  halve  its 
volume  upon  a doubling  of  the  pressure. 

We  have  now  defined  four  quantities  having  to  do  with  the  elastic 
properties  of  isotropic  substances:  Young’s  modulus  Y,  Poisson’s  ratio  v, 
the  shear  modulus  G,  and  the  bulk  modulus  B.  They  are  not  all  indepen- 
dent of  one  another,  as  Example  16-6  indicates. 


EXAMPLE  16-6 


Express  B in  terms  of  Y and  r. 

■ Young’s  modulus  and  Poisson’s  ratio  have  to  do  with  uniaxial  stress,  while  the 
bulk  modulus  has  to  do  with  hydrostatic  pressure.  To  find  the  desired  relation,  you 
imagine  a sample  of  material  to  be  subjected  first  to  uniaxial  stress  and  next  to 
hydrostatic  pressure;  then  you  compare  the  resulting  volume  changes. 

Begin  by  considering  a solid  bar  of  rectangular  cross  section  which  is  subjected 
to  a uniaxial  compressive  stress  cr  along  its  length,  as  shown  in  Fig.  1 6- 1 3.  As  a result 
of  the  primary  strain  produced  by  this  stress,  the  bar  will  experience  a reduction  in 
volume.  This  reduction  is  partially  compensated  for  by  expansion  in  the  plane 
normal  to  the  direction  of  the  primary  stress.  The  length  / is  changed  by  A / (which 
has  a negative  value),  the  width  w is  changed  by  Aw  (which  has  a positive  value),  and 
the  thickness  t is  changed  by  A t (which  also  has  a positive  value).  Thus  the  change  in 
volume  is 


AV  = (l  + A l)(w  + A w)(t  + At)  — hut 


Since  hut  = V,  the  initial  volume,  you  can  write 


AY 

~V 


- 1 


Now  A l/l  is  just  the  primary  strain  e.  The  terms  Aw/w  and  A t/t  are  the  induced 
strains.  As  you  saw  in  Sec.  16-2,  they  are  equal  for  an  isotropic  substance  and  are 
equal  to  the  negative  of  the  primary  strain  times  Poisson’s  ratio.  Expressed  mathe- 
matically, we  have 

Aw  At 

w t 

Thus  the  above  equation  for  the  fractional  volume  change  AV /V  can  be  written 


AV 

— = ( 1 + e)(  1 - re)(  1 - re)  ~ 1 = ( 1 + e)(  1 
= (1  + e)(l  - 2 re  + me2)  - 1 


re)2  - 1 


716  Mechanics  of  Continuous  Media 


Since  the  strain  e is  small  compared  to  1,  you  can  neglect  the  term  in  e2.  Multiply- 
ing the  remaining  terms  gives  you 


AT 

— = e - 2^e 

Again  neglecting  the  term  in  e2,  you  obtain 

AT 

— = e(l  - 2v) 


2 re2 


This  is  the  fractional  volume  change  produced  by  stress  along  the  length  /. 

Now  suppose  that  the  sample  of  Fig.  16-13  is  immersed  in  a fluid,  and  the  fluid 
and  the  sample  are  subjected  to  a pressure  increase  of  magnitude  A p.  Then  all  six 
faces  of  the  sample  are  subjected  to  equal  stress.  This  is  the  condition  under  which 
the  definition  of  the  bulk  modulus  is  applicable. 

You  have  just  calculated  the  effect  on  the  volume  of  the  sample  due  to  the  stress 
applied  in  the  direction  parallel  to  its  length  l.  The  stress  applied  in  the  direction 
parallel  to  the  thickness  t of  the  sample  will  produce  an  equal  fractional  change  in 
the  volume.  This  is  because  the  ratio  AT/T  in  the  equation  AT/T  = e(l  - 2v)  is 
independent  of  the  dimensions  of  the  sample.  The  same  statement  pertains  to  the 
stress  applied  in  the  direction  parallel  to  the  width  w of  the  sample.  The  volume 
changes  due  to  the  stresses  applied  in  all  three  directions  take  place  together.  Thus 
the  total  volume  change  is  the  sum  of  the  three.  (Strictly  speaking,  the  volume 
changes  are  additive  only  if — as  is  always  the  case  for  solids — the  changes  in  vol- 
ume are  small  compared  to  the  volume  itself.  Can  you  see  the  need  for  this  restric- 
tion?) Since  the  three  fractional  volume  changes  are  of  equal  magnitude,  the  total 
fractional  volume  change  AT/T  is  3 times  any  one  of  them.  So  you  have 


AT 

- = 36(1 


2v) 


Since  the  bar  has  its  unstressed  volume  T when  the  pressure  is  zero,  the  volume 
change  is  produced  by  a change  in  the  applied  pressure  Ap  = p — 0 = p.  But  this 
change  in  pressure  is  equivalent  to  a stress  — c r applied  at  the  same  time  along  the 
length,  the  width,  and  the  thickness  of  the  bar.  So  you  have  Ap  = — cr,  and  you  can 
write  the  definition  of  B,  Eq.  (16-21«),  in  the  form 


B = —V 


Ap 


V 


AT  AT 


Using  the  value  of  AT/Tjust  obtained,  you  get 


B 


a 1 


e 3(1  - 2v) 

Finally,  note  that  a/e  = Y,  so  that  you  have 

Y 
B 


3(1  - 2v) 


(16-26) 


Because  of  the  way  in  which  we  have  defined  the  bulk  modulus,  it  can 
never  be  negative.  Since  Young’s  modulus  Y is  also  always  positive  (you 
cannot  make  something  expand  by  pressing  on  it!),  it  follows  from  Eq. 
(16-26)  that  Poisson’s  ratio  v can  never  have  a value  greater  than  i.  Indeed, 
the  value  v = i implies  a substance  which  is  perfectly  incompressible.  Can 
you  see  why? 

16-5  Bulk  Modulus  and  Compressibility  717 


We  give  two  other  useful  relations  among  elastic  moduli,  without 
proof.  The  shear  modulus  G is  related  to  Y and  v by  the  relation 


Y 

G ~ 2(1  + v) 

and  the  three  elastic  moduli  are  related  by 

J.  _ J_  J_ 
Y ~ 3G  + 9B 


(16-27) 


(16-28) 


The  proofs  of  these  two  equations  are  in  the  same  spirit  as  the  proof  of  Eq. 
(16-26)  in  Example  16-6. 


16-6  FLUID  FRICTION, 
LAMINAR  FLOW,  AND 
TURBULENT  FLOW 


When  a solid  body  moves  through  a fluid  (or,  what  amounts  to  the  same 
thing,  a fluid  flows  past  the  body),  there  is  always  a force  of  fluid 
friction — often  called  the  drag  force — opposing  the  motion.  Some  of  the 
empirical  consequences  of  this  general  observation  were  discussed  in  Sec. 
4-6.  As  noted  there,  the  magnitude  of  the  drag  force  depends  upon  the  size 
and  shape  of  the  solid  body,  its  speed  relative  to  the  fluid,  the  density  of  the 
fluid,  and  the  viscosity  of  the  fluid. 

When  a relatively  small  body  moves  through  a fluid  at  a relatively  low 
speed,  the  magnitude  of  the  drag  force  depends  on  the  viscosity  of  the 
fluid.  In  this  case,  the  drag  force  is  often  called  viscous  drag.  (The  terms 
“relatively  small  body”  and  “relatively  low  speed”  will  be  defined  more 
quantitatively  later  in  this  section.)  Loosely  speaking,  viscosity  is  a measure 
of  the  “thickness”  of  a fluid.  Molasses  is  quite  viscous,  water  substantially 
less  so,  and  air  very  much  less  so.  For  most  fluids,  viscosity  depends  rather 
strongly  on  temperature,  as  evident  in  the  familiar  phrase  “slow  as  molasses 
in  January.”  One  rough-and-ready  way  of  measuring  viscosity  quantita- 
tively is  to  measure  how  long  it  takes  a specified  quantity  of  a fluid  to  flow 
out  of  a standard  container  through  a hole  of  specified  dimensions.  Such  a 
time,  measured  in  seconds,  is  in  fact  the  SAE  (Society  of  Automotive  Engi- 
neers) viscosity  number  used  for  motor  oils. 

Here  we  will  take  a more  fundamental  view.  We  have  defined  a fluid  as 
a substance  which  cannot  sustain  a shear  stress.  For  the  purposes  of  this  dis- 
cussion, the  word  “sustain”  is  important.  LTnlike  a solid,  a fluid  cannot  be 
sheared  statically  and  remain  under  stress.  (That  is,  if  you  apply  a shear 
strain  to  a fluid  it  has  no  tendency  to  “spring  back,”  but  ceases  to  resist  you 
as  soon  as  you  stop  moving.)  If,  however,  a pair  of  shear  forces  is  applied 
steadily  to  a fluid,  the  result  is  a shear  strain  which  increases  uniformly  in 
time.  As  long  as  this  dynamic  — not  static  — situation  holds  true,  a shear  stress 
is  maintained  in  the  fluid. 


Consider  the  case  illustrated  in  Fig.  16-14.  A fluid  fills  the  space  between 
two  very  large  parallel  plates  separated  by  a distance  d.  Plate  B is  moving 
relative  to  plate  A at  a speed  y0>  which  is  rather  small.  There  is  an  imaginary 
thin  layer,  called  a lamina,  of  fluid  which  lies  in  contact  with  the  stationary 
plate  A.  It  is  plausible  to  assume  (and  experiment  bears  out  this  assump- 
tion) that  this  lamina  is  substantially  at  rest.  Similarly,  we  can  assume  that 
the  lamina  in  contact  with  plate  B is  moving  at  substantially  the  same  speed 


718  Mechanics  of  Continuous  Media 


Plate  B 


b 


a 


Plate  A 


i f 


2 


Fig.  16-14  Viscous  drag  in  an  idealized  system.  Two  very 
large  parallel  plates  A and  B are  separated  by  a distance  d. 
The  space  between  them  is  filled  with  fluid.  A constant 
force  F must  be  applied  to  plate  B to  keep  it  moving  at  a 
constant  speed  v0  with  respect  to  plate  A.  If  a thin  lamina 
of  fluid  at  a uniform  distance  y from  plate  A moves  with  a 
speed  v(y)  = v^y/d,  the  flow  is  laminar.  The  text  discusses 
the  situation  where  the  fluid  is  a gas.  Adjacent  laminae  a 
and  b move  at  speeds  va  and  respectively.  Momentum  is 
transferred  between  the  laminae  by  molecules  such  as  1 and 
2,  which  migrate  from  one  to  the  other. 


as  plate  B itself.  Intermediate  laminae  move  with  speed 

y 

v(y)  = v0^ 


where  y is  the  distance  of  a lamina  from  the  stationary  plate.  That  is,  each 
lamina  slips  slowly  past  its  neighbor  on  the  side  nearer  to  plate  A,  and  the 
speed  of  any  lamina  is  directly  proportional  to  its  distance  from  plate  A. 
This  orderly  motion  is  called  laminar  flow. 

Why  must  a force  be  continually  applied  to  keep  the  system  in  motion? 
That  is,  what  is  the  source  of  the  friction?  There  is  no  single  answer  for  all 
fluids,  but  the  qualitative  account  which  follows  is  correct  for  gases. 

Two  adjacent  laminae  a and  b are  shown  schematically  in  Fig.  16-14. 
Lamina  b is  moving  faster  than  lamina  a.  However,  the  molecules  in  both 
laminae  are  in  continual  individual  random  motion,  aside  from  their  par- 
ticipation in  the  overall  motion  of  the  laminae  of  which  they  are  part.  As  a 
result,  molecules  continally  migrate  from  one  lamina  to  the  next.  If  mole- 
cule 1 moves  from  a to  b,  it  will  (on  the  average)  be  going  too  slowly  to  keep 
up  with  the  overall  motion.  Other  molecules  in  lamina  b will  collide  with  it 
in  such  a way  as  to  accelerate  it  to  the  speed  characteristic  of  lamina  b (again 
on  the  average).  The  forward-directed  forces  necessary  to  maintain  the  dif- 
ferences in  speeds  of  the  laminae  are  transmitted  from  lamina  to  lamina  in 
the  same  way.  The  original  source  of  these  forces  is  plate  B. 

The  same  thing  happens  in  reverse  to  molecule  2,  which  migrates 
from  lamina  b to  lamina  a.  It  (and  molecules  acting  similarly)  must  be 
slowed  down.  The  necessary  backward  forces  come  ultimately  from 
plate  A. 

The  shear  stress  applied  by  the  plates  to  the  fluid  is  defined  by  Eq. 
(16-11),  <xs  = F/a,  where  F is  the  applied  force  and  a is  the  area  of  either 
plate.  It  is  found  by  experiment  that  c rs  is  directly  proportional  to  the  rela- 
tive speed  of  the  plates  and  inversely  proportional  to  the  distance  between 
them.  This  relation  is  expressed  mathematically  by  the  equation 


(16-29) 


The  proportionality  constant  rj  is  called  the  coefficient  of  viscosity  (or  vis- 
cosity, for  short),  which  we  used  in  Sec.  4-6  without  defining  it.  According 
to  Eq.  (16-29),  the  dimensions  of  viscosity  are  those  of  stress  multiplied  by 
length  divided  by  velocity,  or  stress  multiplied  by  time.  The  SI  unit  of  vis- 
cosity is  thus  the  pascal-second  (Pa-s).  [An  older  unit  of  viscosity  still  in 
frequent  use  is  the  poise  (P);  1 P = 0.1  Pa-s.]  Some  typical  values  of  the  co- 
efficient of  viscosity  r)  are  given  in  Table  4-3. 


16-6  Fluid  Friction,  Laminar  Flow,  and  Turbulent  Flow  719 


Scale 


Fig.  16-15  A rotating-cylinder  viscometer.  The 
outer  cylinder,  of  radius  r2,  is  rotated  at  any  de- 
sired speed  by  the  turntable.  The  inner  cylinder, 
of  radius  r1,  hangs  from  a torsion  balance.  Not 
shown  is  a mechanism  for  keeping  the  inner  cy- 
linder coaxial  with  the  outer  one.  The  fluid  whose 
viscosity  is  to  be  measured  fills  the  space  between 
the  cylinders  to  a height  h.  The  viscosity  is  pro- 
portional to  the  angular  speed  of  the  outer  cylin- 
der and  to  the  torque,  measured  by  the  torsion 
balance,  which  is  exerted  on  the  inner  cylinder. 


A device  used  to  measure  the  coefficient  of  viscosity  of  fluids  is  called  a 
viscometer.  One  form  of  an  important  type,  called  the  rotating-cylinder 
viscometer,  is  shown  schematically  in  Fig.  16-15.  It  consists  of  two  metal  cyl- 
inders whose  axes  are  made  to  coincide  with  high  accuracy.  The  outer  cyl- 
inder is  driven  by  a variable-speed  motor,  while  the  inner  one  is  suspended 
from  a torsion  balance.  The  space  between  the  cylinders  is  filled  with  the 
fluid  (usually  a liquid)  whose  viscosity  is  to  be  measured.  As  the  outer  cylin- 
der rotates,  the  liquid  transmits  to  the  inner  cylinder  a torque  whose  mag- 
nitude depends  on  the  value  of  17.  Once  the  torque,  the  angular  speed  of 
the  outer  cylinder,  and  the  dimensions  of  the  apparatus  are  known,  the 
coefficient  of  viscosity  can  be  calculated  by  Eq.  (16-29).  A typical  case  is  il- 
lustrated in  Example  16-7. 


EXAMPLE  16-7  — ''  r 

You  use  a rotating-cylinder  viscometer  to  measure  the  coefficient  of  viscosity  of 
castor  oil  at  a temperature  of  20°C.  The  radius  of  the  inner  cylinder  is  rx  = 4.00  cm, 
and  the  radius  of  the  outer  cylinder  is  r2  = 4.28  cm.  The  inner  cylinder  is  sub- 
merged in  the  oil  to  a depth  h = 10.2  cm.  When  the  outer  cylinder  is  rotating  at  20.0 
revolutions  per  minute,  the  torsion  balance  reads  a torque  T = 3.24  x 10-2  m-N. 
Find  the  viscosity  of  the  castor  oil. 

■ l ire  cylindrical  space  between  the  outer  and  inner  cylinders  of  the  viscometer 
is  not  a bad  approximation  to  the  ideal  flat-plate  system  of  Fig.  16-14.  This  is  be- 
cause the  space  between  the  cylinders  is  quite  narrow  compared  to  the  radii  of  the 
cylinders.  Consequently,  Eq.  (16-29)  is  applicable  to  the  viscometer.  Solving  for  the 
coefficient  of  viscosity,  you  have 


720  Mechanics  of  Continuous  Media 


d 

V = crs- 

Vo 


(16-30) 


The  wetted  area  of  the  inner  cylinder  is  a = 27rr1/r.  Thus  you  have  as  = 
F/2TT)\h,  where  F is  the  drag  force  applied  to  the  inner  cylinder  by  the  fluid,  which 
is  driven  by  the  outer  cylinder.  This  force  results  in  a torque  T = r 1F,  which  can  be 
read  on  the  scale  of  the  torsion  balance.  In  terms  of  this  torque,  the  shear  stress  is 


crs 


T 

2 mi  A 


Since  the  inner  cylinder  is  at  rest,  the  relative  speed  v0  of  the  two  cylinders  is 
given  by 


v0  = a >r2 


where  w is  the  angular  speed  of  the  outer  cylinder.  And  the  distance  between  the 
cylinders  is 

d = r2  ~ r1 


Using  the  above  values  of  crs,  y0,  and  d in  Eq.  (16-30),  you  obtain 

_ T(r2  - rt) 

^ 2TTu>r\r<Ji 


(16-31) 


Inserting  the  numerical  values  gives  you 


V = 


3.24  x 10~2m-N  x 0.28  x 10~2  m 


2tt  x 


20.0rev/min  x 277 rad/rev 
60s/min 


x (4.00  x l(U2m)2  x 4.28  x 10“2m  x 10.2  x 10_2m 


or 


17  = 0.99  Pa-s 

When  a fluid  moves  relative  to  a solid  body,  shear  stresses  must  be 
present.  In  Fig.  16-16,  a sphere  is  shown  in  a stream  of  fluid.  If  the  relative 
velocity  v0  is  small  enough,  the  flow  will  be  laminar.  Far  away  from  the 
sphere,  at  points  A,  B,  and  C,  the  fluid  is  essentially  unaffected  by  the 
sphere,  and  its  velocity  is  the  free-stream  velocity  v0.  In  the  lamina  of  fluid 
immediately  adjacent  to  the  surface  of  the  sphere,  the  velocity  must  be  es- 
sentially zero,  with  intermediate  velocities  at  intermediate  points.  Thus 
viscous  drag  must  be  present,  just  as  in  Fig.  16-14. 


B 


Fig.  16-16  In  this  tracing  from  a photograph  of  an  actual  experiment,  a sphere 
of  radius  a disturbs  the  flow  of  a fluid  which  moves  uniformly  at  velocity  v0 
at  all  points  distant  from  the  sphere.  If  the  free-stream  speed  v0  is  small  enough, 
the  flow  is  laminar.  The  fluid  in  immediate  contact  with  the  sphere  is  essentially 
at  rest,  and  the  speed  of  the  fluid  increases  with  increasing  distance  from  the 
sphere,  attaining  the  free-stream  speed  v0  at  distant  points  such  as  A,  B,  and  C. 
The  flow  pattern  is  complex.  However,  the  fluid  passes  along  regular  stream- 
lines, some  of  which  are  shown.  Far  from  the  sphere,  the  streamlines  are  nearly 
straight  lines.  The  streamline  AC  along  the  axis  of  the  sphere  splits  into  semi- 
circular paths  around  the  sphere  and  reunites  into  a single  straight  line  behind 
it.  Intermediate  streamlines  have  intermediate  shapes. 


16-6  Fluid  Friction,  Laminar  Flow,  and  Turbulent  Flow  721 


The  situation  here  is  much  more  complicated  than  that  in  Fig.  16-14. 
The  fluid  must  part  at  the  front  of  the  sphere  and  come  back  together  at 
the  back.  The  description  of  the  fluid  velocity  as  a function  of  position  is 
therefore  a three-dimensional  one,  and  the  laminae  have  a complicated 
shape,  unlike  the  simple  planar  laminae  of  Fig.  16-14.  We  can  describe 
their  shape  in  terms  of  the  paths  taken  by  small  elements  of  fluid  around 
the  sphere.  These  paths  are  called  streamlines. 

Although  the  analysis  of  the  motion  of  a sphere  through  a viscous  fluid 
(or  vice  versa)  is  cpiite  complicated,  the  result  is  simple.  For  a sphere  of 
radius  r,  the  magnitude  of  the  viscous  drag  force  is  given  by  Eq.  (4-26). 
With  a slight  change  of  notation,  this  is 

F = 6vr)rvo  (16-32) 

That  is,  the  drag  force  F is  directly  proportional  to  the  viscosity  77  of  the 
fluid,  the  radius  r of  the  solid  body,  and  the  magnitude  of  the  free-stream 
velocity,  called  the  free-stream  speed  v0.  This  rule,  which  is  valid  for  small 
enough  values  of  v0,  is  called  Stokes’  law,  after  the  British  theoretical  physi- 
cist Sir  George  Stokes  (1819-1903). 

Stokes’  law  is  valid  only  for  spherical  obstacles  to  fluid  flow.  For  more 
complex  shapes,  the  analysis  becomes  very  difficult  and  may  not  be  possible 
at  all.  In  such  cases,  it  is  possible  to  resort  to  approximations  to  solve  the 
problem  numerically,  or  to  rely  completely  on  empirical  measurement. 
Regardless  of  shape,  however,  the  magnitude  of  the  drag  force  is  propor- 
tional to  the  free-stream  speed,  provided  the  speed  is  small  enough. 

What  happens  as  the  free-stream  speed  increases?  A point  is  reached 
where  the  flow  becomes  unstable.  The  orderly  laminae  break  up  and  are 
replaced  by  turbulent  eddies.  The  resulting  turbulent  flow  is  disorderly. 
Two  examples  of  turbulent  flow  are  illustrated  in  Fig.  16-17. 


Fig.  16-17  Turbulent  flow  past  two  like  cylinders.  The  cylinders  are  located  one  behind  the 
other  in  the  direction  of  flow  in  a wind  tunnel,  and  smoke  is  used  to  render  the  flow  pattern 
visible.  ( Courtesy  of  Union  Carbide  Corporation  s Nuclear  Division  /Oak  Ridge  National  Laboratory.) 


The  frictional  drag  associated  with  turbulent  flow7  is  much  greater  than 
that  for  laminar  flow.  That  is,  the  drag  force  is  much  greater  than  that  pre- 
dicted by  the  relation  F = (constant)  v0.  The  turbulent  eddies,  or  vortices, 
are  much  more  efficient  mechanisms  for  mixing  rapidly  and  slowly  moving 
parts  of  the  fluid  than  is  the  interlaminar  diffusion  process  of  laminar  flow. 
As  a consequence,  the  solid  body  must  do  much  more  work  on  the  fluid  in 
passing  through  it.  The  result  is  a much  more  efficient  transfer  of  energy 
from  the  solid  body  to  the  fluid,  and  therefore  a much  more  rapid  dissipa- 
tion of  the  kinetic  energy  of  the  body  as  seen  by  an  observer  moving  with 
the  fluid  at  the  free-stream  speed. 


It  is  not  usually  possible  to  carry  out  a complete  analysis  of  turbulent 
flow.  However,  a great  deal  is  knowm  about  important  special  systems.  (One 
example  is  the  stalling  of  airplane  wings,  where  an  increase  in  the  angle 
between  the  wing  and  the  oncoming  air  leads  to  a catastrophic  increase  in 
drag.)  Over  a fairly  wide  range  of  free-stream  speeds  v0  the  empirical  rule 
of  Eq.  (4-29)  for  turbulent  flow7  is  a fairly  good  approximation.  With  a slight 
change  in  notation,  this  is 


Sap  Vo 

2 


(16-33) 


Here  p is  the  density  of  the  fluid,  a is  the  cross-sectional  area  w'hich  the  ob- 
stacle presents  to  the  fluid,  and  5,  the  coefficient  of  drag,  is  a dimensionless 
empirical  constant  which  depends  on  the  shape  of  the  obstacle.  The  value 
of  8 is  reasonably  independent  of  a,  p,  and  v0  over  a fairly  large  range  of 
these  parameters.  Some  typical  values  of  8 are  quoted  in  Table  4-4. 

Equation  (16-33)  can  be  understood  as  follows.  Suppose  that  a solid 
body  having  cross-sectional  area  a moves  through  the  fluid  a distance  dx 
with  speed  v.  This  speed  is  equal  to  the  free-stream  speed  v0  of  the  fluid 
relative  to  the  solid  body.  In  moving  the  distance  dx,  the  body  vacates  be- 
hind itself  a volume  a dx,  and  this  volume  must  be  filled  with  fluid,  which 
moves  into  it  as  the  body  moves  out  of  it. 

Now  suppose  that  the  interaction  between  the  body  and  the  fluid  is 
such  that  all  the  fluid  which  has  replaced  the  body  is  dragged  along  with 
it — that  is,  the  fluid  in  the  volume  a dx  acquires  a speed  v.  If  its  mass  is  m,  it 
must  acquire  kinetic  energy  mr2/ 2. 

If  we  assume  that  the  fluid  is  incompressible,  its  density  p is  constant. 
The  mass  of  the  fluid  being  dragged  along  by  the  body  can  be  written  m — 
pa  dx.  Its  kinetic  energy  is  thus  (apv2/2)dx.  The  source  of  this  kinetic  energy 
is  the  w7ork  dW  done  on  the  fluid  by  the  solid  body  as  the  body  passes 
through.  The  work  is  given  by  dW  = F dx,  where  F is  the  force  required  to 
drag  the  body  through  the  fluid  at  speed  v.  We  therefore  have  F dx  = 
( apv2/2)dx , or 

apir 

2 


In  effect,  a body  moving  through  a fluid  does  drag  some  of  the  fluid 
along  with  it.  This  can  be  seen  in  Fig.  16-17,  which  shows  various  forms 
taken  by  the  irregular  trail,  or  wake,  of  fluid.  The  turbulent  eddies  persist, 
because  it  takes  some  time  before  friction  and  other  influences  can  bring 
the  fluid  in  the  wake  back  to  rest  with  respect  to  the  surrounding  fluid.  This 
is  quite  different  from  the  case  of  laminar  flow,  where  the  fluid  returns 


16-6  Fluid  Friction,  Laminar  Flow,  and  Turbulent  Flow  723 


724 


immediately  to  its  undisturbed  state  once  the  solid  body  has  passed.  The 
wake  is  not  in  actuality  dragged  along  at  a speed  v in  uniform  fashion.  But 
the  energy  imparted  to  the  wake  by  the  passing  solid  body  of  cross-sectional 
area  a is  equal  to  that  which  would  be  imparted  by  an  ideal  wake  having  a 
different  cross-sectional  area  given  by  the  product  8 a.  This  quantity  8a  is 
the  area  of  an  imaginary  body  which  does  drag  along  uniformly  a wake 
having  its  own  cross-sectional  area.  The  value  of  the  constant  8 depends  on 
the  shape  of  the  body.  A streamlined  body,  such  as  an  airplane  fuselage, 
slips  through  the  fluid  with  minimal  disturbance  to  the  fluid.  The  area  of 
the  wake  is  quite  small,  and  the  value  of  6 is  correspondingly  small.  As 
noted  in  Table  4-4,  the  value  of  8 for  a streamlined  airplane  body  is  only 
about  0.06.  Loosely  speaking,  this  means  that  only  about  6 percent  of  the 
displaced  fluid  is  dragged  along  as  the  airplane  body  passes.  On  the  other 
hand,  a circular  disk  moving  face  forward  through  a fluid  actually  leaves  a 
wake  wider  than  itself  ; the  corresponding  value  of  8 is  1.2.  You  can  experi- 
ment with  bodies  of  similar  size  but  different  shape  by  dragging  them 
through  water  in  a bathtub  and  observing  the  differences  among  their 
wakes. 

The  constant  8 thus  relates  the  cross-sectional  area  a of  a body  to  the 
cross-sectional  area  of  an  imaginary  body  which  leaves  an  ideal  wake.  It  is 
called  the  coefficient  of  drag,  as  already  noted.  The  force  F required  to 
drag  the  actual  body  through  a fluid  at  speed  v is  equal  to  the  force  re- 
quired to  drag  the  imaginary  body  through  the  fluid  at  the  same  speed. 
Since  F = apv2/ 2,  we  have 

8a  pv2 

2 

which  is  Eq.  (16-33)  because  v — tv 

At  large  enough  values  of  v,  Eq.  (16-33)  fails  also.  The  way  in  which  it 
fails  depends  very  much  on  the  specifics  of  the  system  in  question,  and  we 
cannot  discuss  the  matter  further  here. 


So  far  we  have  spoken  loosely  about  free-stream  speeds  as  being 
“small  enough”  or  “not  too  large.”  A very  useful  quantity  for  determining 
what  these  terms  mean  is  the  Reynolds  number  R , defined  to  be 


V 


where  d is  some  dimension  typical  of  the  system,  p is  the  density  of  the 
fluid,  and  r\  is  its  coefficient  of  viscosity.  For  a sphere  in  a stream  of  fluid  d is 
the  diameter  of  the  sphere;  for  water  in  a pipe,  it  is  the  pipe  diameter;  for 
an  airplane,  it  is  some  average  of  the  length,  width,  and  height.  Transitions 
from  one  kind  of  flow  to  another  in  a system  of  a particular  geometry  are 
typified  by  a certain  value  of  the  Reynolds  number,  called  the  critical 
Reynolds  number  for  that  particular  transition.  The  same  critical  Reynolds 
number  will  characterize  the  value  of  v0  for  the  transition  from  laminar  to 
turbulent  flow  for  a bubble  of  air  rising  through  a pool  of  water  and  a bas- 
ketball falling  through  the  air.  Some  typical  critical  Reynolds  numbers  are 
given  in  Table  16-3  for  two  kinds  of  transition.  All  but  the  last  of  the  table 
entries  specify  the  upper  limit  for  the  applicability  of  the  v 1 rule  of  Stokes’ 
law.  The  last  entry  sets  the  upper  limit  for  the  v 2 rule  of  turbulent  flow. 
Since  large  Reynolds  numbers  imply  turbulent  flow  and  since  the  number 
is  proportional  to  the  product  of  the  free-stream  speed  v0  and  the  typical 


Mechanics  of  Continuous  Media 


Table  16-3 


Some  Critical  Reynolds  Numbers 

R (approx)  Phenomenon 


10 

1200 

3000 

20,000 

3 X 105 


Upper  limit  for  strict  conformance  to  Stokes’  law  for  a sphere 
Onset  of  turbulent  flow  in  a cylindrical  pipe  with  an  irregular 
inlet 

Onset  of  turbulent  flow  in  a long  cylindrical  pipe 
Onset  of  turbulent  flow  in  a pipe  with  entrance  section  of  opti- 
mized shape 

Upper  limit  for  v2  law  [Eq.  (16-33)] 


dimension  d,  the  v 1 rule  of  Stokes’  law,  Eq.  (16-32),  applies  to  small  bodies 
moving  slowly.  The  v2  empirical  rule  of  Eq.  (16-33)  applies  to  larger  bodies 
moving  more  rapidly,  while  rules  involving  si  ill  higher  powers  of  v apply  to 
still  larger  bodies  moving  still  more  rapidly. 


A nuclear  submarine  is  100  m long.  The  shape  of  its  hull  is  roughly  cylindrical,  with 
a diameter  of  15  m.  When  it  is  submerged,  it  cruises  at  a speed  of  about  40  knots, 
or  20  m/s.  Is  the  flow  of  water  around  the  hull  laminar  or  turbulent? 

■ Even  though  the  watei  in  question  is  seawater  rather  than  pure  water,  the  val- 
ues of  the  viscosity  t)  and  density  p for  pure  water  are  a sufficiently  good  approxi- 
mation for  the  purposes  of  calculating  the  Reynolds  number.  From  Table  4-3  you 
haverj  = 1 X 10-3  Pa-s.  And  the  density  of  water  is  p = 1 X 103  kg/m3.  In  choosing 
a value  for  d to  use  in  Eq.  (16-34),  you  must  guess  at  some  value  between  the  length 
of  100  m and  the  diameter  of  15  m,  so  you  can  try  d = 30  m.  Equation  ( 1 6-34)  then 
gives  you 


R = 


pv0d  1 x 103  kg/m3  x 20  m/s  X 30  m 
T ~~  1 x 10“3  Pa-s 


= 1 x 109 


Referring  to  Table  16-3,  you  see  that  this  value  is  far  above  the  critical  Reynolds 
numbers  given  for  transition  from  laminar  to  turbulent  flow,  so  the  flow  must  be 
turbulent  and  the  drag  force  is  probably  proportional  to  a power  of  v greater  than 
the  square.  Is  the  flow  of  water  around  a ship  ever  laminar  for  practical  purposes? 
Suppose  you  replace  the  engine  of  a ship  with  one  of  double  power  output.  Will  this 
affect  the  maximum  speed  appreciably? 


16-7  DYNAMICS  OF  Section  16-6  was  concerned  with  fluids  in  motion.  Our  attention  was  fo- 
IDEAL  FLUIDS  cused,  however,  on  the  frictional  forces  which  remove  mechanical  energy 

from  the  system  and  hence  tend  to  make  it  come  to  rest.  Here  we  adopt  the 
point  of  view  which  has  proved  so  fruitful  in  studying  systems  of 
particles — we  ignore  friction.  That  is,  we  assume  that  the  fluid  under  study 
has  zero  viscosity.  As  an  additional  simplification,  we  assume  that  the  fluid  is 
incompressible  as  well.  Such  a fluid  is  called  an  ideal  fluid. 

Since  the  constituent  elements  of  a fluid  have  mass  and  since  they  exert 
forces  on  one  another,  fluids  can  possess  kinetic  and  potential  energy  just  as 
solids  can.  However,  we  do  not  usually  consider  the  motion  of  fluids  in  dis- 
crete blobs.  Rather,  we  are  interested  in  the  way  in  which  they  flow  in  a con- 
tinuous stream,  as  in  a pipe.  It  is  therefore  most  useful  to  consider  the  en- 
ergy per  unit,  volume  of  the  ideal  fluid  rather  than  the  total  energy. 


16-7  Dynamics  of  Ideal  Fluid  725 


Fig.  16-18  A tube  of  flow.  Typical 
streamlines  within  the  tube  are  shown 
as  clashed  lines.  All  fluid  passing  through 
the  tube  must  first  penetrate  the  planar 
cross  section  M and  later  the  planar 
cross  section  N. 


But  even  if  we  imagine  a certain  volume  of  an  ideal  (and  therefore 
incompressible)  fluid  to  be  flowing  through  a system  of  pipes,  tanks,  and  so 
forth,  the  shape  of  this  volume  generally  will  not  remain  constant.  If  the 
pipe  narrows,  for  instance,  a squat,  cylindrical  volume  of  fluid  will  become 
long  and  thin,  as  the  "front”  end  of  the  cylinder  enters  the  narrow  region 
hrst  and  speeds  up  first.  To  deal  with  this  difficulty,  we  introduce  the  con- 
cept of  the  tube  of  flow.  We  assume  that  the  fluid  flows  steadily.  In  this 
so-called  steady  state  each  microscopic  element  of  fluid  follows  a stream- 
line (see  the  definition  of  streamline  in  Sec.  16-6).  You  may  think  of  a tube 
of  flow  as  a bundle  of  streamlines,  as  illustrated  in  Fig.  16-18.  In  the  ab- 
sence of  viscosity,  all  elements  of  the  fluid  on  any  surface  normal  to  the 
streamlines  flow  at  the  same  speed.  1 bus  elements  of  fluid  which  are  simul- 
taneously located  on  the  plane  surface  M will  later  find  themselves  simul- 
taneously on  the  plane  surface  N. 

It  follows  directly  from  the  definition  of  a streamline  that  no  fluid 
enters  or  leaves  the  tube  of  flow  through  its  sides.  However,  every  bit  of 
fluid  which  crosses  the  surface  M must  later  cross  the  surface  N.  Since  the 
flow  is  steady,  the  masses  of  fluid  crossing  these  surfaces  per  second  are  the 
same. 


This  is  true  for  steady  flow  even  if  the  fluid  is  compressible.  To  see  this,  imag- 
ine a hypothetical  tube  of  flow  in  which  the  fluid  density  is  uniform,  except  in  one 
region  where  it  has  some  greater  value,  because  the  pressure  there  is  greater.  Even 
though  this  region  contains  more  fluid  per  unit  volume  than  the  rest  of  the  tube  of 
flow,  matter  must  leave  the  region  at  the  same  rate  as  it  enters  or  else  the  local  den- 
sity will  change,  in  contradiction  to  what  we  mean  by  steady  flow. 


The  rate  at  which  mass  crosses  a surface  is  called  the  flux.  More  specifi- 
cally, the  rate  of  flow  of  mass  is  called  the  mass  flux  <I>.  That  is,  if  in  time  dt 
the  mass  dm  crosses  the  surface,  then  the  mass  flux  is  defined  to  be 


(16-35) 


[See  Sec.  12-6  for  a different  but  related  use  of  the  concept  of  flux.  In  the 
three-dimensional  situation  considered  there,  we  used  the  symbol  S to  rep- 
resent the  (energy)  flux  — that  is,  the  (energy)  flow  per  unit  time  — across  a 
unit  area.  Here  we  use  the  symbol  <f>  to  represent  the  total  (mass)  flux  — that 
is,  the  (mass)  flow  per  unit  time — across  a surface  of  arbitrary  area  a.  The 
relation  between  the  two  quantities  is  S = df/m] 

It  is  useful  to  reexpress  Eq.  (16-35)  in  terms  of  the  density  p,  the  mass 
per  unit  volume.  From  Eq.  (16-16)  we  have 

m — pV 

where  the  mass  rn  of  fluid  occupies  a volume  V.  Since  the  density  p is  a 
constant  for  an  incompressible  fluid,  we  can  substitute  this  expression  into 
Eq.  (16-35),  writing  dm  = p dV  to  obtain 

dV 

cfi  = p—  (16-36) 

It  is  often  convenient  to  consider  flux  $ as  a signed  scalar,  with  flux 
into  a closed  region  having  a positive  value  and  flux  outward  having  a negative 
value.  For  the  region  enclosed  by  M and  N in  Fig.  16-18,  you  can  see  that 

= -&M  (16-37) 


726  Mechanics  of  Continuous  Media 


Since  the  flux  through  the  walls  of  the  tube  is  zero,  Eq.  (16-37)  can  be 
written  in  the  more  general  form 


Fig.  16-19  A tube  of  flow  shown  in 
profile.  All  fluid  located  on  the  surface 
M at  a certain  instant  lies  at  a time  dt 
later  on  another  surface  located  a dis- 
tance ds  downstream.  The  local  speed  of 
the  fluid  is  vM  = ds/dt.  A similar  state- 
ment can  be  made  about  fluid  which  at 
a certain  instant  lies  on  the  surface  N. 
But  if  the  area  aN  of  surface  N is  not  the 
same  as  the  area  aM  of  surface  M,  then 
vN  / vM. 


v o = o (16-38) 

entire 

surface 

where  “entire  surface"  includes  M,  N,  and  the  boundary  of  the  tube  of  flow 
between  them.  Either  Eq.  (16-37)  or  (16-38)  is  called  the  continuity  equa- 
tion. The  continuity  equation  says  simply  that  the  net  amount  of  fluid  en- 
tering the  tube  of  flow  is  zero,  provided  that  fluid  is  neither  created  nor 
destroyed  within  the  tube.  It  is  true  of  any  closed  region  in  a steadily  flow- 
ing fluid,  provided  that  there  are  no  sources,  or  sinks,  of  fluid  within  the 
region — that  is,  places  where  fluid  “appears”  or  “disappears.”  The  conti- 
nuity equation  is  of  fundamental  importance  in  the  theory  of  fluids,  where 
it  is  quite  clear  what  is  flowing.  As  you  will  see  in  Chap.  20,  it  is  equally  im- 
portant in  the  theory  of  electricity,  where  what  is  “flowing”  is  not  fluid  but 
the  much  more  abstract  entity  called  electric  field. 


Associated  with  the  fluid  is  a velocity.  Consider  an  imaginary  surface, 
moving  with  the  fluid,  which  passes  through  the  stationary  surface  M at  a 
certain  moment.  At  a time  dt  later,  the  moving  surface  will  have  passed 
downstream  an  infinitesimal  distance  ds,  as  shown  in  Fig.  16-19.  Since  ds  is 
infinitesimal,  the  cross-sectional  area  of  the  tube  of  flow  will  not  be  appre- 
ciably different  from  aM,  its  value  at  M.  The  volume  dV  of  the  space  con- 
tained between  M and  the  moving  surface  contains  all  the  fluid  which  has 
passed  through  the  surface  M in  the  time  dt,  and  no  other  fluid  has  entered 
this  space.  The  volume  of  fluid  passing  M in  time  dt  is  thus 

dV  = aM  ds 

Equation  (16-36)  can  therefore  be  written  in  the  form 

ds 

4>m  = pauit 

Since  vM  — ds/dt  is  the  speed  of  fluid  flow  at  M,  this  can  be  written 

<$>m  = paMvM  (16-39) 

The  same  argument  can  be  applied  at  any  other  location  along  the  tube  of 
flow,  so  that  in  general  at  any  location  the  flux  has  magnitude 

|<f>|  = pa(s)v{s)  (16-40) 

where  a and  v are  functions  of  the  distance  s along  the  curved  path  fol- 
lowed by  the  fluid.  [Compare  this  equation  with  the  analogous  equation, 
Eq.  (12-62)  for  energy  flux,  which  in  our  present  notation  would  be  written 
|<f>|/a  = pv.] 

Now  take  the  case  of  an  incompressible  fluid,  where  p is  constant. 
Combining  Eq.  (16-40)  with  Eq.  (16-37),  the  continuity  equation  for  any 
two  surfaces  M and  N,  immediately  yields 


Vm  _ <Zn 
vN  aM 


(16-41) 


That  is,  for  an  incompressible  fluid  the  flow  speed  is  inversely  proportional 
to  the  cross-sectional  area  of  the  tube  of  flow. 


16-7  Dynamics  of  Ideal  Fluid  727 


yM 

Ly  = 0 

Fig.  16-20  Diagram  for  the  derivation 
of  Bernoulli's  theorem.  A region  MN  of 
a tube  of  flow  is  shown.  Fluid  enters 
the  region  through  surface  M at  speed 
vM.  Its  local  pressure  is  pM,  and  M lies 
at  a height  yM  with  respect  to  an  arbi- 
trary reference  level  y = 0.  Fluid  leaves 
the  region  through  surface  TV,  where  the 
corresponding  parameters  are  vn,  Pn , 
and  yN. 


We  are  now  ready  to  apply  energy  considerations  to  an  incompressible 
fluid  passing  through  a tube  of  flow.  This  will  lead  to  a general  expression 
which  relates  the  change  in  pressure  of  the  fluid,  the  change  in  its  speed, 
and  the  change  in  its  vertical  position  as  the  fluid  flows.  Through  any 
cross-sectional  area  such  as  those  labeled  M and  N in  Fig.  16-20,  there  con- 
stantly passes  a mass  flux  whose  magnitude  is  |<f>|.  If  the  pressure  of  the 
fluid  at  M is  pM,  there  must  be  a force  FM  = pMaM  exerted  on  the  fluid  above 
M by  the  entering  fluid.  According  to  Eq.  (8-4),  the  power  P expended  by  a 
force  F applied  to  an  object  along  the  direction  of  its  motion  is  related  to 
the  force  and  the  speed  v of  the  point  of  application  of  the  force  by  the  re- 
lation P = Fv.  Hence  the  power  necessary  to  move  the  fluid  is 

Pm  = Fmvm 


or 


Pm  — PmCImVm 

This  power  represents  energy  flowing  into  the  fluid  in  the  region  MN.  Ac- 
cording to  Eq.  (16-39),  the  fluid  speed  vM  at  M is  given  by  vM  = |T| /paM. 
Since  TM,  the  flux  into  the  region  MN  at  M,  has  a positive  value,  this  can  be 
written  vM  = fpM/paM.  Thus  the  input  power  to  the  region  MN  is 

<PM 

Pm  — Pm 

P 


At  the  same  time,  mass  flux  T^  of  equal  magnitude  but  opposite  sign  is 
leaving  the  region  MN  at  N.  The  power  required  to  expel  this  fluid  has  the 
negative  value 


T v _ T V/ 

Pn  ~ pN 
P P 


It  represents  energy  flowing  out  of  the  region  MN.  The  net  power  input  P to 
the  region  MN  is  thus 


P ~ P m + Pn  ~ (Pm  Pn)  ^ 


And  since  TM  = |T|,  this  can  be  written 


T 

P ~ ( Pm  ~ Pn) 


(16-42) 


Let  us  assume  for  the  sake  of  argument  that  the  value  of  P in  Eq. 
(16-42)  is  positive.  That  is,  whatever  is  driving  fluid  into  the  region  MN 
must  do  work  on  it.  (The  ultimate  source  of  this  work  might  be  a pump.) 
Since  the  fluid  is  ideal,  this  input  tends  to  increase  the  energy  of  the  fluid  in 
the  region  MN.  But  we  have  stipulated  that  the  system  is  in  the  steady  state. 
This  means  that  the  energy  content  of  the  region  MN  (or  any  other  region 
in  the  system)  cannot  change.  If  this  condition  is  to  be  satisfied,  the  fluid 
must  transport  out  of  region  MN  an  amount  of  energy  per  unit  time  equal 
to  the  power  input.  This  the  fluid  can  clo  if  the  sum  of  its  kinetic  energy  and 
its  potential  energy  as  it  passes  out  of  the  region  at  N is  greater  than  the 
corresponding  sum  as  it  passes  into  the  region  at  M.  More  specifically,  the 
total  mechanical  energy  of  the  fluid  which  leaves  the  region  per  unit  time  at 
N minus  the  total  mechanical  energy  of  the  fluid  which  enters  the  region 
per  unit  time  at  M must  be  equal  to  the  power  input  to  the  region. 


728  Mechanics  of  Continuous  Media 


Equation  (16-42)  already  gives  a quantitative  expression  for  the  power 
input  to  region  MTV.  (If  the  value  of  the  term  pM  — pN  is  negative — that  is,  if 
the  fluid  pressure  is  greater  at  TV  than  at  M — the  power  “input”  is  negative 
and  thus  signifies  a power  output  from  the  region.)  In  order  to  write  quan- 
titatively the  equation  described  in  words  at  the  end  of  the  previous  para- 
graph, we  must  find  quantitative  expressions  for  the  rates  at  which  the  fluid 
carries  energy  into  and  out  of  the  region.  We  do  this  separately  for  kinetic 
and  potential  energy,  and  then  we  take  the  sum  of  the  two  to  find  the  rate 
of  transport  of  total  energy. 

First  consider  the  kinetic  energy.  In  a time  dt,  an  amount  of  fluid 
having  mass  dm  enters  the  region  MTV  at  M,  where  it  has  a speed  vM.  The 
kinetic  energy  of  this  fluid  is  thus  dXM  = dm  x%/2.  Thus  the  rate  at  which 
kinetic  energy  enters  the  region  at  M is 

dKM  dm  v2m 
dt  dt  2 


But  at  M,  dm/dt  = <J>;U,  so  we  have 


dKM 

dt 


= i®MV2M 


for  the  rate  at  which  kinetic  energy  enters  MN  at  M.  An  identical  argument 
is  made  at  TV.  Here  the  mass  flux  is  <1>N  = — and  the  fluid  flows  with 
speed  Vff.  This  leads  to  the  corresponding  expression  for  the  rate  dKN/dt  at 
which  kinetic  energy  enters  MN  at  TV.  It  has  the  negative  value 


The  negative  value  signifies  the  fact  that  kinetic  energy  is  actually  leaving 
MN  at  TV. 

The  net  rate  dK/dt  at  which  the  fluid  carries  energy  into  the  region  MTV 
in  the  form  of  kinetic  energy  is  dK/dt  = dKM/dt  + dKN/dt.  Consequently, 
the  net  rate  at  which  the  fluid  carries  kinetic  energy  out  of  the  region  is 

_dK  _ _cIKn  dXM 
dt  dt  dt 


or 


dK 

dt 


— 2 ^m(vN  V2m)  — v%) 


(16-43) 


In  the  particular  case  shown  in  Fig.  16-20,  this  represents  a net  outflow  of 
kinetic  energy,  since  vN  > vM. 

Similarly,  there  is  a change  in  the  potential  energy,  since  as  fluid  enters 
MTV  at  the  height  yM  (with  respect  to  an  arbitrary  reference  level),  an  equal 
amount  leaves  at  the  generally  different  height  yN.  A calculation  analogous 
to  the  calculation  above  for  -dK/dt  yields  a rate  —dlJ/dt  at  which  the  fluid 
carries  potential  energy  out  of  region  MTV.  This  rate  is  given  by 


dU 

dt 


(I  l y dU M 

dt  dt 


or 


dU 

dt 


l^lgOiv  - Jm) 


(16-44) 


16-7  Dynamics  of  Ideal  Fluid  729 


where  g is  the  acceleration  of  gravity.  In  the  particular  case  shown  in  Fig. 
16-20,  this  represents  a net  outflow  of  potential  energy,  since  yN  > yM. 


We  have  now  evaluated  P,  the  flow  of  power  into  the  region  MN,  and 
— dK/dt  — dU/dt,  the  rate  of  flow  of  mechanical  energy  out  of  the  region. 
To  satisfy  the  steady-state  condition,  these  quantities  must  be  equal,  so  that 
the  total  energy  content  of  the  region  may  remain  constant.  That  is,  the 
steady-state  condition  requires 


dK  _ dU 
dt  dt 


(16-45) 


The  three  quantities  in  this  equation  are  given,  respectively,  by  Eqs. 
(16-42),  (16-43),  and  (16-44).  Inserting  those  values  into  Eq.  (16-45)  gives 

|o| 

(Pm  ~ Pn)  — = - vh)  + - Jm) 

Canceling  the  common  factor  |<I>|  and  multiplying  through  by  p,  we  obtain 
Pm  ~ Pn  = $P(v2n  ~ vh)  + pg(yN  - yM)  (16-46) 


This  equation  tells  us  that  in  the  steady  state  the  power  put  into  region  MN 
clue  to  the  pressure  difference  between  its  ends  results  in  a change  in  the 
speed  and/or  height  of  the  fluid  passing  through  it.  These  changes  just  suf- 
fice to  remove  energy  from  MN  at  a rate  which  keeps  the  energy  content  of 
the  region  constant. 


We  have  taken  here  a point  of  view  subtly  different  from  the  one  taken  in  all 
previous  energy  calculations.  Up  to  now,  we  have  spoken  of  the  energy  (kinetic  or 
potential)  possessed  by  matter.  It  is  still  true  here  that  the  energy  involved  is  the 
mechanical  energy  of  matter,  namely,  the  energy  of  the  fluid.  But  because  of  the 
continuous  nature  of  the  flow,  it  is  more  convenient  to  think  in  terms  of  the  energy 
content  of  a region.  The  specific  fluid  contained  in  the  region  changes  from  instant 
to  instant.  But  because  of  the  steady  flow,  the  region  does  not  look  different  from 
instant  to  instant.  So  we  argue  in  terms  of  the  energy  content  of  the  region  — 
energy  which  is  actually  possessed  by  different  parts  of  the  fluid  as  time 
passes — instead  of  the  energy  content  of  a moving  “package”  of  fluid.  By  doing  so 
we  take  advantage  of  the  steady  state  to  write  what  appears  superficially  to  be  an 
energy  conservation  equation,  Eq.  (16-45).  But  neither  the  region  MN  nor  the  fluid 
system  as  a whole  is  an  isolated  system,  and  in  fact  energy  is  not  conserved.  The 
pump  or  other  device  driving  the  fluid  through  the  system  continually  increases 
the  energy  of  the  fluid.  Nevertheless,  the  energy  content  of  region  MN  does  not 
change  with  time,  and  Eq.  (16-45)  holds.  This  pseudoconservation  equation  is 
called  a steady-state  equation.  We  make  further  use  of  such  equations  in  the  study 
of  electric  current  flow. 


We  now  rewrite  Eq.  (16-46)  by  subtracting  pM  — pN  from  both  sides. 
This  gives 

Pn  - Pm  + ip(v2N  ~ vh)  + pg(yN  ~ yM)  = 0 

Defining  the  differences  &p  = pN  — pM,  An2  = v%  — vh,  and  Ay  = y N — y^, 
we  can  write  this  equation  in  the  form 

A p + ip  An2  + pg  Ay  = 0 (16-47) 

Since  region  MN  was  defined  arbitrarily,  this  equation  applies  to  the  dif- 
ferences betwen  any  two  locations  in  the  fluid.  It  is  called  Bernoulli’s 


730  Mechanics  of  Continuous  Media 


Fig.  16-21  Application  of  Bernoulli’s  theorem 
to  steady  flow  in  a nonlevel  pipe  of  uniform 
cross  section.  The  theorem  gives  the  change 
in  pressure  from  plt  the  value  at  a location 
having  height  ylt  to  p2,  the  value  at  a location 
having  height  y2.  The  speed  of  the  fluid  does 


theorem,  after  Daniel  Bernoulli  (1700-1782),  a noted  mathematician  and 
physicist  from  a family  of  many  distinguished  Swiss  mathematicians,  physi- 
cists, and  other  scholars.  Bernoulli  made  significant  contributions  to  the 
theory  of  differential  equations  and  to  the  theory  of  fluids. 

Bernoulli’s  theorem  has  a number  of  important,  simple,  special  cases. 
If  we  set  v = 0 everywhere  (the  hydrostatic  case),  then  Aw2  = 0,  and 


A p = - pg  Ay 


(16-48) 


This  is  just  Eq.  (16-17)  written  in  a slightly  different  form.  Even  if  the  flow 
speed  is  not  zero  but  is  the  same  everywhere  (as  is  the  case  for  a pipe  of  con- 
stant cross  section),  Eq.  (16-48)  still  holds.  This  is  illustrated  in  Fig.  16-21. 
While  such  a case  differs  from  the  hydrostatic  case  in  that  the  fluid  has 
kinetic  energy,  the  kinetic  energy  does  not  change  as  the  fluid  moves  along. 
Thus  any  increase  in  the  gravitational  potential  energy  of  the  fluid  per  unit 
volume  [the  third  term  in  Eq.  (16-47)]  must  be  accompanied  by  a numeri- 
cally equal  decrease  in  the  pressure,  just  as  in  the  hydrostatic  case.  T his  is 
illustrated  in  Example  16-9. 


EXAMPLE  16-9 


The  tank  shown  in  Fig.  16-22  is  kept  filled  with  water  to  a depth  of  8.0  m. 

a.  Find  the  speed  with  which  the  jet  of  water  emerges  from  the  small  pipe 
just  at  the  bottom  of  the  tank. 

■ You  apply  Bernoulli’s  theorem  to  the  differences  Av2,  A p,  and  Ay  between 
the  locations  t and  b in  Fig.  16-22.  Since  the  tank  is  large  and  the  pipe  is  small,  you 
can  neglect  the  speed  with  which  water  at  the  top  of  the  tank  descends  through  the 
tank  to  replace  the  water  flowing  out  via  the  jet  at  the  bottom.  That  is,  the  water  has 
negligible  speed  until  it  is  actually  in  the  outlet  pipe.  In  applying  Eq.  (16-47),  you 
can  therefore  write 


An2  = vl  - 0 


so  that 


The  water  surface  in  the  tank  is  at  atmospheric  pressure.  But  so  is  the  water  jet, 
which  consists  of  water  that  has  emerged  from  the  tank  and  is  not  subject  to  the 


16-7  Dynamics  of  Ideal  Fluid  731 


a 

o 

00 


Water 


t 


< 


t 


Fig.  16-22  Illustration  for  Example 
16-9.  The  faucet  keeps  the  tank  filled 
to  the  top  as  water  flows  out  through 
the  pipe  at  the  bottom  and  drips  out 
through  the  pipe  just  below  the  surface. 


Fig.  16-23  A beach  ball  is  kept  sus- 
pended in  the  blast  of  air  issuing  from 
a vacuum  cleaner  hose.  The  ball  is  shown 
at  a moment  when  it  has  moved  to  the 
right  of  the  center  of  the  air  blast.  In 
this  application  of  the  Venturi  effect, 
the  speed  v2  of  the  air  passing  the  ball 
on  the  right  is  less  than  the  speed  v1 
of  the  air  passing  on  the  left.  The  air 
pressure  is  therefore  greater  on  the  right 
than  on  the  left,  and  the  ball  experiences 
a net  force  Fnet  which  tends  to  restore 
it  to  the  center  of  the  air  blast. 


hydrostatic  pressure  of  the  water  in  the  tank.  Since  the  change  in  atmospheric  pres- 
sure over  a height  difference  of  8.0  m is  negligible,  you  can  write 

Ap  = 0 

Equation  (16-47)  thus  becomes  ip  Av2  + pg  Ay  = 0,  or 

vl  = -2  g Ay 
or 

vb  = V2g(-  Ay)  (16-49) 

Comparing  water  at  the  top  of  the  tank  and  at  the  level  of  the  jet,  you  have 

Ay  = — 8.0  m 

Inserting  the  numerical  values  given,  you  have 

vb  = V2  x 9.8  m/s2  X 8.0  m = 13  m/s  ■ 

b.  I he  upper  pipe  in  Fig.  16-22  is  located  just  under  the  surface  of  the  water. 
Its  free  end  is  plugged  except  for  a small  hole  through  which  water  drips.  Neg- 
lecting air  resistance,  find  the  speed  vt  of  a drop  from  the  upper  pipe  just  as  it  falls 
past  the  bottom  of  the  tank. 

■ You  have,  in  fact  already  solved  this  part  of  the  example.  Equation  (16-49), 
which  uses  Bernoulli’s  theorem  to  find  the  speed  of  the  water  in  the  jet  leaving  the 
tank  at  its  bottom,  is  identical  to  the  expression  for  the  speed  of  an  object  which  falls 
from  rest  through  a vertical  distance  - Ay.  Thus  you  have  for  the  speed  of  the  drops 
as  they  pass  the  bottom  of  the  tank 

vt  = 13  m/s 

And  in  general. 


vt  = vb 

A physical  explanation  for  the  identity  of  the  speeds  will  be  given  shortly.  The  fact 
that  a liquid  emerges  from  a tank  at  a given  depth  at  the  same  speed  which  it  would 
acquire  in  falling  from  rest  through  the  same  vertical  distance  is  called  Torricelli’s 
theorem. 


Suppose,  now,  that  a fluid  flows  through  a pipe  which  is  level,  so  that 
Ay  = 0.  However,  let  there  be  a variation  in  cross-sectional  area,  so  that 
An2  7^  0.  Equation  (16-47),  applied  to  the  differences  in  the  quantities  p 
and  v2  between  two  parts  of  the  pipe  of  different  cross  section,  then  takes 
the  form 


A p = — |p  Ay2  (16-50) 

This  equation  tells  us  that  an  increase  in  fluid  flow  speed  corresponds  to  a 
decrease  in  pressure.  This  is  the  Venturi  effect. 

The  fact  that  the  Venturi  effect  conflicts  with  “commonsense”  notions  is  the 
basis  of  many  parlor  tricks.  In  one  of  these,  a small  sheet  of  paper  may  be  made  to 
adhere  to  the  end  of  a spool  by  blowing  through  the  opposite  end,  in  spite  of  the 
fact  that  “intuition”  tells  you  that  you  should  be  blowing  the  paper  away.  In  a 
variation  beloved  by  vacuum  cleaner  salespeople,  the  blast  of  air  from  the  end  of 
the  vacuum  cleaner  hose,  directed  upward,  can  be  made  to  keep  a large  rubber  ball 
apparently  suspended  in  midair.  Friction  with  the  surrounding  air  slows  the  outer 
part  of  the  airstream  more  than  the  inner  part.  If  the  ball  moves  sideways  out  of  the 
center  of  the  airstream,  the  streamlines  divide  unequally  around  the  ball,  with  the 
faster-moving  air  on  the  inner  side.  This  is  illustrated  in  Fig.  16-23.  The  higher 


732  Mechanics  of  Continuous  Media 


Fig.  16-24  Schematic  drawing  of  a Venturi 
meter.  The  operation  of  the  device  is  explained 
in  Example  16-10. 


pressure  of  the  relatively  slow-moving  air  then  pushes  the  ball  back  toward  the 
center  of  the  airstream. 

A quantitative  illustration  of  the  Venturi  effect  is  the  Venturi  meter  of 
Fig.  16-24,  which  can  be  used  variously  to  measure  fluid  fluxes  or  speeds. 
Its  use  is  illustrated  in  Example  16-10. 


EXAMPLE  16-10 


Air  flows  through  the  horizontal  main  tube  of  the  Venturi  meter  of  Fig.  16-24  from 
left  to  right.  If  the  U-tube  of  the  meter  contains  mercury,  find  the  mercury-level 
difference  h between  the  two  arms.  Let  the  radii  of  the  wide  and  narrow  parts  of  the 
main  tube  be  r1  = 1.0  cm  and  r2  = 0.50  cm,  respectively,  and  let  the  speed  of  the  air 
entering  the  meter  be  vx  = 15.0  m/s.  The  density  of  air  is  pair  =1.3  kg/m3,  and  that 
of  mercury  is  pme, c = 13.6  x 103  kg/m3. 

■ Since  the  air  moves  horizontally,  you  have  Ay  = 0,  and  Eq.  ((16-47)  becomes 


= -iPaiv  An2 


(16-51) 


In  order  to  find  Av2,  you  begin  with  Eq.  (16-41),  v2/v1  = a^/a2,  where  a i and  a2  are 
the  cross-sectional  areas  of  the  two  parts  of  the  main  tube.  This  gives  you 

ax  r\ 

V2  = V\  — = Vi  — 
a2  ri 


Thus  you  have 


Av2 


v\ 


v\ 


v\ 


ri 

r\ 


Inserting  this  value  into  Eq.  (16-51)  leads  to  the  relation 


A P 


Paired 


r\ 

ri 


(16-52) 


This  difference  in  pressure  between  the  two  ends  of  the  mercury-containing  U-tube 
produces  a mercury-level  difference  between  the  two  arms.  Specifically,  the  extra 
hydrostatic  pressure  produced  by  the  extra  column  of  mercury  in  the  right  arm  just 
compensates  for  the  higher  pressure  of  the  air  passing  above  the  left  arm.  You  can 
apply  Eq.  (16-17),  which  in  the  present  notation  can  be  written 

|A/t|  — Pmerc£p 

Taking  the  absolute  value  of  both  sides  of  Eq.  (16-52)  and  combining  the  result  with 
this  equation,  you  have 


2 PaiTl  f 1 


16-7  Dynamics  of  Ideal  Fluid  733 


Solving  for  the  mercury-level  difference  h,  you  obtain 


, Paii-tti 

h = 


2p  merest 


(16-53) 


Inserting  the  numerical  values  given,  you  find  the  result 

1.3  kg/m3  x (15.0  m/s)2 " / 1 

' “ 2 x 13.6  x 103  kg/m3  x 9.80  m/s2  L VoT 


1 .0  cm 
0.50  cm 


= 0.01 6 m =1.6  cm 


Fig.  16-25  Schematic  drawing  of  an 
airspeed  indicator.  This  is  a modifica- 
tion of  the  Venturi  meter  which  meas- 
ures the  difference  in  pressure  between 
the  freely  flowing  air  at  B and  the  stag- 
nant air  at  A. 


The  reason  for  the  paradoxical  Venturi  effect — the  lowering  of  the  in- 
ternal pressure  of  a fluid  as  its  speed  is  increased — is  that  the  pressure  of  a 
fluid  is  related  to  its  potential  energy.  Note  that  every  term  in  Eq.  (16-47), 
A p + ip  An2  + pg  Ay  = 0,  has  the  dimensions  of  energy  per  unit  volume. 
The  second  term,  ip  Av2,  is  the  kinetic  energy  change  per  unit  volume  of 
fluid  as  it  passes  from  the  initial  to  the  final  location  (for  example,  from  M 
to  N in  Fig.  16-20).  The  third  term,  pg  Ay,  is  the  gravitational  potential  en- 
ergy change  per  unit  volume  of  fluid  as  it  passes  from  the  initial  to  the  final 
location.  The  first  term  in  the  equation,  the  quantity  A p,  is  the  potential  en- 
ergy change  per  unit  volume  of  fluid  due  to  the  change  in  pressure  as  the 
fluid  passes  from  the  initial  to  the  final  location.  That  is,  potential  energy  is 
stored  in  a fluid  when  the  fluid  is  subjected  to  pressure. 

Consider,  for  instance,  the  mechanical  energy  of  a small  volume  ele- 
ment of  water  as  it  passes  through  the  system  discussed  in  Example  16-9 
and  illustrated  in  Fig.  16-22.  When  the  water  is  at  the  top  of  the  tank,  it  pos- 
sesses gravitational  potential  energy  relative  to  a reference  location  taken  at 
the  bottom  of  the  tank.  As  the  water  descends  through  the  tank,  its  gravita- 
tional potential  energy  decreases.  But  the  total  mechanical  energy  of  the 
volume  element  is  conserved  because  the  potential  energy  associated  with 
the  pressure  increases  by  an  equal  amount.  When  the  water  passes  through 
the  jet.  that  potential  energy  is  converted  into  an  equal  amount  of  kinetic 
energy. 

But  now  consider  what  happens  in  a level  pipe,  like  that  in  Fig.  16-24, 
when  an  increase  in  flow  speed  results  in  an  increase  in  the  kinetic  energy 
per  unit  volume  of  the  fluid.  Since  the  pipe  is  level,  this  increase  in  kinetic 
energy  can  arise  only  from  a decrease  in  the  potential  energy  associated 
with  pressure.  This  is  what  we  have  called  the  Venturi  effect. 

An  important  variant  on  the  Venturi  meter  is  the  airspeed  indicator  il- 
lustrated in  Fig.  16-25.  Air  moving  with  free-stream  speed  u0  passes  the  two 
openings  A and  B.  At  B , the  air  flows  by  essentially  unimpeded,  and  the  pres- 
sure pB  is  essentially  the  hydrostatic  pressure  p0  which  would  be  measured  by 
a barometer,  that  is,  pB  = po ■ But  in  the  steady  state,  when  the  mercury  in  the 
U-tube  is  in  equilibrium  as  shown,  the  speed  of  the  air  at  A must  be  vA  — 0, 
since  the  tube  presents  an  obstacle  to  the  passage  of  air.  The  streamlines 
representing  the  path  of  the  oncoming  air  split  as  shown.  But  the  air  pre- 
cisely at  A — called  the  stagnation  point — is  at  rest.  Since  it  was  previously 
moving  along  the  streamline  at  the  free-stream  speed  u0>  there  is  a loss  of 
kinetic  energy.  This  must  be  compensated  by  an  increase  in  potential  en- 
ergy and  hence  an  increase  of  pressure  to  a value  pA  which  is  greater  than 
po-  The  pressure  pA  is  called  the  stagnation  pressure. 


Fig.  16-25  Schematic  drawing  of  an 
airspeed  indicator.  This  is  a modifica- 
tion of  the  Venturi  meter  which  meas- 
ures the  difference  in  pressure  between 
the  freely  flowing  air  at  B and  the  stag- 
nant air  at  A. 


734  Mechanics  of  Continuous  Media 


As  in  Example  16-10,  the  pressure  difference  between  the  two  arms  of 
the  U-tube  is  given  by  Bernoulli’s  equation  in  the  special  form  of  Eq. 
(16-51).  This  can  be  written 

N?  = Pa  ~ Po  = -2Pair  &v2 

And  since  Air  = v\  — v20  = 0 — v2,  we  have  Aw2  = — u2.  Thus  the  dif- 
ference between  the  stagnation  pressure  and  the  hydrostatic  pressure  is 

Pa  ~ Po  = iPaiAo 

We  have  already  noted  that  pB  — p0.  Consequently,  the  difference  in  pres- 
sure between  points  A and  B is 

Pa  ~ Pb  = ipair^o  (16-54) 

That  is,  the  pressure  difference  is  proportional  to  the  square  of  the  air- 
speed v0 — the  free-stream  speed  of  the  air  flowing  past  the  airspeed  indi- 
cator. In  Fig.  16-25  the  pressure  difference  is  shown  schematically  as  being 
measured  by  means  of  a mercury-filled  U-tube.  However,  it  can  be  mea- 
sured by  any  suitable  gauge,  which  can  be  calibrated  directly  in  units  of 
speed.  In  actual  practice,  it  is  necessary  to  compensate  for  the  fact  that  the 
density  of  air  pair  is  itself  a function  of  altitude. 

In  this  section  we  have  considered  mainly  the  flow  of  ideal  fluids,  whose  den- 
sity may  be  considered  to  be  constant.  In  particular,  Bernoulli’s  theorem  depends 
on  this  condition.  Why?  For  systems  containing  liquids  only,  the  condition  is  well 
met,  since  the  compressibility  of  liquids  is  negligible  under  commonly  encoun- 
tered conditions.  Even  in  the  case  of  gas  flow,  Bernoulli’s  theorem  is  often  not  a 
bad  approximation.  In  the  Venturi  meter  discussed  in  Example  16-10,  for  instance, 
the  pressure  difference  was  sufficient  to  support  a mercury  column  of  height  h = 
1.6  cm.  If  the  pressure  of  the  incoming  air  was  1 atm — sufficient  to  support  a mer- 
cury column  of  height  7 6 cm — the  fractional  change  in  pressure  was  1 . 6 cm/ 7 6 cm  = 
2 percent.  The  magnitude  of  the  fractional  change  in  density  must  be  the  same. 
Why?  Hence  Bernoulli’s  theorem  produces  a result  whose  accuracy  is  acceptable 
for  many  purposes. 

However,  in  cases  where  the  change  in  fluid  flow  speed  is  quite  large,  the 
variation  in  the  density  of  the  fluid  must  be  taken  into  consideration,  and  Ber- 
noulli’s theorem  no  longer  applies.  Even  a liquid  may  experience  a pressure  drop 
so  great  that  it  begins  to  boil  spontaneously.  This  phenomenon  is  called  cavita- 
tion. Cavitation  is  a problem  in  marine  propellers,  where  it  can  cause  serious  effi- 
ciency losses  and  even  damage  to  the  propeller.  It  must  be  minimized  by  careful 
design. 


EXERCISES 

Group  A 

16-1  Bending.  A heavy  weight  hangs  from  one  end  of 
a plank  whose  other  end  is  embedded  in  a wall.  As  a re- 
sult, the  plank  bends  somewhat. 

a.  Which  part  of  the  plank  is  under  tension?  under 
compression? 

b.  Is  there  any  part  of  the  plank  that  is  neither 
stretched  nor  compressed? 

16-2.  Measuring  Young’s  modulus.  An  iron  wire  1.0  m 
long  and  1.0  mm  in  diameter  is  attached  to  a hook.  When 


a 1.0- kg  mass  is  hung  from  the  other  end,  the  wire’s  length 
increases  by  0.059  mm.  What  is  the  value  of  Young’s  mod- 
ulus for  the  wire? 

16-3.  Two  in  line.  A 1.0-m  length  of  aluminum  wire  is 
attached  to  a hook.  A 1.0-m  length  of  brass  wire  is  welded 
to  the  free  end  of  the  aluminum  wire,  and  a 10-kg  mass  is 
attached  to  the  free  end  of  the  brass  wire.  If  both  wires 
have  diameters  of  1.0  mm,  what  will  be  the  total  increase 
in  length? 


Exercises  735 


16-4.  The  Magdeburg  hemispheres.  Practical  vacuum 
pumps,  capable  of  exhausting  most  of  the  air  from  a 
closed  container,  were  first  developed  in  the  seventeenth 
century.  Exploiting  one  such  pump  which  he  had  devel- 
oped, Otto  von  Guericke  invented  the  so-called  Magde- 
burg hemispheres  as  a dramatic  demonstration  of  the 
existence  of  air  pressure.  Two  hemispherical  metal  shells 
of  equal  radius  R are  fitted  together  rim  to  rim  so  that 
they  form  an  airtight  sphere.  An  exhaust  tube  mounted 
on  one  of  them  is  connected  to  the  vacuum  pump.  When 
substantially  all  of  the  air  has  been  pumped  out  of  the 
spherical  container,  a valve  on  the  tube  is  closed.  A heavy 
ring  on  each  of  the  hemispheres  is  then  hitched  to  a team 
of  strong  horses.  If  the  value  of  R is  large  enough,  and  if 
the  system  is  leakproof,  the  horses  cannot  pull  the  hemi- 
spheres apart. 

A student  wishes  to  calculate  the  force  with  which  the 
atmosphere  presses  one  of  the  hemispheres  against  the 
other.  He  multiplies  the  atmospheric  pressure,  patm,  by  the 
surface  area  of  the  hemisphere,  2ttR2.  Why  is  this  incor- 
rect? What  is  the  correct  result? 

16-5.  Height  of  a barometer  column.  Standard  atmo- 
spheric pressure  can  support  a column  of  mercury  760 
mm  high. 

a.  Show  that  this  is  equivalent  to  1.01  X 105  Pa. 

b.  How  high  a column  of  water  can  the  standard 
atmosphere  support?  The  density  of  mercury  is  13.6  x 
103  kg/m3,  and  the  density  of  water  is  1.00  x 103  kg/m3. 

16-6.  Sea  of  air.  The  density  of  the  earth’s  atmo- 
sphere actually  diminishes  gradually  with  height  above 
the  earth’s  surface.  But  imagine  instead  that  it  is  uniform 
in  density  and  has  a well-defined  top,  as  a lake  does.  If  the 
density  of  the  imaginary  atmosphere  were  equal  to  the 
density  of  the  actual  atmosphere  at  sea  level,  1.29  kg/m3, 
and  its  sea-level  pressure  were  the  same  as  that  of  the 
actual  atmosphere,  what  would  be  its  height?  Compare 
this  value  with  the  heights  of  some  actual  mountains. 

16-7.  Pneumatic  lift.  The  piston  of  an  automobile  lift 
of  the  kind  used  in  service  stations  is  25.0  cm  in  diameter. 
What  gauge  pressure  (excess  pressure  over  atmospheric) 
does  it  require  to  lift  a 1500-kg  car  by  having  compressed 
air  push  against  the  piston?  (The  gauge  pressure  is  the 
pressure  measured  by  a tire  gauge.) 

16-8.  Expansion  of  a rising  bubble.  A gas  bubble  is  rising 
through  a considerable  depth  of  water.  As  the  bubble 
rises,  the  gas  pressure  p(J  in  the  bubble  continually  adjusts 
itself  to  equal  the  water  pressure  outside  the  bubble. 

a.  If  the  bubble  is  formed  at  an  initial  depth  d,  under 
the  surface  with  an  initial  volume  Vh  find  its  volume  V as  a 
function  of  its  depth  d.  The  pressure  at  the  surface  is  that 
of  the  atmosphere,  patm.  Neglect  any  change  in  the  mass  or 
temperature  of  the  gas  in  the  bubble;  that  is,  assume 
Boyle’s  law  is  applicable. 

b.  A bubble  originates  at  a depth  d = 100  m.  Let  g = 
9.8  m/s2.  Assume  a uniform  water  density  of  1.0  X 103 


kg/m3  and  a pressure  of  1.0  atm  at  the  surface.  What  is 
the  volume  of  the  bubble  as  it  breaks  the  surface? 

16-9.  Mississippi  mud.  A typical  riverborne  silt  par- 
ticle has  a radius  of  20  /am  = 2.0  x 10-5  m and  a density 
of  2 x 103  kg/m3. 

a.  Find  the  terminal  speed  with  which  such  a particle 
will  settle  to  the  bottom  of  a motionless  volume  of  water. 
(Unless  the  speed  of  internal  fluid  motions  is  smaller  than 
this  settling  speed,  the  silt  particles  will  not  settle  to  the 
bottom.  Hint:  See  Example  4-12  and  allow  for  buoyancy.) 

b.  Suppose  that  you  filled  a one-liter  soda-pop  bottle 
with  water  from  a muddy  river,  such  as  the  lower  Missis- 
sippi. After  all  internal  motions  of  the  water  itself  had 
stopped,  about  how  long  would  it  take  for  all  the  silt  to 
settle  to  the  bottom?  (Hint:  This  time  is  accurately  given 
by  the  ratio  of  water  depth  to  terminal  speed,  because 
each  silt  particle  reaches  its  terminal  speed  in  a time  very 
short  compared  to  the  total  settling  time.) 

c.  In  the  lower  part  of  its  course,  the  Mississippi 
River  is  typically  6 m deep  and  flows  at  about  1.5  m/s. 
Suppose  that  the  river  is  thoroughly  laden  with  silt  as  it 
passes  Natchez  (which  is  the  case)  and  that  there  is  no  ad- 
ditional mixing  due  to  internal  fluid  motions  once  the 
river  passes  Natchez  (which  is  not  actually  the  case).  How 
far  downstream  from  Natchez  would  the  river  water  first 
become  clear  to  the  bottom? 

16-10.  A child’s  garden  of  Reynolds  numbers. 

a.  For  each  of  the  following  motions,  a typical  speed  v 
and  characteristic  dimension  d are  given.  Compute  the 
corresponding  Reynolds  number.  For  motions  in  air,  use 
p = 1.2  kg/m3  and  17  = 1.8  x 10-5  Pa-s  for  the  density 
and  viscosity,  respectively.  For  motions  in  water,  use  the 
values  p = 1.0  x 103  kg/m3  and  17  = 1.0  x 10-3  Pa-s. 

(i)  a peregrine  falcon  in  a hunting  dive(u  = 70  m/s; 
d = 0.15  m) 

(ii)  a minnow  swimming  in  a quiet  stream  (1.0  m/s; 
0.030  m) 

(iii)  a paramecium  moving  about  in  a pond 
(1.0  x 10"3  m/s;  2.0  x 1CT4  m) 

(iv)  a pitched  baseball  (30  m/s;  9.0  x 10-2  m) 

(v)  a rifle  bullet  fired  underwater  (6.0  x 102  m/s; 
2.0  x 10“2  m) 

(vi)  airborne  dust  particles  settling  at  terminal  speed 
on  a calm  day  (2.0  x 10~4  m/s;  1.0  x 10-6  m) 

(vii)  a cruising  dirigible  (10  m/s;  50  m) 

b.  Construct  a logarithmic  scale  for  Reynolds 
numbers,  ranging  from  R = 1CT5  to  R = 1010.  Mark  and 
label  the  appropriate  location  on  your  scale  for  each  of  the 
motions  listed  in  part  a.  With  the  help  of  Table  16-3,  indi- 
cate which  motions  should  involve  laminar  flow,  which 
motions  should  involve  turbulent  flow  described  by  a 
quadratic  drag  law,  and  which  motions  are  probably  too 
“rapid”  to  obey  a quadratic  drag  law. 


736  Mechanics  of  Continuous  Media 


16-11.  Delivery  capacity  of  a pipe. 

a.  Calculate  the  speed  at  which  the  flow  of  water  in  a 
long  cylindrical  pipe  of  diameter  2.0  cm  becomes  turbu- 
lent. Assume  that  the  temperature  is  20°C,  and  refer  to 
Tables  4-3  and  16-3  for  the  necessary  data. 

b.  When  water  is  flowing  through  the  pipe  at  the  crit- 
ical speed  calculated  in  part  a,  what  is  the  rate  at  which  the 
pipe  delivers  water  to  a tank  at  iis  end?  Express  your 
answer  in  m3/s  and  in  liters/min. 

c.  Suppose  someone  suggests  that  the  water  delivery 
rate  of  the  pipe  to  the  tank  be  increased  by  increasing  the 
pressure  produced  by  the  pump  which  drives  water 
through  the  system.  Why  would  you  advise  against  this? 

16-12.  Dimensional  check.  Show  that  the  Reynolds 
number  R = pv0d/iq  is  dimensionless. 


16-13.  Hydraulic  press.  The  apparatus  shown  in  Fig. 
16E-13  is  filled  with  a liquid,  and  the  mass  of  the  body 
resting  on  piston  A is  1.00  kg.  If  pistons  A and  B,  both  of 
negligible  mass,  lie  at  the  same  level  and  are  stationary, 
then  according  to  Bernoulli's  theorem  pA  = pB , where  p 
stands  for  pressure.  If  the  cross-sectional  area  of  piston  A 
is  1.00  cm2,  then  pA  = 9.8  x 104  Pa,  and  this  must  also  be 


the  value  of  pB. 


i 


Fig.  16E-13 


V J 

a.  If  the  system  is  at  rest  when  the  body  resting  on 
piston  B has  a mass  of  100  kg,  what  must  be  the  cross- 
sectional  area  of  piston  B't 

b.  Show  that  energy  conservation  holds  (neglecting 
friction)  if  the  imposition  of  a small  additional  downward 
force  at  A leads  to  the  slow  descent  of  piston  A and  the  cor- 
responding rise  of  piston  B.  This  device  is  known  as  the 

hydraulic  press. 


16-14.  Under  pressure.  In  Fig.  16E-14,  the  system  is 
filled  to  height  h with  a liquid  of  density  p.  The  atmo- 
spheric pressure  is  patm.  Neglecting  fluid  friction,  evaluate 
the  pressure  of  the  fluid  at  each  of  the  points  lablecl  1,2,4, 
and  5.  compare  the  pressure  at  point  3 with  that  at  point 
5. 


Patm 


± 1 


Patm 


3 


Fig.  16E-14 


Group  B 

16-15.  Combined  stresses.  A bar  is  subject  to  a tensile 
stress,  cr  = F/a,  as  in  Fig.  16E-15.  Consider  a thin  planar 
slab  within  the  solid  making  an  angle  0 with  the  bottom  of 
the  bar.  For  this  plane: 

a.  What  is  the  force  on  its  upper  surface  perpendic- 
ular to  the  plane?  Parallel  to  the  plane? 


b.  What  is  the  tensile  stress  at  this  plane?  the  shear 
stress? 

c.  For  what  value  of  0 is  the  tensile  stress  a max- 
imum? the  shear  stress  a maximum? 

16-16.  Alternative  derivation  of  Pascal's  principle.  At  any 
given  point  in  a fluid,  the  pressure  is  the  same  in  all  direc- 
tions. This  can  be  shown  to  follow  from  the  fact  that  the 
force  acting  on  any  surface  in  a fluid  at  rest  must  be  at 
right  angles  to  the  surface  so  that  there  may  be  no  shear 
stress. 

z Fig.  16E-16 


Figure  16E-16  represents  an  infinitesimal  triangular 
prism  of  fluid  anywhere  in  a fluid  in  equilibrium.  Hence 
the  forces  have  been  drawn  at  right  angles  to  the  surface 
on  which  each  acts.  Two  forces  that  would  be  labeled  Fx 
have  been  neglected.  Why? 

a.  Show  that  Fs  sin  6 = Fy  and  Fs  cos  0 = Fz. 

b.  Using  your  answer  to  part  a show  that  ps  = py  and 
Ps  ~ Pz  so  that  ps  = py  = pz,  where  py,  pz,  and  ps  are  the 
pressures  on  the  faces  perpendicular  to  the  y axis  and  the 
z axis,  and  the  pressure  on  the  slanting  face,  respectively. 

16-17.  Force  and  torque  on  a dam.  The  length  of  a dam 
is  L and  its  height  is  H.  Its  vertical  cross-sectional  area  is 
thus  LH  = A. 

a.  Show  that  the  total  force  exerted  by  the  water 
against  the  dam  equals  pgA  H / 2,  where  p is  the  density  of 
the  water. 

b.  Show  i hat  the  torque  about  the  bottom  edge  of  the 
dam  due  to  the  water  is  equal  to  pgA  H2/6. 


Exercises  737 


16-18.  Relative  densities.  The  U-shaped  tube  in  Fig. 
16E-18  contains  water  and  carbon  tetrachloride.  The 
height  AB  is  2.5  cm;  AC  equals  4.0  cm.  What  is  the  density 
ratio  Pccij/Pmo? 


Carbon 

tetrachloride 


Fig.  16E-18 


16-19.  Liquid  level.  A liquid  cannot  withstand  a 
shear  stress.  How  does  this  imply  that  the  surface  of  a 
liquid  at  rest  must  be  level,  that  is,  normal  to  the  gi'avita- 
tional  force? 


16-20.  Good  thing  train  soup  is  never  hot!  A bowl  of 
soup  rests  on  a table  in  the  dining  car  of  a train.  If  the 
acceleration  of  the  train  is  g/4  in  the  forward  direction, 
what  angle  does  the  surface  of  the  soup  make  with  the 
horizontal?  (Hint:  See  Exercise  16-19.) 


16-21.  Up,  down,  sideways.  A long  capillary  (a  thin 
tube)  with  uniform  internal  diameter  and  closed  at  one 
end  is  positioned  horizontally.  See  Fig.  16E-21.  In  that 
position,  the  air  trapped  in  the  closed  end  by  mercury  in 
the  capillary  occupies  30  cm  and  the  mercury  50  cm,  as 
indicated.  What  will  be  the  length  of  the  column  of  trapped 
air  when  the  tube  is  turned  so  that  it  is  vertical  with  the 
open  end  up?  with  the  open  end  down?  Assume  that  the 
temperature  of  the  system  remains  unchanged  and  that 
the  atmospheric  pressure  at  the  time  of  the  experiment 
is  1.0  atm. 


V 


Fig.  16E-21 


30  cm 


50  cm 


16-22.  Journey  to  the  center  of  the  earth.  In  a simplified 
model,  the  innermost  part  of  the  earth  is  described  as  a 
core  of  molten  iron,  with  a radius  of  3500  km.  The  density 
of  the  core  ranges  from  9 x 103  kg/m3  at  its  outer  bound- 
ary to  12  x 103  kg/nr3  at  the  center.  (The  average  density 
of  the  earth  as  a whole  is  about  5.5  x I03  kg/m3.)  At  these 
pressures,  the  bulk  modulus  of  iron  is  12  X 1012  Pa.  Calcu- 
late the  increase  in  pressure  between  the  edge  of  the  core 
and  the  center  of  the  earth. 


16-23.  Deep-sea  litterbug.  As  the  research  submarine 
Alvin  descends  through  a depth  where  the  pressure  is  200 
atm,  the  pilot  sees,  suspended  freely  in  the  water,  an  open, 
water-filled  plastic  container  that  had  been  lost  on  a pre- 
vious dive.  The  “normal”  density  of  the  plastic  at  the  sur- 
face is  p = 0.75  x 103  kg/m3.  The  bulk  modulus  of 
seawater  is  2.2  x 109  Pa. 

a.  What  is  the  bulk  modulus  of  the  plastic? 

b.  What  would  happen  to  the  container  if  Alvin 
nudged  it  upward  or  downward?  (Is  the  equilibrium 
stable?) 

c.  Would  you  expect  to  see  large  amounts  of  debris 
suspended  at  various  depths  throughout  the  ocean?  Why 
or  why  not? 


16-24.  Relating  the  elastic  moduli.  Derive  Eq.  (16-28), 
1/F  = 1/3G  — 1/9 B.  from  Eqs.  (16-26)  and  (16-27). 

16-25.  Steel  spheres,  I.  A number  of  tiny  spheres  made 
of  steel  with  density  ps,  and  having  various  radii  rs,  are  re- 
leased from  rest  just  under  the  surface  of  a tank  of  water, 
whose  density  is  p.  They  fall  under  the  combined  action  of 
the  net  gravitational  force  (the  weight  minus  the  buoyant 
force)  and  viscous  drag. 

a.  Show  that  the  net  gravitational  force  acting  on  a 
sphere  has  magnitude  (47r/3)  rf  (ps  — p)  g. 

b.  Assuming  that  the  fluid  How  around  each  descend- 
ing sphere  is  laminar,  find  the  terminal  speed  8 of  a 
sphere  in  terms  of  rs,  ps.  p,  and  the  viscosity  p of  the  water. 

c.  Find  the  Reynolds  number  corresponding  to  the 
speed  v found  in  part  b.  Use  the  sphere  diameter  2rs  as 
the  “characteristic  length.”  For  what  range  of  radii  rs  is  it 
correct  to  assume  strictly  laminar  flow?  (See  Table  16-3.) 

d.  Obtain  numerical  results  for  the  quantities  found 
in  parts  b and  c,  given  that  ps  = 7.9  x 103  kg/m3. 


16-26.  Reynolds  numbers  and  the  rotating  viscometer. 

a.  Evaluate  the  Reynolds  number  of  the  flow  in  the 
rotating-cylinder  viscometer  of  Example  16-7.  Assume 
that  the  density  of  castor  oil  is  0.96  x 103  kg/m3.  Compare 
your  result  with  the  “critical"  Reynolds  numbers  given  in 
Table  16-3. 

b.  Do  you  think  that  the  assumption  of  laminar  flow 
in  Example  16-7  is  a valid  one? 

c.  How  could  you  use  the  viscometer  to  check  the  as- 
sumption of  laminar  flow? 

d.  Suppose  you  wished  to  use  the  same  viscometer  to 
check  the  viscosity  of  a sample  of  water.  Could  you  run  the 
turntable  at  the  same  speed?  If  not,  approximately  what 
speed  should  you  use? 


16-27.  Down  the  drain.  A rectangular  tank  with  cross- 
sectional  area  A is  filled  with  a liquid  to  a depth  h.  There 
is  an  opening  of  area  a in  the  bottom  of  the  tank.  Show 
that  the  time  required  for  the  tank  to  empty  is  T = 
(A/a)  V2  h/g. 


738  Mechanics  of  Continuous  Media 


Fig.  16E-30 


16-28.  Vena  contracta.  In  Fig.  16E-28,  the  opening 
through  which  water  leaves  the  vessel  has  a sharp  edge. 
When  this  is  the  case,  the  cross-sectional  area  of  the 
stream  to  the  right  of  the  opening  is  smaller  than  the 
opening.  The  water  has  not  completed  its  acceleration  at 
the  opening,  but  continues  to  speed  up  for  a short  dis- 
tance past  it.  This  causes  the  stream  to  narrow  and  the 
narrowed  region  is  called  the  vena  contracta.  For  sharp 
openings,  the  observed  minimal  cross-sectional  area  for 
which  Torricelli’s  theorem  holds  is  0.62  times  the  actual 
opening  area.  Suppose  the  vessel  is  filled  to  a height  h. 
What  is  the  flux  through  the  opening? 

Fig.  16E-28 


0.62  A 


16-29.  A Venturi  flowmeter.  Figure  16E-29  illustrates 
a Venturi  meter,  which  can  measure  the  rate  at  which 
water  is  being  delivered.  The  difference  in  pressure 
between  the  locations  where  the  areas  of  the  cross  sections 
are  A and  a is  indicated  by  the  difference  d in  the  water 
levels  in  tubes  connected  to  the  pipe  at  these  places.  Using 
Bernoulli’s  Theorem,  show  that  the  volume  of  water  deliv- 
ered iaer__unit__time,  dV/dt,  is  given  by  dV/dt  = 
AaV 2 gd/(A2  - a2). 


1 

Fig.  16E-29 

16-30.  A siphon.  A tube  of  uniform  cross  section  is 
used  to  siphon  water  from  a vessel,  as  in  Fig.  16E-30.  The 
atmospheric  pressure  is  patm  = 1.0  x 105  Pa. 

a.  Derive  an  expression  for  the  speed  with  which  the 
water  leaves  the  tube  at  B. 

b.  It  h2  = 3.0  m,  what  is  the  speed  with  which  water 
flows  out  at  B? 

c.  For  this  value  of  h2 , what  is  the  greatest  value  of 
hl  for  which  the  siphon  will  work? 


16-31.  Falling  water.  Water  leaves  a faucet  with  a 
downward  velocity  of  3.0  m/s.  As  the  water  falls  below 
the  faucet,  it  accelerates  with  acceleration  g.  The  cross- 
sectional  area  of  the  water  stream  leaving  the  faucet  is  1.0 
cm2.  What  is  the  cross-sectional  area  of  the  stream  0.50  m 
below  the  faucet? 

16-32.  Pumping  power.  A pump  draws  water  from  a 
reservoir  and  sends  it  through  a horizontal  hose.  Since  the 
water  starts  at  rest  and  is  set  into  motion  by  the  pump,  the 
pump  must  deliver  power  P to  the  water  when  the  flow 
rate  is  d>,  even  if  fluid  friction  is  negligible.  A new  pump  is 
to  be  ordered  which  will  pump  water  through  the  same 
system  at  a rate  = 20.  What  must  be  the  power  P'  of 
the  new  pump?  Assume  that  friction  is  still  negligible. 


Group  C 

16-33.  Torsion  fiber.  Figure  16E-33«  illustrates  the  tor- 
sion of  a hollow  cylindrical  shell  of  length  L.  The  upper 
end  is  rigidly  clamped.  AB  is  a line  drawn  along  the  sur- 
face of  the  cylinder  parallel  to  its  axis.  Two  antiparallel 
forces  each  of  magnitude  F/2  twist  the  lower  end  of  the 
shell  through  an  angle  6.  so  that  the  point  B moves  to  B' 
and  the  line  AB  is  now  the  line  AB' . The  radius  of  the 
cylindrical  shell  is  r and  its  thickness  is  Ar,  which  is  very 
much  less  than  r. 

Torsion  is  not  a new  type  of  deformation.  Rather,  it  is 
a case  of  pure  shear,  as  can  be  seen  by  imagining  the  cylin- 
drical shell  slit  along  AB'  and  “unrolled”  open  as  in  Fig. 
16E-33E 

a.  Show  that  the  magnitude  of  the  twisting  force  is  re- 
lated to  the  angle  of  torsion  according  to  the  equation 

2rrr2ArG 

F ~ z ^ 

where  G is  the  shear  modulus. 


Exercises  739 


Fig.  16E-33 


b.  Show  that  the  torque  T produced  by  the  two  forces 
is  given  by  the  expression 

2vr3ArG  „ 

T~  z 6 

c.  Now  consider  a solid  cylinder  of  radius  R and 
length  L under  torsion.  It  may  be  considered  to  be  made 
up  of  an  infinite  number  of  infinitesimally  thick  cylin- 
drical shells,  each  of  some  radius  0 < r =£  R and  of 
thickness  dr.  Using  the  result  of  part  b,  write  an  expression 
for  the  torque  dT  required  to  twist  one  such  shell  through 
an  angle  0. 

d.  Integrate  the  expression  found  in  part  c,  and  thus 
obtain  an  expression  for  the  torque  T'  required  to  twist  a 
solid  cylindrical  wire  through  an  angle  6.  Does  this  expres- 
sion conform  to  the  rotational  form  of  Hooke’s  law,  given 
in  Sec.  10-2? 

e.  Describe  an  experiment  by  means  of  which  a tor- 
sion pendulum  of  the  type  described  in  Sec.  10-2  can  be 
used  to  determine  the  shear  modulus  G of  the  material  of 
which  a cylindrical  wire  is  made.  This  method  of  mea- 
suring the  shear  modulus  was  first  devised  by  the  French 
engineer  and  physicist  Charles  Augustin  de  Coulomb  in 
the  late  eighteenth  century.  An  extremely  important  ap- 
plication is  described  in  Chap.  20. 

16-34.  Mercury  column.  A narrow  cylindrical  tube  of 
length  L,  open  at  both  ends,  is  immersed  halfway  into  a 
cylinder  of  mercury.  See  Fig.  16E-34.  The  protuding  end 
of  the  tube  is  covered  and  the  tube  raised  until  its  lower 
end  is  just  below  the  surface  of  the  mercury  in  the  cylinder. 
The  mercury  barometer  height  giving  the  atmospheric 
pressure  is  H.  (Hint:  Refer  to  Exercise  16-5.) 

a.  What  is  the  height  x of  the  mercury  column  re- 
maining in  the  raised  tube? 

b.  Calculate  the  numerical  value  of  x if  L = 50.0  cm 
and  H = 76.0  cm. 


1 


Fig.  16E-34 


16-35.  Spinning  bowl.  An  ordinary  kitchen  mixing 
bowl  is  partially  filled  with  water,  and  placed  on  a turn- 
table so  that  its  center  coincides  with  the  center  of  the 
turntable.  The  turntable  is  then  made  to  rotate  with  angu- 
lar speed  a).  Show  that  when  equilibrium  is  reached  the 
surface  of  the  water  assumes  the  shape  of  a paraboloid  of 
revolution.  (A  paraboloid  of  revolution  is  the  three- 
dimensional  surface  swept  out  when  a parabola  is  rotated 
about  its  axis  of  symmetry.  Hint:  Refer  to  the  result  of 
Exercise  16-20.) 

16-36.  The  barometric  equation  for  an  isothermal  atmo- 
sphere. Suppose  that  the  temperature  of  the  atmosphere 
were  the  same  everywhere. 

a.  Show  that  the  variation  of  pressure  p with  height  h 
is  given  by 

p = p0e~w  Mah 

where  p0  and  p0  are  the  density  and  pressure  at  the  bottom 
of  the  atmosphere. 

b.  If  p0  = 1-223  kg/m3  (the  density  of  air  at  sea  level) 
and  p0  = 1.013  X 105  Pa,  at  what  height  is  p = ip0 ? 

16-37.  Side  by  side.  Two  adjacent  samples  of  gas  in  a 
cylindrical  chamber  are  separated  by  an  airtight  partition, 
which  is  initially  held  clamped  in  one  place.  As  shown  in 
Figure  16E-37,  sample  A has  an  initial  pressure  pA  and  oc- 
cupies volume  VA,  while  sample  B has  pressure  pB  and  oc- 
cupies volume  VB.  The  total  volume  accessible  to  the  two 
samples  is  a constant  Vr,  so  that  VB  = VT  — V A.  The 
clamped  partition  is  released,  and  the  system  adjusts  itself 
to  an  equilibrium  in  which  the  final  pressures  p'A  and  p'B 
are  equal.  The  (common)  final  temperature  of  the  gas 
samples  is  the  same  as  the  (common)  initial  temperature. 


740  Mechanics  of  Continuous  Media 


fixed  total  volume  Vp  Fig.  16E-37 


a.  Find  p'A,  p'B,  V'A,  and  VB  in  terms  of  pA,  pB,  VA,  and 

VB. 

b.  Obtain  numerical  values  for  p'A/pA  and  for  V'A/VA 
for  the  case  pB  = 3 pA  and  VB  = 2VA. 

16-38.  Steel  spheres,  II.  Consider  one  of  the  sinking 
steel  spheres  described  in  Exercise  16-25.  Assume  that  the 
sphere  is  small  enough  that  the  flow  is  laminar,  even  at 
terminal  speed.  Furthermore,  assume  that  as  the  sphere 
accelerates  from  rest  to  its  terminal  speed,  the  viscous 
drag  force  is  given  by  Stokes’  law,  Eq.  (16-32),  at  each  in- 
stant, with  v0  being  the  instantaneous  speed  vs  of  the 
sphere. 

a.  If  the  sphere  is  released  from  rest  at  time  t = 0, 
find  its  speed  vs  as  a function  of  time  t. 

b.  Find  the  distance  ds  through  which  the  sphere 
descends  in  time  t. 

c.  How  long  is  required  for  the  sphere  to  reach  each 
of  the  following  fractions  of  its  terminal  speed?  (1) 
1 - \/e  = 0.63;  (2)  0.90;  (3)  0.99. 

d.  What  are  the  distances  that  correspond  to  the 
times  found  in  part  c? 

e.  Evaluate  numerically  the  results  of  parts  c and  d 
for  a sphere  of  radius  rs  = 50  pan  = 5.0  x 10-5  m.  How 
many  times  its  own  diameter  does  this  sphere  descend  be- 
fore reaching  99  percent  of  its  terminal  speed? 

16-39.  Tiny  bubbles.  Show  that  when  a small  gas 
bubble  is  formed  underwater,  if  buoyancy  were  the  only 
important  force,  the  bubble  would  have  an  initial  upward 
acceleration  much  greater  in  magnitude  than  g.  The 
rapid  rise  of  small  bubbles  can  be  observed  in  a glass  of 
carbonated  beverage.  ( Hint : Refer  to  Exercise  16-38a, 
and  assume  that  the  gas  is  approximately  at  atmosphere 
pressure  so  that  its  density  is  p„  — 1.22  kg/m3.) 

16-40.  Deriving  the  general  form  of  Stokes’  law  by  dimen- 
sional analysis.  Assume  that  the  drag  force  F acting  on  a 
sphere  falling  through  a fluid  depends  only  on  the  viscos- 
ity r]  of  the  fluid,  the  radius  r of  the  sphere,  and  the  veloc- 
ity v of  the  sphere.  The  force  must  then  be  given  by  an 
equation  of  the  form  F = (constant)  7)xrvif , where  x,  y, 
and  z are  exponents  whose  values  you  would  like  to  deter- 
mine and  the  constant  is  dimensionless.  By  considering 
the  known  dimensions  of  the  quantity  F,  and  the  known 
dimensions  of  the  quantities  17,  r,  and  v,  show  that  x = y = 
z = 1 , so  that  the  force  is  given  by  an  equation  of  the  form 
F = (constant)  7 yv.  Compare  this  result  with  Stokes’  law, 
Eq.  (16-32). 


16-41.  Poiseuille’s  law.  In  Fig.  16-14,  the  speed  of  the 
fluid  in  laminar  flow  is  proportional  to  the  first  power  of 
the  distance  from  the  stationary  plate.  This  is  also  approx- 
imately true  in  the  rotating-cylinder  viscometer  of  Fig. 
16-15.  However,  when  fluid  flows  through  a cylindrical 
pipe  of  radius  R and  length  Lin  a direction  parallel  to  the 
pipe  axis,  the  result  is  otherwise  because  of  the  different 
geometry.  Imagine  the  fluid  in  the  pipe  to  be  subdivided 
into  cylindrical  shells  of  infinitesimal  thickness.  These 
shells  correspond  to  the  laminae  of  Fig.  16-14.  The  outer- 
most shell,  in  contact  with  the  pipe  itself,  moves  with  neg- 
ligible speed.  Shells  of  successively  smaller  radii  move  with 
successively  greater  speeds,  being  driven  by  the  pressure 
difference  between  the  ends  of  the  pipe. 

The  shear  stress  at  any  location  within  the  fluid  is  still 
given  by  Eq.  (16-29),  expressed  in  the  differential  form 

dv 


This  stress  has  the  same  value  at  any  point  on  the  surface 
of  a cylinder  of  radius  r (where  r < R)  and  surface  area 
27 rrL.  The  retarding  force  exerted  on  that  cylinder  by  the 
fluid  outside  it  is  thus  of  magnitude  F = crs2TTrL.  Since  in 
the  steady  state  none  of  the  fluid  within  the  cylinder  of 
radius  r is  accelerating,  the  net  force  exerted  on  the  cylin- 
der must  be  zero.  This  is  possible  only  if  the  retarding 
force  is  equal  and  opposite  to  the  driving  force  due  to  the 
pressure  difference  A p between  the  ends  of  the  cylinder. 
That  driving  force  is  given  by  the  product  Apirr2,  where 
rrr2  is  the  area  of  either  end  of  the  cylinder. 

a.  Show  that  the  above  argument  leads  to  the  equation 


dv  = — 


A P 
2t)L 


r dr 


b.  Integrate  this  equation  to  obtain  the  relation 
between  v and  r for  laminar  flow  in  a cylindrical  pipe. 
{Hint:  Set  the  limits  of  integration  at  the  inner  surface  of 
the  pipe,  where  the  fluid  speed  is  zero,  and  at  the  surface 
of  an  arbitrary  cylinder  of  radius  r,  where  the  fluid  speed 
is  v.) 

c.  Show  that  if  the  density  of  the  fluid  is  p,  the  fluid 
flux  through  the  pipe  is  given  by 

77  p A pR* 

$ = 

8 r]  L 


fhis  relation  is  known  as  Poiseuille’s  law. 


16-42.  Mariotte’s  bottle.  If  water  runs  out  of  an  opening 
near  the  bottom  of  a full  container  with  an  open  top, 
the  exit  speed  of  the  water  will  decrease  as  the  level  of  the 
water  drops.  If  a steady  rate  of  flow  is  desired,  the  arrange- 
ment shown  in  Figure  16E-42,  called  Mariotte’s  bottle,  can 
be  used.  The  bottle  is  initially  completely  full  of  water. 
When  the  water  level  has  fallen  so  that  it  lies  a (decreasing) 
distance  h + hi  above  B within  the  bottle,  and  a (fixed) 
distance  h above  B within  the  tube  (actually  at  the  bottom 
of  the  tube): 

a.  What  is  the  pressure  at  A? 


Exercises  741 


b.  Apply  Bernoulli’s  theorem  to  obtain  the  speed  of 
flow  at  B. 

c.  What  is  the  pressure  of  the  air  trapped  in  the 
bottle  above  the  water  surface? 

d.  What  happens  to  this  pressure  as  water  flows  out 
at  Bit  What  causes  this  pressure  change? 


16-43.  Letting  it  out.  The  opening  near  the  bottom  of 
the  vessel  in  Fig.  16E-43  has  an  area  a.  A disk  is  held 
against  the  opening  to  keep  the  liquid,  of  density  p,  from 
running  out. 


Fig.  16E-43 


a.  With  what  force  does  the  liquid  press  on  the  disk? 

b.  The  disk  is  moved  away  from  the  opening  a short 
distance.  The  liquid  squirts  out,  striking  the  disk  inelasti- 


cally.  After  striking  the  disk,  the  water  drops  vertically 
downward.  Show  that  the  force  exerted  by  the  water  on 
the  disk  is  twice  the  force  in  part  a. 

16-44.  Undershot  water  wheel.  In  Fig.  16E-44,  a steady 
stream  of  water  of  cross-sectional  area  a and  speed  v strikes 
one  of  the  vanes  of  an  undershot  water  wheel  in  an  ap- 
proximately normal  direction. 

a.  If  the  vanes  are  moving  with  speed  V,  what  is  the 
magnitude  of  the  force  exerted  by  the  water  stream  on  the 
vane?  Assume  that  the  water  drops  vertically  from  the 
vane  after  impact. 

b.  What  is  the  power  obtained  from  the  wheel? 

c.  What  is  the  desired  relation  betwen  V and  v for 
maximum  power? 


Fig.  16E-44 


d.  What  is  the  efficiency  of  the  system  at  maximum 
power?  [Efficiency  is  defined  to  be  the  ratio  of  output 
power  to  input  power  (or  input  energy  per  unit  time).] 

16-45.  Leaky  can.  A water-filled  can  sits  on  a table. 
The  water  squirts  out  of  a small  hole  in  the  side  of  the  can, 
located  a distance  y below  the  water  surface.  The  height  of 
the  water  in  the  can  is  h. 

a.  At  what  distance  x from  the  base  of  the  can, 
directly  below  the  hole,  does  the  water  strike  the  table  top? 
Neglect  air  resistance. 

b.  Flow  far  from  the  bottom  of  the  can  must  a second 
small  hole  be  located  if  the  water  coming  out  of  this  hole  is 
to  have  the  same  range  x? 

c.  How  far  from  the  surface  of  the  water  must  the 
hole  be  located  to  give  the  maximum  range? 


742  Mechanics  of  Continuous  Media 


S 7 

The  Phenomenology 
of  Heat 


17-1  THE 
PHENOMENOLOGICAL 
APPROACH 


This  chapter  is  concerned  principally  with  two  concepts — temperature  and 
heat — which  appear  at  first  glance  to  lie  outside  mechanics,  the  subject 
matter  of  this  book  as  it  has  been  developed  so  far.  But  the  mechanical 
properties  of  substances  depend  very  much  on  their  temperature.  And 
physical  systems  can  be  made  to  do  mechanical  work  by  transferring  heat 
into  and  out  of  them.  (Indeed,  specially  designed  systems  called  heat 
engines,  such  as  the  gasoline  engine  and  the  steam  turbine,  are  intended  to 
exploit  this  fact  to  the  greatest  possible  advantage.)  These  are  onlv  two  ex- 
amples of  the  many  ways  in  which  temperature  and  heat  are  closely  linked 
with  mechanics,  even  from  the  most  casual  point  of  view. 

In  this  chapter,  we  deal  with  temperature  and  heat,  and  their  applica- 
tion to  mechanical  systems,  in  a phenomenological  way.  That  is,  we  describe 
a number  of  important  experimental  observations  in  a systematic,  quantita- 
tive way.  But  we  do  not  attempt  here  to  understand  them  in  depth,  in  terms 
of  newtonian  mechanics.  For  example,  suppose  we  heat  a specific  mechan- 
ical system  — using  the  term  “heat”  in  the  everyday  sense — to  a certain 
temperature.  We  can  define  heat  and  temperature  quantitatively  in  terms 
of  the  observed  changes  that  occur  in  the  system.  From  a phenome- 
nological point  of  view,  “temperature”  is  nothing  more  nor  less  than  the 
scale  reading  at  the  end  of  the  mercury  column  in  a particular  mechanical 
system  called  a thermometer,  and  “heat”  is  whatever  it  is  that  must  be  “put 
into”  the  system  in  order  to  raise  its  temperature.  (We  will  soon  discuss 
these  matters  more  explicitly.) 

The  quantities  thus  defined  phenomenologically  can  then  be  used  to 
quantitatively  describe  the  behavior  of  a wide  variety  of  mechanical 
systems.  The  resulting  descriptions,  while  they  can  be  quite  precise  and 
very  useful,  do  not  tell  us  what  temperature  and  heat  really  are.  That  is, 


743 


they  say  nothing  as  to  how  temperature  and  heat  are  related  in  a funda- 
mental, logical  way  to  such  now-familiar  quantities  as  energy  or  mo- 
mentum or  to  the  framework  of  newtonian  mechanics  in  general.  Never- 
theless, the  systematic  description  to  which  this  chapter  is  devoted 
furnishes  an  essential  background  for  just  such  a deeper  understanding. 
This  is  developed  in  Chap.  18  in  terms  of  the  behavior  of  macroscopic 
systems  as  seen  from  the  microscopic  point  of  view.  For  example,  we  will 
relate  the  pressure,  volume,  temperature,  and  quantity  of  a gas  confined  in 
a container  to  the  collisions  of  the  molecules  of  the  gas  with  the  container 
walls  and  with  one  another.  Out  of  this  study  will  emerge  an  understanding 
of  the  microscopic  structure  of  gases.  (This  is  not  a trivial  matter,  as  is  evi- 
denced by  the  fact  that  the  very  word  “gas”  was  coined  in  1632  as  a deriva- 
tive of  the  Greek  word  chaos.) 

Thus  in  Chap.  18  the  subject  matter  of  this  chapter  is  explicitly  incor- 
porated into  newtonian  mechanics.  Temperature,  heat,  and  other  related 
quantities  defined  phenomenologically  in  this  chapter  are  then  redefined 
in  more  general  and  fundamental  terms. 


17-2  TEMPERATURE  Although  we  take  the  concept  of  temperature  almost  for  granted,  the  path 

of  reasoning  that  leads  from  the  qualitative  idea  to  an  unambiguous,  uni- 
versally applicable  definition  is  a moderately  intricate  one.  Such  a defini- 
tion depends  in  an  essential  way  on  the  microscopic  theory  developed  in 
Chap.  18,  while  the  derivation  of  the  theory  depends  on  at  least  some  prior 
understanding  of  temperature.  Thus  we  will  have  to  raise  ourselves  by  our 
own  bootstraps.  Fortunately,  we  will  be  able  to  do  so  without  resorting  to 
circular  reasoning. 

A rough-and-ready  measurement  of  temperature  is  essential  in  all 
sorts  of  human  activity,  notably  in  cooking,  metallurgy,  and  pottery  manu- 
facture. But  there  are  some  fairly  delicate  cases,  too,  where  temperature 
measurement  is  important.  As  an  example,  the  Paduan  physician  Sanc- 
torius  was  probably  the  first  to  use  a crude  thermometer  to  obtain  consist- 
ent, objective,  and  quantitative  readings  of  the  small  variations  in  human 
body  temperature  for  aid  in  medical  diagnosis  (although  mothers  have 
doubtless  been  feeling  their  children’s  foreheads  for  millennia).  Credit  for 
the  invention  of  the  thermometer  Sanctorius  used  was  claimed  by  his 
fellow  professor  Galileo,  although  it  was  probably  invented  independently 
by  others  at  about  the  same  time. 

If  temperature  measurement  is  to  be  useful,  serviceable  thermometers 
must  be  built  and  a temperature  scale  established.  The  first  problem  is  a 
practical  one:  to  devise  a thermometer  which  is  rugged,  convenient,  and 
reliable  and  which  gives  reproducible  results  over  a long  period.  1 he 
mercury-in-glass  thermometer,  shown  schematically  in  Fig.  17-1,  is  a very 
satisfactory  solution  of  this  problem  over  a wide  range  of  applications.  As 
the  temperature  changes,  the  mercury  expands  or  contracts  slightly  in  a re- 
producible fashion.  (We  discuss  thermal  expansion  in  Sec.  17-5.  Here  it  suf- 
fices to  be  aware  of  the  phenomenon.)  While  the  glass  envelope  expands 
and  contracts  as  well,  the  mercury  does  so  to  a greater  extent.  Since  the 
relatively  large  volume  of  mercury  in  the  bulb  can  expand  only  into  the 
very  fine  (narrow)  capillary  tube,  small  changes  in  this  volume — and  hence 
small  temperature  changes — can  be  measured. 


744  The  Phenomenology  of  Heat 


Fig.  17-1  Schematic  illustration  of  the  mercury-in-glass  thermom- 
eter. The  thin-walled  bulb  at  the  bottom  is  connected  to  the  very 
fine,  thick-walled,  highly  uniform  capillary  tube  on  which  the  tem- 
perature scale  is  etched.  Just  as  in  the  barometer,  the  space  above 
the  mercury  column  is  evacuated.  Most  of  the  mercury  is  contained 
in  the  bulb,  whose  volume  is  much  larger  than  that  of  the  capillary. 
When  the  bulb  is  immersed  in  the  medium  whose  temperature  is  to 
be  measured,  the  mercury  soon  comes  into  thermal  equilibrium 
with  its  surroundings — that  is,  the  mercury  and  the  surrounding 
medium  come  to  the  same  temperature.  In  this  process,  the  tem- 
perature of  the  mercury  must  usually  change,  and  its  volume 
changes  slightly  in  a corresponding  manner.  Specifically,  the  mer- 
cury expands  when  its  temperature  is  increased  and  contracts 
when  its  temperature  is  decreased,  and  the  final  volume  depends 
on  the  temperature  in  a reproducible  fashion.  Because  the  capil- 
lary tube  is  very  fine,  a small  change  in  the  total  volume  of  the  mer- 
cury results  in  a readily  measurable  change  in  the  position  of  the 
end  of  the  mercury  column.  The  temperature  is  read  by  noting  the 
location  along  the  scale  of  the  end  of  the  mercury  column.  (The 
glass  envelope  also  expands  and  contracts  with  changing  tempera- 
ture, but  this  effect  is  small  compared  to  the  expansion  and  con- 
traction of  the  mercury.) 


The  first  fairly  accurate  liquid-in-glass  thermometers  were  devised 
about  1720  by  Daniel  G.  Fahrenheit  (1686-1736).  His  development  of 
techniques  for  drawing  fine  capillary  tubes  of  highly  uniform  diameter  and 
then  filling  and  evacuating  them  was  a technological  triumph  of  his  time. 

The  second  problem  in  practical  thermometry  is  to  devise  a tempera- 
ture scale  which  can  be  reproduced  at  will,  so  that  persons  measuring  tem- 
peratures at  different  places  and  times  can  compare  their  results.  A fairly 
satisfactory  but  completely  empirical  recipe  for  doing  this  is  as  follows: 

1.  Pick  two  phenomena  which  are  believed  always  to  occur  at  repro- 
ducible temperatures. 

2.  Make  marks  on  the  scale  of  a thermometer,  constructed  in  a speci- 
fied fashion,  when  it  is  immersed  successively  in  the  two  media  in  which  the 
chosen  phenomena  are  occurring. 

3.  Divide  the  distance  between  the  two  marks  into  a convenient 
number  of  equally  spaced  marks.  Each  of  these  represents  one  degree  (°)  of 
the  temperature  scale. 


In  following  this  recipe,  Fahrenheit  defined  as  the  zero  of  temperature 
the  lowest  temperature  he  could  then  achieve  in  the  laboratory,  that  of  an 
ice-salt  mixture.  That  is,  he  immersed  the  bulb  of  his  thermometer  in  an 
ice-salt  mixture  and  then  made  a mark  on  its  scale  which  he  labeled  0°.  For 
the  upper  defining  temperature,  which  he  chose  to  call  100°,  he  took 
normal  human  body  temperature.  One  Fahrenheit  degree  is  then  automat- 
ically just  1/100  of  the  temperature  difference  between  the  two  defining 
temperatures,  which  are  called  fixed  points. 

It  was  apparent  quite  early  that  the  phenomena  chosen  by  Fahrenheit 
to  define  his  fixed  points  were  not  accurately  reproducible.  As  it  happens, 
the  lowest  temperature  that  can  be  obtained  with  an  ice-salt  mixture  is 
about  6 Fahrenheit  degrees  lower  than  the  best  he  could  do.  The  fact  that 


17-2  Temperature  745 


human  body  temperature  is  about  98.6°,  and  not  100°,  has  to  do  with  the 
limits  of  accuracy  of  Fahrenheit’s  early  thermometers,  as  well  as  with  the 
variability  of  body  temperature. 

The  Fahrenheit  scale  was  therefore  redefined  by  using  the  melting 
and  boiling  “points”  of  water  at  atmospheric  pressure  as  the  new  fixed 
points.  In  order  to  make  the  newly  defined  scale  approximately  reconcil- 
able to  the  old  one,  the  new  points  were  defined  to  be  32°  and  212°,  respec- 
tively, which  is  about  what  they  had  been  on  the  old  scale.  We  will  soon 
discuss  other  very  similar  redefinitions  of  temperature  scales. 

I he  Fahrenheit  scale  is  almost  exclusively  of  historical  interest  today; 
our  discussion  is  intended  mostly  to  show  the  essentially  arbitrary  nature  of 
the  choice  of  fixed  points  and  to  illustrate  the  practical  considerations 
which  lead  to  particular  choices  for  these  points.  The  Celsius  scale,  which 
we  discuss  next,  is  used  almost  exclusively  in  the  civilized  world.  Only  the 
United  States  and  a few  other  countries  still  use  the  Fahrenheit  scale  to  a 
limited  and  rapidly  diminishing  extent. 

The  Swedish  astronomer  Anders  Celsius  (1701-1744)  was  among  the 
first  to  realize  fully  the  importance  of  using  easily  reproducible  fixed  points 
exclusively  (as  we  have  done  above  in  the  redefinition  of  the  Fahrenheit 
scale).  Taking  the  melting  point  of  ice  and  the  boiling  point  of  water  at 
atmospheric  pressure  as  the  fixed  points,  Celsius  chose,  as  had  Fahrenheit, 
to  divide  the  interval  between  his  fixed  points  into  100  degrees.  Since  both 
the  interval  and  the  fixed  points  are  different  from  Fahrenheit’s  original 
ones,  the  two  scales  differ  in  the  sizes  of  their  degrees  as  well  as  in  the  loca- 
tion of  their  zero  points. 

It  is  a common  misnomer  to  call  the  Celsius  scale  the  centigrade  scale. 
Strictly  speaking,  any  scale  which  has  100  degrees  between  its  fixed  points 
is  a centigrade  (that  is,  a 100-degree)  scale.  In  this  sense  the  original  Fahr- 
enheit scale  is  a centigrade  scale.  The  Celsius  scale  used  to  be  the  particular 
centigrade  scale  in  which  the  fixed  points  are  taken  to  be  the  freezing  and 
boiling  points  of  water,  but  even  this  is  no  longer  true,  except  approxi- 
mately. The  Celsius  scale  has  been  redefined  in  terms  of  the  absolute  tem- 
perature scale  in  a manner  which  we  discuss  later.  This  redefinition  in- 
volves only  one  arbitrarily  chosen,  fixed  point  (the  other  being  absolute 
zero).  The  old  fixed  points  were  abandoned  as  being  too  difficult  to  repro- 
duce with  sufficient  accuracy.  The  fixed  point  now  used  is  the  triple  point 
of  water.  This  is  the  unique  combination  of  pressure  and  temperature  at 
which  alone  water  can  exist  simultaneously  in  the  three  phases:  solid  (ice),  liq- 
uid (water),  and  gas  (water  vapor).  Since  this  can  happen  at  one  pressure 
only,  there  is  no  possibility  that  an  error  in  pressure  control  will  result  in  a 
temperature  error. 

Such  a scale  can  no  longer  be  spoken  of  properly  as  a centigrade  scale. 
The  temperature  ttp  of  the  triple  point  is  defined  to  be  exactly  0.01  degrees 
Celsius.  We  write  this  definition  as 

*tP  = 0.0 1°C 

(It  is  conventional  to  refer  to  a specific  temperature  on  the  Celsius  scale  in 
the  form  “degrees  Celsius”  (°C).  But  in  referring  to  a difference  between  two 
temperature  measurements  on  this  scale,  we  refer  to  “Celsius  degrees.”) 
The  freezing  point  of  water  at  atmospheric  pressure  lies  quite  close  to  0.01 
Celsius  degrees  below  the  triple-point  temperature — that  is,  quite  close  to 
0°C.  The  boiling  point  of  water  at  atmospheric  pressure  is  close  to  100°C, 


746  The  Phenomenology  of  Heat 


so  that  the  temperature  difference  between  the  freezing  and  boiling  points 
is  close  to  (but  not  exactly)  100  Celsius  degrees.  But  that  is  simply  a con- 
venience; neither  of  these  two  points  is  used  in  defining  the  temperature 
scale. 


17-3  CHARLES’  LAW  Now  that  we  have  an  empirical  (that  is,  experimentally  based)  definition  of 

temperature,  let  us  consider  quantitatively  the  connection  between  temper- 
ature and  such  other  quantities  as  pressure  and  volume.  On  the  basis  of  the 
Boyle’s-law  experiment  described  in  Sec.  16-4,  we  derived  an  empirical 
rule  connecting  pressure  with  volume  under  conditions  of  constant  tempera- 
ture. We  now  consider  a very  similar  (if  somewhat  idealized)  experiment  in 
which  temperature  is  varied  and  the  volume  is  observed  under  conditions 
of  constant  pressure.  The  experimental  apparatus,  depicted  schematically  in 
Fig.  17-2,  is  called  the  constant-pressure  gas  thermometer.  A drop  of  liquid, 
which  is  free  to  move  in  the  long  horizontal  tube,  defines  the  volume  of  the 
gas  trapped  to  the  left  of  it.  The  pressure  of  that  gas  must  be  equal  to  atmo- 
spheric pressure,  since  the  chop  will  move  until  it  is.  With  such  a device  it  is 
easy  to  make  the  qualitative  observation  that  the  volume  of  the  trapped  gas 
increases  with  increasing  temperature  and  decreases  with  decreasing  tem- 
perature. (Indeed,  this  was  the  basis  of  Galileo's  primitive  thermometer.) 

In  order  to  make  this  observation  quantitative,  hrst  we  place  the  gas 
thermometer  in  a bath  at  the  triple-point  temperature  of  water.  With  <tP  = 
0.01°C,  we  carefully  measure  the  volume  Ttp  of  the  trapped  gas.  Using  an 
empirical  thermometer  of  the  sort  described  in  Sec.  17-2,  say  a mercury- 
in-glass  one,  we  prepare  a series  of  baths  at  various  temperatures,  both 
above  and  below  0.0 1°C.  Immersing  the  gas  thermometer  successively  in 
these  baths,  we  find  that  each  temperature  increase  (or  decrease)  of  1 
Celsius  degree  results  in  a corresponding  increase  (or  decrease)  of  the  gas 
volume  by  ( 1/273. 16)Vtp . That  is.  if  a series  of  measurements  of  volume  V 
are  made  at  various  temperatures  t with  the  pressure  held  constant  for  all 
measurements,  each  volume  V is  related  to  the  corresponding  temperature 
t by  the  relation 

V = Utp  ^1  + ) f°r  constant  pressure  and  mass  (17-1) 

The  qualification  of  constant  mass  arises  from  the  fact  that  the  gas  within 
the  apparatus  is  trapped.  That  is,  no  gas  can  enter  or  leave  the  apparatus. 
This  empirical  expression  is  known  as  Charles’  law,  or  Gay-Lussac’s  law, 
after  Jacques  A.  C.  Charles  (1746-1823)  and  Joseph  Louis  Gay-Lussac 


ttp  = 0.01  °c 


t,  > t. 


tp 


dp  6 + 273.1 6) 


Fig.  17-2  A constant-pressure  gas  thermometer.  A fixed  quantity 
of  gas  is  confined  to  the  left  of  the  liquid  drop  in  the  horizontal  arm 
of  the  device.  The  gas  pressure  must  be  equal  to  that  of  the  atmo- 
sphere. As  the  temperature  changes,  the  volume  of  the  confined  gas 
changes  as  shown,  relative  to  its  value  Ttp  at  the  triple-point  tempera- 
ture f,  p = 0.0 1°C. 


h 0, 


Vxv  0 + 


273.16 


17-3  Charles'  Law  747 


(1778-1850),  both  of  whom  made  significant  inquiries  into  the  thermal 
properties  of  gases. 

Like  Boyle’s  law,  Charles'  law  is  very  general  in  the  sense  that  it  applies 
to  all  gases,  as  long  as  the  temperature  is  well  above  the  liquefaction  point 
of  the  gas  in  question  and  the  pressure  is  not  too  high.  This  is  something  of 
a surprise.  While  we  might  have  expected  all  gases  to  contract  on  cooling, 
we  would  probably  not  have  predicted  in  advance  that  they  would  all  con- 
tract at  the  same  rate. 

If  we  were  to  carry  out  the  experiment  with  a large  variety  of  gases,  we 
would  see  one  after  another  “drop  out”  by  liquefying  as  the  temperature 
were  lowered.  Once  liquefied,  a substance  is  no  longer  a gas,  and  Charles’ 
law  no  longer  applies  to  it.  The  remaining  gases,  however,  would  continue 
to  obey  Charles’  law.  On  the  basis  of  this  observation,  we  assert  that  insofar 
as  a gas  resists  liquefaction,  it  approximates  the  behavior  of  what  we  call  an 
ideal  gas — one  which  obeys  Charles’  law  perfectly  at  all  temperatures.  In 
this  sense,  some  gases  are  closer  to  ideal  than  others.  Helium,  in  particular, 
does  not  liquefy  at  atmospheric  pressure  until  its  volume  is  only  1.5  percent 
of  Ttp<  a point  reached  at  about  — 269°C,  which  is  only  about  4.2  Celsius  de- 
grees above  the  temperature  at  which  we  would  expect  its  volume  to  go  to 
zero,  according  to  Eq.  (17-1),  if  it  did  not  liquefy.  Thus  helium  comes  very 
close  to  approximating  an  ideal  gas. 

Charles’  law  predicts  that  if  we  could  somehow  invent  an  ideal  gas  to 
fill  the  gas  thermometer,  the  trapped  volume  would  vanish  completely  at  a 
temperature  273.16  Celsius  degrees  below  the  triple-point  temperature  of 
water  tip.  Since  D>  = 0.0 1°C,  this  would  occur  at  the  Celsius  temperature 
t = — 273.15°C.  Thus  t = — 273.15°C  is  the  lowest  conceivable  temperature, 
at  least  for  gases.  It  is  the  best  possible  choice  for  a zero  point  on  an  absolute 
temperature  scale.  If  we  use  such  a scale,  all  temperatures  must  be  positive, 
at  least  for  gases. 

This  absolute  zero  is  the  basis  for  the  Kelvin  scale  of  temperature;  we 

define  it  to  be  0 kelvin  (0  K).  But  we  must  still  define  the  size  of  the  degree 
of  temperature,  called  the  kelvin.  In  order  to  make  it  quite  closely  the  same 
as  the  old  centigrade  degree,  we  define  the  Kelvin  temperature  of  the  triple  point 
of  water  Ttp  to  have  the  value 

Dp  = 273.16  K 

as  already  noted.  The  conversion  between  Kelvin  and  Celsius  temperatures 
is  then  given  by  a simple  equation  and  its  inverse,  which  in  fact  define  the 
temperature  t on  the  Celsius  scale  in  terms  of  the  temperature  T on  the 
Kelvin  scale.  These  are 

T = t + 273.15  and  t = T - 273.15  (17-2) 

Thus  0°C  = 273.15  K.  And  a temperature  difference  of  1 Celsius  degree  is 
equal  to  a difference  of  1 K. 

In  older  literature,  the  standard  nomenclature  was  °K  (degrees  Kelvin)  or  oc- 
casionally °A  (degrees  absolute)  rather  than  simply  K.  The  degree  sign,  however, 
is  really  a historical  appendage.  It  tends  to  be  misleading  in  that  it  suggests  that 
temperature  is  a quantity  possessing  dimensions.  Inspection  of  Eq.  (17-1),  how- 
ever, will  convince  you  that  it  is  really  a dimensionless  number.  The  kelvin  is 
named  after  William  Thomson,  Lord  Kelvin  (1824-1907),  who  was  probably  the 


748  The  Phenomenology  of  Heat 


first  to  suggest  the  use  of  an  absolute  temperature  scale.  Kelvin  made  an  enormous 
number  of  important  contributions  to  science  and  engineering,  notably  in  thermo- 
dynamics and  electricity. 

The  use  of  the  Kelvin  scale  makes  possible  a simplification  of  Eq. 
(17-1).  Using  the  second  of  Eqs.  (17-2)  and  noting  that  itp  = 0.01°C,  we 
obtain  for  the  volume  V at  any  Kelvin  temperature  T the  relation  V = 
Vtp[l  + (T  — 273.15  - 0.01)/273.16],  or 

T 

V = Ttp  — for  constant  pressure  and  mass  (17-3) 

This  equation  shows  that  the  volume  of  an  ideal  gas  is  directly  proportional  to  its 
absolute  temperature.  In  order  to  stress  the  fundamental  importance  of  this 
point,  we  rewrite  Charles’  law  in  the  general  form  V/T  = Ttp/273.16,  or 

V 

— = constant  for  constant  pressure  and  mass  (17-4) 

In  Example  17-1,  Boyle’s  law  and  Charles’  law  are  used  to  describe  the 
behavior  of  a quantity  of  confined  gas  as  its  pressure  and  its  temperature 
are  changed. 


EXAMPLE  17-1  — — — ^ 

The  cylinder  shown  in  Fig.  17-3  is  equipped  with  a leakproof  piston.  A pointer  at- 
tached to  the  piston  rod  and  a scale  provide  a means  of  measuring  the  cylinder  vol- 
ume V at  any  time.  The  cylinder  is  also  fitted  with  a pressure  gauge,  so  that  the  pres- 
sure p of  the  gas  trapped  inside  it  can  also  be  measured. 

a.  The  apparatus  is  immersed  in  a bath  of  ice  water.  Its  temperature  is  thus 
7\  = 273  K.  The  position  of  the  piston  is  adjusted  until  the  cylinder  volume  is  V1  = 
1.000  liter  (L)  (1  L = 1 X 10~3  m3).  The  pressure  gauge  shows  that  the  gas  pressure 
inside  is  px  = 1.000  atm.  With  the  apparatus  still  immersed  in  ice  water,  the  piston 
is  pulled  out  until  the  pressure  gauge  reads  p2  — 0.333  atm. 

A gas  heater  under  the  bath  is  then  turned  on.  All  the  ice  melts,  and  the  bath 
temperature  then  slowly  rises  until  the  water  begins  to  boil.  The  system  thus  comes 
to  a final  temperature  T3  = 373  K.  During  this  process,  the  piston  is  moved  as  neces- 
sary to  keep  the  pressure  reading  constant.  Thus  the  final  pressure  reading  is  p3  = 
p2  = 0.333  atm.  What  is  the  final  cylinder  volume  V3  read  on  the  scale? 

■ The  process  has  two  parts,  shown  schematically  in  Fig.  17-4 a.  In  the  first 
part,  the  temperature  of  the  system  remains  constant,  being  fixed  at  the  freezing 
point  of  water.  Since  the  piston  is  leakproof,  the  mass  of  the  gas  in  the  cylinder  re- 


Fig.  17-3  Illustration  for  Example  17-1. 


I I I I II  I II  I I I II  I I I Y I II  I I I I II  I r I TTT 
V 

Volume  scale 


17-3  Charles'  Law  749 


(a) 


(b) 


Fig.  17-4  Schematic  diagram  of  two  processes  carried  out  on  the  system  shown  in  Fig.  17-3 
and  described  in  Example  17-1.  Each  of  the  two  processes  involves  a temperature  change  at 
constant  pressure  and  a pressure  change  at  constant  temperature.  It  is  shown  in  Example  17-1 
that  if  the  initial  conditions  are  the  same,  the  final  conditions  will  be  the  same,  regardless  of 
the  order  in  which  the  changes  are  carried  out.  It  is  only  necessary  that  the  temperature  and 
pressure  changes  be  the  same  in  the  two  cases. 


mains  constant  throughout.  Consequently,  you  can  use  Boyle’s  law  of  Eq.  (16-206), 
pV  = constant,  to  determine  the  volume  V2  of  the  cylinder  which  corresponds  to 
the  pressure  p2.  Since  for  constant  temperature  and  mass  the  product  pV  remains 
constant,  you  have 

P2V2  = p\V  1 


or 


In  the  second  part  of  the  process,  the  pressure  of  the  system  remains  constant. 
Consequently,  you  can  use  Charles’  law  to  determine  the  volume  V3  from  the  pres- 
sure p2,  the  corresponding  temperature  T2,  and  the  final  temperature  T3.  Ex- 
pressing the  temperatures  on  the  Kelvin  scale  makes  possible  the  use  of  Charles’  law 
in  the  convenient  form  of  Eq.  (17-4).  Since  for  constant  pressure  and  mass  Charles’ 
law  in  this  form  requires  that  the  ratio  V/T  remain  constant,  you  have 

Va  = V2 
T3  ~ Ti 


or 


Inserting  into  this  equation  the  value  of  V2  just  obtained  by  applying  Boyle’s  law  to 
the  first  part  of  the  process,  you  find 


p 1 T3 

V3  = V1^~—  (17-5a) 

pi  Ti 

Since  the  temperature  does  not  change  in  the  first  part  of  the  process,  you  have 
T2  = Tx.  And  since  the  pressure  does  not  change  in  the  second  part  of  the  process, 
you  have  p2  = p3.  Making  these  substitutions  in  Eq.  (l7-5a),  you  have 

Pi  t3 

V3  = V1^--^  (17-5  b) 

Ps  T 1 


750  The  Phenomenology  of  Heat 


When  you  insert  the  numerical  values,  you  obtain 


V3  = 1-000  L x 


1.000  atm 
0.333  atm 


X 


373  K 
273  K 


= 4.10  L 


b.  The  apparatus  is  again  immersed  in  a bath  of  ice  water,  with  the  initial  con- 
ditions the  same  as  those  in  part  a.  Its  temperature  is  T[  = Tl  = 273  K,  the  cylinder 
volume  is  V[  = V1  = 1.000  L and  the  gas  pressure  is  p[  = p1  = 1.000  atm.  This  time, 
the  system  is  first  heated  until  the  water  in  the  bath  boils.  During  this  part  of  the 
process,  the  piston  is  moved  as  necessary  to  maintain  the  pressure  reading  constant, 
so  that  when  the  temperature  reaches  Ti  — 373  K,  the  pressure  is  pi  = pi.  With  the 
apparatus  still  immersed  in  boiling  water,  the  piston  is  pulled  out  until  the  pressure 
gauge  reads  pi  = 0.333  atm.  What  is  the  final  cylinder  volume  Vi  read  on  the  scale? 

■ Again,  the  process  has  two  parts,  shown  schematically  in  Fig.  17-4 b.  In  both 
parts,  again  the  mass  of  the  gas  in  the  cylinder  remains  constant.  In  the  first  part, 
the  pressure  remains  constant,  so  that  Charles’  law  can  be  used  to  determine  the 
volume  V'2  of  the  cylinder  which  corresponds  to  the  temperature  T i.  You  have 


Vi 

Ti 


T[ 


or 


In  the  second  part  of  the  process,  the  temperature  of  the  system  remains  con- 
stant. Consequently,  you  can  use  Boyle’s  law  to  determine  the  volume  V3  which  cor- 
responds to  the  final  pressure  pi.  You  have 

PiVi  = piVi 


or 


p2 

Vs  = Vi r; 
Pi 


Inserting  into  this  equation  the  value  of  Vi  just  obtained  by  applying  Charles’  law  to 
the  first  part  of  the  process,  you  find 


Ti  pi 

vi  = t;  — 

T pi 


(17 -6a) 


Since  the  pressure  does  not  change  in  the  hrst  part  of  the  process,  you  have 
pi  = p[.  And  since  the  temperature  does  not  change  in  the  second  part  of  the 
process,  you  have  T i = Ti.  Making  these  substitutions  in  Eq.  ( 1 7-6« ),  you  have 


Vi  = Ti 


Tipi 

Tipi 


(17-6  b) 


But  the  initial  volume,  temperature,  and  pressure  are  the  same  as  those  in  part  a. 
And  the  final  temperature  and  pressure  are  also  the  same  as  those  in  part  a.  Hence 
you  have 

T3  pi 

Vi  = Vi  — ^ (17-7  a) 

Ti  p3 


or,  comparing  with  Eq.  ( 1 7-56), 


Ti  = V3 


(17-76) 

17-3  Charles’  Law  751 


And  since  you  found  V3  = 4. 10  L in  part  a,  you  have  also 

Vi  = 4.10  L 

Does  it  follow  from  the  above  discussion  that  the  intermediate  volumes  V2  and  V2 
are  equal  as  well? 


Example  17-1  shows  that  the  final  volume  of  the  system  does  not  de- 
pend on  the  order  in  which  the  operations  of  temperature  change  and 
pressure  change  are  carried  out.  Indeed,  in  both  parts  of  the  example  it 
was  possible  to  write  an  ecpiation  which  expressed  the  final  volume  of  the 
system  in  terms  of  the  final  temperature  and  pressure  and  the  initial  vol- 
ume, temperature,  and  pressure.  While  the  final  values  were  found  by 
working  through  an  intermediate  state  represented  by  p2,  V2,  and  T2  (or  p2, 
V2,  and  T 2),  those  intermediate  quantities  did  not  appear  in  the  result.  This 
suggests  a very  important  principle,  which  we  verify  in  this  and  the  next 
two  chapters:  The  pressure,  volume,  and  temperature  of  a fixed  mass  of 
gas  are  interrelated  in  a way  that  does  not  depend  on  the  particular  process 
through  which  they  were  attained. 


17-4  THE  EQUATION 
OF  STATE  OF  AN 
IDEAL  GAS 


In  Secs.  16-4  and  17-3,  we  obtained  two  quantitative  relations  among  the 
variables  describing  the  condition  of  a fixed  quantity  of  gas: 

pV  = constant  for  T = constant  (Boyle’s  law)  (l7-8a) 

V 

— = constant  for  p — constant  (Charles’  law)  (17-86) 


We  can  write  an  equation  which  includes  all  this  information,  without  the 
special  restrictions  placed  on  each  of  the  above  equations.  This  equation, 
valid  for  a fixed  quantity  of  gas,  is 

pV 

— = constant  ( 1 / -9) 

Note  that  if  we  impose  either  of  the  special  conditions  (T  or  p held  fixed), 
we  get  back  the  corresponding  “law.” 

lire  empirical  constant  in  Eq.  (17-9)  must  contain  as  a factor  the 
amount  of  matter  present  in  the  gas.  To  see  that  this  is  so,  consider  two 
containers  of  equal  volume  V which  are  separated  by  a removable  partition. 
Suppose  that  each  container  holds  the  same  kind  of  gas  at  the  same  pres- 
sure p and  temperature  T.  Since  everything  else  is  equal,  the  two  containers 
must  hold  equal  amounts  of  gas.  Now  remove  the  partition.  The  resulting 
large  container  holds  twice  as  much  gas  as  either  of  the  original  containers. 
Its  volume  is  2V,  while  the  pressure  and  temperature  are  still  p and  T.  The 
quantity  on  the  left  side  of  Eq.  (17-9)  is  thus  doubled.  So  the  quantity  on  the 
right  side  of  the  same  equation  must  also  be  doubled.  Since  nothing  else 
has  changed,  this  must  be  because  a doubling  of  the  amount  of  matter  re- 
sults in  a doubling  of  the  value  of  the  constant.  Thus  we  can  rewrite  Eq. 
(17-9)  in  the  more  explicit  but  still  somewhat  ambiguous  form 

pV 

— = (amount  of  matter)  x (residual  constant)  (17- 10a) 


The  quantity  we  have  called  “amount  of  matter”  can  be  expressed  as  a mass, 

752  The  Phenomenology  of  Heat 


as  we  have  always  done  up  to  this  point.  This  is  certainly  consistent  with  the 
restrictions  we  have  placed  on  Boyle’s  and  Charles’  laws,  which  are  valid 
only  for  gases  whose  mass  is  fixed.  However,  Charles’  law  in  the  form  of  Eq. 
(17-3),  V = Vtp(7’/273.16),  suggests  that  the  mass  of  the  gas  is  not  the  quan- 
tity of  primary  interest.  This  equation  holds  (in  the  range  of  conditions 
under  which  a gas  approximates  the  ideal  gas)  for  all  types  of  gases.  But  the 
mass  of  an  individual  gas  molecule  differs  from  one  type  of  gas  to  another. 

Indeed,  if  the  mass  of  an  individual  gas  molecule  is  not  relevant,  the 
only  possible  quantity  we  can  insert  into  Eq.  (17- 10a)  having  to  do  with  the 
amount  of  matter  present  is  the  number  of  molecules  present.  (This  number 
is  proportional  to  the  total  mass  of  the  gas  for  any  particular  gas.)  We 
denote  the  number  of  molecules  by  the  symbol  N and  the  residual  constant 
by  the  symbol  k.  Equation  (17- 10a)  can  then  be  written  as  pV /T  = Nk.  Di- 
viding by  N yields 

pv 

Jjf  = k (17-10/d 

The  argument  leading  to  Eq.  (17-10/0  may  be  tested  experimentally.  If 
it  is  correct,  the  quantity  k must  have  the  same  value  (within  experimental 
error)  when  evaluated  by  measurement  on  many  different  gases,  as  long  as 
their  behavior  approximates  that  of  an  ideal  gas.  Appropriate  measure- 
ments do,  in  fact,  yield  this  result.  The  quantity  k is  called  Boltzmann’s  con- 
stant. Its  value  is  measured  to  be 

k = 1.3807  x 10“23  J/K  (17-11) 


EXAMPLE  17-2 

Show  that  the  units  given  for  Boltzmann’s  constant  in  Eq.  (17-11)  are  correct. 

■ Since  k is  given  by  Eq.  (17-10/0,  its  units  must  be  the  same  as  those  of  the  quan- 
tity pV /NT  to  which  it  is  equal.  You  thus  have 

(force/area)(voIume) 

Units  of  k = — : — — — — 

(dimensionless  n umber) (kelvin) 

(force-length)  _ (energy) 

(kelvin)  (kelvin) 

In  SI,  the  unit  of  energy  is  the  joule.  Hence  you  have 

Units  of  k — J/K 


Boltzmann’s  constant  is  named  after  the  Austrian  Ludwig  Boltzmann 
(1844-1906).  One  of  the  giants  of  nineteenth-century  theoretical  physics,  Boltz- 
mann was  a staunch  advocate  of  the  molecular  theory  of  matter  and  of  the  kinetic 
theory  (developed  in  Chap.  18)  which  stems  from  it.  He  suffered  severe  fits  of  de- 
pression, partly  as  a result  of  the  reluctance  of  many  of  his  colleagues  to  accept  his 
views.  Ironically,  he  committed  suicide  on  the  eve  of  the  collapse  of  resistance  to 
the  molecular  picture  of  matter. 

It  is  customary  to  write  Ecp  (17-10 b)  in  the  standard  form 

pV  = NkT  (17-12) 

This  equation  is  quite  general.  The  only  restriction  is  that  the  gas  whose 
behavior  it  describes  must  approximate  an  ideal  gas.  (For  present  pur- 
poses, this  restriction  means  that  the  temperature  of  the  gas  must  be  well 
above  its  liquefaction  temperature  and  that  its  pressure  must  not  be  too 


17-4  The  Equation  of  State  of  an  Ideal  Gas  753 


great,  say,  1 atm  or  less.  The  ideal  gas  is  defined  more  precisely  in  Chap. 
18.)  Equation  (17-12)  is  therefore  called  the  equation  of  state  of  an  ideal 
gas,  or  the  ideal-gas  law.  Example  17-3  demonstrates  a simple  application 
of  the  ideal-gas  law  in  the  form  of  Eq.  (17-12). 


EXAMPLE  17-3  — — — — 

How  many  molecules  are  present  in  1.00  L of  air  at  room  temperature  (300  K)  and 
atmospheric  pressure? 

■ From  Eq.  (17-12)  you  have 


In  order  to  solve  this  equation  numerically,  the  quantities  must  all  be  expressed  in 
consistent  units.  You  know,  from  Sec.  16-3,  that  1 atm  = 1.013  x 105  Pa.  Also,  you 
have  1 L = 1 X 10-3  m3.  Thus  you  can  write,  to  three  significant  figures, 

1.01  x 105  N/m2  x 1.00  x 10“3  m3 
A “ 1.38  x 10“23  J/K  x 300  K 

= 2.45  x 1022  molecules 


There  is  a way  of  rewriting  the  ideal-gas  law,  Eq.  (17-12),  completely  in 
terms  of  macroscopic  quantities.  This  is  often  convenient,  since  in  dealing 
with  macroscopic  quantities  of  gas  (or  other  matter)  it  is  awkward  to 
express  the  quantity  in  terms  of  the  number  of  molecules  present.  In  order 
to  do  this,  we  proceed  as  follows.  The  atomic  mass  unit,  introduced  in 
Example  15-6,  is  defined  to  be  exactly  one-twelfth  the  mass  of  an  atom  of 
carbon- 12,  the  most  common  isotope  of  carbon.  Experimental  measure- 
ment shows  the  atomic  mass  unit  u to  have  the  value  u = 1.661  x 10-27  kg. 
For  any  pure  substance  consisting  of  identical  molecules,  a certain  quantity 
of  that  substance  possesses  a mass,  measured  in  kilograms,  which  is  nu- 
merically equal  to  the  mass  of  any  one  of  its  molecules,  measured  in  atomic 
mass  units.  This  quantity  of  the  substance  is  defined  to  be  1 kilomole  (kmol). 
For  example,  the  molecules  of  hydrogen  consist  of  two  atoms  each  of 
hydrogen.  If  a sample  consists  of  the  pure  isotope  hydrogen- 1,  whose 
atomic  mass  is  1.008  u,  then  the  mass  of  1 kmol  of  the  sample  is  2 x 
1 .008  kg  = 2.016  kg. 

The  mass  of  1 kmol  of  a substance  can  be  defined  even  if  one  or  more  of  the 
chemical  elements  comprising  it  consist  of  a mixture  of  several  isotopes.  In  this 
case,  the  average  mass  of  the  atoms  present  is  used  in  place  of  the  mass  of  any  par- 
ticular isotope.  For  example,  the  gas  hydrogen  chloride  (HCl)  consists  of  mole- 
cules each  comprising  one  hydrogen  atom  (average  mass  1.0  u)  and  one  chlorine 
atom  (average  mass  35.5  u).  The  average  mass  of  the  molecules  is  thus  1.0  u + 
35.5  u = 36.5  u.  Therefore  the  mass  of  1 kmol  of  HCl  is  36.5  kg. 

Because  of  the  way  in  which  we  have  defined  the  kilomole,  1 kmol  of  any 
substance  contains  the  same  number  of  molecules  as  1 kmol  of  any  other  substance. 
This  is  made  evident  by  the  following  calculation,  in  which  we  evaluate  the 
number  of  molecules  present  in  1 kmol  of  an  arbitrary  substance.  We  have 

mass  of  1 kmol  of  molecules  (in  kg) 

Number  of  molecules  in  1 kmol  = ttt: ; ; — — ; — r 

mass  of  1 molecule  (in  kg) 


754  The  Phenomenology  of  Heat 


The  denominator  of  this  fraction  can  be  expressed  in  the  form  [mass  of  1 
molecule  (in  u)]  x [1  u (in  kg)].  Thus  we  have 

Number  of  molecules  in  1 kmol 

_ mass  of  1 kmol  of  molecules  (in  kg)  1 

mass  of  1 molecule  (in  u)  1 u (in  kg) 

The  first  fraction  on  the  right  side  of  this  equation  has  the  numerical  value 
1,  because  of  the  definition  of  the  quantity  1 kmol.  Thus  the  second  frac- 
tion, the  reciprocal  of  the  atomic  mass  unit  expressed  in  kilograms,  is  equal 
to  the  number  of  molecules  in  1 kmol.  This  number,  which  is  a universal  con- 
stant, is  called  Avogadro’s  number  A.  The  value  of  A is  determined  experi- 
mentally to  be 

A = 6.022  x 1026  (17-13) 

Just  as  1 kmol  is  defined  to  be  the  quantity  of  a substance  whose  mass  in  kilo- 
grams is  numerically  equal  to  the  mass  of  its  individual  molecules  in  atomic  mass 
units,  1 mole  (mol)  is  defined  to  be  the  quantity  of  a substance  whose  mass  in 
grams  is  numerically  equal  to  the  mass  of  its  individual  molecules  in  atomic  mass 
units.  One  mole  contains  10~3  as  many  molecules  as  1 kmol,  that  is,  6.022  x 1023 
molecules.  In  chemical  practice,  where  substances  are  handled  experimentally 
more  often  in  gram  quantities  than  in  kilogram  quantities,  this  is  the  value  usually 
quoted  for  Avogadro’s  number.  In  this  book,  however,  we  use  the  kilomole  exclu- 
sively, so  that  the  proper  value  for  Avogadro’s  number  will  be  that  expressed  in 
Eq.  (17-13). 

Since  1 kmol  of  any  substance  contains  A molecules,  n kmol  of  the 
same  substance  must  contain  nA  molecules.  If  we  call  the  total  number  of 
molecules  present  N,  it  follows  that  N = nA.  Substituting  this  value  of  N 
into  the  ideal-gas  law,  pV  = NkT,  we  have 

pV  = nAkT 

But  Avogadro’s  number  A and  Boltzmann’s  constant  k are  both  universal 
constants.  So  we  may  as  well  lump  them  together  into  a single  constant.  We 
define  the  universal  gas  constant  R to  be 

R = Ak  (17-1 4zz) 

The  numerical  value  of  R is 

R = 6.022  x 1026  x 1.3807  x 10"23J/K 

or 

R = 8.314  x 103J/K  (17-14 b) 

Expressing  the  ideal-gas  law  in  terms  of  R , we  have 

pV  = nRT  (17-15) 

Unlike  Boltzmann’s  constant,  the  universal  gas  constant  R can  be  deter- 
mined directly  by  measuring  the  pressure  p,  the  volume  V,  and  the  temper- 
ature T of  a macroscopic  sample  containing  a known  number  n of  kilomoles 
of  some  gas  under  conditions  in  which  its  behavior  approximates  that  of  an 
ideal  gas. 

The  universal  gas  constant  R plays  the  same  role  in  equations  describ- 
ing the  macroscopic  behavior  of  a gas  as  Boltzmann’s  constant  k plays  in 
equations  describing  the  microscopic  behavior  of  the  gas.  The  relation 


17-4  The  Equation  of  State  of  an  Ideal  Gas  755 


between  them  thus  provides  a vital  bridge  between  the  macroscopic  and 
microscopic  views  of  matter.  This  relation  can  be  obtained  by  comparing 
Eq.  (17-15)  with  the  microscopic  form  of  the  ideal-gas  law  given  by  Eq. 
(17-12),  pV  = NkT.  It  is  evident  from  this  comparison  that  the  relation  is 

nR  = Nk  (17-16) 

That  is,  the  product  of  the  number  of  moles  present  in  a sample  of  gas  and  the  uni- 
versal gas  constant  is  equal  to  the  product  of  the  number  of  molecules  present  in  the 
same  sample  and  Boltzmann  s constant. 


From  the  point  of  view  of  describing  completely  what  are  called  the 
thermodynamic  properties  of  any  system  consisting  of  an  assemblage  of 
matter,  it  often  suffices  to  specify  the  pressure,  volume,  amount  of  matter 
present,  and  temperature  of  the  system.  When  these  quantities  are  speci- 
fied, it  is  said  that  the  state  of  the  system  is  specified.  An  equation  which  re- 
lates these  four  quantities,  as  Eq.  (17-12)  or  Eq.  (17-15)  does  when  the 
system  comprises  an  approximately  ideal  gas,  is  called  an  equation  of 
state.  For  matter  in  any  form  other  than  an  ideal  gas,  the  equation  of  state 
is  always  more  complicated.  In  the  case  of  solids,  liquids,  polymers  (such  as 
commercial  plastics),  or  other  more  complex  forms  of  matter,  it  may  be 
very  complicated  indeed.  This  is  a consequence  of  the  fact  that  the  inter- 
molecular  forces  are  complicated.  We  will  see  in  Chap.  18  that  it  is  possible 
to  derive  the  equation  of  state  of  an  ideal  gas  from  first  principles  alone, 
that  is,  from  simple  calculations  involving  the  mechanics  of  particles.  It  is 
this  fact,  taken  together  with  the  fact  that  the  behavior  of  real  gases  is  often 
well  approximated  by  the  behavior  of  the  ideal  gas,  which  makes  it  worth- 
while to  study  ideal  gases  at  length.  In  more  complicated  cases  it  is  usually 
not  possible  to  derive  the  equation  of  state  entirely  from  first  principles, 
and  the  equation  retains  some  of  the  qualities  of  an  empirical  rule.  Never- 
theless, such  quasi-empirical  equations  of  state  can  be  very  useful  not  only 
in  calculating  practical  results,  but  also  in  leading  to  a deeper  under- 
standing of  the  structure  of  matter. 


17-5  THERMAL 
EXPANSION  OF 
SOLIDS  AND  LIQUIDS 


We  saw  in  Sec.  17-3  that  at  constant  pressure  the  volume  of  a gas  increases 
by  about  1 part  in  300  for  every  1 -degree  increase  in  temperature.  Solids 
and  liquids  usually  expand  with  increasing  temperature,  too,  although  at  a 
considerably  smaller  rate.  Typically,  a 1-degree  change  in  temperature 
produces  a variation  in  the  length  of  a piece  of  solid  material  of  the  order 
of  1 part  in  105.  This  is  not  a large  change,  but  the  relative  incompressibility 
of  solids  (and  of  liquids  as  well)  makes  this  small  change  manifest  itself  in 
substantial  forces,  if  the  material  is  suitably  confined. 

As  we  have  done  before,  we  begin  our  inquiry  in  a purely  phenome- 
nological way.  Knowing  as  we  do  that  solids  expand  when  heated,  we  at- 
tempt to  make  this  knowledge  quantitative.  When  a rod  of  length  l is 
heated  through  a temperature  change  AT,  its  length  increases  by  an 
amount  A/.  Observation  shows  that  for  a very  large  class  of  solids,  at  tem- 
peratures within  the  realm  of  ordinary  experience,  the  fractional  expan- 
sion A///  is  quite  closely  proportional  to  AT.  This  relation  can  be  written  in 
the  form  of  the  equation 


756  The  Phenomenology  of  Heat 


A/ 

-r  = a AT 


(17-17a) 


Table  17-1 


Typical  Coefficients  of  Linear  Expansion 


Material 

T (in  K) 

a (in  10  6 K ') 

Aluminum 

293 

25.5 

Calcite 

273-358 

Parallel  to  crystal  axis 

25.1 

Perpendicular  to  axis 

5.6 

Copper 

298-373 

16.8 

Hard  rubber 

298-308 

84.2 

Glass  (soft) 

300  (approx.) 

8.5 

Invar 

293 

0.9 

Steel 

313 

10.5  (typical) 

Quartz  (fused) 

273-303 

0.42 

Wood 

275-307 

Along  grain 

2. 5-6.6 

Across  grain 

26-54 

where  a is  a proportionality  constant.  If  we  consider  an  infinitesimal  tem- 
perature change  dT  instead  of  the  finite  temperature  change  AT,  the  in- 
crease in  length  of  the  bar  will  be  the  infinitesimal  quantity  dl.  Under  these 
circumstances,  Eq.  (17- 17a)  assumes  the  form  dl/l  — a dT.  We  solve  this 
equation  for  the  proportionality  constant  a and  obtain 

a=ldT  (I7-!7  b) 

This  equation  may  be  regarded  as  the  definition  of  the  quantity  a,  which  is 
called  the  coefficient  of  linear  expansion.  Its  value  may  be  determined  em- 
pirically for  each  material.  Table  17-1  gives  some  typical  values  of  a.  Sev- 
eral points  are  apparent  from  inspection  of  the  table.  The  first,  and  most 
striking,  is  how  little  the  coefficient  of  expansion  varies  from  material  to 
material.  Metals  generally  have  relatively  small  values  of  a,  and  nonmetals 
have  larger  ones.  Polymers  (an  example  in  the  table  is  hard  rubber)  tend  to 
have  rather  large  values  of  a,  but  wood  is  an  exception  to  this  general  state- 
ment. Finally,  anisotropic  substances  (such  as  calcite)  have  different  values 
of  a for  different  directions.  However,  we  consider  only  isotropic  sub- 
stances quantitatively. 

Example  17-4  explores  some  of  the  mechanical  aspects  of  thermal  ex- 
pansion. 


A rod  made  of  the  steel  listed  in  Table  17-1  is  2.50  m long  at  300  K and  has  a cir- 
cular cross  section  of  diameter  2.00  cm. 

a.  Find  the  increase  in  length  when  the  temperature  is  increased  to  350  K. 

■ From  Eq.  (17-1 7« ) you  have 

A/  = al  AT 

Taking  the  value  of  the  linear  expansion  coefficient  a given  for  steel  in  Table  17-1, 
together  with  the  values  given  for  the  rod  length  l and  the  temperature  change  AT, 
you  find  from  this  equation  the  length  change 

Al  = 10.5  x 10~6  K_1  x 2.50  m X 50  K = 1.31  x lO^3  m 
= 1.31  mm 

Such  a change  is  rather  easy  to  detect. 


17-5  Thermal  Expansion  of  Solids  and  Liquids  757 


b.  If  the  rod  is  rigidly  clamped  in  a strong  holder  at  300  K and  the  rod  (but  not 
the  bulk  of  the  holder)  is  heated  to  350  K,  find  the  stress  along  the  axis  of  the  rod 
and  the  magnitude  of  the  force  required  to  hold  it.  Take  Young’s  modulus  to  be 
Y = 2.00  x 10“  N/m2,  and  assume  that  the  rod  is  not  stressed  beyond  its  elastic 
limit. 

■ If  the  rod  were  not  clamped,  its  length  would  be  increased  by  1.31  x 10-3  m. 
The  uniaxial  stress  rr  in  the  rod  is  the  same  as  if  the  rod  had  been  allowed  to  expand 
freely  and  had  then  been  squeezed  back  to  its  original  length.  This  process  would 
involve  a strain  e = A///.  And  since  Young’s  modulus  is  defined  to  be  Y = c r/e,  you 
have 


A/ 

a = eY  = — Y 

Using  the  given  numerical  values  of  / and  Y and  the  value  of  A / calculated  in  part  a, 
you  obtain 

1.31  X 10'3  m 

a = x 2.00  x 1011  N/m2 

2.50  nr 

= 1.05  x 108  N/m2 

The  magnitude  F of  the  force  is  the  product  of  the  stress  and  the  cross-sectional 
area.  The  area  is  7 rr2,  and  the  radius  r is  one-half  the  rod  diameter,  2.00  cm.  So  you 
have 

F = 1.05  x 108  N/m2  x [>  x (1.00  x 10~2  m)2] 

= 3.30  x 104  N 

This  is  more  than  3 tons. 

c.  How  much  mechanical  energy  is  stored  in  the  rod  by  heating  it?  That  is,  how 
much  mechanical  work  can  it  do  when  it  is  unclampecl  with  the  temperature  at 
350  K? 

■ If  you  assume  that  the  rod  is  not  compressed  beyond  its  elastic  limit,  it  will 
obey  Hooke’s  law  when  allowed  to  expand.  According  to  Eq.  (7-58),  the  potential 
energy  stored  in  such  a system  is  given  by 


U = 


k(Al)2 


where  k is  the  force  constant  given  by 


F 


Combining  the  two  equations,  you  have 


U 


F A/ 


I he  numerical  value  is 


U = 


3.30  x 104  N x 1.31  x 10“3  m 


= 21.6 


In  the  case  of  a long,  thin  rod,  the  linear  expansion  is  of  greatest  inter- 
est. But  the  rod  expands  in  girth  as  well  as  in  length.  For  a solid  of  more 
general  shape,  and  for  all  fluids,  we  are  interested  primarily  in  the  increase 
in  volume  rather  than  that  of  a specific  dimension.  Like  the  fractional 
linear  expansion,  the  fractional  volume  expansion — that  is,  the  ratio  of  the 


758  The  Phenomenology  of  Heat 


Fig.  17-5  An  isotropic  material  is  fab- 
ricated into  a cube  of  side  l and  vol- 
ume l3,  as  shown  by  the  solid  lines. 
When  its  temperature  is  increased  by  an 
amount  A T,  it  expands.  According  to 
Eq.  (17- 17a),  each  side  of  original  length 
l expands  by  an  amount  A / = la  AT,  so 
that  its  hnal  length  is  l + Al  = 1(1  + 
a AT).  Thus  the  cube  now  has  an  in- 
creased volume  (Z  + Al)3,  as  shown  by 
the  dashed  lines.  The  fractional  change 
in  volume  can  be  expressed  in  terms  of  a 
volume  coefficient  of  expansion  y.  As 
explained  in  the  text,  y = 3a. 


volume  change  AV  to  the  original  volume  V — is  for  very  many  materials 
proportional  to  the  temperature  change  AT.  Thus,  in  analogy  to  Eq. 
(17- 17a)  we  can  write 


AV 

V 


= y AT 


(17- 18a) 


where  y is  a proportionality  constant.  Again  considering  an  infinitesimal 
temperature  change  dT  instead  of  the  finite  change  AT,  we  find  that  Eq. 
(17- 18a)  assumes  the  form  dV/V  — y dT.  Solving  for  the  proportionality 
constant  y,  we  obtain 


1 dV 
7 ~VdT 


(17-18  b) 


I his  equation  may  be  regarded  as  the  definition  of  the  quantity  y,  which  is 
called  the  coefficient  of  volume  expansion,  or  the  bulk  expansion  coeffi- 
cient. 

For  an  isotropic  solid  or  a liquid,  there  is  a simple  relation  between  y 
and  a.  Consider  the  cube  of  side  / shown  in  Fig.  17-5.  Its  original  volume  is 
V = l3.  As  its  temperature  is  increased,  its  volume  expands  at  a rate 

dV_  = d(T) 
dT  dT 


Evaluating  the  derivative  of  / 3 with  respect  to  T,  we  have  d(l3)/dT  = 
3 T dl/dT,  or 

dV  I 2 Ji 
dT  dT 


Inserting  this  value  of  dV /dT  into  Eq.  (17-1 8b)  and  again  using  the  fact  that 
V = l3,  we  have 


y = — 3/2  — 

y j3*1  dT 


3 


1 dl 
l dT 


But  according  to  Eq.  (17-176),  a = (1  /l)(dl/dT).  Thus  we  have 

y = 3a  (17-19) 

Table  17-2  lists  the  volume  coefficients  of  expansion  y for  selected  liquids 
at  T = 293  K. 

Example  17-5  discusses  the  operation  of  the  familiar  mercury-in-glass 
thermometer  described  in  Fig.  17-1  in  terms  of  the  coefficient  of  volume 
expansion. 


Table  17-2 

Coefficient  of  Volume  Expansion  for  Typical  Liquids  at  293  K 
Substance  y (in  10-6  K ') 


Ethanol  (grain  alcohol)  1120 

Bromine  1132 

Glycerine  505 

Mercury  181.9 

Water  207 


17-5  Thermal  Expansion  of  Solids  and  Liquids  759 


EXAMPLE  17-5 


Vacuum 


AV 


Glass  envelope 


Mercury 


Figure  17-6  shows  a mercury-in-glass  thermometer  whose  bulb  has  a volume  V = 
75.0  mm3.  If  the  capillary  tube  has  a diameter  of  0.100  mm,  how  far  will  the  end  of 
the  mercury  column  move  when  the  temperature,  which  is  initially  near  room  tem- 
perature, increases  by  an  amount  AT  = 1.00  K?  Neglect  the  expansion  of  the  glass 
and  the  contribution  to  the  total  mercury  volume  V of  the  small  amount  of  mercury 
already  in  the  capillary  tube. 

■ You  have  from  Eq.  (17- 18a)  a volume  increase  AV  given  by 

AV  = yV  AT 

Using  the  value  of  the  coefficient  of  volume  expansion  y given  in  Table  17-2  for 
mercury  near  room  temperature,  you  have 

AT  = 181.9  x 10-6  R-1  x 75.0  mm3  x 1.00  K 
= 1.36  x 10-2  mm3 

If  the  radius  of  the  capillary  is  r,  the  volume  of  a segment  of  the  capillary 
having  length  A l is  vr2  A!.  Thus  the  length  required  to  accommodate  the  additional 
mercury  volume  is 


Fig.  17-6  Schematic  drawing  of  a mer-  Using  this  expression  to  calculate  the  numerical  value  of  Al,  you  obtain 
cury-in-glass  thermometer,  discussed  in 

Example  17-5.  ^ 1.36  X 10-2  mm3 

A/  ~~  7T  x (5.00  x 10-2  mm)2 
= 1.73  mm 


Why  do  you  not  need  to  convert  the  volume  into  units  of  cubic  meters  before  car- 
rying out  the  calculation? 


It  should  be  noted  that  the  expansion  coefficients  a and  y are  depen- 
dent on  temperature.  As  has  been  done  in  Tables  1 7-1  and  1 7-2,  the  temper- 
ature range  over  which  their  values  have  been  measured  must  always  be 
specified.  (Where  a single  temperature  is  specified,  the  temperature  range 
of  measurement  was  small  and  no  variation  of  the  coefficient  over  that 
range  was  detected.) 

There  are  exceptions  to  the  general  rule  that  materials  expand  when 
heated.  Most  notable  is  water,  which  contracts  by  a fractional  volume  of 
about  1 part  in  104  when  the  temperature  increases  from  its  melting  point 
of  0°C  to  4°C,  where  water  attains  its  maximum  density  and  begins  to  ex- 
pand “normally.”  Materials  which  exhibit  this  behavior  are  called  icelike. 

I bis  is  because  they  all  are  characterized  by  the  fact  that  the  solid  is  less 
dense  than  the  liquid  (remember  that  ice  floats!).  Other  than  water,  the 
most  common  icelike  materials  are  the  heavy  metals  bismuth  and  anti- 
mony. The  icelike  property  is  turned  to  advantage  by  including  these 
metals  in  alloys  used  for  precision  casting,  notably  in  printing.  Most  mate- 
rials contract  on  freezing  and  cooling  and  thus  produce  castings  with 
rounded  edges.  But  type  metal  (an  alloy  of  lead,  tin,  antimony,  and  some- 
times bismuth  and  copper)  expands  on  freezing,  forcing  its  way  into  the 
corners  of  the  mold  and  producing  sharp  castings. 

A final  generalization  may  be  made  concerning  expansion  coefficients 
of  materials.  They  all  tend  to  zero  at  low  temperatures.  We  will  see  in  Chap. 
19  that  this  is  a consequence  of  a fundamental  property  of  matter. 


760  The  Phenomenology  of  Heat 


17-6  HEAT  Heat  and  temperature  are  very  closely  related  concepts,  and  they  are  often 


confused.  Part  of  this  confusion  arises  from  the  nomenclature  we  use, 
which  we  inherit  from  a day  when  temperature  as  an  independent  idea  did 
not  exist  at  all.  When  we  raise  the  temperature  of  an  object,  for  example, 
we  say  that  we  are  making  it  hotter.  There  is  nothing  wrong  with  that,  pro- 
vided we  know  what  we  are  talking  about. 

A large  part  of  the  confusion  between  heat  and  temperature  comes 
from  the  intimate  qualitative  connection  between  “putting  heat  into”  an  ob- 
ject (whatever  that  may  mean  in  the  microscopic  sense)  and  raising  its  tem- 
perature. This  is  analogous  to  the  connection  between  putting  water  into  a 
container  and  raising  its  water  level,  two  things  which  are  related  but  are 
not  the  same.  Indeed,  the  analogy  between  raising  the  water  level  in  a con- 
tainer by  adding  water  and  raising  the  temperature  of  an  object  by  heating 
it,  is  very  useful  in  understanding  what  happens  from  a phenomenological 
point  of  view.  But  like  all  analogies,  it  must  ultimately  fail  when  it  is  pushed 
too  far.  This  failure  and  the  deeper  understanding  of  the  nature  of  heat 
which  arises  from  it  are  discussed  in  Sec.  17-7. 

The  distinction  between  heat  and  temperature  was  hrst  made  clear  by 
the  Scottish  chemist  and  physician  Joseph  Black  (1728-1799).  We  now 
make  that  distinction  in  phenomenological  terms. 

It  takes  a greater  volume  of  water  to  raise  the  water  level  in  a container 
of  large  cross-sectional  area  by  a certain  amount  than  to  raise  the  water 
level  by  the  same  amount  in  a container  of  small  cross-sectional  area.  We 
will  use  this  commonplace  observation  to  illustrate  by  analogy  the  fact  that 
one  object  requires  more  of  the  quantity  called  “heat”  to  raise  its  tempera- 
ture by  a certain  amount  than  does  another  object.  That  is,  what  is  true  of 
the  water  capacity  of  containers  is  also  true  of  the  “heat  capacity”  of  objects 
in  general.  In  the  case  of  adding  water  to  containers,  we  can  measure 
directly  both  the  volume  of  the  added  water  and  the  change  in  water  level 
which  results  from  this  addition  in  a specific  container.  In  the  analogous 
case,  where  heat  is  “added”  to  a body,  we  can  measure  the  resulting  change 
in  temperature  by  using  a thermometer.  But  there  is  no  direct  way  to  mea- 
sure the  “added  heat”  as  we  can  measure  the  added  volume  of  water. 
Rather,  we  must  work  backward  from  the  measurement  of  temperature 
change  to  infer  the  change  in  “heat  content”  of  an  object  when  its  tempera- 
ture is  changed. 

Let  us  carry  the  analogy  a little  farther.  If  a container  has  a cross- 
sectional  area  a(y)  at  level  y,  the  volume  of  additional  water  A V required  to 
fill  it  from  some  initial  level  yt  to  a final  level  yf  is 


In  the  same  way,  the  amount  of  heat  A H required  to  raise  the  temperature 
of  an  object  from  an  initial  value  7j-  to  a final  value  Tf  is  given  by  the  expres- 
sion 

A H = JTf  C(T)  dT  (17-20) 

The  empirical  quantity  C(T)  is  called  the  heat  capacity  of  the  object.  In 
general,  it  is  a function  of  the  temperature  T of  the  object. 

Unlike  the  volume  change  AU  and  the  cross-sectional  area  a(y),  which 
we  understand  outside  the  context  of  the  water  analogy,  the  quantities  A H 


17-6  Heat  761 


and  C(T)  have  meaning  only  in  terms  of  Eq.  (17-20).  At  this  point,  there- 
fore, we  can  define  A H and  C(T)  only  relative  to  some  standard  object.  To 
do  this,  we  return  to  the  water  analogy.  Suppose  for  some  reason  we  could 
not  measure  the  volume  of  a quantity  of  water  directly,  but  only  the  water 
levels  in  a series  of  containers.  We  could  still  calibrate  the  cross-sectional 
areas  of  the  containers  in  terms  of  one  standard  container.  We  might,  for 
example,  siphon  water  from  the  standard  container  into  another  arbitrary 
container  until  the  water  levels  were  the  same  and  water  ceased  to  flow. 
Suppose  that  the  water-level  change  Ay  in  the  arbitrary  container  and  the 
level  change  Ays  in  the  standard  container  are  sufficiently  small  that  both 
cross-sectional  areas  are  essentially  constant.  We  define  “1  area  unit”  to  be 
the  cross-sectional  area  of  the  standard  container.  In  terms  of  this  unit,  the 
cross-sectional  area  of  the  arbitrary  container  is  a area  units. 

Since  the  increase  in  the  volume  of  water  in  one  of  the  containers  must 
be  equal  to  the  decrease  in  the  volume  of  water  in  the  other,  we  can 
equate  their  magnitudes.  This  gives  us 

(a  area  units)  Ay  = (1  area  unit)  Ays 
Solving  for  a,  we  obtain 


a = 


Ay» 

Ay 


In  the  case  of  heat,  there  is  an  operation  analogous  to  siphoning  water 
from  a container  having  a higher  water  level  to  another  container  having  a 
lower  water  level.  This  operation  is  the  placing  in  close  contact  of  two  ob- 
jects having  different  temperatures.  The  temperature  of  the  “hotter”  ob- 
ject will  decrease,  and  that  of  the  “colder”  object  will  increase  until  their 
temperatures  are  the  same.  (If  the  two  objects  are  both  quantities  of  water, 
for  example,  this  operation  can  be  accomplished  simply  by  mixing  them.) 

The  standard  “container”  is  taken  to  be  1 kg  of  water  at  15°C.  (This 
temperature  is  chosen  in  part  because  of  its  convenience  and  in  part  be- 
cause the  heat  capacity  of  water  changes  relatively  slowly  with  temperature 
at  this  temperature.)  In  analogy  with  the  equation  displayed  immediately 
above,  which  gives  the  value  of  the  cross-sectional  area  a of  an  arbitrary 
container  relative  to  a standard  container,  the  heat  capacity  C of  an  object  in 
the  temperature  range  close  to  15°C  is  found  by  measuring  its  temperature 
change  AT  and  the  temperature  change  ATS  of  1 kg  of  water  when  the  two 
are  brought  into  close  contact  until  their  temperatures  are  the  same.  We 
have 


r _ A Ts 
AT 


In  the  light  of  the  preceding  discussion,  we  can  define  “quantity  of 
heat."  The  quantity  of  heat  required  to  raise  the  temperature  of  1 kg  of  water  from 
14.5°C  to  15.5°C  is  called  a kilocalorie  (kcal).  It  is  sometimes  also  called  a large 
Calorie  (Cal) — this  is  the  dietician’s  calorie. 


The  above  definition  of  the  kilocalorie  is  no  longer  the  primary  one.  You  will 
see  in  Sec.  17-7  that  heat  is  a form  of  energy,  and  the  kilocalorie  is  therefore  de- 
fined in  terms  of  the  joule.  But  for  this  purely  phenomenological  discussion,  the 
definition  above  is  adequate. 

Another  unit  frequently  used  is  the  calorie,  or  small  calorie  (cal).  Its  value  is 


762  The  Phenomenology  of  Heat 


one-thousandth  that  of  the  kilocalorie,  so  that  1 cal  = 10-3  kcal.  How  much  water 
can  be  raised  in  temperature  from  14.5°C  to  15.5°C  by  1 cal  of  heat? 

The  British  thermal  unit  (Btu],  still  used  in  U.S.  engineering  practice,  is  the 
amount  of  heat  required  to  raise  the  temperature  of  1 pound  of  water  from  63°  Fahr- 
enheit to  64°  Fahrenheit.  In  terms  of  the  kilocalorie,  its  value  is  1 Btu  — 0.252  kcal. 

The  heat  capacity  C(T)  of  all  substances  changes  abruptly  and  signifi- 
cantly when  they  undergo  melting,  boiling,  or  similar  phase  changes.  Even 
in  the  absence  of  such  changes,  the  heat  capacities  of  all  substances  decrease 
rapidly  with  decreasing  temperature  when  the  temperature  is  low  enough. 
(For  most  substances,  “low  enough”  means  at  temperatures  well  below 
room  temperature.)  Otherwise,  however,  the  heat  capacity  for  most  sub- 
stances varies  quite  slowly  with  temperature  and  can  therefore  be  regarded 
as  constant  for  many  practical  purposes.  When  this  is  the  case,  Eq.  (17-20) 
can  be  simplified  to  obtain 

[Tf 

A H = C dT  — C(Tf  - T,) 

Jt i 

Calling  AT  = T{  — Tu  we  write  this  in  the  compact  form 

AH  = C AT  (17-21) 

In  the  particular  case  where  C has  the  value  appropriate  to  a sample  of 
matter  consisting  of  1 kg  of  water  in  the  temperature  range  between  14.5°C 
and  15.5°C,  Eq.  (17-21)  becomes  the  definition  ofi  quantity  of  heat  AH.  To  see 
this,  compare  Eq.  (17-21)  applied  to  this  special  case  with  the  italicized  state- 
ment used  to  define  the  quantity  of  heat  called  a kilocalorie. 


It  seems  plausible  (and  it  is  borne  out  by  experimentation)  that  the 
beat  capacity  of  a homogeneous  object  is  directly  proportional  to  its  mass  m. 
We  therefore  define  the  specific  heat  capacity  c of  a substance  as  its  heat 
capacity  per  unit  mass: 


c 


(17-22) 


In  terms  of  this  quantity  (which  is  the  one  invariably  tabulated)  we  can 
rewrite  Eq.  (17-21)  in  the  form 

AH  — cm  AT  (17-23) 

That  is,  the  quantity  of  heat  AH  required  to  change  the  temperature  ofi  a homoge- 
neous object  whose  mass  is  m,  and  which  is  made  of  a substance  whose  specific  heat 
capacity  is  c,  by  an  amount  AT  is  given  by  the  product  of  the  specific  heat  capacity,  the 
mass,  and  the  temperature  change.  If  the  temperature  dependence  of  c is  not 
negligible,  we  can  use  an  equation  like  Eq.  (17-22)  to  rewrite  Eq.  (17-20)  in 
the  more  general  form 

f Tf 

AH  — m c(T)  dT  (17-24) 

JTf 

The  units  of  specific  heat  capacity  are  kilocalories  per  kilogram-kelvin 
[kcal/(kg-K)]  when  SI  units  are  used  in  defining  it  phenomenologically, 
as  we  have  just  clone.  Tables  often  quote  the  specific  heat  capacity  in  units 
of  calories  per  grant-degree  Celsius  [cal/(g-°C].  However,  the  specific  heat 
capacity  of  any  substance  has  the  same  numerical  value  in  either  set  of  units. 
Can  yon  see  why?  The  numerical  value  of  the  quantity  c is  also  often  given 
in  terms  of  the  specific  heat  ratio,  that  is,  the  ratio  of  the  specific  heat 


17-6  Heat  763 


Table  17-3 


Specific  Heat  Ratios  of  Selected  Substances 


Substance 

T (in  K) 

Specific  heat  ratio 

Water 

288 

1 (by  definition) 

Ice 

271 

0.502 

Steam  (1  atm) 

383 

0.481 

Aluminum 

293 

0.214 

Bromine 

Solid 

260 

0.088 

Liquid 

286-318 

0.107 

Copper 

293 

0.0921 

Gold 

291 

0.0312 

Lead 

293 

0.0306 

Lithium 

373 

1.041 

Mercury 

293 

0.03325 

Sodium  chloride 

273 

0.204 

Ammonia  (liquid) 

293 

1.125 

Ethanol 

298 

0.581 

capacity  of  a substance  to  that  of  water  at  288  K = 15°C.  This  ratio  also  has 
the  same  numerical  value  as  the  specific  heat  capacity,  since  for  water  at  that 
temperature  we  have  by  definition  c — 1 kcal/(kg-K)  = 1 cal/(g-°C).  The 
specific  heat  ratio  is  dimensionless  since  it  is  the  ratio  of  two  specific  heat 
capacities.  (You  have  probably  encountered  the  specific  heat  ratio  in  your 
previous  studies  under  the  name  “specific  heat."  We  do  not  use  this  name 
because  it  is  imprecise  and  tends  to  be  confusing.)  Some  typical  specific  heat 
ratios  are  given  in  Table  1 7-3.  The  specific  heat  ratios  of  different  substances 
range  over  about  two  orders  of  magnitude  for  temperatures  in  the  vicinity 
of  room  temperature. 

Equation  (17-23),  \H  = cm  A T,  was  derived  on  the  basis  of  an  anal- 
ogy between  a heat  experiment  and  an  experiment  involving  containers  of 
water.  The  latter  was  a thought  experiment  in  which  water  was  transferred 
between  a standard  container  and  an  arbitrary  container.  By  doing  this,  it 
was  possible  to  determine  the  cross-sectional  area  of  the  arbitrary  container 
relative  to  that  of  the  standard  container.  But  the  analogous  heat  experi- 
ment is  by  no  means  a thought  experiment.  Rather,  it  is  a standard  method 
for  determining  the  heat  capacity  of  an  object.  Examples  17-6  and  17-7  ex- 
plore the  principles  of  this  method,  which  is  called  calorimetry  (a  word 
derived  from  Latin  and  Greek  roots  meaning  "heat  measurement”). 


EXAMPLE  17-6  «^''**°*~***'***'"*i**>**-'  •>  ' — 

A 5.00-kg  lump  of  lead  having  a temperature  of  90.0°C  is  dropped  into  an  insulated 
container  called  a calorimeter.  The  calorimeter  holds  10.00  kg  of  water  at  an  initial 
temperature  of  20.0°C.  Neglecting  the  heat  capacity  of  the  container,  find  the  final 
temperature  of  the  system. 

■ You  know  from  experience  that  the  system  will  come  to  equilibrium  — that  is, 
no  further  temperature  changes  will  take  place  in  any  part  of  the  system — when  all 
parts  of  it  are  at  the  same  final  temperature  Tf.  If  you  neglect  any  flow  of  heat  into 
or  out  of  the  system  comprising  the  lead  and  the  water,  all  the  heat  that  flows  out  of 
the  lead  as  it  cools  must  how  into  the  water  and  warm  it.  Thus  you  have 

Ah/iead  A//water 


764  The  Phenomenology  of  Heat 


Assuming  the  specific  heat  capacities  of  lead  and  water  to  be  constant  over  the  tem- 
perature range  of  the  experiment,  you  use  Ecj.  (17-23)  to  obtain 

deadhead  ATjead  ^water^hvater  A7water 

Dividing  both  sides  of  this  equation  by  the  quantity  rwater,  you  obtain 

Oead  AT  = _ AT 

wiead  ^ ‘ lead  ^water  * water 

Water 

The  fraction  on  the  left  side  of  this  equation  is  the  specific  heat  ratio  of  lead,  which 
you  can  obtain  from  Table  17-3.  In  terms  of  the  final  system  temperature  Tf,  the 
temperature  changes  are 

ATlead  = 7>  - 90.0°C  and  A Twater  = Tf  - 20.0°C 

Using  these  numerical  quantities  together  with  the  given  masses  of  the  lead  and  the 
water,  you  have 

0.0306  x 5.00  kg  x (Tf  - 90.0°C)  = -10.00  kg  x (Tf  - 20.0°C) 


or 

Tf  - 90°C 

-10.00  kg 

Tf  - 20°C  “ 

0.0306  x 5.00  kg  " 

or 

T,  - 90°C  = 

-65.4  Tf  + 1 3 10°C 

or 

64.4  Tf  = 

1 400°C 

Thus 

Tr  = 

1400OC  = 21.7°C 

Because  of  the  relatively  large  specific  heat  capacity  ot  the  water,  and  because 
the  mass  of  the  water  is  greater  than  that  of  the  lead,  the  temperature  change  of  the 
water  is  relatively  small.  Of  common  substances,  water  has  the  largest  specific  heat 
capacity,  while  metals  in  general  have  rather  small  specific  heat  capacities.  Among 
the  metals,  the  specific  heat  capacity  tends  to  decrease  with  increasing  density.  We 
return  to  this  point  in  Chap.  18. 


Fig.  17-7  Melting  a block  of  ice.  It  is 
evident  from  the  way  the  system  is  set 
up  that  there  is  a steady  flow  of  heat  into 
the  ice. 


When  heat  flows  into  or  out  of  an  object,  its  temperature  usually 
changes  in  a smooth  and  steady  fashion.  An  important  exception  to  this 
statement,  however,  is  the  phenomenon  called  change  of  phase.  The  most 
familiar  phase  changes  are  those  in  which  a substance  transforms  from 
solid  to  liquid  {melting),  from  liquid  to  gas  (evaporation  or  boiling),  from  solid 
to  gas  (sublimation),  from  gas  to  solid  or  liquid  (condensation),  and  from  liq- 
uid to  solid  (freezing ).  Such  changes  of  phase  violate  our  primitive  observa- 
tion at  the  beginning  of  this  section,  according  to  which  a flow  of  heat  into 
or  out  of  a substance  is  accompanied  by  a change  in  temperature.  This  may 
seem  paradoxical,  in  view  of  our  assertion  that  a flow  of  heat  can  be  de- 
tected only  indirectly,  through  a change  in  temperature.  However,  the  ad- 
dition or  extraction  of  heat  appears  to  be  required  for  the  change  of  phase 
itself,  when  it  occurs  at  constant  pressure.  This  can  be  seen  by  observing, 
for  example,  the  melting  of  a block  of  ice  into  which  heat  is  made  to  flow  at 
a constant  rate,  as  shown  in  Fig.  17-7.  As  the  ice  melts  and  water  appears, 
the  temperature  of  the  system  remains  at  0°C  (provided  the  system  is  well 
stirred  to  avoid  the  formation  of  hot  spots,  so  that  it  really  has  a well- 
defined  temperature).  The  amount  of  liquid  water  increases  at  a constant 
rate.  Only  when  the  last  bit  of  ice  disappears  does  the  temperature  begin  to 
rise. 


17-6  Heat  765 


Table  17-4 


Latent  Heats  for  Selected  Substances 


Substance 

Phase  change 

L (in  kcal/kg) 

Water 

Melting 

79.7 

Boiling 

539.6 

Ammonia 

Boiling 

327.1 

Copper 

Melting 

42 

Lead 

Melting 

5.86 

Mercury 

Melting 

2.82 

Boiling 

65 

Ethanol 

Melting 

24.9 

Boiling 

204 

Bromine 

Boiling 

43.7 

Helium 

Boiling 

6.0 

Nitrogen 

Melting 

6.09 

Boiling 

47.6 

Because  the  heat  flowing  into  or  out  of  the  sample  undergoing  a phase 
change  seems  to  “go  into  hiding”  (that  is,  it  produces  no  temperature 
change),  it  is  called  latent  (that  is,  hidden)  heat.  When  a phase  change  is  car- 
ried out  on  a sample  of  a particular  substance  at  constant  pressure,  the 
amount  of  heat  A H required  is  proportional  to  the  mass  m of  the  sample; 
that  is,  A H <x  m.  The  proportionality  constant  linking  A H to  m is  character- 
istic of  the  substance  and  is  called  the  latent  heat  L.  In  terms  of  the  latent 
heat,  we  can  write  the  relation 

A H = Lm  (17-25) 

The  units  of  L are  kilocalories  per  kilogram  (kcal/kg).  Table  17-4  gives  val- 
ues of  the  latent  heat  for  selected  substances.  Water  is  again  atypical  in  that 
it  has  rather  large  latent  heats  of  both  melting  and  boiling. 

In  Example  17-7  a calorimeter  is  used  to  illustrate  the  fact  that  the  heat 
required  to  melt  ice  can  come  from  the  surrounding  water,  which  is  cooled 
in  the  process. 


EXAMPLE  17-7  

A 3.00-kg  block  of  ice  has  temperature  — 10.0°C.  It  is  dropped  into  a calorimeter  (an 
insulated  container  of  negligible  heat  capacity)  holding  5.00  kg  of  water  whose  tem- 
perature is  40.0°C.  Will  the  ice  all  melt? 

■ Before  the  ice  can  begin  to  melt,  it  must  be  warmed  to  0°C.  The  temperature 
change  is  Arjce  = 0°C  — (—  10.0°C)  = 10.0°C.  This  requires  a heat  input  A Hv  Ac- 
cording to  Eq.  (17-23),  it  is  given  by 

AE/i  Cice^tice  ATce 

= 0.502  kcal/(kg-°C)  x 3.00  kg  x 10.0°C  = 15.1  kcal 

Once  the  ice  has  been  warmed  to  its  melting  point,  the  process  of  melting  re- 
quires an  additional  heat  input  A H2.  According  to  Eq.  (17-25),  this  is  given  by 

A H2  = Lm 

= 79.7  kcal/kg  x 3.00  kg  = 239  kcal 

The  source  of  heat  is  the  water.  In  cooling  through  a temperature  change 
ATvater  = 0°C  - 40.0°C  = -40.0°C,  the  water  gives  up  an  amount  of  heat  A H3. 
Again  using  Eq.  (17-23),  you  have 


766 


The  Phenomenology  of  Heat 


17-7  THE  MECHANICAL 
EQUIVALENT  OF  HEAT 


A//3  Cwater^bvater  A Y\vater 

= 1 kcal/(kg-°C)  x 5.00  kg  x (-40.0°C)  = -200  kcal 

The  negative  sign  implies  that  heat  is  flowing  out  of  the  water.  This  heat  made  avail- 
able through  the  cooling  of  the  water,  A Hz  = 200  kcal,  is  not  sufficient  to  melt  all 
the  ice.  Indeed,  the  heat  still  available  when  the  ice  begins  to  melt  is  200  kcal 
- 15.1  kcal  = 185  kcal.  This  is  enough  to  melt 

185  kcal 

, — rj. — = 2.32  kg  of  ice 
79.7  kcal/kg  8 

There  w ill  thus  be  0.68  kg  of  ice  left.  Its  temperature,  and  that  of  the  surrounding 
water,  which  comprises  the  original  water  and  that  produced  by  the  melting  ice,  will 
be  0°C. 


The  device  called  the  calorimeter  always  operates  on  the  principle 
suggested  by  Examples  17-6  and  17-7.  A system  undergoes  a process  in- 
volving the  transfer  out  of  or  into  it  of  a certain  quantity  of  heat,  which  is  to 
be  measured.  The  heat  is  made  to  flow  into  or  out  of  a known  quantity  of 
matter  (usually  a fluid  and  often  water)  having  a known  initial  temperature 
and  specific  heat  ratio.  In  simple  cases  this  is  done  as  in  the  examples,  by 
immersing  the  system  to  be  tested  directly  into  the  standard  fluid.  From  the 
temperature  change  of  the  standard  fluid,  the  amount  of  transferred  heat 
can  be  calculated.  In  practice,  corrections  must  be  made  for  the  heat  capac- 
ity of  the  calorimeter  container  itself  and  for  heat  flow  across  the  insulating 
barrier  which  isolates  the  calorimeter  from  its  surroundings.  Calorimetry  is 
used  to  measure  not  only  heat  capacities  and  latent  heats  of  phase 
changes,  but  also  the  heats  of  combustion  of  foods  or  fuels  and  the  heat 
flow  involved  in  many  other  processes. 


In  Sec.  17-6  we  developed  the  concept  of  heat  and  related  it  to  that  of  tem- 
perature. Through  such  experimentally  derived  quantities  as  the  heat 
capacity  and  the  thermal  expansion  coefficients,  the  two  concepts  become 
very  useful  in  describing  the  properties  of  matter  and  their  variation.  How- 
ever, nothing  has  yet  been  said  about  what  heat  is.  The  water  analogy  of 
Sec.  17-6  has  been  very  helpful  in  developing  the  concept  phenome- 
nologically. Thus  it  is  tempting  to  speculate  that  heat  is  itself  a kind  of  fluid, 
whose  flow  into  and  out  of  various  physical  objects  underlies  temperature 
changes,  phase  changes,  and  other  so-called  thermal  phenomena. 
Eighteenth-century  investigators  mostly  took  this  view  and  gave  the  conve- 
nient name  caloric  to  this  “heat  fluid”  or  “caloric  fluid.” 

In  what  follows,  we  consider  some  of  the  evidence  that  this  caloric 
theory  of  heat  fails  to  meet  the  test  of  experimental  verification.  The  same 
experimental  tests  give  strong  support  to  the  view  that  heat  is  a form  of  en- 
ergy. What  distinguishes  this  form  of  energy  from  others  is  that  it  involves 
random  motion  of  the  microscopic  parts  of  the  macroscopic  system  which 
“contains”  the  heat.  (The  concept  of  random  motion  is  discussed  in  detail 
in  Chap.  18.)  This  view  is  called  the  kinetic  theory  of  heat.  In  this  and  the  fol- 
lowing two  chapters,  you  will  see  its  power  to  interpret  physical  phenomena 
in  ever  broader  and  deeper  terms.  As  was  mentioned  at  the  beginning  of 
this  chapter,  it  makes  possible  an  understanding  of  phenomena  involving 
heat  entirely  in  terms  of  the  principles  of  mechanics. 


17-7  The  Mechanical  Equivalent  of  Heat  767 


It  has  been  known  since  prehistoric  times  that  there  is  a connection 
between  friction  and  heat.  Primitive  people  have  made  fire  by  frictional 
means  for  at  least  600,000  years.  Nails  driven  into  wood  are  heated  sub- 
stantially; hammering  or  drilling  on  metal  can  produce  enough  heat  to 
make  the  metal  red-hot.  And  milkmaids  have  known  for  a long  time  that 
freshly  churned  butter  is  considerably  warmer  than  the  cream  from  which 
it  is  made.  (We  will  soon  see  the  elegant  scientific  use  to  which  Joule  put  this 
observation.)  The  idea  which  these  observations  suggest  — that  heat  is  a 
form  of  motion — dates  at  least  as  far  as  classical  Greek  times. 

In  1620,  the  great  English  philosopher  Francis  Bacon  (1561-1626)  inquired 
into  the  nature  of  heat  on  the  basis  of  the  above  observations,  taken  together  with  a 
large  number  of  others.  His  explicit  purpose  was  to  make  the  study  of  heat  into  a 
model  science,  conforming  to  the  rules  of  scientific  investigation  which  he  had 
devised.  Although  his  method  was  open  to  question  from  the  modern  point  of 
view,  he  concluded:  “Heat  is  a motion,  expansive,  restrained,  and  acting  in  its 
strife  upon  the  smaller  particles  of  bodies.  ...  If  in  any  natural  body  you  can  ex- 
cite a[n]  . . . expanding  motion,  and  can  so  repress  this  motion  and  turn  it  back 
upon  itself,  . . . you  will  undoubtedly  generate  heat.” 

One  way  of  elaborating  Bacon’s  point  of  view  is  to  postulate  that  an  in- 
crease of  temperature  of  a body  implies  its  molecules  are  moving  faster. 
(This  motion  is  random  from  molecule  to  molecule,  so  that  the  center  of 
mass  of  the  body  does  not  move  at  all.)  It  is  possible  to  measure  the  me- 
chanical energy  A E put  into  a system  by  friction.  If  the  system  is  inside  a 
calorimeter,  we  can  also  find  the  increase  A//  in  the  “heat  content”  of  the 
system  from  the  measured  temperature  increase  and  the  known  heat 
capacity  of  the  system.  For  that  particular  experiment,  we  can  then  write 
the  empirical  relation 

A//  = J AT  (17-26) 

where  J is  the  proportionality  constant  determined  by  the  experiment.  If  A// 
is  measured  in  kilocalories  and  AT  in  joules,  the  units  of  J must  be  kilocalo- 
ries per  joule  (kcal/J). 

Now  suppose  that  the  experiment  is  carried  out  with  different  systems 
and  with  different  friction  mechanisms.  And  suppose  that  (within  experi- 
mental error)  the  value  of  J is  always  the  same.  Such  a result  constitutes 
a strong  experimental  (though  indirect)  evidence  that  heat  is  indeed  a 
microscopic  form  of  mechanical  energy.  Moreover,  Eq.  (17-26)  takes  on  the 
character  of  a universal  relation.  The  quantity  J becomes  the  conversion 
factor  which  relates  the  arbitrarily  defined  unit  of  “heat” — which  is  now 
better  called  heat  energy,  or  energy  in  the  form  of  random  microscopic 
motion — to  the  fundamental  unit  of  energy,  the  joule. 

I he  earliest  semiquantitative  effort  of  note  in  this  direction  was  the 
series  of  experiments  performed  in  the  1780s  and  1790s  by  Rumford. 

Benjamin  Thompson,  Count  Rumford  (1753-1814),  is  one  of  the  most 
remarkable  personages  in  the  history  of  science.  Born  of  poor  parents  in  colonial 
Massachusetts,  he  died  a Count  of  the  Holy  Roman  Empire.  His  second  wife — he 
had  abandoned  his  first  when  he  left  the  United  States — was  the  widow  of  the  im- 
mortal chemist  Lavoisier  and  was  herself  an  accomplished  chemist  and  leader  of 
Parisian  intellectual  society.  Besides  being  a physicist,  chemist,  engineer,  nutri- 
tionist, and  agronomist  of  the  first  rank,  Rumford  was  an  immensely  versatile  in- 
ventor, a social  engineer  and  reformer,  a master  Tory  spy  and  a double  agent,  an 


768  The  Phenomenology  of  Heat 


extraordinarily  corrupt  but  resourceful  politician,  a military  administrator  and 
strategist,  a leading  popularizer  of  science  and  technology,  a philanthropist  who 
detested  the  common  people,  a maker  of  innumerable  enemies,  and  a thorough- 
going rogue.  His  biography  reads  like  a novel;  if  offered  as  fiction,  it  would  almost 
certainly  be  rejected  as  too  improbable. 

One  of  Rumford’s  duties  as  director  of  the  Bavarian  state  arsenal  was 
to  oversee  the  boring  of  cannon.  In  this  process,  a solid  bronze  casting  was 
bored  out  to  the  proper  size  by  a cutting  tool  on  a lathelike  boring  machine. 
The  motive  power  for  the  machine  was  furnished  by  a team  of  horses 
through  a system  of  belts  and  pulleys.  As  the  boring  tool  cut  chips  out  of 
the  bronze,  much  heat  was  evolved.  According  to  the  caloric  theory  of  heat, 
this  came  about  because  the  metal  could  not  hold  as  much  caloric  when  cut 
into  thin  chips  as  it  had  in  the  form  of  a solid  chunk,  just  as  a sponge  sliced 
into  very  thin  slivers  could  not  hold  much  water. 

The  heat  produced  was  so  considerable  that  it  was  necessary  to  provide 
cooling.  This  was  done  simply  by  immersing  the  casting  and  the  cutting 
tool  in  a tank  of  water.  When  the  boring  had  proceeded  for  a while,  the 
water  became  hot  enough  to  boil.  Rumford  used  the  rate  of  boiling  as  a 
rough  indication  of  the  rate  of  evolution  of  heat. 

As  the  cutting  proceeds,  the  tool  gets  dull,  and  the  cutting  rate  de- 
creases even  though  the  horses  continue  to  work  at  the  same  rate.  Rumford 
noticed  that  the  rate  of  boiling  remained  constant  even  though  the  rate  at 
which  metal  chips  were  produced  (and  presumably  therefore  also  the  rate 
at  which  caloric  leaked  out)  diminished. 

Rumford  repeated  the  experiment  with  a completely  dull  tool  which 
did  not  cut  at  all.  Nonetheless,  the  water  continued  to  boil  at  the  same  rate. 
Indeed,  the  rate  of  boiling  appeared  to  have  more  to  do  with  the  rate  at 
which  the  horses  worked  than  with  the  details  of  how  the  tool  cut  the  metal. 
Rumford  argued  that  the  caloric  theory  could  not  account  for  these  experi- 
mental observations  satisfactorily.  In  particular,  he  could  continue  to  boil 
water  indefinitely  with  the  dull  tool.  It  was  not  consistent  with  the  theory  to 
argue  that  the  casting  was  an  inexhaustible  source  of  caloric. 

Rather,  Rumford  argued,  the  mechanical  work  performed  by  the 
horses  in  moving  the  tool  against  the  resistance  of  friction  was  transformed 
into  an  equivalent  amount  of  random  microscopic  motion,  or  heat. 
Although  Rumford  was  in  general  a meticulous  experimenter  who  made 
careful  measurements,  he  appears  never  to  have  made  any  effort  to  refine 
this  particular  experiment.  Nevertheless,  his  persistent  demonstrations, 
together  with  his  flair  for  showmanship,  were  of  great  importance  in  re- 
viving interest  in  the  kinetic  theory  of  heat — that  is,  the  view  that  heat  is 
nothing  more  or  less  than  a macroscopic  manifestation  of  random  microscopic  mo- 
tion. 


The  best  estimate  of  the  value  of  the  constant  J in  Eq.  (17-26),  made  on 
the  basis  of  Rumford’s  results,  leads  to  the  relation  1 kcal  — 5700  }.  This 
value  for  the  kilocalorie  is  about  35  percent  higher  than  the  modern  value. 
Accurate  measurements  of  the  mechanical  equivalent  of  heat  were  not 
made  until  some  decades  after  Rumford’s  work.  Between  1840  and  the 
1870s,  Joule  performed  a long  series  of  classical  experiments  in  which  dif- 
ferent forms  of  energy  were  converted  into  heat  in  a variety  of  ways. 

James  Prescott  Joule  (1818-1889)  was  a member  of  a well-to-do  family  of 
brewers  in  Manchester.  He  was  partially  crippled  by  a spinal  ailment,  and  since  he 


17-7  The  Mechanical  Equivalent  of  Heat  769 


was  judged  unfit  to  participate  in  the  family  business,  he  devoted  himself  to  scien- 
tific investigations.  In  this,  he  was  perhaps  the  last  of  the  gifted  amateurs  who  had 
dominated  British  physics  from  the  death  of  Newton. 

One  of  Joule’s  early  experimental  arrangements  is  shown  in  Fig.  17-8. 
The  weights  in  Fig.  77  of  this  engraving  turned  the  pulleys  and  thus  ro- 
tated the  paddles  in  the  vessel,  which  were  immersed  in  water  or  mercury. 
In  a related  experiment,  the  same  weights  rubbed  two  metal  parts  together. 
(These  are  the  small  plates  e and  b in  Joule’s  Fig.  75.)  From  the  heat  capac- 
ity of  the  liquid  and  the  container,  and  the  temperature  rise,  the  heat  was 
calculated  (various  corrections  were  made  for  radiation  losses  and  other 
small  effects).  The  mechanical  energy  input  could  be  calculated  from  the 
vertical  distance  traversed  by  the  known  weights.  Joule  concluded  that 
772  foot-pounds  of  mechanical  work  was  equivalent  to  1 Btu  of  heat.  Re- 
stated in  terms  of  modern  units,  this  equivalence  is  4240  J = 1 kcal.  This 
is  not  far  different  from  more  precise  modern  values.  Modern  measure- 
ments have  become  so  refined,  and  confidence  in  the  kinetic  theory  is  so 
complete  for  this  and  other  compelling  reasons,  that  the  kilocalorie  is  now 
defined  in  terms  of  the  joule.  By  definition,  the  relation  is  exactly 

1 kcal  s 4186  J (17-27) 

This  definition  has  been  chosen  to  be  in  close  concord  with  the  best  mea- 
surements made  in  terms  of  the  old,  and  independent,  definitions  of  the 
two  units.  (There  is  a tendency  in  modern  physical  practice  to  dispense  en- 
tirely with  the  kilocalorie  as  a unit  of  heat.  It  is  still  in  favor  with  chemists 
and  engineers,  however,  and  will  likely  be  with  us  for  a long  time.) 

Example  17-8  considers  some  of  the  details  of  Joule’s  experiments. 


In  one  of  Joule’s  experiments  he  obtained  the  following  data  (expressed  in  modern 
units): 

Mass  of  driving  weights  (labeled  e in  Joule's  Fig.  77:  26.32  kg 
Total  distance  of  fall  of  the  weights:  31.85  m 
Fleat  capacity  of  paddlewheel  apparatus:  6.316  kcal/°C 
Temperature  rise  of  paddlewheel  apparatus:  0.316°C 

Find  the  value  of  J,  the  constant  in  Eq.  (17-26).  Take  the  acceleration  of  gravity  at 
Joule’s  laboratory  in  Manchester  to  be  9.812  m/s2,  and  neglect  small  corrections. 

■ The  kinetic  energy  input  to  the  calorimeter,  which  is  converted  to  heat  by  the 
friction  of  the  paddles  against  the  water,  is  equal  to  the  loss  of  gravitational  poten- 
tial energy  by  the  falling  weights.  You  thus  have 

A E = mg  Ah  = 26.32  kg  X 9.812  m/s2  x 31.85  m = 8225  J 

The  heat  input  to  the  system  is,  according  to  Eq.  (17-21),  A H = C AT,  where  AT 
is  measured  in  kelvins.  But  AT  = At,  where  At  is  measured  in  Celsius  degrees.  So 
you  have 

AH  = C At  = 6.316  kcal/°C  x 0.316°C  = 1.996  kcal 


You  thus  have 


J 


AH 

~AE 


= 2.427  x 1(T4  kcal/J 


770  The  Phenomenology  of  Heat 


JUL.. 


Fig.  17-8  Reproduction  of  Joule’s  illustrations  of  his  apparatus  for  determining  the  mechan- 
ical equivalent  of  heat.  In  his  Fig.  77,  the  heavy  weights  ee  drive  the  paddles  of  the  churnlike 
calorimeter  in  the  center  through  a system  of  pulleys  and  cords.  The  details  of  the  calorimeter 
are  shown  in  Figs.  69  and  70.  The  temperature  rise  of  the  water  inside  the  calorimeter  is  mea- 
sured by  means  of  a precise  thermometer  (not  shown)  which  is  inserted  through  bushing  b in 
Fig.  71.  The  total  mechanical  work  done  by  the  weights  is  found  by  observing  the  total  dis- 
tance through  which  they  descend,  using  the  scales  kk  shown  in  Fig.  77.  A clutch  mechanism 
makes  it  possible  to  disconnect  the  paddles  inside  the  calorimeter  from  the  driving  mecha- 
nism. Thus  the  weights  can  be  “wound  up”  and  allowed  to  descend  numerous  times  in  the 
course  of  a single  experiment.  Joule’s  Figs.  72,  73,  and  74  illustrate  a similar  apparatus  in 
which  mercury  was  used  as  the  fluid  instead  of  water.  In  the  apparatus  of  Figs.  75  and  76,  the 
friction  takes  place  between  two  solid  disks  e and  b.  Within  experimental  error,  all  these 
methods  of  converting  mechanical  work  to  heat  by  means  of  friction  lead  to  the  same  value  of 
the  proportionality  constant  J in  the  equation  A H = J A E. 


Taking  the  reciprocal  of  this  number,  you  find  that 

1 kcal  = 4121  J 

which  differs  from  the  value  now  established  by  definition  by  1.6  percent. 

Joule’s  experiment  does  not  identify  heat  with  energy,  but  it  does 
suggest  such  an  identity  very  strongly.  If  the  phenomena  associated  with 
flow  of  the  phenomenologically  defined  quantity  “heat”  are  always  asso- 
ciated with  a precisely  proportional  transfer  of  the  fundamental  quantity 
energy,  there  must  at  least  be  a strong  connection  between  the  two.  The 
precise  nature  of  this  connection — or  identity — is  a major  concern  of 
Chap.  19. 


17-7  The  Mechanical  Equivalent  of  Heat  771 


EXERCISES 


Group  A 

17-1.  Convenient  conversion. 

a.  Find  an  equation  for  converting  Fahrenheit  to 
Celsius  temperature. 

b.  At  what  temperature  is  the  numerical  value  the 
same  on  both  scales? 

17-2.  Air  densities:  winter  versus  summer.  How  dense  is 
the  air  on  a cold  winter  day  (255  K),  as  compared  to  the  air 
on  a hot  summer  day  (310  K)?  Assume  identical  pressures 
and  chemical  compositions. 

17-3.  A pressure  relief  valve.  A 3.0-m3  chamber  con- 
tains 8.4  kg  of  nitrogen  gas.  The  chamber  is  equipped 
with  a pressure  relief  valve  which  is  adjusted  to  open  when 
the  total  gas  pressure  reaches  5.0  atm  = 5.1  X 105  Pa. 
What  is  the  maximum  temperature  to  which  this  system 
can  be  heated  without  activating  the  valve?  The  mass  of  a 
nitrogen  molecule  is  28  u. 

O 

17-4.  Pressure  comparison.  Gas  chamber  A has  a vol- 
ume of  1.0  m3  and  a temperature  of  350  K.  It  contains 

2.0  kg  of  argon  gas.  Gas  chamber  B has  a volume  of 

3.0  m3,  and  a temperature  of  300  K.  It  contains  1.0  kg  of 
helium  gas.  Which  chamber  has  the  higher  gas  pressure? 
What  is  the  ratio  of  pressures?  The  mass  of  an  argon  atom 
is  40  u,  and  that  of  a helium  atom  is  4 u. 

17-5.  Ideal  gas.  One  liter  of  an  ideal  gas  has  a mass  of 
1.98  g at  temperature  0°C  and  pressure  1.00  atm.  What  is 
the  mass  of  1 L of  the  same  gas  at  27°C  and  1.05  atm? 

17-6.  Universal  gas  constant.  The  density  of  argon  at 
the  pressure  of  one  standard  atmosphere  and  the  temper- 
ature of  exactly  0°C  is  1.7837  kg/m3.  A kilomole  of  the  gas 
has  a mass  of  39.948  kg.  From  these  data,  calculate  the  nu- 
merical  value  of  the  universal  gas  constant  R . 

17-7.  Thin  air.  The  density  of  air  at  sea  level  is  1.223 
kg/m3  when  the  pressure  is  one  standard  atmosphere  and 
the  temperature  is  15.7°C.  What  is  its  density  at  an  altitude 
of  8.000  km  where  the  pressure  is  0.3609  atm  and  the 
temperature  is  — 29.7°C?  Assume  that  air  is  an  ideal  gas. 

17-8.  Allow  for  expansion.  The  suspended  roadway  of 
a steel  bridge  is  1500  m long.  The  ends  rest  on  rollers  to 
allow  for  expansion.  In  summer,  the  temperature  may  go 
to  40°C,  in  winter  to  — 20°C.  How  much  allowance  must  be 
made  for  expansion  at  each  end  of  the  roadway? 

17-9.  Snug  rim.  A steel  rim  is  to  be  shrink  fitted  onto  a 
wooden  wagon  wheel  1.000  m in  diameter.  If  the  diame- 
ter of  the  rim  is  0.998  m,  what  is  the  minimum  number  of 
Celsius  degrees  that  the  rim  must  be  heated  to  ht  on  the 
wheel? 

17-10.  Count  your  calories.  How  many  calories  are  re- 
quired to  change  exactly  1 g of  ice  at  — 10°C  to  steam  at 
atmospheric  pressure  and  120°C? 


17-11.  Hot  mulled  water.  A blacksmith  plunges  a red 
hot  2.0-kg  horseshoe  at  1200°C  into  8.0  kg  of  water  at 
50°C.  How  much  steam  will  be  produced?  Take  the  aver- 
age specific  heat  ratio  of  iron  to  be  0.108. 

17-12.  How  to  stay  thin  on  2000  kcal  a day.  What  is  the 
average  power  expenditure  of  a person  whose  dailv  food 
intake  has  an  energy  equivalent  of  2000  kcal? 

17-13.  Hot  shot.  A quantity  of  lead  shot  is  placed  in  a 
cardboard  tube  1.0  m long.  See  Fig.  17E-13.  When  the 


Fig.  17E-13 


tube  is  turned  end  for  end  15  times,  the  rise  in  tempera- 
ture of  the  lead  shot  is  measured  and  found  to  be  1.0°C. 
What  value  does  this  crude  experiment  give  for  the  me- 
chanical equivalent  of  heat?  What  is  the  main  source  of 
error? 

17-14.  A warm  shower ? How  much  higher  is  the  tem- 
perature of  a 50-nt  waterfall  at  the  bottom  than  at  the  top? 
Consider  only  the  conversion  of  gravitational  potential  en- 
ergy into  heat. 

Group  B 

17-15.  Constant  pressure  or  constant  volume.  A cylinder 
whose  inside  diameter  is  4.00  cm  contains  air  compressed 
by  a piston  of  mass  m = 13.0  kg,  which  can  slide  freely  in 
the  cylinder.  See  Fig.  17E-15.  The  entire  arrangement  is 
immersed  in  a water  bath  whose  temperature  can  be  con- 
trolled. The  system  is  initially  in  equilibrium  at  tempera- 


Fig.  17E-15 


772  The  Phenomenology  of  Heat 


ture  Ti  = 20°C.  The  initial  height  of  the  piston  above  the 
bottom  of  the  cylinder  is  ht  = 4.00  cm. 

a.  The  temperature  of  the  water  bath  is  gradually  in- 
creased to  a final  temperature  Tf  = 100°C.  Calculate  the 
height  hf  of  the  piston. 

b.  Starting  from  the  same  initial  conditions  specified 
in  part  a,  the  temperature  is  again  gradually  raised,  and 
weights  are  added  to  the  piston  to  keep  its  height  fixed  at 
hi.  Calculate  the  mass  that  has  been  added  when  the  tem- 
perature has  reached  Tf  = 100°C. 

17-16.  To  get  off  the  ground. 

a.  Pilots  of  light  planes  must  be  careful  to  calculate 
their  loads  on  warm  days.  Why  must  pilots  leaving  from  or 
landing  at  high  elevation  (for  example,  Denver,  Colorado, 
or  Mexico  City)  be  particularly  careful? 

b.  Compare  the  density  of  the  air  at  0°C  to  the  den- 
sity at  30°C.  Assume  identical  pressures. 

c.  Compare  the  density  of  the  air  at  Logan  Airport  in 
Boston  (elevation  0 m)  at  0°C  to  the  density  of  air  at  Sta- 
pleton Field  in  Denver  (elevation  1600  m)  at  30°C.  At  con- 
stant temperature,  the  atmospheric  pressure  p obeys 
approximately  the  equation  p = p0e~,zl8i50>  if  the  elevation 
z is  expressed  in  meters,  and  p0  is  the  atmospheric  pres- 
sure at  0 m. 

17-17.  Oxygen  tank.  A pressure  gauge  indicates  the 
differences  between  atmospheric  pressure  and  pressure 
inside  the  tank.  The  gauge  on  a 1.00-m3  oxygen  tank  reads 
30  atm.  After  some  use  of  the  oxygen,  the  gauge  reads 
25  atm.  How  many  cubic  meters  of  oxygen  at  normal  at- 
mospheric pressure  were  used?  There  is  no  tempera- 
ture change  during  the  time  of  consumption. 

17-18.  When  gas  samples  meet.  Two  samples  x and  y of 
the  same  ideal  gas  are  in  adjacent  chambers,  separated 
by  a thermallv  insulating  partition.  The  initial  volumes, 
pressures,  and  temperatures  of  the  samples  are  Vx,  Vy, 
px,py , and  Tx.  Ty,  respectively.  The  partition  is  removed, 
and  the  single  chamber  of  combined  gas  is  brought  to  a 
final  temperature  Tf. 

a.  Assume  that  the  additional  volume  made  available 
by  the  removal  of  the  partition  is  negligible,  so  that  the 
final  volume  is  Vx  + Vy.  Find  the  final  pressure  pf.  Give 
your  result  in  terms  of  Vx,  Vy,  px,  py,  Tx,  Ty,  and  Tf. 

b.  Suppose  that  the  removal  of  the  partition  makes 
available  an  additional  volume  Vp , so  that  the  final  volume 
is  Vx  + Vy  + Vp.  Modify  the  result  of  part  a to  allow  for 
this. 

17-19.  True  pressure  from  a suspect  barometer.  A mer- 
cury barometer  of  the  type  described  in  Example  16-5 
reads  740  mm.  Because  of  the  low  reading,  it  is  suspected 
that  some  air  is  present  in  the  space  above  the  mercury. 
The  space  is  60  mm  long.  The  open  end  of  the  barometer 
is  lowered  further  into  the  mercury  reservoir.  When  the 
barometer  reading  is  730  mm,  the  space  above  the  mer- 
cury is  40  mm  long.  What  is  the  true  atmospheric  pres- 
sure? 


17-20.  Connected  bulbs.  A glass  bulb  of  volume 
400  cm3  is  connected  to  another  of  volume  200  cm3  by 
means  of  a tube  of  negligible  volume.  The  bulbs  contain 
dry  air  and  are  both  at  a common  temperature  and  pres- 
sure of  20°C  and  1.000  atm.  The  larger  bulb  is  immersed 
in  steam  at  100°C;  the  smaller  in  melting  ice  at  0°C.  Find 
the  final  common  pressure. 

17-21.  Mercury  thermometer.  Let  A/j  be  the  length  of 
the  column  in  a mercury  thermometer  which  corresponds 
to  a rise  in  temperature  A t if  the  expansion  of  the  glass  is 
neglected.  Let  A /2  be  the  actual  length  which  allows  for  the 
expansion  of  the  glass.  Calculate  the  numerical  value  of 
(A/!  — A/aJ/A/j.  Only  the  bulb  is  immersed  in  the  object 
whose  temperature  is  being  measured. 

17-22.  Warm  barometer.  The  temperature  of  a barom- 
eter increases  by  AT.  The  pressure  of  the  air  remains  con- 
stant at  p0.  Show  that  the  height  read  by  the  barometer 
changes  by  A h = y/tAT,  where  h was  the  height  reading 
before  the  temperature  change  and  y is  the  coefficient  of 
volume  expansion.  The  expansion  of  the  glass  is  neg- 
ligible. 

17-23.  Buckled  rail.  A steel  rail  30  m long  is  firmly  at- 
tached to  the  roadbed  only  at  its  ends.  The  sun  raises  the 
temperature  of  the  rail  by  50°C,  causing  the  rail  to  buckle. 
Assuming  that  the  buckled  rail  consists  of  two  straight 
parts  meeting  in  the  center,  calculate  how  much  the  center 
of  the  rail  rises. 

17-24.  Melt  the  ice.  Initially  48.0  g of  ice  at  0°C  is 
in  an  aluminum  calorimeter  can  of  mass  2.0  g,  also  at  0°C. 
Then  75.0  g of  water  at  80°C  is  poured  into  the  can.  What 
is  the  final  temperature? 

17-25.  Bunsen-burner  temperature.  A student  per- 
formed the  follow'ing  experiment  to  estimate  the  tempera- 
ture of  a Bunsen-burner  flame.  He  heated  a 10-g  iron  nail 
for  some  time  in  the  flame  and  then  plunged  the  nail  into 
100  g of  water  at  10°C.  The  water  temperature  rose  to 
20°C.  What  result  did  he  get  for  the  temperature  of  the 
flame? 

17-26.  Heat  of  fusion  of  ice.  In  an  experiment  to  deter- 
mine the  latent  heat  of  fusion  of  ice,  200.0  g of  water  at 
30.0°C  in  an  iron  can  of  mass  200.0  g is  cooled  by  the  addi- 
tion of  ice  to  a temperature  of  10.0°C.  The  can  w'as 
weighed  at  the  end  of  the  experiment  and  found  to  have 
increased  in  mass  by  50.0  g. 

a.  Calculate  the  latent  heat  of  fusion  of  ice. 

b.  What  is  the  advantage  of  stopping  the  addition  of 
ice  when  the  water  temperature  is  10.0°C?  Room  tempera- 
ture is  20.0°C. 

17-27.  Steam  bath.  In  an  experiment,  50.0  g of  ice  at 
— 40°C  is  mixed  with  1 1.0  g of  steam  at  120°C  (and  1 atm 
pressure).  Neglecting  any  heat  exchange  with  the  sur- 
roundings, what  is  the  final  temperature? 


Exercises  773 


Fig.  17E-31 


Group  C 

17-28.  A gaseous  jack-in-the-box.  A box  of  interior  vol- 
ume Vb  has  a heavy  airtight  hinged  lid  of  mass  Mt  and 
area  A; . The  box  contains  nb  kmol  of  a perfect  gas  at  tem- 
perature T0-  The  box  is  inside  a chamber  which  also  con- 
tains an  additional  nc  kmol  of  the  gas  at  the  same  temper- 
ature. The  gas  in  the  chamber  occupies  a volume  Vc . 

a.  Find  the  pressure  pb  in  the  box  in  terms  of  nb , Vb, 
and  T0. 

b.  Find  the  pressure  pc  in  the  chamber  in  terms  of  nc, 
Vc,  and  T0. 

c.  Initially  the  hinged  lid  is  closed.  Show  that  this  re- 
quires that  pb  - pc  «£  Mig/Ai . 

d.  If  the  whole  system  is  heated,  at  what  temperature 
Ti  will  the  gas  pressure  lift  the  hinged  lid? 

e.  Suppose  that  starting  from  T0.  the  system  is  heated 
to  a temperature  T'  > Tt,  and  then  cooled  back  to  tem- 
perature T0.  Assume  that  the  hinged  lid  recloses  as  soon 
as  the  (changing)  pressure  in  the  bo x fails  to  exceed  the 
(changing)  chamber  pressure  by  Mig/Ai . Let  n'b  denote  the 
number  of  kilomoles  remaining  in  the  box  after  the 
heating  and  recooling.  Show  that 

, _ MigVbVc/AiRT'  + (nb  + nc)Vb 
Hb  Vb  + Ve 

f.  Show  that  the  expression  for  nb  given  in  part 
e approaches  the  limiting  value  (nb  + nc)Vc/{Vb  + Vc)  as 
T'  —*  °°.  Can  you  suggest  a simple  interpretation  for  this 
result? 

17-29.  Up  the  chimney.  A chimney  is  50  m high.  The 
outside  air  temperature  is  0°C.  The  fire  heats  the  air  in  the 
chimney  to  an  average  temperature  of  273°C.  Given  this 
information,  it  is  possible  to  calculate  the  speed  of  the 
air  rising  in  the  chimney?  Write  a paragraph  stating  how 
you  could  perform  the  calculation  or  why  you  could  not. 

17-30.  Pendulum  clock. 

a.  If  the  length  / of  a simple  pendulum  is  increased 
by  an  infinitesimal  amount  dl,  what  will  be  the  fractional 
decrease  in  the  frequency? 

b.  An  approximation  to  a simple  pendulum  is  a light 
steel  rod  of  length  /,  supporting  a much  heavier  concen- 
trated weight.  By  what  fraction  would  the  rod’s  length  in- 
crease for  an  infinitesimal  rise  in  temperature  dT? 

c.  The  pendulum  clock  described  in  part  b keeps 
accurate  time  at  15°C.  How  many  seconds  per  day  would  it 
lose  if  the  temperature  were  25°C?  The  frequency  of  an 
accurate  clock  is  86,400  s/day. 

17-31.  Coefficient  of  expansion  for  a liquid.  When  the  ex- 
pansion of  a liquid  in  a vessel  is  measured  to  obtain  the 
coefficient  y,  what  is  actually  obtained  directly  is  y relative 
to  the  material  of  which  the  container  is  made.  Figure 
17E-31  illustrates  an  apparatus  from  which  the  correct 
value  of  y can  be  found  without  any  knowledge  of  y for 
the  container. 


Water  at 

temperature  Melting 

t°C  ice 


a.  Show  that  y = (ht  — h0)/h0t. 

b.  If  ht  — h0  = 1.0  cm,  h0  = 100  cm,  and  t = 20°C, 
calculate  the  value  of  y for  the  liquid. 

17-32.  Build  a better  balance  wheel.  The  balance 
(timing)  wheel  of  a mechanical  wrist  watch  has  a fre- 
quency of  oscillation  given  by 

= _L  / k _ i 

V ~ 2tt  VMG2  2t tG  VM 

where  G is  its  gyration  radius;  see  Eq.  (10-27).  The  wrist 
watch  keeps  accurate  time  at  25°C.  How  many  seconds 
would  it  gain  a day  at  — 25°C  if  the  balance  wheel  was 
made  of  steel?  If  it  was  made  of  Invar? 

17-33.  Hot  capillary  tube.  A thread  of  liquid  in  a uni- 
form capillary  tube  is  of  length  L,  as  measured  by  a ruler. 
The  temperature  of  the  tube  and  thread  of  liquid  is  raised 
by  AT.  Show  that  the  increase  in  the  length  of  the  thread, 
again  measured  with  a ruler,  is  AT  = (y  — 2a) AT,  where 
y is  the  coefficient  of  volume  expansion  of  the  liquid  and  a 
is  the  coefficient  of  linear  expansion  of  the  tube  material. 

17-34.  Bimetallic  strip.  A bimetallic  strip  consists  of 
two  thin  metal  strips  of  different  material  welded 
together.  When  heated,  the  strip  curves  as  shown  in  Fig. 
17E-34.  Prove  that  R = d/(a2  — ad  A T,  where  R is  the 


Fig.  17E-34 


radius  of  curvature  of  the  strip,  a2  and  ax  are  the  coeffi- 
cients of  linear  expansion  of  its  two  constituents,  d is  the 
thickness  of  each,  and  AT  is  the  increase  in  temperature. 


774  The  Phenomenology  of  Heat 


Fig.  17E-35 


17-35.  Temperature-independent  pendulum  clock.  The 
pendulum  of  a clock  illustrated  in  Fig.  17E-35  has  a bob 
consisting  of  glass  tubes  containing  mercury  supported  by 
a steel  rod  80.0  cm  long.  If  the  period  is  to  be  unaffected 
by  temperature  changes,  how  high  should  the  mercury 
column  be  in  each  of  the  tubes? 

17-36.  Hot  rods.  An  aluminum  rod  and  a steel  rod, 
both  50  cm  long  and  with  the  same  cross-sectional  area, 
are  placed  end  to  end  between  two  rigid  supports.  The 
temperature  is  raised  20°C.  What  is  the  stress  in  either 
rod?  Young’s  modulus  for  steel  is  21  X 1010  N/m2. 
Young’s  modulus  for  aluminum  is  7.0  x 1010  N/m2. 

17-37.  Specific  heat  capacity  of  metals  at  low  temperatures. 
At  low  temperatures,  the  specific  heat  capacity  of  metals 
can  be  expressed  as  c = k{T  + k3T 3,  where  T is  in  K.  For 
Cu,  k3  = 2.48  x 10~7  cal/tg-K4),  Kl  = 2.75  x 10“6 
cal/(g-K2).  How  much  heat  energy  is  required  to  raise  the 
temperature  of  a 15-g  block  of  Cu  from  5 K to  30  K? 


Exercises  775 


18 

Kinetic  Theory 
and  Statistical 
Mechanics 


18-1  THE  IDEAL-  We  have  arrived  at  the  equation  of  state  of  an  ideal  gas  by  summarizing  the 
GAS  MODEL  results  of  careful  experimentation.  That  is,  so  far  the  equation  describes  a 
purely  empirical  relation  among  its  variables.  Now  we  consider  the  matter 
from  the  point  of  view  of  newtonian  mechanics  for  the  purpose  of  ex- 
plaining the  relation.  Attempts  to  do  this  were  made  by  many  people, 
starting  from  the  time  that  Boyle  hrst  enunciated  his  rule  concerning  the 
inverse  proportionality  between  the  pressure  of  a gas  and  its  volume.  Each 
attempt  involved  applying  fundamental  physical  laws  to  a model  of  a gas. 

A model  is  one  of  the  most  powerful  tools  available  to  the  physical  sci- 
entist. A model  of  an  actual  physical  system  is  a simpler  system  — usually  an 
imaginary  one — whose  behavior  is  supposed  to  be  relevant  to  what 
happens  in  the  actual  system.  While  a model  may  be  a figment  of  the  imagi- 
nation, it  has  one  incomparable  virtue:  It  is  simple  enough  that  its  behavior 
can  be  analyzed  in  some  detail  by  using  the  basic  laws  of  physics.  If  we  as- 
sume that  a model  does  not  contain  logical  contradictions  and  is  not  in  con- 
flict with  these  basic  laws,  its  validity  is  judged  by  the  extent  to  which  the 
analysis  leads  to  descriptions  of  its  behavior  which  agree  with  experimental 
observations  made  on  the  actual  system.  The  better  the  model,  the  more  its 
predicted  properties  have  in  common  with  the  observed  properties  of  the 
system  it  models.  Yet  even  for  a very  good  model  it  should  not  be  thought 
that  the  model  is  the  system.  Rather,  the  model  behaves  like  the  system  in 
important  respects. 

Most  of  the  early  models  for  a gas  were  static.  That  is,  they  assumed 
that  the  component  parts  of  a gas  were  at  rest.  Some  of  these  models  as- 
sumed that  a gas  was  composed  of  molecules,  and  others  assumed  that  it 
was  continuous.  There  were  two  main  kinds  of  difficulties  with  the  static 


776 


models.  First,  when  their  behavior  was  analyzed  and  compared  with  accu- 
rate experiments,  it  became  necessary  to  invent  all  sorts  of  special  hypothe- 
ses, many  of  which  were  quite  implausible.  Special  hypotheses  tend  to  spoil 
the  main  purpose  of  any  model,  which  is  to  explain  the  behavior  of  a 
system  — in  this  case  a gas — on  the  basis  of  well-founded  physical  laws  only. 
Second,  the  models  tended  to  be  so  complex  that  it  was  difficult  or  impos- 
sible to  make  quantitative  predictions  as  to  how  real  gases  should  behave. 
This  precluded  stringent  tests  of  the  models  themselves. 

The  kinetic  models  of  gases  that  we  will  now  discuss  do  not  share  in 
these  difficulties.  They  make  two  basic  assumptions: 

1.  Gases  are  composed  of  a large  number  of  individual  molecules. 

2.  The  molecules  are  in  constant  motion. 

Both  of  these  assumptions  have  been  amply  confirmed  since  1900  by  direct 
experimental  obervation. 

Kinetic  models  predict  many  aspects  of  the  behavior  of  gases  quite  ac- 
curately. Even  more  encouraging  is  the  fact  that  it  is  clear  from  a kinetic 
model  itself  why  it  fails  under  those  circumstances  when  it  does  fail.  It  is 
then  possible  to  extend  the  model  very  fruitfully  by  elaboration.  More- 
over, kinetic  models  make  it  possible  to  understand  the  concept  of  temper- 
ature, so  far  an  isolated  one,  in  terms  of  the  laws  of  mechanics.  Still  further, 
it  is  possible  to  show  that  heat,  an  even  more  elusive  quantity,  is  a particular 
form  of  energy. 

A kinetic  model  of  a gas  was  first  proposed  by  Hermann  in  1716.  In  1738,  pre- 
dictions concerning  its  behavior  were  obtained  by  applying  newtonian  me- 
chanics, in  the  work  of  Daniel  Bernoulli.  This  work  was  published,  but  it  lay 
fallow  for  over  a century.  This  was  due  largely  to  the  contemporary  distaste  for 
assuming  the  existence  of  molecules.  No  one  had  ever  seen  a molecule,  and  it  then 
appeared  highly  unlikely  that  anyone  ever  would.  Today,  when  we  have  a very 
large  number  of  independent  means  for  detecting  and  studying  individual  mole- 
cules, atoms,  and  even  smaller  entities,  it  may  seem  naive  to  be  so  suspicious  of 
molecules.  But  we  cannot  say  that  the  skepticism  of  scientists  of  earlier  days  was 
unwarranted.  Indeed,  it  is  the  very  sort  of  skepticism,  concerning  that  which  is 
unobserved  (or  unobservable),  on  which  science  must  almost  certainly  be  founded 
if  it  is  to  make  progress. 

During  the  nineteenth  century  all  sorts  of  indirect  evidence  for  the  existence 
of  molecules  piled  up,  both  in  physics  and  in  chemistry.  And  during  that  period 
the  utility  of  kinetic  models  of  gases  (and  hence  their  validity)  was  well  estab- 
lished by  the  far-ranging  and  exceedingly  detailed  predictions  of  the  behavior  of 
gases  to  which  they  lead.  Such  predictions  were  made  by  Maxwell,  Boltzmann, 
Gibbs,  Helmholtz,  and  others. 

We  now  describe  the  simplest  kinetic  model  of  a gas  and  the  one  that 
we  analyze  in  Sec.  18-2.  It  is  called  the  ideal-gas  model.  In  addition  to  the 
two  assumptions  common  to  all  kinetic  models — that  gases  consist  of  many 
molecules,  which  are  in  constant  motion — the  ideal  gas  model  assumes 
that: 

3.  A gas  molecule  is  so  small  that  it  can  be  considered  as  an  ideal 
particle  — a body  of  nonzero  mass  but  zero  size. 

4.  The  only  forces  acting  between  gas  molecules  and  other  objects  are 
contact  forces. 


18-1  The  Ideal-Gas  Model  777 


The  justification  of  assumption  3 is  that  gases  are  so  very  much  less  dense 
than  liquids  or  solids  that  the  size  of  a gas  molecule  is  at  almost  all  times 
negligibly  small  compared  to  the  separation  between  it  and  any  other  gas 
molecule.  And  its  size  is  certainly  always  negligible  compared  to  the  dimen- 
sions of  the  container  holding  the  gas.  There  are  two  points  to  be  made  in 
justification  of  assumption  4.  One  is  that  a gas  molecule  has  a very  small 
mass.  Hence  the  gravitational  forces  acting  between  it  and  anything  else 
separated  from  it  by  an  appreciable  distance  will  be  so  weak  that  they 
can  be  neglected.  The  other  point  is  that  a gas  molecule  has  no  net  electric 
charge.  So  when  a gas  molecule  is  at  an  appreciable  distance  from  some 
other  object,  electric  forces  will  not  be  exerted  between  them. 

These  two  assumptions  allow  us  to  neglect  any  interaction  between  the 
gas  molecules  in  a container  of  gas.  Assumption  4 says  that  two  molecules 
can  exert  forces  on  each  other  only  at  the  instant  when  they  are  in  contact 
in  a collision,  and  assumption  3 says  that  there  are  no  collisions  because 
molecules  have  no  size.  Thus  in  the  ideal-gas  model  a molecule  travels  with 
constant  momentum  in  a straight  path  through  a container  of  gas  and  never 
interacts  with  another  molecule. 

However,  the  ideal-gas  model  does  allow  for  interactions  between  the 
molecules  of  a gas  and  the  walls  of  the  container  holding  the  gas.  At  the  in- 
stant when  a molecule  moves  up  to  a wall  and  attempts  to  enter  it,  forces 
arise  which  are  quite  strong  on  the  scale  of  objects  of  molecular  mass.  They 
are  electric  in  nature,  but  we  need  not  be  concerned  with  their  details  at 
this  point.  Their  effect  is  to  produce  a contact  force  which  acts  on  the  gas 
molecule  in  the  direction  away  from  the  wall  and  causes  it  to  bounce  back 
into  the  region  where  the  gas  is  contained.  At  the  same  time,  the  gas  mole- 
cule exerts  an  equal  but  oppositely  directed  force  on  the  wall.  This  force, 
directed  into  the  wall,  and  the  similar  forces  exerted  on  the  wall  by  other 
gas  molecules  when  they  bounce  from  it,  gives  rise  to  the  gas  pressure  ex- 
erted on  the  wall.  We  evaluate  the  pressure  in  Sec.  18-2. 


18-2  KINETIC  THEORY  Now  we  will  apply  newtonian  mechanics  to  the  ideal-gas  model  in  order  to 
OF  THE  IDEAL  GAS  derive  the  equation  of  state  of  an  ideal  gas.  In  so  doing,  we  work  with  what 

is  called  the  kinetic  theory  of  the  ideal  gas. 

We  have  assumed  that  there  are  no  interactions  among  the  molecules 
in  the  ideal-gas  model.  Hence  each  one  acts  precisely  as  it  would  if  none  of 
the  others  were  present.  This  makes  possible  an  extremely  important  sim- 
plification. We  can  start  by  studying  a box  containing  just  one  single  mole- 
cule and  derive  an  equation  of  state  for  such  a “gas.”  The  extension  of  the 
theory  to  a box  containing  a very  large  number  of  molecules  then  becomes 
a simple  matter  of  addition. 

For  the  moment  we  assume  that  each  wall  of  the  box  is  perfectly  rigid. 
That  is,  we  assume  that  no  molecule  of  the  wall  can  move  with  respect  to 
any  other  such  molecule,  so  that  an  entire  wall  acts  as  a single  body.  And  in 
comparison  to  the  mass  of  the  gas  molecule,  the  wall  can  be  considered  as 
having  infinite  mass. 

When  the  gas  molecule  collides  with  a wall  of  the  box  containing  it,  the 
wall  exerts  a force  on  the  gas  molecule,  as  described  in  Sec.  18-1.  As  a re- 
sult, momentum  is  transferred  to  the  gas  molecule  so  that  it  bounces  off  the 
wall.  Momentum  conservation  requires  that  momentum  of  the  same  mag- 
nitude be  transferred  to  the  wall.  Now  for  any  body  of  mass  M,  its  kinetic 


778  Kinetic  Theory  and  Statistical  Mechanics 


y 


Fig.  18-1  A molecule  of  mass  m mov- 
ing with  velocity  v in  a cubical  box  of 
edge  length  L. 


energy  K is  related  to  the  magnitude  p of  its  momentum  by  the  equation 
K = p2/2M.  [To  obtain  this  relation,  write  K = Mv2/ 2 and  multiply  the 
right  side  by  M/M  to  obtain  K — (Mv)2 /2M.  Then  write  p = Mil]  This 
equation  can  be  applied  to  calculate  the  kinetic  energy  K = p2/2M  trans- 
ferred to  the  wall  of  mass  M when  the  gas  molecule  bounces  from  it,  trans- 
ferring in  the  process  momentum  of  magnitude  p to  the  wall.  The  calcula- 
tion tells  us  that  K can  be  considered  to  be  zero  since  we  can  consider  M to 
be  infinite.  And  since  there  is  no  energy  transferred  to  the  wall  from  the 
gas  molecule,  there  can  be  no  energy  transferred  to  the  gas  molecule  from 
the  wall.  Our  assumption — that  a wall  of  the  box  is  perfectly  rigid  and  in- 
finitely massive  in  comparison  to  the  mass  of  a gas  molecule — leads  to  the 
conclusion  that  the  total  mechanical  energy  of  the  molecule  leaving  the  wall 
after  a collision  is  the  same  as  when  the  molecule  approaches  the  wall  be- 
fore the  collision. 

Figure  18-1  depicts  the  single-molecule  “gas.”  For  convenience,  the 
box  containing  the  gas  is  a cube  of  edge  length  L.  The  molecule  has  mass  m 
and  at  the  instant  illustrated  is  moving  with  velocity  v.  Every  so  often,  the 
molecule  bounces  from  one  of  the  walls  in  a collision  which  does  not 
change  its  total  mechanical  energy.  We  describe  these  collisions  by  saying 
the  molecule  collides  elastically  with  the  walls.  Since  no  force  is  exerted  on 
the  molecule  except  at  the  walls  of  the  box,  we  can  take  the  potential  energy 
associated  with  the  molecule  to  have  the  value  zero  everywhere  within  the 
walls.  Then  the  total  energy  of  the  molecule  is  the  same  as  its  kinetic  en- 
ergy, and  we  can  say  that  as  it  bounces  elastically  from  the  walls,  it  main- 
tains constant  kinetic  and  total  energies.  (It  should  be  pointed  out  that  if  the 
collisions  of  the  molecule  with  the  walls  were  inelastic,  so  that  it  lost  energy 
in  each  collision,  then  in  time  it  would  have  no  energy.  The  molecule  then 
would  not  have  the  constant  motion  required  by  the  ideal-gas  model.  Thus 
the  assumptions  w’hich  lead  to  elastic  collisions  are  consistent  with  the 
ideal-gas  model,  but  ones  which  lead  to  inelastic  collisions  would  not  be 
consistent  with  the  model.) 


In  terms  of  its  components  along  unit  vectors  x,  y,  z aligned  with  the 
edges  of  the  box  in  Fig.  18-1,  the  velocity  vector  v of  the  molecule  can  be 
written 


v = vx  x + Vy  y + vz  i 

Suppose  the  molecule  hits  the  right-hand  wall  of  the  box.  We  simplify  the 
calculations,  without  affecting  their  final  results,  by  assuming  that  the  force 
exerted  on  the  molecule  by  a wall  is  always  directed  normal  to  the  wall  and 
toward  the  interior  of  the  box.  In  this  case  the  force  on  the  molecule  acts  in 
the  negative  x direction.  Hence  it  changes  only  the  x component  of  the  mol- 
ecule’s momentum  and  therefore  only  the  x component  of  its  velocity,  vx. 
The  force  simply  reverses  the  sign  of  the  value  of  vx.  This  must  be  true  in 
order  that  the  molecule’s  speed  v = (vx  + v\  + v2z)112  be  the  same  after  it 
hits  the  wall  as  it  is  before,  so  that  its  kinetic  energy  remains  constant. 

After  the  molecule  strikes  the  right-hand  wall,  it  moves  off  to  the  left. 
Which  of  the  five  other  walls  it  will  strike  next  we  cannot  tell  in  general,  but 
we  know  that  it  must  soon  strike  the  left-hand  wall.  Any  intermediate  colli- 
sions with  other  walls  will  not  change  the  x component  of  its  velocity  since, 
according  to  our  assumption  that  the  forces  are  normal,  these  walls  cannot 
exert  forces  on  the  molecule  in  the  x direction.  Thus  since  the  distance 
between  the  right-  and  left-hand  walls  is  L,  the  time  required  for  the  mole- 


18-2  Kinetic  Theory  of  the  Ideal  Gas  779 


cule  to  travel  between  them  is  L/\vx\,  where  \vx\  is  the  magnitude  of  the  x 
component  of  its  velocity.  After  the  molecule  strikes  the  left-hand  wall,  the 
sign  of  this  component  of  velocity  is  reversed  again,  but  its  magnitude  is 
again  unchanged.  Hence  the  molecule  takes  the  same  amount  of  time, 
L/\vx\,  to  travel  hack  to  the  right-hand  wall,  regardless  of  intermediate  colli- 
sions with  other  walls.  The  total  time  A t between  collisions  with  the 
right-hand  wall  is  therefore  given  by  the  expression 

2 L 

A t=T-,  (18-1) 

Kl 

The  value  of  At  is  determined  in  Example  18-1  for  a typical  case. 


EXAMPLE  18-1  —nil'll  ■ ■■■■■ ■rT.n.i  i ii 

At  room  temperature  (T  = 300  K)  a typical  oxygen  molecule  has  a velocity  compo- 
nent in  the  x direction  of  magnitude  luj  = 278  m/s.  If  a single  such  molecule  were 
confined  in  a cubical  box  of  edge  length  L = 1.00  m,  how  many  times  per  second 
would  it  collide  with  the  right-hand  wall? 

■ From  Eq.  (18-1)  you  have 


At 


2 x 1.00  m 
278  m/s 


= 7.19  x 1CT3  s 


This  is  the  number  of  seconds  per  collision.  The  number  of  collisions  per  second  is 
its  reciprocal. 


1 

At 


139  s"1 


The  collisions  are  very  frequent  on  the  scale  of  everyday  experience. 

I — — 


The  force  exerted  on  the  right-hand  wall  by  the  molecule  is  very  non- 
uniform.  Most  of  the  time  there  is  no  force  at  all,  but  at  intervals  At  = 
2L/\vx\  apart  there  is  a large  force.  This  is  depicted  schematically  by  the 
series  of  uniformly  spaced  “spikes”  in  Fig.  18-2.  What  we  would  really  like 
to  find,  though,  is  the  average  force  exerted  by  the  molecule  on  the  wall, 
because  we  ultimately  will  relate  it  to  the  pressure  exerted  on  the  wall.  This 
is  represented  in  Fig.  18-2  by  the  constant-value  horizontal  line,  the  area 
under  which  is  equal  to  that  under  the  spiked  curve. 


F 


Fig.  18-2  Spikes  representing  the  time  de- 
pendence of  the  magnitude  of  the  force  ex- 
erted on  the  right-hand  wall  of  a box  by  a 
molecule  bouncing  between  its  walls.  The 
horizontal  line  represents  the  magnitude  of 
the  average  force  (F)  exerted  on  the  wall. 


780  Kinetic  Theory  and  Statistical  Mechanics 


We  can  find  the  average  force  (F)  exerted  by  the  molecule  on  the 
right-hand  wall  by  calculating  the  momentum  A (rav)  transferred  to  the  wall 
by  the  molecule  in  one  collision  with  the  wall,  dividing  by  the  time  interval 
A t between  collisions,  and  then  invoking  Newton's  second  law  in  the  form 


A (mv) 
A t 


(18-2) 


Before  it  collides  with  the  wall,  the  x component  of  the  molecule’s  velocity 
has  the  value  vx  = |yx|.  Afterward  it  has  the  value  vx  = — |wj|.  The  change  in 
this  velocity  component  is  its  final  value  minus  its  initial  value,  that  is, 
— Ittrl  — |uj.|  = — 2\vx\.  The  change  in  the  x component  of  the  molecule’s 
momentum  is  the  product  of  its  mass  m and  this  quantity.  So  the  x compo- 
nent of  the  momentum  changes  by  the  amount  — 2m|vr|.  Since  there  are  no 
changes  in  the  y or  z components  of  the  molecule’s  momentum  in  its  colli- 
sion with  the  right-hand  wall,  this  accounts  for  all  the  momentum  trans- 
ferred from  the  wall  to  the  molecule.  The  law  of  momentum  conservation 
requires  that  the  momentum  transferred  from  the  molecule  to  the  wall  be 
just  the  negative  of  this  quantity.  Hence  if  we  write  the  momentum  trans- 
ferred to  the  wall  as  the  vector  A (m\),  the  value  of  this  vector  is  given  by 
the  expression 

A (mv)  = 2w.|trr|  x 


Dividing  by  the  time  interval  At  = 2L/\vx\  and  then  using  Eq.  (18-2),  we 
find  the  average  force  exerted  on  the  right-hand  wall  to  be 


(F) 


2m\vx\  x _m{\vx\)2  , 
2L/\vx\  ~ Z X 


Since  (|fx|)2  — vx,  this  can  be  written 


<F> 


mv2x 

1 T 


X 


This  force  is  exerted  on  the  wall  in  the  outward  direction. 


(18-3) 


In  a formal  sense,  we  can  calculate  the  pressure  p exerted  on  the  wall 
by  the  molecule  by  dividing  the  magnitude  of  this  average  force  by  the  area 
of  the  wall.  It  may  seem  artificial  to  speak  of  a pressure  (which  we  usually 
think  of  as  a steady  and  distributed  effect)  produced  by  the  periodic  impact 
of  a point  molecule  at  one  place  or  another  on  a large  wall.  But  under  ordi- 
nary circumstances,  the  round-trip  time  At  is  so  short  on  the  macroscopic 
time  scale  that  even  a single  molecule  would  seem  to  produce  a “continu- 
ous” bombardment.  That  is,  a macroscopic  device  designed  to  measure  the 
force  would  not  respond  to  the  fluctuations,  but  would  read  the  average 
value.  More  important,  however,  is  the  fact  that  eventually  we  will  be 
dealing  with  a vast  number  of  molecules.  Each  molecule  will  be  colliding 
with  the  walls  independently  of  all  the  others.  These  collisions  will  be  so 
well  spread  over  time,  and  over  the  surface  of  the  wall,  that  the  total  ob- 
served effect  will  be  steady,  and  uniform  everywhere  on  the  wall.  The  situa- 
tion is  indicated  schematically  in  Fig.  18-3. 

Thus  it  makes  some  sense  to  calculate  the  pressure  exerted  on  the 
right-hand  wall,  of  area  L2,  by  the  “one-molecule  gas.”  Because  this  pres- 
sure is  exerted  in  the  x direction  by  a molecule  to  which  we  can  assign  the 


18-2  Kinetic  Theory  of  the  Ideal  Gas  781 


F 


Fig.  18-3  Spikes  representing  the  time  de- 
pendence of  the  magnitude  of  the  force  ex- 
erted on  the  right-hand  wall  of  a box  con- 
taining three  molecules  moving  with 
velocities  having  x components  of  different 
magnitudes  |wx|.  The  molecule  with  the 
smallest  value  of  |wx|  produces  a force, 
when  it  bounces  from  the  wall,  whose  mag- 
nitude is  smaller  than  that  produced  by  the 
molecule  with  the  largest  value  of  |t>x|  by  a 
factor  equal  to  the  ratio  of  these  quantities. 
Also,  the  frequency  at  which  it  bounces  is 
smaller  than  that  of  the  other  molecule  by 
the  same  factor.  These  two  effects,  taken 
together,  mean  that  its  contribution  to  the 
total  average  force,  of  magnitude  (F),  is 
smaller  than  that  of  the  other  molecule  by 
the  square  of  the  factor. 


label  1,  we  write  it  as  pXx.  Since  pressure  is  defined  to  be  force  per  unit  area, 
the  value  of  this  pressure  is 

, _ KF>I 

Pl*  L2 

Using  Eq.  (18-3),  we  obtain 

mv2x/L  mv% 

Plx  = ~lT~  = ~iy 


or 


mv\ 
Pix  = — 


(18-4) 


where  V = L3  is  the  volume  of  the  container. 

It  makes  complete  sense  to  calculate  the  pressure  if  we  put  a very  large 
number  N of  molecules  in  the  box.  Because  the  molecules  act  completely 
independently  of  one  another,  we  can  write  the  total  pressure  px  that  they 
exert  on  the  right-hand  wall  of  the  box  as  the  sum  of  the  separate 
pressures  — that  is,  partial  pressures — exerted  by  the  individual  molecules. 
Labeling  the  typical  molecule  as  j,  we  have 

N N yyj  .7)? 

= I^rr  (18-5) 

3=1  j=l 


This  use  of  the  concept  of  partial  pressure  is  quite  consistent  with  what  chem- 
ists call  the  law  of  partial  pressures.  The  law  states  that  the  pressure  of  a gas  con- 
sisting of  a mixture  of  substances  is  equal  to  the  sum  of  the  pressure  which  would 
be  exerted  by  each  gaseous  substance  separately.  Here,  of  course,  our  gas  consists 
of  a “mixture”  of  “gases”  each  of  which  is  an  individual  molecule. 

Taking  all  the  molecules  in  the  box  to  have  the  same  mass  by  setting 
nij  = m for  ally,  we  can  pass  this  common  value,  as  well  as  the  volume  V of 
the  box,  through  the  summation  sign.  We  obtain 

Px=Tf^Evfx  (18-6) 

V 3=1 

The  quantity  vfx,  the  square  of  the  x component  of  velocity  of  a molecule, 
will  not  be  the  same  for  all  molecules.  The  molecules  of  a gas  move  in  a 


782  Kinetic  Theory  and  Statistical  Mechanics 


random  manner,  not  in  a uniform  manner.  Nevertheless,  we  can  define  an 
average  of  this  quantity  over  the  collection  of  molecules.  We  write  the 
average  as  (v1 2f  and  evaluate  it  by  adding  the  vfx  and  then  dividing  by  the 
total  number  N of  molecules.  That  is, 


<t£>  = 


t=i 


N 


(18-7) 


Next  we  solve  Eq.  (18-7)  for  the  summation  over  j,  to  find 


N(v*) 


j=i 


Then  we  use  this  relation  to  substitute  the  quantity  N(v%)  for  the  summa- 
tion in  Eq.  (18-6),  and  we  get 

Nm 


Px  = 


V 


(Vi) 


(18-8) 


Equation  (18-8)  gives  the  total  pressure  exerted  on  the  right-hand  wall 
of  the  container  by  all  the  molecules.  What  about  the  left-hand  wall?  The 
pressure  there  must  be  the  same,  although  it  is  exerted  in  the  negative  x 
direction.  Aside  from  the  dictates  of  symmetry,  note  that  Eq.  (18-8)  con- 
tains only  the  square  of  vx,  not  vx  itself.  The  square  is  always  positive  and 
therefore  independent  of  direction. 

There  are  still  four  more  walls  to  account  for.  Or  rather,  there  are  two 
additional  pairs  of  walls  to  account  for,  since  the  argument  immediately 
above  demonstrates  that  the  pressure  on  any  wall  is  the  same  as  that  on  the 
opposite  wall.  The  expressions  for  the  pressure  py  exerted  on  the  front  and 
back  walls  and  for  the  pressure  pz  exerted  on  the  top  and  bottom  walls  can 
be  derived  just  as  we  have  done  for  the  left  and  right  walls.  The  expressions 
are 


Pi 


Nm 


(v%)  Py 


Nm 


(vl) 


Pz 


Nm 


(vl) 


(8-9) 


1 he  three  equations  for  the  pressures  on  the  three  pairs  of  walls  are  iden- 
tical in  form. 

We  can  show  that  the  pressures  themselves  are  identical  in  value  by 
considering  what  we  mean  by  the  average  of  the  square  of  the  velocity  com- 
ponent appearing  in  each  of  these  expressions  for  the  pressures.  Ac- 
cording to  Eqs.  (18-9),  the  pressures  do  not  depend  on  the  square  of  the 
velocity  component  of  any  particular  molecule.  This  is  an  important  simpli- 
fication because  the  velocity  components  differ  from  one  molecule  to  the 
next.  For  example,  one  molecule  may  be  moving  very  fast  in  the  x direction 
but  quite  slowly  in  the  y and  z directions.  Another  may  be  moving  fast  in 
the  x and  z directions  but  slowly  in  the  y direction.  But  whatever  may  be 
true  of  a single  molecule,  we  are  dealing  here  with  the  squares  of  the  velocity 
components  averaged  over  a very  large  number  of  molecules.  Since  there  is  no  reason 
for  one  direction  to  be  different  from  any  other,  these  averages  must  be  equal.  Hence 
we  must  have 

(vl)  =< vl ) =(v*>  (18-10) 

Using  these  equalities  in  Eqs.  (18-9),  we  get 

Px  = Py  = Pz  — P (18-11) 

18-2  Kinetic  Theory  of  the  Ideal  Gas  783 


We  h ave  i has  derived  a form  of  Pascal's  law  of  Sec.  16-3  which  states  that 
pressure  is  uniform  in  all  directions.  We  therefore  drop  the  subscripts  on 
the  pressure  and  write  it  as  p. 


The  next  step  in  deriving  the  equation  of  state  for  an  ideal  gas  is  to 
write  the  expression  for  the  square  of  the  velocity  of  the  jth  molecule  in 
terms  of  the  sum  of  the  squares  of  its  components.  The  three-dimensional 
Pythagorean  theorem  gives 

= 71?  -|-  7 »?  7 1? 

uj  ujx  ' ujy  ' ujz 

We  average  this  relation  over  the  molecules  in  the  box,  just  as  in  Eq.  (18-7), 
to  obtain 


( V 2)  = (v%  + vl  + vl) 

Then  we  use  the  fact  that  the  average  of  a sum  of  terms  equals  the  sum  of 
their  averages.  (The  operation  of  taking  an  average  has  this  property  be- 
cause of  its  close  relation  to  the  operation  of  taking  a summation,  and  the 
latter  certainly  has  the  property.)  Hence  we  can  write 

(v2)  = (vl)  + (v2y)  + (vl) 

Using  Eqs.  (18-10),  we  can  write  this  as 

<^2>  = (v%)  + (v%)  + (vl) 

or 

= i(v2)  (18-12) 

Finally,  we  set  px  = p in  Eq.  (18-8),  to  obtain 


and  then  employ  Eq.  (18-12).  The  result  is 

1 Nm 


P = 


3 V 


(v2) 


Multiplying  the  right  side  by  2/2,  we  express  it  as 


P =¥ 


2 N m(v2) 


3 V 2 


(18-13) 


Let  us  analyze  the  physical  significance  of  the  factors  on  the  right  side 
of  Eq.  (1 8-13).  The  quantity  m(v2)  /2  = (mv2/ 2)  is  the  average  kinetic  energy 
of  a gas  molecule.  It  also  is  the  average  total  energy  of  the  gas  molecule 
since,  as  was  explained  near  the  beginning  of  this  section,  we  can  take  the 
potential  energy  of  a gas  molecule  to  be  zero.  So,  using  the  symbol  (e)  for 
the  average  total  energy  of  a gas  molecule,  we  have 


The  quantity  Nm(v2) / 2 = N(e)  is  the  total  energy  in  the  gas  because  it  is  the 
product  of  the  average  total  energy  of  a gas  molecule  and  the  number  of 
molecules  in  the  gas.  And  (N/V)m(v2)  /2  = N(e) /V  is  the  total  energy  di- 
vided by  the  volume  of  the  gas.  In  other  words,  N(e) /V  is  the  energy  den- 
sity p of  the  gas.  That  is,  we  define 


784  Kinetic  Theory  and  Statistical  Mechanics 


(18-14) 


N(e) 


Thus  we  have  %(N/V)m(v2) /2  = i(N/V)(e)  = fp.  We  now  can  see  that  Eq. 
(18-13)  states  the  pressure  p exerted  by  the  ideal  gas  on  a wall  of  the  box  contain- 
ing it  is  equal  to  two-thirds  the  total  energy  density  p of  the  gas: 

p = fp  (18-1 5a) 

It  is  a remarkable  result  of  the  kinetic  theory  that  the  pressure  (a  surface  ef- 
fect) should  be  directly  proportional  to  the  energy  density  (a  volume  ef- 
fect). 

The  energy  density  in  air,  which  approximates  an  ideal  gas  very  well,  is 
evaluated  in  Example  18-2. 


EXAMPLE  18-2 

Find  the  energy  density  of  air  at  atmospheric  pressure  (p  = 1.013  X 105  Pa). 

■ From  Eq.  (18- 15a)  you  have 

P = ip 

or 

p = I x 1.013  X 105  N/m2  = 1.520  x 10s  J/m3 

If  the  energy  in  1 m3  of  air  were  completely  recoverable  in  the  form  of  macroscopic 
mechanical  energy,  it  would  be  enough  energy  to  lift  a body  of  1-ton  mass  more 
than  15  m.  Conversely,  if  you  started  with  a container  holding  1 m3  of  air  at  atmo- 
spheric pressure,  you  would  have  to  make  it  do  that  much  mechanical  work  in  some 
fashion,  in  order  to  extract  all  the  energy. 


Multiplying  Eq.  (18- 15a)  through  by  V and  then  using  Eq.  (18-14),  we 
obtain 

pV  = fjV(e)  (18-156) 

It  appears  that  we  are  on  the  right  track  toward  our  goal  of  relating  the 
mathematical  behavior  of  the  ideal-gas  model  to  the  actual  behavior  of  a 
container  of  a real  gas.  Since  the  product  of  the  pressure  p and  the  volume 
V is  equal  to  a certain  quantity,  this  looks  very  much  like  Boyle’s  law.  It  will 
be  identical  with  Boyle’s  law,  as  far  as  its  experimental  consequences  are 
concerned,  provided  that  the  quantity  on  the  right  side  of  the  equation  is 
held  constant.  This  will  be  the  case  if  two  things  are  constant:  the  number  iV 
of  gas  molecules  in  the  container  and  the  average  energy  per  molecule  (e). 
The  first  of  these  conditions  is  satisfied  because  the  container  is  closed.  The 
second  is  satisfied  because  each  molecule  maintains  a constant  total  energy 
as  it  bounces  between  the  perfectly  rigid  and  infinitely  massive  walls  of  the 
container.  Thus  the  kinetic  theory  of  the  ideal  gas  does  yield  Boyle's  law  as 
a necessary  result. 

We  are  now  ready  to  make  a full-fledged  comparison  between  the  re- 
sults deduced  from  the  kinetic  theory,  given  by  Eq.  (18-156),  and  the  sum- 
mary of  empirical  results  given  by  the  ideal-gas  law,  Eq.  (17-14).  We  write 
the  two  equations  together  for  comparison: 

pV  = NkT  (describes  empirically  the  results 

of  many  measurements  on  real  gases)  (18-16) 

pV  = f N{e)  (predicts  behavior  of  an  ideal  gas 

from  energy  and  momentum  conservation)  (18-17) 

18-2  Kinetic  Theory  of  the  Ideal  Gas  785 


What  must  be  true  if  Eq.  (18-17)  is  to  be  a useful  description  of  real  gases, 
and  not  just  of  the  hypothetical  model  on  the  basis  of  which  it  was  derived? 
The  right  sides  of  the  two  equations  must  be  equal.  Equating  them  gives 

fiV(e>  = NkT 

where  k is  Boltzmann’s  constant  and  T is  the  absolute  temperature  of  the 
gas.  Solving  for  (e),  we  obtain 

<e)=f  kT  (18-18) 

The  average  energy  of  a.  molecule  in  an  ideal  gas  is  proportional  to  the  absolute  tem- 
perature of  the  gas,  the  proportionality  constant  being  f times  Boltzmann’s  constant. 
Thus  the  value  of  a microscopic  quantity  (e)  is  reflected  directly  in  the  value 
of  a macroscopic  quantity  T. 

We  can  write  N(e),  the  number  of  molecules  in  the  gas,  times  the 
average  total  energy  per  molecule  as  the  total  energy  E in  the  gas,  so  that 
E = N(e) . Then  using  Eq.  (18-18),  we  have 

E = iNkT  (18- 19a) 

The  toted  energy  of  an  ideal  gas  is  proportional  to  its  absolute  temperature,  and  the 
proportionality  constant  is  I times  the  product  of  the  number  of  molecules  in  the  gas 
and  Boltzmann  s constant. 

Another  expression  for  the  total  energy  E can  be  obtained  by  em- 
ploying Eq.  (17-16),  Nk  = nR,  where  n is  the  number  of  kilomoles  of  gas  in 
the  container  and  R is  the  universal  gas  constant.  Then  we  have 

E = hiRT  (18-196) 

This  is  a truly  remarkable  statement:  The  total  energy  of  an  ideal  gas  is  propor- 
tional to  its  absolute  temperature,  and  the  proportionality  constant  is  f times  the  prod- 
uct of  the  quantity  of  gas,  expressed  in  kilomoles,  and  the  universal  gas  constant. 
These  quantities,  to  which  the  energy  of  a gas  is  related,  all  have  meaning 
independent  of  the  kinetic  theory. 

The  results  obtained  from  the  kinetic  theory  show  that  temperature, 
which  up  to  now  has  been  an  “orphan”  quantity  unrelated  to  the  funda- 
mental quantities  mass,  length,  and  time,  may  be  redefined  in  a funda- 
mental way.  Our  definitions  of  temperature  so  far  have  depended  on 
some  particular  thermometer,  be  it  even  so  idealized  a device  as  the  ideal 
gas  thermometer.  We  may  now  use  Eq.  (18-18)  to  supplant  such  empirical 
definitions  of  temperature  with  the  more  fundamental  definition  of  the 
absolute  temperature  T : 

2(e) 

T = (18-20) 

We  will  not  be  so  rash  as  to  throw  away  our  thermometers;  they  are  far  too 
useful.  It  is  easy  to  insert  a thermometer  into  a container  of  gas,  but  much 
harder  to  find  the  energy  of  all  the  individual  molecules  and  then  to 
average  them  to  obtain  (e).  What  is  important  about  the  redefinition,  how- 
ever, is  that  it  deepens  the  meaning  of  the  ideal-gas  law,  pV  = NkT  = nRT. 
Up  to  now  it  has  been  an  empirical  relation  between  the  fundamentally  de- 
fined quantities  p and  V,  on  one  hand,  and  the  empirical  quantity  T,  on  the 
other.  It  is  now  a relation  among  quantities  which  can  all  be  understood  in 
fundamental  terms. 


786  Kinetic  Theory  and  Statistical  Mechanics 


Furthermore,  the  results  obtained  from  the  kinetic  theory  cement  the 
connection  between  heat  and  energy  that  was  suggested  by  Joule’s  experi- 
ment. We  know  that  raising  the  temperature  of  a gas  can  be  accomplished 
by  heating  the  gas.  In  fact,  measurements  show  that  through  the  range  of 
conditions  where  a gas  behaves  like  an  ideal  gas,  its  heat  capacity  has  a con- 
stant value  C.  Hence  the  definition  of  heat  in  Eq.  (17-21),  A H = C AT, 
shows  that  the  amount  of  heat  A H added  to  the  gas  is  proportional  to  its  in- 
crease in  temperature  A T.  And  Eq.  (18-196)  shows  that  when  the  tempera- 
ture of  the  gas  increases  by  this  amount,  its  total  energy  content  increases 
by  A E = \nR  A T.  Taken  together,  these  two  relations  show  that  when  an 
amount  of  heat  A H is  added  to  the  gas,  its  total  energy  content  A E is  in- 
creased by  a proportional  amount.  This  is  in  agreement  with  Eq.  (17-26), 
AH  = J A E,  the  proportionality  suggested  by  Joule’s  experiment. 

To  summarize,  in  adding  heat  to  an  ideal  gas,  we  add  energy  to  the 
gas.  This  heat  energy  is  distinguished  from  other  forms  of  energy  by  the 
fact  that  it  is  contained  in  the  random  motion  of  the  gas  molecules.  The 
contribution  made  by  an  individual  molecule  of  the  ideal  gas  to  the  heat  en- 
ergy contained  in  the  gas  is  the  kinetic  energy  of  the  molecule.  But  in  con- 
sidering the  entire  gas,  there  is  a very  important  distinction  to  be  made 
between  the  heat  energy  in  the  gas  and  what  would  properly  be  called  the 
kinetic  energy  of  the  gas.  If  a box  filled  with  an  ideal  gas  at  a very  low  tem- 
perature is  stationary  with  respect  to  an  observer,  the  observer  would  say 
that  the  gas  contains  very  little  energy  of  any  type.  If  the  box  of  very  cold 
gas  is  then  set  into  motion  with  respect  to  the  observer  at  a high  speed,  the 
observer  would  say  that  the  gas  has  an  appreciable  kinetic  energy  because 
all  its  molecules  are  moving  together.  If  the  box  remains  stationary  but  the 
gas  is  heated  to  a high  temperature,  the  observer  would  say  that  the  gas  has 
appreciable  heat  energy  because  all  its  molecules  are  moving  at  random.  In 
both  cases  the  gas,  considered  as  a whole,  contains  energy,  and  the  energy 
results  from  the  motion  of  its  molecules.  But  when  the  motion  is  organized, 
the  energy  of  the  gas  as  a whole  is  called  kinetic  energy;  when  the  motion  is 
random,  this  energy  is  called  heat  energy. 


18-3  IMPROVEMENTS  The  ideal-gas  model  assumes  that  molecules  are  of  zero  size.  Such  mole- 
TO  THE  KINETIC  cules  cannot  collide  with  one  another,  and  so  there  cannot  be  a transfer  of 
THEORY  energy  among  them.  Furthermore,  in  developing  the  kinetic  theory  of  the 
ideal  gas,  we  assumed  that  the  massive  walls  of  the  box  containing  the  gas 
were  perfectly  rigid.  This  assumption  precludes  any  transfer  of  energy 
between  the  gas  molecules  and  the  walls  of  the  box.  No  molecule  ever  hits 
another  molecule;  it  only  bounces  elastically  between  the  walls  of  the  box, 
always  maintaining  whatever  total  energy  it  has  at  any  initial  instant. 

There  is  a serious  difficulty  with  this  picture.  Imagine  a container  di- 
vided into  two  equal  parts  by  a removable  partition,  initially  in  place  as  in 
Fig.  18-4o.  Both  parts  contain  the  same  number  of  molecules  of  the  same 
gas.  By  using  heaters  inside  the  container,  the  gas  in  compartment  1 is 
brought  to  temperature  Tu  and  that  in  compartment  2 to  temperature  To, 
with  Tx  > To.  Because  of  the  proportionality  between  temperature  and 
average  total  energy  per  molecule,  this  means  (e)i  > (e)2.  We  then  remove 
the  partition  and  allow  the  gas  molecules  to  mix,  as  in  Fig.  18-46.  If  the 
gases  were  ideal,  so  that  their  molecules  did  not  exchange  energy  with  one 


18-3  Improvements  to  the  Kinetic  Theory  787 


(a) 


ib) 


Ti 


t2 


t 

x - 

' 1 
x x 

i 

" a * 

Bet 

ore 

\'  ' / > *T 

After 


Fig.  18-4  The  mixing  of  molecules  of  a 
gas  having  a high  temperature  7\  and 
molecules  of  a gas  having  a low  tem- 
perature T2,  if  we  assume  that  the  mole- 
cules are  ideal  point  particles  and  the 
walls  of  the  container  are  perfectly 
rigid.  The  arrows  represent  molecular 
velocities,  showing  high-speed  mole- 
cules in  the  high-temperature  gas  and 
low-speed  molecules  in  the  low- 
temperature  gas. 


another,  and  if  the  container  walls  were  perfectly  rigid,  so  that  the  mole- 
cules did  not  exchange  energy  with  the  walls,  all  the  molecules  would  retain 
whatever  total  energy  they  had  before  the  partition  was  removed.  Conse- 
quentlv,  we  would  end  up  with  two  intimately  mixed  but  distinct  popula- 
tions of  gas  molecules — one  with  molecules  of  average  total  energy  (e)i  and 
the  other  with  molecules  in  which  this  quantity  has  the  value  (e)2.  Our 
everyday  experience  with  the  mixing  process  suggests  that  this  is  not  what 
happens.  We  have  every  reason  to  expect  that  in  a very  short  time  after  the 
removal  of  the  partition  the  molecules  from  the  two  sides  will  have  merged 
into  a single  population  of  gas  molecules  with  an  average  total  energy  inter- 
mediate between  (e)i  and  (e)2,  and  hence  with  a temperature  intermediate 
between  7\  and  TV  After  all,  we  know  that  hot  air  added  to  cold  air  yields 
warm  air. 

In  order  for  this  to  happen,  there  must  be  some  way  for  gas  molecules 
to  exchange  energy,  in  contradiction  to  what  we  have  assumed.  Is  there  a 
way  to  allow  this  to  happen  without  doing  essential  damage  to  the  suc- 
cessful results  already  obtained?  There  are,  in  fact,  two  ways,  each  of  which 
is  suggested  by  realistic  physical  considerations.  First  we  will  consider  how 
gas  molecules  exchange  energy  indirectly  by  means  of  the  walls  of  the 
vessel  containing  the  gas. 


Now,  the  walls  of  a container  cannot  be  perfectly  rigid  on  the  molecu- 
lar scale.  The  walls  are,  themselves,  composed  of  atoms  (or  possibly  ions) 
which  are  bound  to  their  equilibrium  positions  by  electric  forces  acting 
between  each  of  these  particles  and  its  neighbors.  The  forces  are  much  like 
the  forces  that  would  be  produced  by  the  set  of  springs  illustrated  in  Fig. 
18-5.  When  a gas  molecule  strikes  the  surface  of  a wall,  it  strikes  one  atom 
(or  at  most  a few  neighbors)  and  then  rebounds  into  the  volume  occupied 
by  the  gas.  In  this  process  the  struck  atom  (or  small  group  of  atoms)  acts 
like  a body  whose  mass  is  not  infinitely  large  compared  to  that  of  the  gas 
molecule.  So  energy — as  well  as  momentum — can  be  exchanged  between  a 
gas  molecule  and  a wall. 


Fig.  18-5  A model  of  the  atomic  structure 
of  the  solid  material  of  which  the  wall  of  a 
container  is  made. 


788  Kinetic  Theory  and  Statistical  Mechanics 


Consider  a single-compartment  box  containing  a gas  and  an  internal 
heater.  The  heater  is  turned  on  so  that  the  temperature  of  the  gas  is  rapidly 
raised  to  some  high  value,  and  then  the  heater  is  turned  off.  At  this  point 
the  temperature  of  the  walls  of  the  box  is  much  lower  than  that  of  the  gas. 
This  means  that  the  average  energy  of  an  atom  in  the  wall  is  much  smaller 
than  that  of  a molecule  of  the  gas.  But  the  discrepancy  does  not  persist. 
The  constant  bombardment  by  rapidly  moving  gas  molecules  soon  increases 
the  oscillatory  motion  of  the  atoms  in  the  walls  and  thereby  increases  their 
average  energy.  This  increase  in  energy  is  at  the  expense  of  the  average  en- 
ergy of  the  gas  molecules.  That  is,  in  most  collisions  energy  is  transferred 
from  a gas  molecule  to  a wall  atom.  The  energy  transfer  does  not  continue 
in  this  direction  indefinitely,  however.  As  the  energy  of  the  atoms  in  the 
walls  increases,  significant  energy  begins  to  How  in  the  other  direction  as 
well.  Eventually,  the  energy  flow  from  walls  to  gas  is  equal  to  that  from  gas 
to  walls,  and  there  is  no  further  net  transfer  of  energy.  In  this  situation  it  is 
said  that  the  gas  and  the  walls  of  the  box  are  in  thermal  equilibrium. 

While  approaching  thermal  equilibrium,  the  temperature  of  the  gas 
decreases  as  the  energy  of  its  molecules  decreases,  and  the  temperature  of 
the  walls  increases  as  the  energy  of  its  atoms  increases.  When  thermal  equilib- 
rium is  achieved,  the  temperatures  of  the  two  parts  of  the  system  have  reached  a 
common  value. 

In  thermal  equilibrium,  no  energy  flows  from  the  gas  to  the  walls  or 
from  the  walls  to  the  gas,  on  the  average.  But  energy  usually  is  transferred 
from  a gas  molecule  to  a wall  atom,  or  in  the  other  direction,  in  an  individual 
collision.  The  point  is  that  saying  the  gas  and  the  walls  have  a certain  tem- 
perature says  something  only  about  the  average  value  of  the  energy  of  a gas 
molecule  and  the  average  energy  of  a wall  atom.  The  energy  of  specific 
particles  of  either  type  can  have  values  which  depart  significantly  from  the 
average  energy  for  that  type.  Consider  a gas  molecule  whose  energy  is  ap- 
preciably smaller  than  the  average  for  all  the  gas  molecules,  and  assume 
that  it  happens  to  collide  with  a wall  atom  whose  energy  is  appreciably 
larger  than  the  average  for  all  the  wall  atoms.  Then  the  particular  wall 
atom  will  be  moving  more  rapidly  than  the  average  wall  atom,  and  the  par- 
ticular gas  molecule  will  be  moving  more  slowly  than  the  average  gas  mole- 
cule, when  the  two  collide.  In  the  collision  the  wall  atom  will  likely  hit  the 
gas  molecule  so  as  to  knock  it  back  from  the  wall  at  a higher  speed  than 
when  it  approached  the  wall.  Energy  has  been  transferred  from  the  wall 
atom  to  the  gas  molecule.  It  can  just  as  well  go  the  other  way  if  a gas  mole- 
cule whose  energy  content  is  appreciably  larger  than  the  average  for  gas 
molecules  strikes  a wall  atom  whose  energy  content  is  appreciably  smaller 
than  the  average  for  wall  atoms. 

The  continual  exchanges  of  energy  between  the  gas  and  the  wall  in 
individual  collisions  are  the  mechanism  which  keeps  the  two  in  thermal 
equilibrium,  once  thermal  equilibrium  has  been  achieved.  If,  by  chance,  a 
sequence  of  individual  collisions  occurs  in  which  the  energy  flow  is  pre- 
dominantly in  a single  direction,  then  the  temperature  of  the  gas  will  shift 
slightly  one  way  and  that  of  the  wall  will  shift  slightly  the  other  way.  But  this 
means  that  the  particles  in  the  part  of  the  gas-plus-walls  system  that  has  had 
a small  increase  in  temperature  will  be  moving  a little  more  rapidly  than  be- 
fore. The  opposite  is  true  for  the  particles  in  the  other  part.  As  a conse- 
quence, they  will  be  moving  in  such  a way  that  in  the  next  sequence  of  colli- 
sions energy  will  tend  to  flow  predominantly  in  the  other  direction,  and  so 
the  temperatures  of  the  two  parts  of  the  system  will  tend  to  come  back  into 


18-3  Improvements  to  the  Kinetic  Theory  789 


balance.  The  continual  small  energy  exchanges  in  individual  collisions 
make  the  thermal  equilibrium  self-regulating. 

Now  let  us  reconsider  the  two-compartment  box  in  Fig.  18-4,  assuming 
that  the  walls  are  initially  at  a temperature  less  than  that  of  the  gas  in  either 
compartment.  What  actually  happens  after  the  partition  is  removed  is  that 
the  gas  with  the  initially  higher  temperature  loses  energy  to  the  walls  by 
means  of  collisions  of  its  molecules  with  the  atoms  of  the  walls.  The  same  is 
true  for  the  gas  with  the  initially  lower  temperature  — but  it  loses  energy  at 
a low^er  rate.  The  flow  of  energy  continues  until  the  gas  from  one  compart- 
ment comes  into  thermal  equilibrium  with  the  walls  and  the  gas  from  the 
other  compartment  does  the  same.  Then  the  entire  system  has  achieved 
some  common  temperature.  At  this  point  both  gases  are  in  thermal  equilib- 
rium w'ith  the  walls,  and  so  they  are  in  thermal  equilibrium  with  each  other. 
Since  the  gases  have  the  same  temperatures,  they  have  the  same  average 
energies  per  molecule.  Each  gas  has  exchanged  energy  wdth  the  wfalls,  and 
so  the  gases  have,  in  effect,  exchanged  energy  with  each  other.  This  indirect 
exchange  of  energy  between  the  two  gases  allows  them  to  come  into  thermal 
equilibrium  with  each  other.  The  process  which  has  taken  place  is  an  ex- 
ample of  what  is  called  the  zeroth  law  of  thermodynamics:  If  two  objects  are 
each  in  thermal  equilibrium  with  a third,  then  they  are  in  thermal  equilibrium  with 
each  other. 

There  is  also  a direct  exchange  of  energy  between  the  gases  from  the  two 
compartments  in  Fig.  18-4.  Although  the  ideal-gas  model  assumes  gas  mol- 
ecules to  be  particles  of  zero  size,  this  is  not  true  of  real  gases.  Because  their 
sizes  are  not  zero,  molecules  of  a real  gas  do  collide  with  one  another. 
These  collisions  lead  to  the  transfer  of  energy  between  the  two  gases  and 
thereby  provide  a mechanism  for  them  to  come  into  thermal  equilibrium 
writh  each  other. 

We  close  this  section  by  describing  briefly  how  the  kinetic  theory  deri- 
vation of  the  ideal-gas  law  in  Sec.  18-2  can  be  modified  to  take  into  account 
the  size  of  real  gas  molecules.  The  nonzero  size  produces  twro  effects.  One 
is  that  it  makes  the  molecules  of  the  gas  collide  with  each  other,  as  has  just 
been  mentioned.  This  is  in  contrast  to  what  was  assumed  in  the  derivation 
of  Sec.  18-2.  But  these  collisions  cause  no  change  in  the  results  of  the  deri- 
vation. The  law  of  momentum  conservation  requires  that  in  any  collision 
between  gas  molecules  their  total  momentum  be  unchanged.  A collision 
only  redistributes  the  momentum  between  the  two  molecules  of  the  gas,  and 
this  does  not  change  the  total  rate  at  which  momentum  is  transferred  to  the 
w'alls  of  the  container  by  the  two  molecules  in  their  collisions  with  these 
walls.  The  same  is  true  of  the  entire  set  of  molecules  and  of  the  total  rate  of 
momentum  transfer  from  the  entire  set  of  molecules  to  the  walls.  Since  this 
total  rate  of  momentum  transfer  gives  the  total  force  exerted  on  a w'all  and 
the  total  force  divided  by  the  area  of  a w'all  gives  the  pressure  exerted  on  it, 
there  is  no  change  in  the  pressure. 

The  second  consequence  of  the  nonzero  size  of  real  gas  molecules  is  a 
reduction  of  the  volume  accessible  to  any  molecule  from  the  volume  that 
would  be  accessible  if  the  molecules  had  zero  size,  as  assumed  in  the 
ideal-gas  model.  This  reduction  can  be  handled  by  the  simple  expedient  of 
subtracting  from  the  volume  V of  the  container  a small  volume  b , but 
making  no  other  modification  in  the  derivation  of  Sec.  18-2.  The  result  is  to 


790  Kinetic  Theory  and  Statistical  Mechanics 


produce  an  equation  of  state  for  a 
size  which  has  the  form 


Fig.  18-6  A pair  of  impenetrable  spherical 
molecules  of  radii  r,  at  their  closest  possible 
approach.  The  dashed  sphere  of  radius  2 r 
represents  the  region  of  space  from  which 
the  center  of  one  molecule  is  excluded  by 
the  presence  of  the  other  molecule.  Its  vol- 
ume is  f-7r(2r)3  = 8|77r3.  In  a gas  contain- 
ing N molecules,  there  are  Nl 2 such  pairs. 
Thus  the  total  inaccessible  volume  is 
(./V/2)8f  77-r3  = 4Ninr3.  This  volume  is 
designated  as  b in  Eq.  (18-22). 


gas  comprising  molecules  of  nonzero 


p(V  — b)  = NkT  (18-2  la) 

or  equally  well, 

p(V  — b)  = nRT  (18-2  lb) 

This  is  called  the  Clausius  equation  of  state,  after  Rudolf  Clausius,  the 
German  physicist  who  first  proposed  it  in  the  1850s. 

Figure  18-6,  and  the  argument  given  in  its  caption,  shows  that  if  the 
gas  molecules  are  impenetrable  spheres  of  radius  r and  volume  t-nr3  and  if 
there  are  N of  them  in  the  gas,  then  the  inaccessible  volume  b has  the  value 

b = 4N($Trr3)  (18-22) 

Example  18-3  shows  how  Eqs.  (18-21a),  ( 1 8-2 16),  and  (18-22)  can  be 
used  to  determine  the  radius  of  the  monatomic  molecule  helium. 


EXAMPLE  18-3 

A quantity  of  helium  gas  equal  to  1.230  X 10~3  kmol  is  pumped  into  a container 
whose  volume  is  1.000  X 10-3  m3.  With  the  temperature  maintained  at  0.0°C,  the 
pressure  is  measured  to  be  2.822  x 106  Pa. 

a.  Use  these  data  to  determine  the  volume  b in  the  Clausius  equation  of  state. 

b.  Then  use  the  value  of  b in  Eq.  (18-22)  to  determine  the  radius  of  the  helium 
molecule. 

■ a.  Solving  Eq.  (18-21&)  for  b,  you  obtain 

nRT 

b = V 

P 


Substituting  the  given  values  for  the  volume  V,  number  of  kilomoles  n,  temperature 
T,  pressure  p,  and  the  standard  value  of  the  universal  gas  constant  R,  you  have 


b = 1.000  x 10“3  m3 


1.230  x 10“3  x 8.314  x 103  J/K  x 273.2  K 
2.822  x 106  Pa 


= 1.000  x 10"3  m3  - 0.990  x 10“3  m3 


or 

b = 0.010  x 10-3  m3  = 1.0  x 10“5  m3 
b.  Solving  Eq.  (18-22)  for  the  radius  r of  the  molecule  gives  you 

l 3 b 

r ~ l 1677-M 


18-3  Improvements  to  the  Kinetic  Theory  791 


Now  N,  the  number  of  molecules  in  the  gas,  is  the  product  of  n,  the  number  of  kilo- 
moles,  and  Avogadro’s  number  A,  the  number  of  molecules  per  kilomole.  Thus 

N = nA  = 1.23  x KT3  x 6.02  x 1026 

Substituting  in  this  value  and  the  value  of  b,  you  find 

( 3 x 1.0  x 10-5  m3  y/3 

V I677  x 1.23  x 10“3  x 6.02  x 1026/ 


or 


r = 9.3  x 10“n  nr 

This  result  provides  a fairly  accurate  determination  of  the  radius  of  a helium 
molecule  because  such  a molecule  is  very  much  like  an  impenetrable  sphere,  as  as- 
sumed in  Eq.  (18-22).  Its  shape  is  spherical  since  the  helium  molecule  consists  of  a 
single  helium  atom.  Furthermore,  a molecule  of  the  noble  gas  helium  has  a quite 
distinct  boundary  inside  which  another  helium  molecule  finds  it  very  difficult  to 
penetrate. 


18-4  HEAT  CAPACITY  One  of  the  striking  successes  of  the  kinetic  theory  of  gases  is  its  ability,  even 
AND  EQUIPARTITION  *n  'ts  simplest  form,  to  predict  the  heat  capacity  of  monatomic  gases.  This  is 

a major  subject  of  this  section. 

In  Sec.  17-6  we  introduced  the  idea  of  the  heat  capacity  per  unit  mass 
of  some  material,  called  the  specific  heat  capacity  c,  through  Eq.  (17-23). 
Rearranging  that  equation  to  obtain  an  explicit  expression  for  its  value,  we 
have 

1 A H 
C ~ m ~\T 


Here  m is  the  mass  of  the  material,  A H is  the  amount  of  heat  added  to  it, 
and  A T is  its  temperature  increase.  Subsequently  we  have  seen  that  sup- 
plying heat  to  a gas  is  a matter  of  supplying  energy  to  it.  Thus  we  can  just  as 
well  specify  the  specific  heat  capacity  c of  a gas  in  terms  of  the  amount  of 
energy  added  to  it,  AT,  in  raising  its  temperature  by  the  amount  AT,  di- 
vided by  the  mass  m of  the  gas: 

1 A E 
C ~ m A T 


Even  if  c is  a function  of  temperature,  we  can  still  use  this  relation  to  evalu- 
ate it  by  taking  the  limit  as  A T approaches  zero.  Doing  so,  we  obtain 


1 dE 
m dT 


(18-23) 


When  the  heat  capacity  per  unit  mass  c is  expressed  in  this  form,  the 
proper  SI  units  for  c are  joules  per  kilogram-kelvin  [ J/ (kg- K)]. 

There  is  another  way  of  expressing  heat  capacity  which  we  will  find 
particularly  useful  here.  It  is  to  define  a heat  capacity  per  molecule,  called  the 

molecular  heat  capacity  c' . Its  value  is 


1 dE_ 

nJt 


(18-24) 


In  this  expression  N is  the  number  of  molecules  present  in  the  gas,  E is  its 
heat  energy  content,  and  T is  its  temperature. 

792  Kinetic  Theory  and  Statistical  Mechanics 


Table  18-1 


Molecular  Heat  Capacity  of  Monatomic  Gases  at  Constant  Volume 
Approximate  temperature 


Gas 

(in  K) 

C'v 

Helium 

300 

1.506k 

Neon 

300 

1.520k 

Argon 

300 

1.508k 

Sodium 

1100 

1.512k 

Potassium 

1200 

1.521k 

Mercury 

650 

1.503k 

As  is  discussed  in  detail  in  Chap.  19,  if  the  container  holding  a gas  ex- 
pands as  the  gas  is  heated,  then  not  all  the  energy  supplied  to  the  system 
goes  into  increasing  the  energy  content  of  the  gas.  Some  of  it  goes  into  the 
work  done  by  the  walls  of  the  container  as  they  push  against  whatever  is 
restraining  them.  To  avoid  this  complication,  here  we  consider  only  cir- 
cumstances in  which  a gas  is  confined  to  a container  of  constant  volume  as 
the  gas  is  heated.  In  such  circumstances  the  molecular  heat  capacity  is 
written  as  c,',  and  called  the  molecular  heat  capacity  at  constant  volume. 


The  kinetic  theory  makes  a very  direct  prediction  of  the  molecular 
heat  capacity  at  constant  volume  for  an  ideal  gas.  The  prediction  is  con- 
tained in  Eq.  (18-  19a): 

E = I NkT 


Calculate  the  derivative  of  E with  respect  to  T,  keeping  N fixed  because  the 
gas  is  confined.  The  result  is 


dE 

~dT 


t Nk 


Applying  this  to  Eq.  (18-24),  we  obtain  immediately 

c'v  = §k  for  an  ideal  gas  (18-25) 

The  kinetic  theory  of  an  ideal  gas  predicts  a molecular  heat  capacity  at  constant  vol- 
ume with  the  temperature-independent  value  f times  Boltzmann  s constant  k. 

Table  18-1  lists  measured  values  of  c'v  for  a variety  of  monatomic 
gases.  In  each  case  the  conditions  of  the  measurement  are  such  that  the 
gas  obeys  well  the  ideal-gas  law,  pV  = NkT.  That  is,  the  pressure  is  approxi- 
mately 1 atm,  and  the  temperature  is  well  above  the  liquefaction  point  of 
the  gas.  If  the  match  between  this  new  prediction  of  the  kinetic  theory  for 
the  ideal  gas  were  perfect,  the  values  of  c»  in  the  last  column  of  the  table 
would  all  be  1.500&.  In  fact,  the  largest  deviation  from  this  value  is  a little 
more  than  1 percent. 


For  polyatomic  gases  the  experimental  values  of  c'v  are  considerably 
larger  than  the  value  1.500k  predicted  by  the  kinetic  theory  of  an  ideal  gas. 
See  Table  18-2.  Clausius  was  the  first  to  suggest  the  explanation  of  this  fact 
which  follows. 

The  molecular  heat  capacity  at  constant  volume  is  a measure  of  how 
much  energy  is  absorbed  by  the  molecules  of  a gas  when  the  temperature 
of  the  gas  increases  a certain  amount.  For  monatomic  gases  the  tempera- 
ture increase  is  the  macroscopic  expression  of  the  increase  in  the  average 


18-4  Heat  Capacity  and  Equipartition  793 


Rotary 


Table  18-2 


Molecular  Heat  Capacity  of  Polyatomic  Gases  at  Constant  Volume 

Atoms  per  Approximate  temperature 

Gas 

molecule 

(in  K) 

C'v 

Hydrogen  (H2) 

2 

300 

2.45k 

Nitric  oxide  (NO) 

2 

300 

2.51k 

Oxygen  (02) 

2 

300 

2.50k 

Water  (steam)  (H20) 

3 

800 

3.54k 

Ammonia  (NFLd 

5 

300 

3.42k 

Carbon  dioxide  (C02) 

3 

300 

3.43k 

y 


X 


z 


(tf) 


(b) 

Fig.  18-7  Schematic  representations  of 
(a)  a monatomic  molecule  and  ( b ) a dia- 
tomic molecule. 


kinetic  energy  of  the  random  motion  of  the  centers  of  mass  of  the  gas  mole- 
cules. For  polyatomic  gases  it  must  also  be  true  that  the  temperature  in- 
crease measures  the  increase  in  the  kinetic  energy  of  the  center-of-mass 
motion  of  the  molecules.  After  all,  it  is  the  fact  that  the  centers  of  mass  of 
gas  molecules  are  moving  that  causes  them  to  collide  with  the  walls  of  a con- 
tainer and  produce  a pressure  proportional  to  the  temperature — whether 
they  are  monatomic  or  polyatomic  molecules.  It  follows  from  the  excellent 
agreement  between  the  value  predicted  by  the  kinetic  theory  of  the  ideal 
gas  for  c'v  and  the  experimental  values  for  monatomic  gases  that  the  only 
way  in  which  monatomic  molecules  absorb  the  energy  supplied  to  heat  the 
gas  is  through  an  increase  in  the  kinetic  energy  of  their  random  center- 
of-mass  motions.  This  must  be  so  since  the  theory  cannot  account  for  an- 
other way  to  absorb  energy.  In  contrast,  the  fact  that  the  experimental  val- 
ues of  c'v  for  polyatomic  gases  are  larger  than  those  predicted  by  the  theory 
indicates  that  polyatomic  molecules  must  have  additional  ways  of  absorbing 
energy.  If  this  is  so,  more  energy  will  have  to  be  supplied  to  a polyatomic 
gas  in  order  to  produce  a certain  temperature  increase  since  only  part  of 
the  energy  will  go  into  increasing  the  center-of-mass  motion  that  registers 
as  a temperature  increase. 

Figure  18-7 a shows  schematically  a monatomic  molecule,  such  as  he- 
lium. It  absorbs  some  of  the  energy  supplied  to  heat  the  gas  of  which  it  is  a 
part  by  means  of  an  increase  in  the  kinetic  energy  of  motion  of  its  center  of 
mass.  An  expression  for  e,  that  part  of  the  total  energy  of  the  molecule 
which  comes  from  the  energy  it  absorbs,  contains  three  terms: 

e = bnv%  + 2 mvl  + \mv\  (18-26) 

Here  m is  the  mass  of  the  molecule,  and  vx,  vy,  vz  are  the  three  components 
of  the  velocity  of  its  center  of  mass.  (We  exclude  the  possibility  that  ab- 
sorbed energy  goes  into  kinetic  energy  of  rotation  about  a diameter  of  the 
molecule  because  the  results  of  measurements  of  c'v  given  in  Table  18-1 
show  this  does  not  happen,  as  discussed  above.  Newtonian  mechanics  pro- 
vides no  explanation  of  why  it  does  not  happen.  Quantum  mechanics  does, 
but  we  cannot  go  into  the  explanation  here.) 

Figure  18-7/;  is  a schematic  representation  of  a polyatomic  molecule 
containing  two  atoms  of  the  same  species,  such  as  hydrogen.  Just  as  the 
monatomic  helium  molecule  does,  the  polyatomic  molecule  absorbs  some 
of  the  energy  supplied  to  heat  the  gas  of  which  it  is  a part  by  means  of  an 
increase  in  the  kinetic  energy  of  motion  of  its  center  of  mass.  But  in  addi- 
tion, it  absorbs  some  energy  by  means  of  an  increase  in  the  kinetic  energy 
of  its  rotation  about  its  center  of  mass.  (For  the  moment  we  assume  the 


794  Kinetic  Theory  and  Statistical  Mechanics 


spacing  between  the  atoms  is  fixed.)  The  expression  for  the  part  of  the  mol- 
ecule’s total  energy  originating  in  the  energy  it  absorbs  contains  five  terms: 

e = \mv\  + \mv%  + \mv\  + f/tW]  + \I2o&  (18-27) 

The  first  three  terms  represent  the  kinetic  energy  of  motion  of  the  center 
of  mass,  just  as  for  a monatomic  molecule.  In  the  last  two  terms,  /,  and  I2 
are  the  molecule’s  moments  of  inertia  for  rotation  about  the  axes  labeled  1 
and  2 in  the  figure.  Axes  1 and  2 are  perpendicular  to  each  other,  and  both 
are  perpendicular  to  axis  3 extending  along  a common  diameter  of  the 
atoms  in  the  molecule.  (The  molecule  does  not  rotate  about  a diameter  of 
both  atoms  because  of  the  same  quantum-mechanical  property  that  pre- 
vents a monatomic  molecule  from  rotating  about  a diameter  of  its  single 
atom.)  The  quantities  a>1  and  oj2  are  the  components  of  the  molecule's 
angular  velocity  along  axes  1 and  2.  Thus  the  last  two  terms  represent  the 
kinetic  energy  of  rotation  of  the  molecule  about  axes  perpendicular  to  the 
one  that  is  a diameter  of  both  atoms. 

If  you  review  briefly  the  steps  involved  in  deriving  the  result 

c'v  = ik 

for  the  constant-volume  molecular  heat  capacity  of  an  ideal  gas  or,  equally 
well,  a monatomic  gas,  you  will  see  that  the  3 in  the  factor  | arises  because, 
on  the  average,  molecules  of  the  gas  absorb  the  same  amount  of  energy  in 
each  of  three  different  ways.  Each  of  these  corresponds  to  one  of  the  three 
terms  in  the  energy  expression  of  Eq.  (18-26).  Next  note  that  the  values  of 
c'v  listed  in  Table  18-2  for  hydrogen,  and  other  gases  with  diatomic  mole- 
cules, are  all  close  to  the  value 

C'v  - \k 

Here  the  numerator  in  the  fraction  multiplying  k is  5.  And  there  are  five 
different  ways  that  the  molecule  has  for  absorbing  energy,  corresponding 
to  the  five  terms  in  Eq.  (18-27).  This  is  not  a coincidence,  but  a consequence 
of  an  important  theorem,  which  we  now  consider. 


According  to  the  theorem  of  equipartition  of  energy,  if  molecules  are  in 
thermal  equilibrium  with  their  surroundings,  then  on  the  average  they  absorb  an 
equal  amount  of  energy  in  each  way  that  they  have  of  absorbing  that  energy.  The 
name  of  the  theorem  reflects  the  fact  that  the  energy  absorbed  by  mole- 
cules is  partitioned  (divided)  equally,  on  the  average,  between  the  different 
ways  the  molecules  have  of  absorbing  energy.  [The  equipartition  theorem 
applies  only  if  the  terms  in  the  expression  for  the  molecule’s  total  energy 
are  each  proportional  to  the  square  of  a velocity  component  or  to  the 
square  of  a coordinate — including  angular  velocity  components  and  angu- 
lar coordinates.  All  the  cases  we  consider  satisfy  this  restriction.  See  Eqs. 
(18-26)  and  (18-27),  and  also  Eq.  (18-28).] 

We  will  give  some  justification  to  the  equipartition  theorem  soon.  But 
first  we  apply  it  to  a gas  ol  hydrogen  molecules.  The  theorem  requires  that 
each  of  the  five  different  ways  which  molecules  of  the  gas  have  of  absorbing 
energy  receive,  on  the  average,  the  same  amount  of  energy,  providing  the 
molecules  are  in  thermal  equilibrium  with  one  another.  Thus  when  5 units 
of  energy  is  absorbed  by  the  molecules  of  the  gas  in  equilibrium,  3 units 
goes  into  increasing  the  kinetic  energy  of  center-of-mass  motion,  while  2 
units  goes  into  increasing  the  kinetic  energy  of  rotation.  Hence  only  3 parts 


18-4  Heat  Capacity  and  Equipartition  795 


in  5 of  the  absorbed  energy  will  be  the  increase  in  kinetic  energy  of 
center-of-mass  motion  that  leads  to  a temperature  increase.  To  put  it  an- 
other way,  5 energy  units  must  be  added  to  the  gas  to  produce  the  same 
temperature  increase  that  would  be  produced  by  adding  3 energy  units  if 
the  gas  were  monatomic.  As  a consequence  cl  is  f times  larger  than  the 
value  of  this  quantity  for  a monatomic  gas.  Thus  the  equipartition  theorem 
predicts  the  value  cl  = f f k = f&,  in  good  agreement  with  the  results 
of  measurement  quoted  in  Table  18-2. 

Figure  18-76  indicates  another  possible  motion  of  the  hydrogen  mole- 
cule. In  this  motion  the  two  atoms  move  with  respect  to  the  molecular 
center  of  mass  so  that  the  separation  between  their  centers  oscillates  about 
its  equilibrium  value.  The  atoms  oscillate  like  two  equal  balls  connected  to 
opposite  ends  of  a spring.  Can  molecules  in  a hydrogen  gas  absorb  energy 
by  means  of  an  increase  in  the  energy  of  this  motion?  Not  at  T = 300  K,  the 
temperature  at  which  the  value  cl  — \k  quoted  in  Table  18-2  was  mea- 
sured. We  can  say  this  because  the  factor  f has  been  completely  accounted 
for  by  the  two  motions  already  discussed. 

Bui  at  much  higher  temperatures  experimental  evidence  indicates  that 
absorption  of  heat  energy  into  vibrational  motion  does  take  place.  (The  ab- 
sence of  this  absorption,  except  at  very  high  temperatures,  is  a phenome- 
non of  quantum  mechanics.  It  is  explained  in  Example  18-6  at  the  end  of 
Sec.  18-5.)  At  T — 3000  K the  measured  value  of  cl  is  quite  close  to  Ik. 
This  is  interpreted  to  mean  that  in  these  circumstances  the  expression  for 
the  total  absorbed  energy  of  a hydrogen  molecule  contains  seven  terms. 
The  expression  is 

e = \mv\  + \mv%  + \mv\  + i/jcof  + i/2col  + i/uut1 2  + \kdr  (18-28) 

The  next-to-last  term  on  the  right  side  of  this  equation  is  the  kinetic  energy 
of  vibrational  motion  of  the  molecule,  evaluated  by  using  the  reduced-mass 
procedure  of  Sec.  1 1-4.  That  is,  /r  is  the  reduced  mass  of  one  of  the  hy- 
drogen atoms,  and  u2  is  the  square  of  its  speed  of  vibration  relative  to  the 
other  atom  of  the  molecule.  In  the  last  term,  k is  a constant  that  plays  the 
same  role  as  the  force  constant  in  a harmonic  oscillator  consisting  of  a body 
attached  to  one  end  of  a spring.  The  cpiantity  d is  the  difference  between 
the  center-to-center  separation  of  the  two  atoms  and  the  equilibrium  value 
of  that  separation.  Thus  the  last  term  is  the  potential  energy  stored  in  the 
“spring”  (actually,  in  the  electric  interaction  between  the  two  atoms)  in- 
volved in  the  vibrational  motion. 

Each  term  in  Eq.  (18-28)  corresponds  to  a different  way  that  molecules 
of  hydrogen  gas,  in  thermal  equilibrium  at  a very  high  temperature,  have 
of  absorbing  energy.  According  to  the  equipartition  theorem,  they  absorb 
the  same  average  amount  of  energy  in  each  way.  Hence  7 units  of  energy 
must  be  added  to  the  gas  to  produce  the  temperature  increase  that  would 
result  from  adding  3 units  if  the  gas  were  monatomic.  And  therefore  the 
molecular  heat  capacity  at  constant  volume  should  have  the  value  cl  = 
is  f k — Ik,  as  is  confirmed  by  measurement. 

The  equipartition  theorem  can  be  proved  from  very  general  arguments.  But 
the  proof  is  above  the  level  of  this  book,  so  we  justify  it  by  the  following  consider- 
ations: 

1.  In  cases  where  a molecule  absorbs  energy  only  by  increasing  the  kinetic 

energy  of  its  center-of-mass  motion  [as  in  Eq.  (18-26)],  the  equipartition  of  this  en- 


796  Kinetic  Theory  and  Statistical  Mechanics 


ergy  among  the  x,  y,  and  z components  of  this  motion  reflects  the  fact  that  there 
must  be  symmetry  among  the  x,  y,  and  z directions.  This  is  essentially  the  same 
argument  as  used  in  Sec.  18-2  to  derive  the  ideal-gas  law  from  kinetic  theory. 

2.  When  the  molecule  can  also  absorb  energy  by  increasing  the  kinetic  energy 
of  rotation  about  its  center  of  mass  [as  in  Eq.  (18-27)],  then  we  can  say  that  a 
self-regulation  process — much  like  the  one  described  in  Sec.  18-3  to  explain 
thermal  equilibrium — operates  to  keep  the  kinetic  energy  partitioned  equally 
among  the  terms  associated  with  the  various  components  of  the  motion  of  its 
center  of  mass  and  of  the  rotation  about  its  center  of  mass.  For  instance,  if  by 
chance  the  molecule  happens  to  gain  rotational  energy  in  excess  of  the  average 
value,  then  when  it  next  collides  with  the  wall  of  the  container  (or  with  another 
molecule),  it  is  likely  that  it  will  lose  some  of  this  energy  and  gain  some  energy  of 
motion  of  its  center  of  mass.  Think  of  what  would  happen  if  a dumbbell  spinning 
rapidly  about  its  axis  were  thrown  slowly  at  a rigid  wall. 

3.  Next  consider  a molecule  in  which,  in  addition,  vibrational  motion  is  pos- 
sible [as  in  Eq.  (18-28)].  The  equipartition  of  absorbed  energy  between  the  poten- 
tial and  kinetic  energies  associated  with  the  vibration  is  easy  to  understand.  Just 
look  at  Fig.  8-16,  which  is  a plot  of  the  potential  and  kinetic  energies  of  a har- 
monic oscillator  over  several  cycles  of  oscillation.  As  was  noted  when  the  figure 
was  presented,  it  shows  that  even  for  a single  harmonic  oscillator  the  potential  en- 
ergy averaged  over  any  cycle  equals  the  kinetic  energy  averaged  over  the  same 
cycle. 


We  can  obtain  two  very  useful  results  by  noting  that  the  molecular  heat 
capacity  at  constant  volume  is  just  the  rate  of  change  with  temperature  of 
the  average  energy  content  of  gas  molecules.  That  is, 


c 


V 


die) 

dT 


(18-29) 


Then  we  note  that  in  all  the  cases  discussed  the  value  ofc^  is  observed  to  be 


c 


V 


(18-30) 


where  Jf  is  the  number  of  terms  in  the  expression  for  the  energy  e of  a mol- 
ecule. For  each  term  in  the  expression  for  the  energy  content  of  the  molecules  of  a 
gas,  there  is  a contribution  of  \k  to  the  molecular  heat  capacity  at  constant  volume  of 
the  gas,  k being  Boltzmann's  constant.  This  is  an  important  generalization  of 
Eq.  (18-25),  c'v  = |A,  which  we  derived  for  an  ideal  monatomic  gas  using  the 
kinetic  theory  in  its  simplest  form. 

To  obtain  the  second  result,  we  note  also  that  if  we  write  the  average 
value  of  the  energy  (e)  as 

<e>  =YkT  (18-31) 


then  applying  Eq.  (18-29)  produces  immediately  the  observed  values  of  c'v 
in  Eq.  (18-30).  Hence  Eq.  (18-31)  must  give  a correct  description  of  the 
average  energy  of  the  molecules  (or  of  atoms  if  the  molecules  are  mon- 
atomic) in  a gas. 

In  fact,  it  can  be  shown  that  Eq.  (18-31)  applies  to  atoms,  molecules,  or 
entities  of  any  type  that  are  in  a solid,  liquid,  gas,  or  any  state.  In  words,  this 
form  of  the  theorem  of  equipartition  of  energy  says  the  following:  If  an  en- 
tity is  in  thermal  equilibrium  with  its  surroundings  at  absolute  temperature  T,  then 
for  each  term  in  the  expression  for  its  energy  content  e there  is  a contribution  of\kT  to 
the  average  value  (e)  of  that  energy,  k being  Boltzmann  s constan  t. 


18-4  Heat  Capacity  and  Equipartition  797 


Let  us  use  this  form  of  the  equipartition  theorem  to  predict  the  heat 
capacity  of  a.  solid.  Since  it  has  been  fruitful  to  imagine  a diatomic  molecule 
as  a pair  of  balls  connected  by  a spring,  we  extend  the  picture  to  a solid  by 
considering  it  as  a very  large  number  of  balls  connected  by  a cubical  net- 
work of  springs.  This  is  just  the  picture  of  a solid  that  was  presented  in  Fig. 
18-5.  Each  ball  represents  an  atom.  Each  spring  represents  the  electric  in- 
teraction between  neighboring  atoms.  When  the  atoms  are  in  thermal  equi- 
librium at  a temperature  greater  than  absolute  zero,  each  oscillates  about  a 
certain  position  as  a three-dimensional  harmonic  oscillator.  The  expression 
for  the  energy  content  e of  one  of  them  contains  six  terms,  a kinetic  energy 
and  a potential  energy  for  each  of  the  three  coordinates.  That  is,  the  oscil- 
lator energy  is 

e = \mv\  + ikx2  + \mv%  -I-  iky'1  + imv'i  + ikz2  (18-32) 

According  to  the  equipartition  theorem,  the  atoms  of  the  solid  have  an 
average  energy  content  (e)  given  by 

<€>  =YkT 

Here  Jf  = 6 because  there  are  six  terms  in  Eq.  (18-32).  We  adapt  Eq. 
(18-29)  to  the  case  at  hand  by  writing  it  as 

, _ d(e) 

C dT 

In  this  equation  c'  represents  the  heat  capacity  per  atom  of  the  solid.  The 
subscript  v,  implying  the  constant-volume  restriction,  has  been  dropped 
because  when  a solid  is  heated,  its  volume  changes  very  little  even  if  it  is  un- 
constrained. Differentiating  the  equation  for  (e),  we  obtain 


or 


c'  = 3k  (18-33) 

Table  18-3  gives  experimental  values  of  c'  for  a variety  of  solids.  The 
correspondence  with  our  prediction  is  striking.  The  fact  that  for  many 
solids  c'  — 3k  at  a temperature  in  the  vicinity  of  or  higher  than  room  tem- 
perature was  discovered  experimentally  by  the  chemist  P.  L.  Dulong 
(1785-1838)  and  the  physicist  A.  T.  Petit  (1791  - 1820),  and  it  is  called  the 


Table  18-3 

Heat  Capacities  per  Atom  of  Various  Solids 
Approximate  temperature 


Solid 

(in  K) 

c' 

Aluminum 

400 

3.05k 

Gold 

300 

2.99k 

Iodine 

240 

3.09k 

Lead 

300 

3.09k 

Phosphorus 

300 

2.96k 

Silver 

400 

3.06k 

798  Kinetic  Theory  and  Statistical  Mechanics 


Dulong- Petit  law.  In  general,  metals  obey  this  law,  as  well  as  most  non- 
metals. 

It  is  in  agreement  with  the  theory  to  find  that  most  nonmetallic  solids 
conform  to  the  Dulong-Petit  law.  But  it  is  puzzling  that  most  metals  con- 
form as  well,  because  metals  are  known  to  have  within  them  free  electrons 
whose  numbers  are  comparable  to  the  number  of  atoms.  If  these  free  elec- 
trons acted  like  the  molecules  of  an  ideal  gas,  they  would  contribute  an  ad- 
ditional ik  per  electron  to  the  heat  capacity  of  the  solid,  giving  a total  heat 
capacity  per  atom  significantly  larger  than  that  predicted  by  the  Dulong- 
Petit  law.  The  unlooked-for  conformance  of  metals  to  this  law,  as  well  as 
the  nonconformance  of  certain  nonmetals  such  as  diamond  and  graphite, 
can  be  explained  only  by  employing  quantum  mechanics  to  modify  the  ex- 
pression for  the  energy  e absorbed  by  the  individual  constituents  of  the 
solid. 

You  should  use  the  more  detailed  understanding  you  now  have  of  the 
behavior  of  atoms  in  a solid  to  again  go  through  the  argument  in  Sec.  18-3 
concerning  energy  transfer  between  gas  molecules  and  wall  atoms  for  a gas 
in  thermal  equilibrium  with  the  walls  of  its  container. 


EXAMPLE  18-4 

Find  the  heat  capacity  per  kilogram  of  copper  by  assuming  that  the  Dulong-Petit 
law  applies  to  it.  Then  compare  your  results  with  the  experimental  value  given  in 
Table  17-3  and  comment  on  the  applicability  of  the  law  in  this  case.  The  atomic 
weight  of  copper  is  63.5.  That  is,  1 kmol  of  copper  has  a mass  of  63.5  kg. 

■ First  you  must  relate  the  heat  capacity  per  kilogram,  c,  to  the  heat  capacity  per 
atom,  c' . In  words,  the  relation  is 

heat  capacity  _ heat  capacity  ^ atoms  kilomoles 
kilogram  atom  kilomole  kilogram 


or 


heat  capacity  _ heat  capacity/atom  X atoms/kilomole 
kilogram  kilograms/kilo  mole 


In  symbols,  it  is 


c 


c'A 

~W 


where  A is  Avogadro's  number  and  W is  numerically  equal  to  the  atomic  weight  and 
has  the  units  kilograms  per  kilomole  (kg/kmol).  Note  that  the  heat  capacity  per  kilo- 
gram is  inversely  proportional  to  the  atomic  weight. 

Assuming  the  Dulong-Petit  law  applies,  you  write  c'  = 3k,  where  k is  Boltz- 
mann's constant.  Then  you  have 


c 


3 kA 

~W 


The  numerical  value  is 

3 x 1.38  x 10~23  J/K  x 6.02  x 1026  kmol"1 
63.5  kg-kmol-1 


or 

c = 392  J/(K-kg) 

18-4  Heat  Capacity  and  Equipartition  799 


Before  comparing  this  with  the  datum  in  Table  17-3,  you  must  express  c in 
terms  of  kilocalories,  instead  of  joules.  Using  Eq.  (17-27),  1 kcal  = 4186  J.  you  have 


c 


392  J/(K-kg) 


1 kcal 
4186  J 


or 


c = 0.0936  kcal/(K-kg) 

The  experimental  value  fore,  quoted  in  Table  17-3  in  terms  of  the  specific  heat 
ratio,  is 

c = 0.0921  kcal/(K-kg) 

The  good  agreement  shows  that  the  Dulong-Petit  law  applies  well  to  copper. 


18-5  THE  BOLTZMANN  We  turn  now  from  the  kinetic  theory  to  a closely  related  but  more  general 
FACTOR  theory  of  the  behavior  of  a system  containing  a large  number  of  objects, 
such  as  gas  molecules.  The  more  general  theory  is  called  statistical  me- 
chanics. 

Much  has  been  learned  when  large  numbers  of  molecules  are  dealt 
with  by  considering  average  values.  For  instance,  we  have  been  able  to  de- 
fine and  evaluate  the  very  important  macroscopic  quantity  temperature  in 
terms  of  the  average  energy  of  the  molecules  of  a gas.  But  the  average  is 
only  the  simplest  and  most  familiar  of  a number  of  important  quantities 
which  describe  a collection  of  objects,  such  as  the  molecules  of  a gas.  And 
the  average  does  not  convey  all  the  useful  information  there  is  to  know 
about  a collection  of  objects. 

To  give  an  example,  consider  the  two  groups  of  10,000  persons  whose 
ages  are  graphed  in  Fig.  18-8.  Half  of  the  first  group  consists  of  persons 
who  are  10  years  old,  and  the  other  half  comprises  persons  who  are  69 
years  old.  The  second  group  consists  of  equal  numbers  of  persons  who  are 
35,  36,  37,  . . . , 44  years  old.  In  each  case  the  average  age  of  the  group  is 
(A)  = 40  years.  However,  the  death  rate  of  the  first  group  will  be  much 
greater  than  that  of  the  second  because  death  rate  does  not  depend  linearly 
on  age  but  instead  increases  more  and  more  rapidly  as  old  age  is  ap- 
proached. 

In  the  above  example,  important  information  is  contained  in  the  distri- 
bution of  ages  which  is  not  conveyed  by  the  average  age.  A complete  descrip- 
tion of  the  distribution  is  given  by  its  distribution  function.  For  instance, 
the  distribution  functions  nx(A)  and  n2(A)  of  Fig.  18-8  give  the  number  of 
persons  of  every  possible  age  comprising  the  sample  populations.  The  unit 
of  age  is  taken  to  be  the  year,  and  the  distribution  functions  divide  their 
sample  populations  into  subgroups  that  each  span  one  year.  The  function 
nx(A)  specifies  the  number  of  persons  per  unit  age  that  have  age  A for  pop- 
ulation 1,  and  the  function  n2(A)  has  the  same  role  for  population  2.  In  this 
section  we  develop  the  distribution  function  n(e)  for  a large  number  of 
identical  objects  which  are  in  thermal  equilibrium  with  one  another.  The 
quantity  e is  the  energy  of  an  object,  and  the  distribution  function  gives  the 
number  of  objects  per  unit  energy  that  have  energy  e.  If  used  appropri- 
ately, the  results  that  we  will  obtain  pertain  to  objects  of  any  type,  such  as 
gas  molecules. 

Like  the  age  distribution  functions  of  Fig.  1 8-8,  the  energy  distribution 
functions  which  we  will  consider  must,  in  principle,  be  bar-graph  functions. 

800  Kinetic  Theory  and  Statistical  Mechanics 


n2(A)  nx(A) 


5000 


0 L 


5000 


10  20  30 


40 

t 

<A> 


Fig.  18-8  Age  distribution  functions  n^A) 
and  w2(A)  for  two  populations  of  10,000 
_ persons  each.  The  age  distribution  func- 

tions are  represented  by  bar  graphs  be- 
cause of  the  way  in  which  the  ages  are  speci- 
fied. A person  who  is  38  years  old,  for 

J L_ IJ „. _.  example,  has  passed  his  or  her  38th  birth- 

50  60  70  day  but  has  not  yet  attained  the  39th  birth- 

Age  A (in  yr)  day. 


0 


10 


20 


30 


J I I 

40 

t 


<A> 


50 


60  70 

Age  A (in  yr) 


This  is  because  the  number  of  objects  in  any  sample  on  which  we  actually 
make  measurements  is  finite  and  we  can  measure  the  energy  of  each  object 
only  within  finite  limits.  But  although  finite,  the  number  of  objects  is  very 
large  in  most  circumstances,  and  the  energy  resolution  of  a measurement 
(that  is,  the  ability  to  distinguish  small  differences  in  energy)  can  be  very 
good.  So  it  will  be  reasonable  for  us  to  approximate  the  bar-graph  function 
by  a continuous  function,  at  a certain  point  in  the  development. 

At  the  beginning  of  Sec.  18-3  we  discussed  qualitatively  how  two  popu- 
lations of  ideal-gas  molecules  which  are  mixed  come  to  thermal  equilibrium 
by  means  of  a sequence  of  energy  exchanges.  (These  were  the  indirect  ex- 
changes from  gas  molecule  to  wall  atom  to  gas  molecule.)  Let  us  analyze  a 
quantitative  “experiment”  in  a similar  vein.  Imagine  an  isolated  system  con- 
sisting of  a large  number  of  separated  harmonic  oscillators.  Each  of  these 
identical  oscillators  is  a body  connected  to  a spring  and  able  to  move  only 
along  a fixed  line.  Any  of  these  oscillators  is  able  to  interact  with  any  other. 
Hence  the  exchanges  of  energy  between  the  oscillators,  which  are  required 
to  maintain  thermal  equilibrium,  can  take  place.  We  make  a great  simplifi- 
cation in  the  analysis,  without  affecting  the  basic  results  obtained  from  it, 
by  assuming  that  the  energy  exchanges  take  place  between  oscillators  in- 
teracting only  two  at  a time,  and  never  three  or  more  at  a time. 

The  mechanism  of  the  interaction  is  not  important.  But  if  you  want  a picture 
of  how  it  might  take  place,  you  can  visualize  one  free  body  that  moves  very  slowly 
along  a random  path  until  it  happens  upon  an  oscillator.  The  body  in  the  oscillator 
collides  with  the  free  body,  giving  it  part  of  or  all  the  oscillator’s  energy.  The  free 
body  then  carries  this  energy  until  by  chance  it  collides  with  the  body  in  another 
oscillator.  Essentially  all  its  energy  is  given  to  this  oscillator,  and  the  free  body 
moves  away  very  slowly  along  some  random  path.  Eventually  it  comes  across  an- 
other oscillator  and  starts  another  energy  exchange  process. 

We  have  no  idea  which  oscillator  will  interact  with  which  other  oscil- 
lator first,  which  pair  will  interact  next,  and  so  forth.  All  we  know  is  that  it  is 
possible  for  any  one  to  interact  with  any  other.  We  therefore  make  the  plau- 
sible assumption  that,  in  the  absence  of  a reason  to  think  otherwise,  it  is 
equally  probable  that  any  oscillator  will  interact  with  any  other  oscillator.  This  is 
called  a postulate  of  equal  a priori  probabilities.  ( A priori  is  a Latin  term 
meaning  “before  the  fact.”)  The  assumption  is  the  same  as  the  one  which 


18-5  The  Boltzmann  Factor  801 


leads  us  to  believe  that  heads  and  tails  are  equally  likely  in  a coin-flipping 
game  or  that  a playing  card  pulled  from  a shuffled  deck  is  just  as  likely  to 
be  the  jack  of  diamonds  as  the  six  of  hearts.  We  therefore  feel  justified  in 
choosing  pairs  of  oscillators  at  random  when  we  need  to  decide  which  oscil- 
lator is  going  to  interact  with  which  other  one  next. 

When  two  oscillators  interact,  any  fraction  of  the  energy  of  one  can  be 
transferred  to  the  other.  The  first  oscillator  may  give  up  a large  part  of  its 
energy  to  the  second,  for  example,  or  the  second  may  give  up  a small  part 
of  its  energy  to  the  first.  We  assume  energy  is  conserved  in  each  of  these  in- 
teractions, so  that  the  total  energy  of  the  isolated  system  of  oscillators  re- 
mains constant.  Thus  the  sum  of  the  energies  of  two  oscillators  cannot 
change  in  an  interaction  between  them.  But  we  have  no  idea  how  the  in- 
teraction redistributes  this  energy  between  them.  Therefore  again  we  in- 
voke a postulate  of  equal  a priori  probabilities,  and  we  assume  that  in  each 
interaction  all  possible  redistributions  of  energy  are  equally  probable.  Hence  we 
choose  at  random  a fraction  whose  value  can  lie  equally  well  anywhere 
between  0 and  1 , let  the  final  energy  of  one  of  the  interacting  oscillators  be 
this  fraction  of  the  sum  of  their  initial  energies,  and  let  the  other  oscillator 
have  whatever  energy  remains. 

In  the  “experiment”  we  begin  by  assigning  each  oscillator  of  the  system 
a certain  amount  of  energy.  That  is,  we  start  by  arbitrarily  imposing  some 
energy  distribution  function  on  the  system.  Next  two  oscillators  are  picked  at 
random,  and  their  energy  is  redistributed  between  them  at  random.  This 
process  is  repeated  a number  of  times.  We  then  stop,  inspect  the  energy  of 
each  oscillator,  and  from  these  energies  determine  the  energy  distribution 
function  of  the  system  at  this  stage.  Then  we  continue  the  energy  redistri- 
bution processes  until  we  stop  again  to  determine  the  new  energy  distribu- 
tion function.  And  so  on. 

What  we  find  is  that  in  general  the  energy  distribution  function 
changes  from  the  form  we  imposed  on  it  initially.  The  change  is  rapid  at 
first,  but  then  becomes  more  gradual  until  the  distribution  function  settles 
down  to  an  equilibrium  form.  After  it  reaches  the  equilibrium  form,  contin- 
uing the  process  of  randomly  redistributing  energy  between  pairs  of  oscil- 
lators only  makes  the  energy  distribution  function  execute  small  fluctua- 
tions about  the  equilibrium  form. 

Our  “experiment”  is  not  just  a thought  experiment.  It  is  also  an  experi- 
mental simulation  that  can  be  carried  out  on  a programmable  pocket 
calculator — provided  that  the  calculator  has  enough  addressable  storage 
registers  to  keep  track  of  the  energy  of  each  of  a reasonably  large  number 
of  oscillators.  Or  a simulation  can  be  carried  out  on  a small  computer.  A 
simulation  is  possible  because  there  is  a quite  simple  way  to  make  a com- 
puting device  generate  the  uniformly  distributed,  random  numbers  needed  to 
pick  out  oscillators  for  interaction  and  redistribute  their  total  energy 
among  them.  An  exercise  at  the  end  of  this  chapter  indicates  the  steps  in- 
volved in  programming  a device  to  carry  out  the  simulation. 

Figure  18-9  is  an  energy  distribution  function  n(e)  showing  the 
number  of  oscillators  per  unit  energy  at  energy  e.  It  was  obtained  in  an 
experimental  simulation  run  as  follows.  The  system  contained  80  oscil- 
lators. Initially,  each  was  assigned  the  same  energy,  e = 4,  with  energy 
measured  in  some  arbitrary  unit.  Thus  a bar  graph  of  the  energy  distribu- 
tion function  initially  imposed  on  the  system  would  consist  of  a single  bar  at 
e = 4 of  height  80.  There  had  been  a total  of  800  interactions  between 


802  Kinetic  Theory  and  Statistical  Mechanics 


n(e) 


Fig.  18-9  Energy  distribution  function  ob- 
tained in  an  experimental  simulation  in 
which  all  the  oscillators  were  initially  as- 
signed the  energy  e = 4.  Thus  the  average 
energy  per  oscillator  is  (e)  = 4. 


17  18  19  20 


pairs  of  oscillators  when  the  data  in  the  figure  were  recorded.  Hence  the 
system  had  had  ample  opportunity  to  approach  the  equilibrium  distribu- 
tion of  energy  among  its  oscillators.  This  was  confirmed,  subsequently,  by 
letting  the  energy  redistribution  process  continue,  stopping  it  periodically 
to  record  the  new  distribution  function.  It  was  found  always  to  maintain 
the  general  form  indicated  in  the  figure  by  the  continuous  curve.  All  that 
happened  was  that  the  subsequent  distribution  functions  fluctuated  mildly 
about  the  continuous  curve.  Thus,  within  the  accuracy  of  the  experiment, 
the  continuous  curve  describes  the  equilibrium  energy  distribution  func- 
tion. 

The  continuous  curve  is  also  a plot  of  the  decreasing  exponential  func- 
tion 

n(e)  = (18-34a) 

The  number  80  is  the  number  of  oscillators  in  the  system.  The  number  4 is 
the  average  energy  per  oscillator.  It  has  this  value  since  initially  each  of  the 
80  oscillators  had  energy  4,  so  initially  the  total  energy  in  the  system  was 
320.  The  total  energy  is  not  changed  by  the  energy-conserving  interactions. 
Hence  at  all  times  the  average  energy  of  the  oscillators  is  320/80  = 4.  Using 
N to  represent  the  number  of  oscillators  in  the  system  and  (e)  to  represent 
their  average  energy,  we  can  express  the  decreasing  exponential  function 
as 


n(e)  =7- ~,e  €li€}  (18-346) 

<e) 

Evidently  the  equilibrium  energy  distribution  function  is  not  one  in 
which  all  oscillators  have  equal  energies.  But  what  happens  if  we  start  with 
a different  initial  distribution?  Figure  18-10  shows  the  result,  after  1200  in- 
teractions, of  starting  with  the  same  total  energy  of  320  units.  But  in  this 
simulation  oscillator  40  was  given  all  the  energy  to  begin  with,  and  the 
others  were  given  zero  initial  energy.  Thus  a bar  graph  of  the  initially  im- 
posed energy  distribution  function  would  contain  a single  bar  at  e = 320  of 
height  1.  At  first,  most  of  the  randomly  chosen  interactions  are  between 
two  oscillators  which  both  have  zero  energy,  and  these  interactions  have  no 
effect  on  the  distribution  function.  It  therefore  takes  more  interactions  for 
the  system  to  reach  equilibrium.  Nevertheless,  the  same  decreasing  expo- 
nential function 


18-5  The  Boltzmann  Factor  803 


n(e) 


n(e)  = f f 6,4  = ~rr  e-ei<d 

(e> 

gives  an  equally  good  description  of  the  equilibrium  energy  distribution 
function. 

What  happens  if  the  average  energy  (e)  of  the  oscillators  is  changed? 
Figure  18-11  shows  results  obtained  by  rerunning  the  simulation  with  each 
oscillator  given  an  initial  energy  e = 2,  so  that  (e)  = 2.  This  energy  distribu- 
tion function  was  recorded  after  800  interactions.  Figure  18-12  shows  the 
energy  distribution  function  recorded  when  oscillator  40  was  initially  given 
energy  e = 160,  the  others  were  given  no  energy  initially,  and  1200  interac- 
tions were  allowed  to  take  place.  In  this  run  also  the  average  energy  of  the 
oscillators  was  (e)  = 160/80  = 2.  Again,  the  two  bar  graphs  of  Figs.  18-11 
and  18-12  look  very  similar,  and  again  they  are  well  fitted  by  the  same  de- 
creasing exponential  curve.  Here  the  curve  is  that  of  the  function 

n(e)  = ^e~d2  (18-35a) 

But  this  function  can  also  be  written  as 

N 

n(e ) — — t e _€/<e>  (18-356) 

The  form  is  identical  to  that  of  Eq.  (18-346). 

It  is  not  surprising  that  the  curves  of  Figs.  18-11  and  18-12  are  steeper 
than  those  of  Figs.  18-9  and  18-10.  There  is  less  average  energy,  and  so  the 
energies  of  individual  oscillators  tend  to  lie  at  lower  values.  However,  the 
total  number  of  oscillators  is  the  same  in  all  cases,  so  that  the  total  areas 
under  the  curves  must  be  the  same.  In  order  for  this  condition  to  be  satis- 
fied, n(e)  approaches  a larger  value  as  e approaches  zero  in  the  cases  of 
Figs.  18-11  and  18-12  than  in  the  cases  of  Figs.  18-9  and  18-10. 

Whatever  the  average  energy,  the  equilibrium  energy  distribution 
function  appears  to  be  strongly  weighted  in  favor  of  low  energies.  That  is,  a 
quite  small  number  of  oscillators  with  energies  considerably  larger  than  the 
average  value  (e)  are  “balanced”  by  a much  larger  number  with  energies 
less  than  (e).  There  are  two  ways  to  see  in  a general  manner  why  this  should 
be  so.  First,  an  oscillator  is  just  as  likely  to  lose  as  to  gain  energy  in  any  given 

804  Kinetic  Theory  and  Statistical  Mechanics 


Fig.  18-11  Energy  distribution  func- 
tion obtained  in  an  experimental  simu- 
lation in  which  all  the  oscillators  were 
initially  assigned  the  energy  e = 2.  Thus 
the  average  energy  per  oscillator  is 

<e>  = 2- 


Fig.  18-12  Energy  distribution  func- 
tion obtained  in  an  experimental  simu- 
lation in  which  all  the  oscillators  were 
initially  assigned  zero  energy  except  os- 
cillator 40,  which  was  assigned  the  en- 
ergy e = 160.  Thus  the  average  energy 
per  oscillator  is  (e>  = 2. 


18-5  The  Boltzmann  Factor 


805 


interaction.  It  therefore  takes  a fortuitous — and  relatively  improbable  — 
series  of  interactions  for  an  oscillator  to  acquire  an  energy  considerably  in 
excess  of  the  average.  And  if  one  oscillator  does  obtain  such  a large  energy, 
there  is  correspondingly  less  energy  available  for  distribution  among  the 
other  oscillators,  which  therefore  tend  to  be  crowded  toward  the  low- 
energy  end  of  the  distribution.  Second,  consider  a particular  interaction 
between  two  oscillators.  After  the  interaction,  neither  of  these  oscillators 
can  ever  have  an  energy  greater  than  the  sum  of  the  initial  energies  of  the 
two.  The  sum  thus  acts  as  a high-energy  cutoff  on  the  possibilities  for  that 
interaction.  But  all  lower  energies  are  possible  for  one  of  the  oscillators. 
The  smaller  a postinteraction  energy  is  for  one  of  the  oscillators,  the  more 
likely  it  is  to  lie  below  the  cutoff  and  thus  be  allowable.  In  particular,  the 
value  e — 0 is  always  allowable  and  thus  is  the  most  probable  energy,  as  the 
graphs  of  Figs.  18-9  through  18-12  suggest. 


At  this  stage  we  do  not  have  an  explanation  of  why  the  equilibrium  en- 
ergy distribution  function  has  the  particular  form  given  in  Eqs.  (18-346)  and 
(18-356).  That  is,  at  present  we  cannot  justify  the  decreasing  exponential 
factor  c-e/'e  in  these  equations.  But  we  can  justify  the  constant  factor  N/(e). 
The  value  of  the  constant  factor  was  not  chosen  to  achieve  the  best  ht  to  the 
bar  graphs.  Rather,  it  was  chosen  to  normalize  the  curves.  That  is,  the  con- 
stant factor  in  the  equilibrium  energy  distribution  function  of  Eqs.  (18-346) 
and  ( 18-356)  was  chosen  to  satisfy  the  condition  that  the  sum  of  the  number 
of  oscillators  at  all  energies  must  be  equal  to  the  total  number  of  oscillators 
N.  This  sum  can  be  calculated  by  taking  the  number  of  oscillators  per  unit 
energy  n(e),  multiplying  by  the  energy  interval  de , and  then  integrating 
over  all  possible  values  of  e.  Since  the  value  obtained  must  be  N,  we  must 
have  the  normalization  condition 


n(e)  de  = N 


(18-36) 


We  can  readily  prove  that  the  function  n(e)  of  Eqs.  (18-34b)  and  (18-35b) 
satisfies  the  normalization  condition.  Substituting  the  function  into  Eq.  (18-36) 
and  using  the  fact  that  (e)  is  a constant  to  write  (l/(e>)  de  as  d(e/(e}),  we  have 

f°°  N f * 

— e-e/'e>de=N  e— e/<e>  d(e/(e)) 

Jo  (e)  J0 

If  we  now  write  x = e/(e),  the  integral  on  the  right  side  assumes  a form  to  which 
we  can  apply  Eq.  (7-22)  and  find  its  value.  That  is,  we  have  for  this  integral 

| e~x  dx  = (-e_x)J=a,  - (— e-Jjx=0  = 0 - (-1) 

J 0 


or 


Using  this  value,  we  obtain 


as  required. 


dx  = 1 


N 

— e-  de  = N 

o (e) 


(18-37) 


806  Kinetic  Theory  and  Statistical  Mechanics 


The  decreasing  exponential  factor  e~elie)  in  Eqs.  (18-346)  and  (18-356) 
can  be  justified,  too.  We  will  do  this  by  using  the  techniques  of  statistical 
mechanics  to  derive  the  factor.  But  as  a preliminary,  we  must  develop  a few 
simple  concepts  of  statistics. 


If  you  flip  a coin,  a postulate  of  equal  a priori  probabilities  tells  you 
that  heads  (H)  and  tails  (T)  are  equally  likely.  On  this  basis,  you  can  predict 
that  a large  number  of  flips  will  result  in  approximately  equal  numbers  of 
heads  and  tails.  We  define  the  normalized  a priori  probability  of  an  out- 
come (or  its  probability,  for  short)  to  be  die  predicted  fraction  of  the  total 
number  of  trials  which  result  in  that  outcome.  In  this  case,  for  example,  the 
probability  of  heads,  P(H),  is 


P(H) 


predicted  number  of  heads 
total  number  of  coin  flips 


2 


(18-38) 


Of  course,  P(T)  = i also.  This  means  not  that  the  final  result  of  a series  of 
flips  will  always  be  exactly  50  percent  heads  and  50  percent  tails,  but  rather 
that  this  is  the  most  likely  result.  The  greater.the  departure  of  a particular 
result  from  the  most  likely  result,  the  less  likely  it  is  to  be  observed.  (We  re- 
turn to  this  point  in  Sec.  18-7.) 


Let  us  throw  four  coins  at  once.  Coin  1 will  fall  in  a certain  way,  coin  2 
in  a certain  way,  and  so  on.  Suppose,  for  example,  that  the  result  of  the 
throw  is  as  follows: 


Coin:  12  3 4 

Result:  H T T T 

This  arrangement,  which  is  one  specific  possible  outcome  of  a four-coin 
toss,  is  called  by  physicists  a microstate.  Every  possible  microstate  results 
from  a fall  of  four  coins,  each  of  which  is  equally  likely  to  fall  heads  or  tails. 
Thus  for  the  system  of  four  coins  all  microstates  are  equally  probable.  The 
same  is  true  for  any  other  system.  That  is,  microstates  are  always  specified 
in  such  a way  that,  for  any  given  system,  all  microstates  are  equally  probable. 

What  is  the  probability  of  the  microstate  specified  above?  We  know 
that  the  probability  of  coin  1 falling  heads  as  required  is  Now  we  also  re- 
quire that  coin  2 fall  tails.  But  of  all  the  times  coin  1 falls  heads,  coin  2 
simultaneously  falls  tails  in  only  half  of  them.  Thus  the  probability  of  the 
two  coins  falling  as  desired  is  f x } = j.  But  of  all  the  times  this  happens, 
coin  3 falls  tails,  as  required,  only  half  the  time.  So  the  first  three  coins  fall 
as  desired  for  this  microstate  with  a probability  \ x 4 = i.  By  extending  the 
argument  to  require  that  coin  4 also  fall  tails,  we  find  the  probability  of  the 
microstate  to  be 

P(HTTT)  = |x  jx|x  j = ^ 

If,  as  in  this  example,  a microstate  depends  on  the  joint  occurrence  of  two  or 
more  independent  outcomes  (here  the  falling  of  the  individual  coins),  its 
probability  is  the  product  of  the  individual  probabilities  of  the  independent 
outcomes.  That  is,  the  so-called  joint  probability  is  given  by  the  product 

P(HTTT)  = P(EI)P(T)P(T)P(T) 

In  general. 

18-5  The  Boltzmann  Factor  807 


Table  18-4 


Microstates  of  a Four-Coin 


1 

2 

3 

4 

H 

H 

H 

H 

H 

H 

H 

T 

H 

H 

T 

H 

H 

T 

H 

H 

T 

H 

H 

H 

H 

H 

T 

T 

H 

T 

H 

T 

H 

T 

T 

H 

T 

H 

H 

T 

T 

H 

T 

H 

T 

T 

H 

H 

H 

T 

T 

T 

T 

H 

T 

T 

T 

T 

H 

T 

T 

T 

T 

H 

T 

T 

T 

T 

Toss 


Macrostate  “four  heads”:  One  microstate 


Macrostate  “three  heads”:  Four  microstates 


Macrostate  “two  heads”:  Six  microstates 


Macrostate  “one  head”:  Four  microstates 


Macrostate  “zero  heads”:  One  microstate 


^(outcome  1 and  outcome  2 and  . . . and  outcome  N) 

= P(outcome  l)P(outcome  2)  • • • P(outcome  AO  (18-39) 

We  can  check  this  result  by  listing  all  possible  outcomes  of  the  four-coin 
toss  and  counting  them.  They  are  listed  in  fable  18-4.  There  are  16  possi- 
bilities, all  being  equally  probable  microstates.  The  microstate  HTTT  is  one  of 
them,  and  so  its  probability  is  indeed 

P(HTTT)  = tV 

Suppose,  now,  that  you  are  playing  a game  in  which  you  bet  on  the 
number  of  heads  that  come  up  in  a four-coin  toss.  If  your  bet  is  that  one 
head  and  three  tails  will  turn  up,  you  do  not  particularly  care  which  coin 
falls  which  way,  but  only  that  any  one  coin,  and  only  that  one,  fall  heads. 
Physicists  call  such  an  outcome  a macrostate.  In  Table  18-4,  all  the  micro- 
states belonging  to  the  same  macrostate  are  listed  together,  and  the  macro- 
states are  separated  by  dashed  lines.  There  are  four  equally  probable 
ways — m other  words,  microstates  — in  which  one  head  and  three  tails  can 
come  up.  Since  any  one  of  these  ways  will  do,  the  total  probability  P(one 
head)  of  this  occurrence  is  the  sum  of  the  four  equal  probabilities 
P(HTTT),  P(THTT),  P(TTHT),  and  P(TTTH).  That  is, 

P(one  head)  = P(HTTT)  + P(THTT)  + P(TTHT)  + P(TTTH) 

= lV+lV  + l^+lV  = 4 

And  since  all  the  microstates  have  the  same  probability,  we  can  also  calcu- 
late the  probability  of  the  macrostate  as  follows: 

P(one  head)  = 4P(HTTT)  = 4 x iV  = \ 

So  in  a fair  game  you  should  bet  against  4-to-l  odds  on  the  one-head  mac- 
rostate. 

We  have  just  deduced  and  made  use  of  the  following  rule:  The  probabil- 

808  Kinetic  Theory  and  Statistical  Mechanics 


ity  of  a macrostate  is  the  number  of  microstates  included  in  it  multiplied  by  the  proba 
bility  of  any  one  of  the  microstates.  The  rule  is  used  also  in  Example  18-5. 


EXAMPLE  18-5 

What  is  the  probability  of  throwing  a 3 with  a pair  of  dice? 

■ Each  die  has  six  sides  numbered  1 through  6 (by  means  of  spots).  You  employ 
a postulate  of  equal  a priori  probabilities  to  predict  that  each  of  the  six  numbers  is 
an  equally  probable  result  of  a throw.  Thus  the  probability  of  each  is  i.  When  you 
throw  both  dice,  the  probability  of  any  microstate  must  be  the  joint  probability 

P(  any  microstate)  = |xj  = s 

You  now  count  the  microstates  which  comprise  macrostate  “3.”  There  are  exactly 
two: 

Die:  1 2 


Macrostate  “3”:  Two  microstates 

2 1 

Applying  the  rule  that  we  have  deduced,  we  evaluate  the  probability  P(3)  of  the 
macrostate  for  throwing  a 3 by  multiplying  the  number  of  its  microstates,  2,  by 
the  probability  of  any  of  them,  is-  That  is, 

P(  3)  = 2x^  = ,l 

If  you  are  willing  to  spend  some  time  throwing  a pair  of  dice  and  recording  the 
number  of  limes  you  throw  a 3,  as  well  as  the  total  number  of  throws,  you  can  test 
this  prediction.  When  you  have  accumulated  enough  data  for  your  test  to  be  statisti- 
cally significant,  you  will  find  the  prediction  to  be  correct. 


One  of  the  features  of  the  rule  for  calculating  the  probability  of  a 
macrostate  plays  a very  important  role  throughout  the  remainder  of  this 
chapter.  It  is  this:  The  probability  of  a certain  macrostate  of  the  system  is  propor- 
tional to  the  number  of  microstates  included  in  that  macrostate. 

We  are  now  ready  to  apply  these  statistical  ideas  to  a system  compris- 
ing a set  of  many  identical  objects  in  thermal  equilibrium.  In  this  work  we 
will  derive  the  decreasing  exponential  factor  in  Eqs.  (18-346)  and  (18-356), 
which  is  called  the  Boltzmann  factor.  In  addition  to  being  the  principal  topic 
of  this  section,  the  Boltzmann  factor  is  central  to  the  theory  of  statistical 
mechanics.  It  is  of  great  importance  because  it  plays  a key  role  in  the 
description  of  microscopic  systems  in  almost  all  fields  of  physics.  The  Boltz- 
mann factor  is  also  of  great  importance  in  applications  of  physics,  such  as 
chemistry  and  electrical  engineering.  For  example,  the  operation  of  the 
transistor  depends  on  the  way  in  which  the  Boltzmann  factor  governs  the 
distribution  of  energy  among  the  microscopic  entities  w hich  carry  electric 
current  through  a transistor. 

Consider  a system  containing  a set  of  objects  which  interact  to  maintain 
thermal  equilibrium.  The  total  energy  of  the  system  has  the  value  E,  and 
this  value  remains  constant  because  the  system  is  isolated  from  its  sur- 
roundings. All  the  objects  in  the  system  are  identical,  and  there  are  a very 
large  number  N of  them.  They  can  be  harmonic  oscillators,  atoms,  mole- 
cules, or  anything,  providing  only  that  they  satisfy  what  can  be  called  the 
independence  requirement.  That  is,  we  require  that  the  fact  that  one  ob- 
ject happens  to  have  a particular  energy  have  no  direct  influence  on  the 


18-5  The  Boltzmann  Factor  809 


probability  of  another  object’s  having  the  same  energy.  (Of  course,  there 
will  always  be  an  indirect  influence  if  one  object  has  an  energy  greater  than 
half  the  total  energy  of  the  system.  In  this  case  another  object  cannot  have 
the  same  energy  because  if  it  did,  the  sum  of  the  energies  of  the  two  objects 
would  exceed  the  available  energy.  But  this  indirect  influence  has  to  do 
with  energy  conservation  and  is  not  a violation  of  the  independence  re- 
quirement.) The  independence  requirement  is  satisfied  for  all  systems  in 
the  newtonian  domain  and  by  systems  in  the  quantum  domain  in  any  of  the 
cases  with  which  we  will  be  concerned.  But  there  are  very  important  cases 
in  the  quantum  domain  where  the  requirement  is  not  satisfied  and  where 
the  results  we  will  obtain  from  this  argument  therefore  do  not  apply. 

Let  one  of  the  objects  have  energy  ex.  Specify  nothing  about  the  indi- 
vidual energies  of  the  other  objects.  Although  the  individual  energies  of 
these  N — 1 objects  are  unknown,  it  is  known  that  they  must  have  total  en- 
ergy £ — ex.  This  energy  can  be  distributed  among  the  N - 1 objects  in  a 
great  many  ways,  each  of  which  is  an  equally  probable  microstate.  The 
number  of  such  microstates  depends  on  E — ci.  Since  £ is  a constant,  we 
can  also  say  that  the  number  of  microstates  depends  on  ej.  The  probability 
of  the  macrostate  in  which  one  of  the  objects  has  energy  is  proportional 
to  the  number  of  these  microstates,  all  of  which  are  included  within  the 
macrostate.  Hence,  this  probability  depends  on  ex  and  can  be  written  P(e fl. 
The  functional  dependence  of  P(e d on  ex  is  yet  to  be  determined. 

Now  let  some  other  object  have  a specified  energy,  the  energy  e2,  and 
apply  the  argument  of  the  preceding  paragraph.  We  conclude  immediately 
that  the  probability  of  this  macrostate  can  be  written  as  P(e2).  The  two  prob- 
abilities are  given  by  the  same  function  since  the  objects  are  identical.  But 
here  the  function  is  evaluated  for  the  energy  e2. 

Next  consider  a situation  in  which  we  specify  that  some  object  has  en- 
ergy ex  and  that  some  other  object  has  energy  e2.  Specifying  that  some  ob- 
ject has  one  energy  ex  does  not  affect  the  probability  than  some  other  object 
has  some  other  energy  e2  because  the  objects  satisfy  the  independence  re- 
quirement. In  other  words,  the  probabilities  remain  P(e x)  and  P(e2)  since 
they  are  independent  probabilities.  In  view  of  this  independence,  the  joint 
probability  that  an  object  has  energy  ex  and  another  object  has  energy  e2  is 
given  by  the  product  P(ei)P(e2). 

Now'  we  modify  slightly  the  flrst  argument  to  find  a different  expres- 
sion for  this  joint  probability.  When  two  objects  have  energies  ex  and  e2,  the 
remaining  A - 2 objects  of  the  system  must  have  energy  £ — (ex  + e2).Just 
as  in  the  first  argument,  we  say  that  there  are  a large  number  of  ways  in 
w'hich  this  energy  can  be  distributed  over  the  N — 2 objects,  each  being  an 
equally  probable  microstate  of  the  macrostate  in  which  one  object  has  en- 
ergy ex  and  another  has  energy  e2.  The  number  of  these  microstates  de- 
pends on  £ — (ex  + e2)  or,  since  £ is  a constant,  simply  on  €x  + e2.  Since  the 
probability  of  the  macrostate  is  proportional  to  the  number  of  microstates, 
it  depends  on  ex  + e2.  We  write  this  probability  as  P(e1  + e2).  The  symbol  P 
used  for  this  function  is  the  same  as  that  used  before  in  order  to  indicate 
that  the  functional  dependence  of  this  probability  on  its  argument  (the 
quantity  within  parentheses)  is  the  same  as  that  of  the  functions  P(ex)  and 

P(c2). 

This  is  true  even  though  in  the  present  case  the  microstates  involved  in  deter- 
mining the  probability  of  a macrostate  are  microstates  of  a collection  of  N - 2 ob- 
jects of  unspecified  energies,  whereas  in  the  earlier  case  we  were  concerned  with 
microstates  of  collections  of  N — 1 objects  of  unspecified  energies.  To  see  this, 


810  Kinetic  Theory  and  Statistical  Mechanics 


consider  a situation  in  which  the  total  energy  of  the  objects  of  unspecified  en- 
ergies has  the  same  value  in  both  cases.  Then  by  comparing  the  present  case  to  the 
earlier  case,  we  find  there  will  be  fewer  microstates  in  the  macrostate  because 
there  are  only  N - 2 objects  of  unspecified  energies  instead  of  N - 1.  But  there 
will  also  be  fewer  microstates  altogether — that  is,  for  all  macrostates.  Thus  the 
normalized  probability  of  a microstate  within  the  macrostate  will  be  greater. 
When  N is  large  enough  that  the  difference  between  N — 2 and  N — 1 is  very 
small  compared  to  N,  the  decrease  in  the  number  of  microstates  within  the  mac- 
rostate is  just  compensated  by  the  increase  in  the  probability  of  each  microstate, 
and  there  is  no  change  in  the  probability  of  the  macrostate. 

We  have  found  two  different  expressions  for  the  probability  that  some 
object  has  energy  ex  and  some  other  has  energy  e2-  The  first  is  P(ei)P(e2). 
The  second  is  P(ex  + e2).  Since  the  two  express  the  same  thing,  their  values 
are  equal: 

P(e1)P(e2)  = P(ei  + e2)  (18-40) 

Equation  (18-40)  shows  that  the  probability  function  P(e)  is  subject  to  a very 
strong  mathematical  restriction:  The  product  of  the  values  of  the  function  for 
any  two  particular  arguments  must  equal  the  value  of  the  function  for  the  sum  of 
those  two  arguments.  The  only  class  of  mathematical  functions  for  which  this 
is  true  is  the  class  of  exponential  functions.  It  is  most  convenient  to  use  e as  the 
base  in  writing  such  a function.  (Doing  so  involves  no  loss  of  generality.  An 
exponential  function  written  to  any  other  base  a can  always  be  converted  to 
the  base  e by  using  the  relation  ax  = exXna.)  Consequently,  the  probability 
of  finding  an  object  at  energy  e must  be  either  of  the  form  P(e)  = Cd36  or  of 
the  form  P(e)  = Ce~ee,  where  C and  (3  are  positive  constants  yet  to  be  deter- 
mined. The  form  with  the  positive  exponent  would  satisfy  the  mathemati- 
cal conditions  imposed  on  the  function.  But  it  would  mean  that  the  proba- 
bility of  finding  an  object  with  a certain  energy  e increased  without  limit  as  e 
increases.  This  would  imply  that  the  system  of  objects  had  infinite  energy  in 
all  circumstances.  So  the  form  Cd36  must  be  rejected  on  physical  grounds. 
We  conclude  that  the  probability  P(e)  that  an  object  has  energy  e is  given  by 

P(e)  = Ce~ ^ (18-41) 

The  following  considerations  explain  why  P(e)  decreases  with  increas- 
ing e.  The  larger  the  energy  e of  one  object  in  the  system  whose  total  en- 
ergy is  E,  the  smaller  the  energy  E — e that  remains  for  the  other  N — 1 ob- 
jects of  the  system.  The  smaller  the  energy  E — e,  the  fewer  ways  there  are 
for  it  to  be  shared  among  the  N — 1 objects.  That  is,  the  fewer  the  number 
of  microstates  that  belong  to  the  macrostate  in  which  one  object  has  energy 
e.  Since  the  probability  of  the  macrostate  is  proportional  to  the  number  of 
its  microstates,  that  probability  becomes  smaller  as  e becomes  larger.  And 
the  probability  of  the  macrostate  is  just  equal  to  P(e). 

It  is  most  useful  to  say  that  P(e)  is  the  probability  that  “a  single-object 
state  at  energy  e is  occupied.”  A single-object  state  is  a complete  specifica- 
tion of  whatever  quantities  must  be  known  in  order  to  know  everything  of 
interest  about  the  particular  condition  of  a single  object.  Saying  that  a 
single-object  state  at  energy  e is  occupied  amounts  to  saying  that  the  condi- 
tion of  the  object  is  specified  by  the  quantities  associated  with  that  state.  Since 
the  single-object  state  is  at  energy  e,  in  saying  that  it  is  occupied  we  specify 
that  the  energy  of  the  object  is  e.  The  utility  of  this  way  of  speaking  will 
become  more  apparent  as  we  continue. 


18-5  The  Boltzmann  Factor  811 


We  next  evaluate  the  constant  (3  in  Eq.  (18-41)  by  focusing  our  atten- 
tion on  objects  of  a definite  type  and  then  using  that  equation  to  calculate 
the  average  energy  of  these  objects.  Since  Eq.  (18-41)  applies  to  objects  of 
any  type  that  satisfy  the  independence  requirement,  we  can  use  any  such 
type  we  want.  We  will  use  the  simplest  type — harmonic  oscillators.  To  cal- 
culate (e),  the  average  over  a distribution  of  identical  harmonic  oscillators 
of  the  energy  e of  each  oscillator,  we  take  every  possible  value  of  e,  multiply 
it  by  the  number  of  oscillators  having  this  value,  sum  all  these  products, 
and  then  divide  by  the  total  number  of  oscillators.  (This  procedure  for  cal- 
culating a “weighted  average”  is  just  the  one  you  would  follow  in  calcu- 
lating the  average  age  (A)  of  the  persons  in  some  general  population.  It  can 
also  be  applied  to  the  particularly  simple  populations  in  Fig.  18-8  and  leads 
to  the  values  of  (A)  already  quoted  for  these  populations.  Try  it.) 

Consider  an  energy  interval  from  e to  e + de  whose  size  de  is  small 
enough  that  there  is  very  little  change  in  the  value  of  P(e)  over  the  interval. 
(Physically,  the  energy  interval  is  not  infinitesimal  since  it  must  contain  a 
number  of  single-object  states.  But  the  energy  interval  is  small  enough  to 
be  treated  mathematically  as  an  infinitesimal.  Hence  the  symbol  de  is 
appropriate.)  The  probability  that  any  one  oscillator  has  an  energy  in  the 
interval  is  the  probability  P(e)  that  one  of  its  single-object  states  in  the  in- 
terval is  occupied,  multiplied  by  the  number  of  these  states  contained  in  the 
interval.  And  the  number  of  single-object  states  contained  in  the  interval  is 
G de , where  G is  the  number  of  single-object  states  per  unit  energy  and  de  is 
the  size  of  the  energy  interval. 

Now  the  single-object  states  of  a harmonic  oscillator  are  uniformly  dis- 
tributed in  energy.  That  is,  G has  the  same  value  for  all  values  of  e.  I bis 
reasonable-sounding  statement  is  equivalent  to  the  assumption  that  every 
possible  redistribution  of  energy  between  a pair  of  interacting,  identical 
harmonic  oscillators  is  equally  probable  in  the  experimental  simulation 
considered  earlier.  To  see  this,  note  that  if  the  value  of  G were  not  inde- 
pendent of  e but,  say,  had  a maximum  in  a certain  range  of  e,  then  there 
would  be  a tendency  for  the  redistributions  of  energy  to  be  such  that  one  of 
the  oscillators  ends  up  with  an  energy  in  this  range  just  because  there  are 
more  single-object  states  there.  In  Chap.  31  simple  quantum  mechanics  will 
be  used  to  analyze  the  behavior  of  a harmonic  oscillator.  There  you  will  see 
that  the  lack  of  a dependence  of  G on  e is  not  just  a reasonable-sounding 
statement  or  an  assumption.  It  is  a necessary  consequence  of  the  properties 
of  a harmonic  oscillator.  Furthermore,  you  will  see  that  any  harmonic  oscil- 
lator which  is  not  in  the  quantum  domain  has  many  single-object  states  in 
even  a small  energy  interval  de.  Thus  it  makes  sense  to  speak  of  there  being 
a number  of  these  states  in  the  interval  because  we  assume  the  harmonic  os- 
cillators are  not  in  the  quantum  domain. 

We  continue  with  our  task  of  evaluating  (e).  Since  G de  is  the  number 
of  single-object  states  in  the  energy  interval  de  and  since  P(e)  is  the  proba- 
bility that  any  particular  one  of  these  states  in  the  interval  is  occupied,  the 
probability  that  some  single-object  state  in  the  interval  is  occupied  is  P(e)G  de. 
This  is  also  the  probability  that  one  oscillator  will  have  an  energy  in  the 
interval.  Since  there  are  a total  of  N oscillators  in  the  system,  the  total 
number  having  an  energy  in  the  interval  is  NP(e)G  de.  As  we  said  before,  we 
calculate  the  average  energy  of  an  oscillator  by  multiplying  this  quantity  by 
e,  summing  over  all  possible  values  of  e,  and  then  dividing  by  the  total 
number  of  oscillators  in  the  system.  Because  there  are  many  single  object 

812  Kinetic  Theory  and  Statistical  Mechanics 


states  in  the  energy  interval  de , we  can  perform  summations  by  integrating. 
Thus  we  have 


<e>  = 


eNP(e)G  de 


NP(e)G  de 


(18-42) 


The  numerator  on  the  right  side  of  this  equation  is  the  energy  content  of 
the  system  in  the  interval  de , integrated  over  all  e.  So  it  is  the  total  energy 
content  of  the  system.  The  denominator  is  the  total  number  of  oscillators  in 
the  interval  de,  integrated  over  all  e.  Hence  it  is  the  total  number  of  oscil- 
lators in  the  system.  Thus  Eq.  (18-42)  can  be  interpreted  as  evaluating  the 
average  energy  (e)  of  an  oscillator  in  the  system  by  dividing  the  total  energy 
content  of  the  system  by  the  total  number  of  oscillators  it  contains. 

Using  Eq.  (18-41)  to  evaluate  P(e),  we  have 


<e>  = 


eNCe-^G  de 


NCe-^G  de 


Now  the  quantities  N,  C,  and,  in  particular,  G do  not  depend  on  e.  Thus 
they  may  be  taken  through  the  integral  signs  and  then  canceled,  to  simplify 
the  fraction  to 


ee  de 


<€>  =' 


de 


J o 


To  facilitate  evaluation  of  the  integrals,  we  multiply  both  numerator  and 
denominator  of  the  fraction  by  two  factors  of  /3,  and  we  use  the  fact  that  f3 
does  not  depend  on  e so  that  it  may  be  moved  at  will  through  integral  and 
differential  signs.  We  obtain 


<e>  = 

Then  setting  x = f3e , we  have 

<€> 


J 0 


[3ee  d(  /3e ) 


/3  e-^d((3e) 


xe  x dx 


/3  e x dx 


The  integral  in  the  denominator  has  been  shown  in  Eq.  (18-37)  to  have 
the  value  1.  The  integral  in  the  numerator  is  a difficult  one,  but  its  value 
can  be  obtained  immediately  by  consulting  almost  any  table  of  definite  inte- 
grals. This  value  also  is  1.  So  we  finally  obtain  the  result 


or 


(18-43) 


18-5  The  Boltzmann  Factor  813 


For  the  system  of  harmonic  oscillators,  the  constant  ($  equals  the  reciprocal 
of  their  average  energy  (e). 


As  remarked  immediately  below  Eq.  (18-42),  the  quantity  NP{e)G  de  is 
the  total  number  of  oscillators  in  the  system  which  will  be  found  in  the  en- 
ergy interval  from  e to  t + de.  The  number  per  unit  energy,  n(e),  is  this 
quantity  divided  by  de.  Thus  we  have 

n(e)  = NP(e)G  (18-44) 

Using  Eq.  (18-43)  in  Eq.  (18-41)  to  evaluate  P{e),  we  obtain 

n(e)  = NCGe  -£/<e> 

We  can  write  this  in  a simpler  way  by  taking  advantage  of  the  fact  that  A,  C, 
and  G are  constants  for  a particular  system  of  harmonic  oscillators  with  a 
particular  total  energy  and  temperature.  Hence  we  can  group  them  into  a 
constant  K = NCG  and  write 

n(e)  = Ke~€lie)  (18-45a) 


The  value  of  K can  be  determined  by  applying  the  normalization  condition 
of  Eq.  (18-36), 

I n(e)  de  = N 
Jo 


The  calculation  in  small  print  following  Eq.  (18-36)  shows  that  its  value 
must  be 


K = (18-456) 

<e> 

Note  that  since  K = NCG,  this  expression  forK  gives  NCG  = N/(e),  or 

c=g^> 

Thus  the  value  of  the  constant  C in  Eq.  (18-41)  for  the  general  form  of  the  probabil- 
ity P(e)  has  been  determined  in  terms  of  quantities  describing  the  system  of  har- 
monic oscillators^  just  as  the  constant  /3  has  been  so  determined.  But  C is  very 
much  less  important  than  f3  because  the  latter  leads  to  a result  of  universal  signifi- 
cance, as  we  will  soon  explain. 


Now  we  can  make  a comparison  between  our  present  derivation  and 
the  earlier  experimental  simulation.  Using  Eq.  (18-456)  in  Eq.  (18-45«),  we 
have  the  following  expression  for  n(e),  the  number  of  oscillators  per  unit 
energy: 

N 

n{e)  = ~r\e  -£/<e> 

(e) 

This  result,  derived  from  statistical  theory,  is  identical  to  the  one  obtained 
from  the  experimental  simulation  and  described  by  Eqs.  (18-346)  and 
(18-356). 


We  can  use  the  equipartition  theorem  to  connect  the  average  energy  (e) 
of  an  oscillator  with  the  absolute  temperature  T of  the  system  in  which  it 
is  in  thermal  equilibrium.  Since  each  of  the  oscillators  with  which  we  are 

814  Kinetic  Theory  and  Statistical  Mechanics 


dealing  consists  of  a body  connected  to  a spring  and  moving  in  one  dimen- 
sion along  a fixed  line,  the  expression  for  its  energy  e is 

e = 2 mv2  + {kx1 

Because  the  oscillators  are  not  in  the  quantum  domain,  they  can  absorb  en- 
ergy by  increasing  their  vibrational  motions,  no  matter  what  the  tempera- 
ture. (Compare  this  with  the  more  complicated  situation,  discussed  in  Sec. 
18-4,  that  occurs  with  harmonic  oscillators  in  the  quantum  domain,  such  as 
hydrogen  molecules.)  So  this  expression  is  one  that  gives  the  energy  e 
of  an  oscillator.  Since  there  are  two  terms  in  the  expression,  the  second 
statement  of  the  equipartition  theorem  says  that 

(e)  = ikT  = kT 

where  k is  Boltzmann’s  constant.  Using  this  in  Eq.  (18-43), 


we  obtain 

(18-46) 

The  constant  /3  has  a value  given  by  the  reciprocal  of  the  product  of  Boltz- 
mann’s constant  k and  the  absolute  temperature  T of  the  system.  Although 
we  have  found  this  result  by  using  a system  containing  objects  of  a particu- 
lar type  (macroscopic  harmonic  oscillators),  it  actually  applies  to  a system 
containing  objects  of  any  type.  Some  experimental  justification  of  this  state- 
ment is  presented  in  Sec.  18-6.  An  exercise  at  the  end  of  this  chapter  pro- 
vides dieoretical  justification. 

Using  our  evaluation  of  /3  in  the  general  expression  for  the  probability 
P(e)  that  a single-object  state  at  energy  e is  occupied,  Eq.  (18-41),  we  can 
write  the  proportionality 

P(e)  °c  e~£lkT  (18-47) 

We  do  not  put  in  the  constant  C needed  to  write  an  equality.  This  constant, 
which  serves  to  make  P(e)  a normalized  probability,  has  a value  which 
varies  from  case  to  case.  In  contrast  to  /3 , there  is  no  way  to  write  an  explicit 
formula  for  C that  applies  to  all  cases.  But  this  is  of  no  consequence.  The 
basic  result  of  this  section  is  contained  in  the  proportionality.  This  propor- 
tionality between  P(e)  and  e~elkT  is  the  most  important  relation  obtained  in 
statistical  mechanics.  It  is  important  because  it  is  used  frequently  in  almost 
all  fields  of  physics.  The  relation  says  this:  If  there  are  a number  of  objects,  of 
any  nature,  in  thermal  equilibrium  at  absolute  temperature  T,  then  the  probability 
P(e)  that  a single-object  state  of  one  of  these  objects  at  energy  e is  occupied  decreases 
exponentially  with  increasing  values  of  e in  proportion  to  the  factor  e~elkT,  where  k 
is  Boltzmann’s  constant.  Idle  factor  e~€lkl  is  called  the  Boltzmann  factor. 

Example  18-6  makes  use  of  the  Boltzmann  factor. 


EXAMPLE  18-6 

Measurements  of  the  light  emitted  by  a source  containing  hydrogen  gas  (that  is, 
measurements  of  the  “spectrum"  of  this  light  made  by  using  techniques  discussed  in 
Chap.  28)  are  interpreted  to  show  that  a hydrogen  molecule  can  vibrate.  In  this  mo- 


18-5  The  Boltzmann  Factor  815 


e4  =30.59  X KT20  J 


e3  = 21.85  X 10-20  J 


lion,  the  separation  between  the  centers  of  iis  two  atoms  performs  harmonic  oscilla- 
tions about  an  equilibrium  value  (as  discussed  qualitatively  in  Sec.  18-4).  Associated 
with  this  vibrational  motion  is  an  energy  e.  The  measurements  indicate  that  only 
certain  values  of  e occur.  The  hrst  four  are 

e,  = 4.37  x IQ”20  J 


e2  = 13.1  1 X 10"20  J 


e2  = 13.11  x IO"20  J 
e3  = 21.85  x 1(T20  J 
e4  = 30.59  x 10“2oJ 


ej  =4.37  X io-20  J 


e = 0 


Fig.  18-13  A diagram  representing  the 
energies  e}  of  the  first  four  states  of  vi- 
brational motion  of  a hydrogen  mole- 
cule. 


These  energies  are  plotted  in  Fig.  18-13  as  horizontal  lines  whose  distances  above 
the  line  e = 0 are  proportional  measures  of  the  corresponding  energies.  Each  of 
these  energies  corresponds  to  a single-object  state  of  the  vibrational  motion  of  the 
molecule.  You  shotdd  not  be  surprised  to  see  that  these  states  are  uniformly  distrib- 
uted in  energy,  since  this  is  characteristic  of  any  harmonic  oscillator.  But  you  may  be 
very  surprised  that  each  is  well  separated  in  energy  from  its  neighbors.  That  is,  the 
energies  have  a discrete  set  of  values.  This  is  characteristic  of  a harmonic  oscillator 
whose  size  is  in  the  atomic  or  molecular  range,  so  that  the  oscillator  is  in  the 
quantum  domain.  The  phenomenon  is  described  by  saying  that  the  energy  is  quan- 
tized. Energy  quantization  is  explored  at  length  in  Chap.  31. 

a.  A sample  of  hydrogen  gas  is  in  thermal  equilibrium  at  room  temperature, 
T = 300  K.  Use  the  Boltzmann  factor  to  calculate  P(e2)/P(e i),  the  ratio  of  the  proba- 
bility that  a molecule  in  the  gas  will  occupy  its  single-object  state  at  energy  e2  to  the 
probability  that  it  will  occupy  its  state  at  energy  ej. 

b.  Use  the  results  obtained  in  part  a to  evaluate  the  average  vibrational  energy 
(e)  of  a molecule. 

c.  Repeat  parts  a and  b for  T = 600  K.  Then  calculate  the  contribution  of  the 
vibrational  motion  to  the  molecular  heat  capacity  at  constant  volume,  c'v,  at  room 
temperature. 

d.  Calculate  P(e2)/P{e1)  and  P{e3)/P(€i)  at  T = 10,000  K.  Then  predict  (e)  and 
the  contribution  of  vibrational  motion  to  c'v. 

■ a.  The  probability  ratio  is  given  by  the  ratio  of  the  Boltzmann  factors.  That  is, 


p{€i)  g—eilkT 


Now 


e2  - ei  = 13.11  x 10“20  | - 4.37  x 10~20J  = 8.74  x IO”20  J 
And  the  value  of  kT  at  room  temperature  is 

kT  = 1.38  x 10"23  J/K  x 300  K = 4.14  x 1(T21  J 

Thus 

e,  - ex  8.74  x 1(T20  I 

— = = 21  1 

kT  4.14  x 1(T21  J' 

So  you  have 

= e- 21.1  = 6 86  x 10-'° 

P(e  x) 

b.  You  can  see  from  the  result  just  obtained  that  the  probability  of  a molecule 
occupying  its  single-object  state  at  energy  e2  is  extremely  small  relative  to  the  proba- 
bility of  its  occupying  the  one  at  energy  Ex.  And  a moment's  consideration  will  show 
you  that  the  relative  probability  of  its  occupying  the  one  at  energy  e3  is  completely 
negligible.  Thus  to  an  approximation  very  much  better  than  the  accuracy  of  the 
quoted  values,  you  can  say  that  essentially  all  the  molecules  are  in  their  single-object 
states  at  energy  ex-  Their  average  vibrational  energy  is  therefore 

<e>  = Ex  = 4.37  x IQ-20  J 


816  Kinetic  Theory  and  Statistical  Mechanics 


c.  At  T = 600  K,  you  have 

kT  = 2 X 4.14  X 10“21  J 

and 


P(e2) 

= e-*i -i«  = 2.49  x 10“s 

P(e  i) 

Since  this  relative  probability  is  still  extremely  small,  you  again  obtain,  to  a high  de- 
gree of  accuracy, 

(e)  = = 4.37  x lO”20  J 

Comparing  the  results  found  for  (e)  at  T = 600  K with  those  found  at  T = 
300  K,  you  see  that  essentially  no  energy  has  been  absorbed  by  the  gas  through  an 
increase  in  the  vibrational  motion  of  its  molecules  as  a result  of  the  temperature  in- 
crease. I hus  you  can  conclude  that  the  vibrational  motion  makes  no  contribution  to  the 
molecular  heat  capacity  at  constant  volume  of  hydrogen  gas  at  room  temperature.  This  con- 
clusion agrees  with  the  direct  measurements  of  heat  capacity  discussed  in  Sec.  18-4. 

These  considerations  provide  insight  into  what  is  meant  in  Sec.  18-4  by  phrases 
like  “the  number  of  ways  of  absorbing  energy”  or  “the  number  of  terms  in  the  ex- 
pression for  the  energy  content,”  in  referring  to  the  equipartition  theorem.  The  in- 
ternal structure  of  hydrogen  molecules  makes  them  harmonic  oscillators.  But  they 
are  harmonic  oscillators  whose  size  puts  them  in  the  quantum  domain.  As  a result, 
their  states  of  vibrational  motion  are  well  separated  in  energy.  In  fact,  the  separa- 
tion between  the  lowest  state  and  the  one  above  it,  e2  — elt  is  about  20  times  larger 
than  kT  at  room  temperature.  Because  of  the  exponential  nature  of  the  Boltzmann 
factor,  in  these  circumstances  the  oscillators  are  essentially  confined  to  their  lowest 
energy  states  of  vibrational  motion  at  temperatures  up  to,  and  well  above,  room 
temperature.  They  cannot  absorb  any  part  of  the  energy  supplied  to  heat  hydrogen 
gas,  by  means  of  an  increase  in  their  vibrational  motions  which  "promotes"  that  mo- 
tion to  the  next  higher  energy.  Such  a "promotion"  requires  energy  e2  — el  much 
larger  than  the  energy  kT,  which  is  comparable  to  the  energy  available  per  mole- 
cule. Hence  in  this  temperature  range  vibrational  motion  is  not  to  be  counted  in 
applying  the  equipartition  theorem. 

d.  At  T = 10,000  K,  the  value  of  AT  is 

kT  = 1.38  x 10“23  J/K  x 1.00  x 104  K = 1.38  x 10~19J 
Now  you  have 

e2  - £i  8.74  x 10-20  | 

= = 0 633 

kT  1.38  x ur19  J 

and 


Also,  you  have 


T(62) 

P(€l) 


e~0.633 


0.531 


e3  Cl 

— — = 2 X 0.633 
kT 

and 

P(e  3) 

— = c”2*0-633  = 0.281 

P(e  i) 

At  T = 10,000  K,  the  energy  e2  — is  only  about  half  the  energy  kT.  So  a mole- 
cule has  a quite  appreciable  chance  of  occupying  the  vibrational  single-object  state 
at  energy  e2.  1 he  same  is  true  ot  the  one  at  energy  e3.  Indeed,  calculation  shows  that 
P(e3)/P(£x)  does  not  fall  below  1 percent  until  j exceeds  7.  (This  assumes  that  the 


18-5  The  Boltzmann  Factor  817 


spacing  in  e between  adjacent  single-object  states  continues  to  be  essentially  constant 
as  j increases  through  7.  But  the  assumption  is  not  critical.)  Thus  at  this  quite 
high  temperature  the  widely  separated  states  of  vibrational  motion  combined  with 
the  effect  of  the  Boltzmann  factor  do  not  operate  to  prevent  a hydrogen  molecule 
from  occupying  its  single-object  states  of  higher  energy  by  absorbing  energy  by 
means  of  increased  vibrational  motion. 

This  conclusion  gives  you  justification  in  arguing  that  the  equipartition 
theorem  can  be  applied  to  hydrogen  molecules  at  T = 10,000  K,  with  the  vibrational 
motion  taken  into  account,  to  calculate  a value  of  (e)  that  is  at  least  approximately 
correct.  Hence  you  may  predict  that  on  the  average  the  energy  of  a molecule  has  a 
contribution  of  approximately  \kT  from  the  kinetic  energy  of  vibration  and  a con- 
tribution of  approximately  i kT  from  the  potential  energy  of  vibration.  Thus  the 
average  energy  of  vibration  should  be  approximately  f kT,  or 

(e)  — kT 

If  you  write  a program  to  make  a calculator  or  computer  do  the  numerical 
work,  you  will  find  it  easy  to  verify  this  prediction  by  calculating  the  value  of 

_ eiP(ei)  + e2PGi)  + • • • + ejP(ej) 

P(ei)  + T(e2)  + • • • + P(ej) 

€1  + eoP{e2)/P{e-i)  + • • • + c_:;f  ( tj)  jP{  1 1 ) 

1 + P(e2)/P(e1)  + • • ■ + P(ej)/P(el) 

What  is  the  justification  for  this  equation?  How  is  it  related  to  Eq.  (18-42)?  At  what 
value  of  j should  you  stop? 

You  have  concluded  that  at  T = 10,000  K the  equipartition  theorem  can  be 
used  to  predict  an  approximate  value  for  (e)  by  taking  vibrational  motion  into  ac- 
count. So  you  can  also  use  the  theorem  to  conclude  that  the  contribution  of  vibra- 
tional motion  to  c'v  is  approximately  \k  = k at  this  quite  high  temperature.  If  you 
have  written  a program  to  evaluate  (e),  you  can  verify  this  prediction  by  using  it  to 
calculate  ((e)llj0ooK  — (€)io,oook)/(1  1,000  K - 10,000  K).  As  discussed  in  Sec.  18-4, 
direct  measurements  of  c'v  also  confirm  this  prediction. 


18-6  THE  MAXWELL- 
BOLTZMANN  SPEED 
DISTRIBUTION 


In  this  section  we  use  the  Boltzmann  factor  to  find  the  distribution  function 
n(v)  for  the  speed  v of  molecules  of  an  ideal  gas  in  thermal  equilibrium  at  a 
certain  temperature.  This  quantity  gives  the  number  of  molecules  per  unit 
speed  as  a function  of  the  speed.  It  provides  much  more  information  about 
what  happens  in  an  ideal  gas  than  we  have  at  present.  All  we  know  now  is 
the  average  value  of  the  kinetic  energy  of  the  ideal-gas  molecules  — that  is, 
the  average  value  of  something  which  is  proportional  to  the  square  of  the 
molecular  speed. 

I he  speed  distribution  function  can  be  written  as  a product  of  three 
factors,  in  a manner  analogous  to  Eq.  (18-44)  for  the  energy  distribution 
function  n(e).  The  quantity  n(e)  is  the  number  of  oscillators  per  unit  energy 
at  energy  e.  The  equation  is  n(e)  — AfP(e)G,  where  N is  the  total  number  of 
oscillators,  P(e)  is  the  probability  that  a single-object  state  of  an  oscillator  at 
energy  e is  occupied,  and  G is  the  number  of  these  states  per  unit  energy. 
In  the  present  case  the  independent  variable  is  the  speed  v,  not  the  energy 
€.  So  we  write  the  following  equation  for  n(v),  the  number  of  molecules  per 
unit  speed: 

n(v)  = NP(v)G(v)  (18-48) 


Here  N is  the  total  number  of  molecules,  P{v)  is  the  probability  that  a 
single-object  state  of  an  ideal-gas  molecule  at  the  energy  corresponding  to 


818  Kinetic  Theory  and  Statistical  Mechanics 


vz 


Fig.  18-14  A construction  used  to 
show  that  G(v),  the  number  of  different 
single-object  states  of  an  ideal-gas  mole- 
cule per  unit  speed,  is  proportional  to 
the  square  of  the  speed,  v2.  The  spheri- 
cal shell  of  inner  radius  v and  outer  ra- 
dius v + dv  extends  all  the  way  around  the 
origin,  although  only  the  part  in  the 
positive  octant  of  the  vx,  v„,  vz  coordi- 
nate system  is  shown.  The  shell  is  filled 
with  cubes  of  equal,  small  edge  lengths 
8vx,  8vy,  8vz,  of  which  only  two  are 
shown.  All  velocity  vectors  v whose  tips 
lie  anywhere  in  a single  cube  are  consid- 
ered to  describe  the  same  single-object 
state  of  the  molecule.  So  there  are  as 
many  such  states  in  the  speed  interval  v 
to  v + dv  as  there  are  cubes  in  the 
spherical  shell. 


the  speed  v is  occupied,  and  G(v)  is  the  number  of  these  states  per  unit 
speed  interval.  Equation  (18-48)  makes  the  evidently  correct  statement  that 
the  number  of  molecules  per  unit  speed  equals  the  total  number  of  mole- 
cules times  the  probability  that  any  one  will  be  in  a single-object  state  with  a 
certain  speed  times  the  number  of  such  states  per  unit  speed. 

Since  an  ideal-gas  molecule  has  only  translational  motion,  the  relation 
between  its  energy  e and  its  speed  v is  e = mv2/ 2,  with  m being  its  mass.  The 
probability  that  a single-object  state  at  energy  e = mv2 / 2 is  occupied  is  pro- 
portional to  the  Boltzmann  factor  e~elkT  = e~mv2l2kTf  where  k is  Boltz- 
mann’s constant  and  T is  the  absolute  temperature  of  the  gas.  Thus  we 

haVC  P(v)  k e-™mkT  (18-49) 

The  number  of  single-object  states  per  unit  speed  interval  for  an 
ideal-gas  molecule  is  written  in  Eq.  (18-48)  as  G(v)  because  it  depends  on 
the  molecule’s  speed  v.  (This  is  in  contrast  to  the  number  of  these  states  per 
unit  energy  interval  for  a harmonic  oscillator,  which  does  not  depend  on 
the  oscillator’s  energy,  and  so  is  written  as  G.)  The  fact  that  the  number  of 
single-object  states  per  unit  speed  depends  on  the  speed  can  be  seen  by 
considering  two  points.  First,  an  ideal-gas  molecule  with  a particular  speed  v 
can  have  many  different  velocities  v.  Each  velocity  that  is  measurably  dif- 
ferent from  other  velocities  corresponds  to  a different  single-object  state  of  the 
molecule.  This  follows  from  the  definition  of  a single-object  state:  a com- 
plete specification  of  whatever  quantities  must  be  known  in  order  to  know 
the  particular  condition  of  the  object.  Such  a specification  involves  not  only 
how  fast  the  molecule  is  moving  (the  magnitude  of  its  velocity  vector)  but 
also  the  direction  in  which  it  is  moving  (the  direction  of  its  velocity  vector). 
Second,  the  number  of  measurably  different  velocities — and  hence  of  dif- 
ferent single-object  states  — in  a speed  range  v to  v + dv  depends  on  v.  This 
point  is  demonstrated  in  the  following  paragraph. 

Figure  18-14  illustrates  a velocity  vector  v extending  from  the  origin  of 
a set  of  axes  vx,  vu,  vz  in  a velocity  space.  Its  components  lie,  respectively, 
somewhere  within  the  equal  ranges  vx  to  vx  + 8vx,  vy  to  vy  + 8vy,  and  vz  to 
vz  + 8vz.  Hence  the  tip  of  the  vector  lies  within  a little  cube  of  equal  edge 
lengths  8vx,  8vy,  and  8vz,  as  shown.  This  velocity  vector  represents  one 
single-object  state  of  the  ideal-gas  molecule.  Any  other  vector  whose  tip  lies 
in  the  same  cube  is  so  nearly  like  the  one  shown  that  the  difference  in  the 
behaviors  of  the  molecule  specified  by  the  two  vectors  is  not  considered 
enough  to  be  measurable.  That  is,  such  a vector  represents  the  same 
single-object  state  of  the  molecule.  But  a vector  whose  tip  lies  somewhere  in 
the  adjacent  cube  drawn  in  the  figure  is  sufficiently  different  that  a vector 
satisfying  this  criterion  specifies  a different  single-object  state  of  the  mole- 
cule. Now  consider  the  spherical  shell  centered  on  the  origin  with  inner 
radius  v and  outer  radius  v + dv.  The  number  of  different  single-object 
states  in  the  speed  range  v to  v + dv  is  just  the  number  of  little  cubes  in  this 
shell.  It  is  given  by  the  volume  of  the  shell  divided  by  the  volume  of  a cube. 
We  do  not  have  to  specify  the  volume  of  a cube  (in  other  words,  what  we 
consider  to  be  a measurable  velocity  difference)  because  the  point  of  inter- 
est is  simply  that  the  number  of  single-object  states  is  proportional  to  the 
volume  of  the  shell.  This  volume  is  its  surface  area  47 tv2  multiplied  by  its 
thickness  dv.  Thus  the  number  of  different  single-object  states  in  the  speed 
range  v to  v + dv  is  proportional  to  4ttv2  dv  or,  equally  well,  to  v2  dv.  The 
number  per  unit  speed,  which  is  the  quantity  G(v),  is  proportional  to  v2. 
Hence  we  conclude  that 


18-6  The  Maxwell-Boltzmann  Speed  Distribution  819 


ti(v ) (Relative 


G(v)  « v2 


(18-50) 


The  quantity  G(v)  is  often  called  the  density-of-states  factor. 

Using  the  proportionalities  of  Eqs.  (18-49)  and  (18-50)  in  Eq.  (18-48), 
which  is  the  expression  for  the  number  of  molecules  per  unit  speed  n(v)  = 
NP(v)G{v),  we  have 

n(v)  oz  Ne~mvmkTv 2 

Since  the  total  number  of  molecules  N is  fixed,  we  can  write  this  as 

n(v)  a v2e-mv*l2kT 

Then  we  can  introduce  a proportionality  constant  K to  convert  it  to  the 
equality 

n(v)  = Kv2e~mvmkT  (18-51) 

The  value  of  K can  be  determined  by  applying  the  normalization  condition 
of  Eq.  (18-36).  But  often  it  is  not  necessary  to  know  K.  The  formula  we 
have  obtained  for  n(v)  is  called  the  Maxwell-Boltzmann  speed  distribu- 
tion. It  was  obtained  first  by  James  Clerk  Maxwell  in  1859  and  later  in  a 
much  more  general  way  (similar  to  the  approach  we  have  used)  by  Boltz- 
mann. 

Having  been  derived  for  an  ideal  gas,  the  Maxwell-Boltzmann  speed 
distribution  applies  most  accurately  to  gases  which  most  accurately  approx- 
imate an  ideal  gas.  These  are  the  monatomic  gases.  But  it  also  can  be  used 
for  a polyatomic  gas  as  an  approximation  which  is  accurate  insofar  as  the 
gas  satisfies  the  ideal-gas  law,  pV  = NkT.  This  statement  can  be  justified  by 
repeating  the  derivation,  considering  only  the  part  of  the  energy  of  a mole- 
cule that  arises  from  its  center-of-mass  motion. 

Figure  18- 15  is  a plot  of  the  speed  distribution  n(v),  evaluated  from  Eq. 
(18-51)  for  T = 300  K and  with  m the  mass  of  an  oxygen  molecule. 
(Although  these  molecules  are  diatomic,  at  room  temperature  and  pres- 
sure oxygen  behaves  very  nearly  as  an  ideal  gas.)  For  small  values  of  v,  the 


v (in  m/s) 


Fig.  18-15  The  Maxwell-Boltzmann  speed  distribu- 
tion n(v)  for  oxygen  at  T = 300  K.  The  curve  is  not 
normalized.  In  other  words,  the  ordinate  shows  only 
relative  values  of  n(v).  The  most  probable  speed  vrap 
and  the  root-mean-square  speed  urms  are  indicated  on 
the  speed  axis.  Note  that  vmp  corresponds  to  the  peak  of 
the  curve.  Can  you  explain  qualitatively  why  z/rms  > i>mp? 


820  Kinetic  Theory  and  Statistical  Mechanics 


n(v) 

——  [in  (m/s) 


y(in  m/s) 


Fig.  18-16  Maxwell-Boltzmann  speed 
distributions  n(v)  for  a container  of  he- 
lium gas  at  T = 300  K and  T = 6000  K. 
The  curves  are  normalized  so  that  the 
scale  of  the  ordinate  is  the  same  for 
both.  That  is,  the  container  holds  the 
same  number  of  molecules  at  both  tem- 
peratures. When  multiplied  by  the  total 
number  N of  molecules  in  the  gas,  a 
value  of  the  ordinate  is  the  number  of 
molecules  in  a 1-m/s  range  of  speed. 


density-of-states  factor  v2  increases  with  increasing  v faster  than  the  Boltz- 
mann factor  e~mv2l2kr  decreases.  So  their  product  increases.  It  reaches  a 
peak,  and  then,  as  the  exponential  Boltzmann  factor  takes  over,  it 
descends.  In  the  descending  region  the  speed  distribution  is  not  an  expo- 
nentially decreasing  function  of  v;  rather  it  is  an  exponentially  decreasing 
function  of  v2. 

Because  T appears  in  the  exponent  of  Eq.  (18-51),  the  Maxwell- 
Boltzmann  distribution  is  very  sensitive  to  temperature.  In  Fig.  18-16,  the 
speed  distribution  is  plotted  for  a gas  of  the  monatomic  molecule  helium  at 
T — 300  K and  T = 6000  K.  To  facilitate  comparison,  both  curves  have 
been  normalized  to  correspond  to  the  same  number  of  molecules  in  the 
gas.  That  is,  the  value  of  the  constant  K in  both  has  been  adjusted  so  that 
the  areas  under  both  curves  have  the  same  value. 

In  the  speed  distribution  of  Fig.  18-15,  two  particular  speeds  are  indi- 
cated. Each  is  useful  in  certain  cases  where  it  is  sufficient  to  characterize  the 
speed  of  the  molecules  in  a gas  by  a single  value.  One  is  the  most  probable 
speed  ump.  The  other  is  the  root-mean-square  speed  urms . Equations  for  vmp 
and  urms  are  obtained  in  Example  18-7. 


EXAMPLE  18-7 

a.  Find  an  expression  for  the  speed  t>mp  at  which  the  Maxwell-Boltzmann  speed 
distribution  has  its  maximum. 

■ At  the  maximum  of  the  speed  distribution  n(v),  its  slope  is  zero.  Hence  the 
value  of  ump,  the  speed  at  which  the  maximum  occurs,  can  be  obtained  by  calcu- 
lating dn(v)/dv,  setting  it  equal  to  zero,  and  then  Ending  the  value  of  v that  satisfies 
the  resulting  equation.  Employing  the  rule  for  differentiating  the  product  of  two 
functions,  you  have 

— (Kv2e~"n,zl2kT) 

dii 

K ( 9 yg-m&mr l,U'  -mipnkT  \ 

\ kT  1 


dn(v) 

dv 


18-6  The  Maxwell-Boltzmann  Speed  Distribution  821 


= Kve 


-mv2l2kT 


2 - 


mv 

~kT 


= 0 


The  last  equality  is  satisfied  for  a speed  v at  which 


mv 

2 — = 0 

kT 


Solving  for  this  speed  and  calling  it  ump,  you  have 

12k? 


^TTID 


m 


(18-52) 


1'his  is  the  most  probable  speed  ump.  It  is  the  value  that  you  would  most  probably 
obtain  in  a measurement  of  the  speed  of  an  ideal-gas  molecule  of  mass  m in  a gas  at 
temperature  T.  ■ 

b.  Find  an  expression  for  the  speed  urms  = (( v 2})1/2,  the  square  root  of  the  mean 
(that  is,  average)  of  the  square  of  the  speeds  of  molecules  in  an  ideal  gas  at  tempera- 
ture T. 

a You  can  make  use  of  the  kinetic  theory  results  obtained  in  Sec.  18-2,  specifi- 
cally Eq.  (18-18): 

(e)  =|  kT 

Here  (e)  is  the  average  energy  of  the  ideal-gas  molecules.  Since  for  any  such  mole- 
cule its  energy  is  e = mv2/ 2,  you  have 


Hence 


m(v2)  _ 3 kT 

2 ~ 2 


or 


(v2) 


3 kT 
m 


Taking  the  square  root  of  both  sides  of  this  equation  and  writing  (( v 2))1/2  = vTms,  you 
have  the  following  expression  for  the  root-mean-square  speed  urms 


v 


rms 


(18-53) 


Can  you  explain  why  urms  is  larger  than  vmp?  ■ 

c.  Air  is  composed  principally  of  the  diatomic  molecule  nitrogen.  Evaluate  ump 
and  urms  for  nitrogen  at  room  temperature,  T = 300  K.  The  molecular  weight  of  ni- 
trogen is  28.0.  That  is,  1 kmol  of  nitrogen  has  a mass  of  28.0  kg. 

■ The  mass  of  a nitrogen  molecule  is 

mass  _ mass/kilomole 
molecule  molecules/kilomole 


or 


W 

m = — 

A 

where  W is  numerically  equal  to  the  molecular  weight  and  has  units  ot  kilograms 
per  kilomole  (kg/kmol)  and  where  A is  Avogadro’s  number.  Thus 

28.0  kg-kmol_1 
m ~ 6.02  X 1026  kmol"1 


822  Kinetic  Theory  and  Statistical  Mechanics 


Thallium  vapor 


Oven 


or 

m = 4.65  x 10-26  kg 

Using  this  value  of  m,  you  evaluate  the  most  probable  speed  ump  thus: 

l2kf  _ 1 2 x 1.38  x IQ-23  J/K  x 300  K 

Vmp  ~ V m ~ V 4.65  x 10“26  kg 

or 


urap  = 422  m/s 


You  can  then  obtain  the  root-mean-square  speed  urms  most  easily  by  comparing 
Eqs.  (18-52)  and  (18-53).  They  show  that 


'iv. 


mp  1.22ump 


Hence 


urms  = 1.22  X 422  m/s 


or 


urms  = 517  m/s 

This  is  about  50  percent  larger  than  the  speed  of  sound  in  nitrogen  gas.  Why  is  urms 
comparable  to  the  speed  of  sound? 


In  Sec.  18-5  we  proved  that  the  Boltzmann  factor  has  the  form  e~Bt  for  a collec- 
tion of  objects  of  any  nature.  But  we  proved  that  /3  = 1/kT  only  for  harmonic  os- 
cillators not  in  the  quantum  domain.  There  is  an  enormous  amount  of  physical 
theory  based  on  using  the  factor  e~elkT  not  just  for  macroscopic  harmonic  oscil- 
lators but  also  for  microscopic  atoms  and  molecules.  For  instance,  the  Maxwell- 
Boltzmann  speed  distribution  is  obtained  by  using  the  Boltzmann  factor  to  calcu- 
late occupation  probabilities,  and  the  speed  distribution  is  supposed  to  apply  to 
molecules.  This  circumstance  makes  it  possible  to  test  experimentally  the  applica- 
bility of  the  Boltzmann  factor  to  molecules  by  comparing  the  speed  distribution 
predicted  by  the  Maxwell-Boltzmann  theory  with  the  measured  speed  distribution. 

To  make  accurate  measurements  of  the  speed  distribution  is  not  easy.  The 
first  real  attempt  was  carried  out  around  1920  by  Stern.  Subsequent  experiments 
by  Zartman  and  Ko  from  1930  to  1934,  and  by  others,  led  to  improved  results.  The 
best  results  to  date  are  those  obtained  by  Miller  and  Kusch  in  1955.  The  apparatus 
is  sketched  in  Fig.  18-17.  A vacuum  chamber  surrounding  the  entire  apparatus  is 
not  shown.  A small  oven,  whose  temperature  can  be  controlled  very  accurately, 
contains  a supply  of  thallium  metal.  The  temperature  is  sufficiently  high  that  a 
vapor  made  up  of  monatomic  molecules  of  the  metal  fills  the  oven.  The  pressure 
(about  10-6  atm]  is  low  enough  that  the  molecules  approximate  an  ideal  gas  very 


Fig.  18-17  Apparatus  used  by  Miller  and 
Kusch  to  measure  the  speed  distribution  of  an 
ideal  gas.  The  entire  region  is  evacuated. 


Counter 


18-6  The  Maxwell-Boltzmann  Speed  Distribution  823 


n(v)  (Relative) 


Fig.  18-18  The  speed  distribution  n( v)  for  thallium  molecules.  It  is  unnor- 
malized and  is  plotted  versus  v/vmp  with  ump  being  the  most  probable  speed. 
The  circles  and  triangles  are  data  obtained  for  T = 870  K and  T = 944  K, 
respectively.  The  Maxwell-Boltzmann  theory  predicts  that  plotting  the  two 
sets  of  data  points  in  this  way  should  make  them  coincide.  They  do.  The 
curve  is  the  Maxwell-Boltzmann  speed  distribution  n(v)  plotted  versus 
vlvmp.  It  agrees  with  the  data  to  within  the  accuracy  expected  of  the  experi- 
ment. 


well.  There  is  a small  hole  in  the  side  of  the  oven,  through  which  molecules 
“leak.”  That  is,  a molecule  which  happens  to  be  heading  for  the  hole  simply  keeps 
going.  The  hole  is  sufficiently  small,  however,  to  introduce  only  a very  small  de- 
viation from  the  condition  of  thermal  equilibrium  in  which  the  vapor  would  find 
itself  in  a completely  enclosed  chamber. 

The  emerging  molecules  which  happen  to  be  headed  toward  the  cylinder  are 
not  stopped  by  the  collimators.  The  cylinder,  in  which  a number  of  grooves  are  cut 
at  an  angle  to  the  cylinder  axis,  is  rotating  rapidly  and  uniformly  at  an  accurately 
determinable  speed.  If  a molecule  headed  for  the  cylinder  happens  to  reach  it 
when  the  groove  is  in  the  right  position,  the  molecule  enters  the  groove.  It  cannot 
pass  the  entire  length  of  the  groove,  however,  unless  its  speed  and  the  rotational 
speed  of  the  cylinder  are  so  matched  that  the  molecule  progresses  along  the 
groove  just  as  the  groove  rotates  into  position. 

Any  molecule  meeting  these  conditions  emerges  from  the  end  of  the  cylinder 
and  strikes  the  detector,  where  it  is  counted.  From  a plot  of  the  counting  rate 
versus  the  cylinder  speed,  a plot  can  be  obtained  of  the  flux  of  molecules  S(v) 
versus  the  molecular  speed  v.  The  quantity  S(v)  is  the  number  of  molecules  per 
unit  speed  striking  the  detector  per  second.  Just  as  in  Eq.  (12-56),  it  is  related  to  the 
density  p[v)  of  molecules  per  unit  speed  in  the  beam,  and  their  speed  v,  as  follows: 

S(v)  = p[v)v 

But  p(v)  is  proportional  to  n(v),  the  number  of  molecules  per  unit  speed  in  the 
oven.  So 


S(v)  a n(v)v 


or 


n(v) 


S(v) 

OC  

V 


The  points  in  Fig.  18-18  are  the  values  ofn(v)  obtained  from  the  measurements, 
and  the  curve  is  the  prediction  of  the  Maxwell-Boltzmann  speed  distribution.  The 
magnificent  agreement  confirms  the  correctness  of  applying  the  Boltzmann  factor 
to  molecules  that  act  as  an  ideal  gas,  for  the  purpose  of  calculating  occupation 
probabilities. 

824  Kinetic  Theory  and  Statistical  Mechanics 


18-7  DISORDER  AND  In  systems  containing  a large  number  of  objects  (such  as  molecules),  nature 
ENTROPY  seems  to  favor  disorder  over  order.  That  is,  if  initially  the  objects  are  or- 
dered in  some  way  and  then  the  system  is  isolated  from  external  influence, 
they  tend  to  become  disordered  with  the  passage  of  time.  We  investigate 
several  striking  aspects  of  the  tendency  toward  disorder  in  this  section,  in- 
troducing in  due  course  the  concept  of  entropy  to  provide  a quantitative 
measure  of  the  disorder  in  a system.  Our  investigation  uses  two  tools  we 
have  employed  earlier  in  this  chapter:  experimental  simulation  and  the 
Boltzmann  occupation  probability  factor. 

First  let  us  consider  two  examples  of  processes  occurring  in  nature 
which  demonstrate  the  tendency  toward  disorder.  These  natural  processes 
are: 


1.  A box  with  a partition  contains  a hot  ideal  gas  on  one  side  and  a cool 
ideal  gas  of  the  same  type  on  the  other.  Then  the  partition  is  removed.  In 
time  the  hot  and  cool  gases  mix  intimately,  resulting  in  a warm  gas  with  a 
single  temperature.  The  molecules  in  the  system  originally  were  ordered  in 
that  high-speed  molecules  were  on  the  hot  side  of  the  box  and  low-speed 
molecules  were  on  tbe  cool  side.  After  the  two  populations  of  molecules 
are  allowed  to  mix  and  come  into  equilibrium  through  energy  exchanges 
with  the  walls  of  the  box,  the  molecules  are  no  longer  ordered  according 
to  speed.  In  other  words,  the  molecules  have  become  disordered. 

2.  A quantity  of  alcohol  is  poured  carefully  on  top  of  a container  of 
water.  When  some  time  has  passed,  the  alcohol  and  water  become 
thoroughly  mixed.  In  this  case  the  molecules  of  the  system  originally  were 
ordered  with  those  of  one  type  in  one  region  and  those  of  the  other  type  in 
another  region.  But  as  they  come  into  equilibrium  through  their  interac- 
tions, this  order  according  to  type  disappears,  and  so  the  molecules  become 
disordered. 

Corresponding  to  each  natural  process  is  an  inverse  process  which  does 
not  occur  in  nature: 

T.  A box  is  filled  with  an  ideal  gas  at  a certain  temperature  T.  Thus 
the  molecules  have  a most  probable  speed  ump  equal  to  (2 kT/m)1'2.  But  some 
individual  molecules  have  a speed  higher  than  vmp,  and  some  have  a speed 
lower  than  that  value.  In  the  course  of  their  random  motions,  the  high- 
speed molecules  suddenly  And  themselves  on  the  left  side  of  the  box,  while 
at  the  same  time  the  low-speed  molecules  find  themselves  on  the  right  side. 
We  quickly  slide  in  a partition  separating  the  two  sides  and  end  up  with  two 
separate  containers  of  gas,  one  hot  and  the  other  cold. 

2'.  A moonshiner  has  a vat  of  fermented  corn-squeezin’s  consisting 
of  about  10  percent  alcohol  mixed  with  90  percent  water.  At  some  instant 
the  random  movement  of  the  alcohol  molecules  around  the  vat  has 
brought  all  of  them  to  the  top.  The  moonshiner  quickly  skims  off  the  al- 
cohol and  avoids  the  necessity  of  chopping  wood  for  his  still. 

We  will  never  see  one  of  the  unnatural  processes  actually  occur.  Never- 
theless, there  is  a way  in  which  we  can  “watch”  one.  All  we  need  do  is  to  set 
up  its  inverse  process.  Since  the  inverse  process  is  a natural  process,  there  is 
no  difficulty  in  doing  this.  Then  we  take  a motion  picture  of  the  natural 
process  and  look  at  it  while  the  film  is  run  backward  through  the  projector. 
Of  course,  no  one  will  be  fooled.  It  will  be  apparent  immediately  to  all 


18-7  Disorder  and  Entropy  825 


Fig.  18-19  A box  divided  into  two 
halves  by  a partition  containing  a small 
hole  covered  by  a movable  slide.  Initially 
n,  molecules  are  put  in  the  left  half  and 
nr  in  the  right  half.  Then  the  hole  is 
opened. 


0 nl  l 

n,  + nr 

Fig.  18-20  If  a number  is  chosen  at 
random  from  a uniformly  distributed 
set  of  numbers  that  ranges  from  0 to  1 
in  value,  the  probability  that  it  will  be  in 
the  shaded  part  of  the  range  is  just  the 
ratio  of  the  extent  of  that  part  to  the 
total  extent  of  the  range.  The  ratio  is 
numerically  equal  to  ni/(ni  + nr). 


watchers  that  the  him  is  being  run  backward.  The  intuitive  understanding 
of  how  nature  works  will  tell  everyone  seeing  the  movies  that  the  projec- 
tionist is  artificially  making  time  “flow  backward"  to  create  an  illusion.  In- 
deed, the  situation  can  be  summarized  by  a very  strong  statement.  We  can 
say  that  the  natural  direction  of  a process — the  direction  toward 
disorder — serves  to  define  the  natural  direction  of  the  flow  of  time. 

These  ideas  can  be  demonstrated  by  carrying  out  an  experimental  sim- 
ulation on  a programmable  calculator  (of  the  same  type  used  for  numerical 
calculations  in  other  chapters)  or  on  a small  computer.  The  “experiment”  is 
pictured  in  Fig.  18-19.  A box  (in  which  the  temperature  is  everywhere  the 
same)  is  divided  down  its  center  by  a partition  with  a small  hole.  Keeping 
the  hole  closed  by  a movable  slide,  you  put  a certain  number  of  molecules  in 
the  left  half  and  some  other  number  of  molecules  in  the  right  half.  All  the 
molecules  are  identical,  and  they  are  molecules  of  an  ideal  gas,  so  they  do 
not  interact  with  one  another.  Now  you  open  the  hole.  As  the  molecules 
bounce  between  the  walls  of  their  halves  of  the  box,  each  has  the  same 
chance  of  fortuitiously  taking  a path  that  leads  it  through  the  hole.  So  there 
is  an  equal  chance  that  any  molecule  in  either  half  of  the  box  will  be  the 
next  one  to  pass  through  the  hole  and  end  up  in  the  other  half.  In  the 
experiment  you  monitor  the  number  of  molecules  in  the  left  half  as 
that  number  changes  each  time  a molecule  passes  through  the  hole  in  one 
direction  or  the  other. 

Consider  an  instant  when  there  are  n\  molecules  in  the  left  half  of  the 
box  and  nr  in  the  right  half.  Since  each  molecule  has  the  same  chance  to  be 
the  next  one  to  go  through  the  hole,  the  probability  that  the  next  molecule 
to  do  so  will  be  one  in  the  left  half  is  just  the  number  in  the  left  half  at  that 
instant  divided  by  the  total  number.  Thus  the  probability  is  n//(n;  + nr ) that 
the  next  thing  which  will  happen  is  that  a molecule  in  the  left  half  of  the 
box  will  go  to  the  right  half. 

To  see  how  the  process  is  simulated  numerically,  consider  choosing  at 
random  a number  u}  from  a set  of  numbers  distributed  uniformly  in  the 
range  extending  from  0 to  1 . The  chance  that  in  so  doing  you  will  get  one 
whose  value  lies  in  any  limited  part  of  that  range  is  tin  extent  of  that  part 
divided  by  the  total  extent  of  the  range,  1.  Hence  the  probability  that  the 
randomly  chosen  number  will  be  in  the  part  extending  from  zero  to  the 
value  nj{ni  + nr)  is  equal  to  the  extent  of  that  part,  rq/(rq  + nr).  See  Fig. 
18-20.  Comparing  this  conclusion  to  the  one  in  the  preceding  paragraph, 
you  see  that  the  probability  that  a random  number  iq  uniformly  distributed 
in  the  range  0 to  1 will  have  a value  tq  n;/(n;  + nr)  is  the  same  as  the 
probability  that  the  next  event  is  for  a molecule  in  the  left  half  of  the  box  to 
go  to  the  right  half. 

This  parallelism  makes  it  possible  to  simulate  the  “experiment"  in  a 
manner  that  is  in  complete  agreement  with  the  laws  of  probability  and  the 
properties  of  icleal-gas  molecules,  by  programming  a calculating  device  so 
that  at  each  stage  in  a sequence  of  calculations  it  goes  through  the  following 
steps.  (1)  Generate  a random  number  from  a uniformly  distributed  set  of 
random  numbers  with  values  between  0 and  1.  (2)  Test  it  against  the  cur- 
rent value  of  the  fraction  of  molecules  in  the  left  half  of  the  box,  the 
numbers  of  molecules  in  the  two  halves  being  stored  in  two  registers  of  the 
device.  (3)  “Move”  a molecule  from  left  to  right  if  the  random  number  is 
smaller  than  the  fraction  by  subtracting  1 from  the  register  storing  the 
number  of  molecules  in  the  left  half,  or  do  the  opposite  if  the  random 


826  Kinetic  Theory  and  Statistical  Mechanics 


Fig.  18-21  A simulation  of  the  experi- 
ment using  equipment  depicted  in  Fig. 
18-19.  Initially  there  were  60  molecules 
in  the  left  half  of  the  box  and  none  in 
the  right  half.  The  points  plot  successive 
values  of  the  number  of  molecules  in 
the  left  half. 


number  is  larger  than  the  fraction.  You  will  find  a program  that  carries 
out  this  procedure  in  the  Numerical  Calculation  Supplement.  It  is  called 
the  molecules-in-a-box  program.  The  program  is  used  in  Example  18-8. 


EXAMPLE  18-8 

Run  the  molecules-in-a-box  program,  taking  the  initial  numbers  of  molecules  in  the 
left  and  right  halves  of  the  box  to  be  nt  = 60  and  nr  = 0. 

■ The  results  obtained  in  the  simulation  are  plotted  in  Fig.  18-21  as  a set  of  points 
showing  the  successive  values  of  nt,  the  number  of  molecules  in  the  left  half  of  the 
box.  The  first  passage  of  a molecule  through  the  hole  must  result  in  a decrease  in  nh 
since  there  are  initially  no  molecules  in  the  right  half  of  the  box.  As  soon  as  there  is 
at  least  one  molecule  in  the  right  half,  it  is  possible  that  the  next  event  will  be  the  pas- 
sage of  a molecule  back  to  the  left  half,  with  an  increase  in  nt.  But  this  is  a highly 
unlikely  event  at  first,  because  there  are  so  few  molecules  in  the  right  half  and  so 
many  in  the  left  half.  So,  as  you  can  see  from  the  figure,  the  value  of  at  first  de- 
creases monotonically.  As  the  number  of  molecules  in  the  right  half  increases,  even- 
tually one  of  them  happens  to  be  the  one  chosen  by  chance  to  go  through  the  hole. 
At  this  point  there  is  a small  upward  fluctuation  superimposed  on  the  continuing 
downward  trend  in  n;.  And  as  n;  approaches  30  (half  the  total  number  of  mole- 
cules in  the  box),  its  fluctuations  become  more  pronounced  and  its  downward  trend 
less  pronounced.  Ultimately,  n / fluctuates  about  an  average  value  of  30. 


We  can  say  that  the  molecules  in  the  box  of  Example  18-8  initially  have 
a high  degree  of  order  because  they  are  all  in  one  half  of  the  box.  It  can 
be  said  just  as  well  that  they  initially  have  a low  degree  of  disorder.  After  the 
experimental  simulation  has  run  for  a while,  the  molecules  distribute  them- 
selves rather  equally  in  both  halves  of  the  box.  This  happens  spontaneously — 
the  system  is  completely  isolated  from  external  influences  once  you  start 
the  calculating  device  to  simulate  opening  the  hole  in  the  partition.  Thus 
the  molecules  lose  their  initial  order.  That  is,  their  degree  of  disorder 
increases.  The  simultation  demonstrates  very  well  the  tendency  of  an 
isolated  system  toward  disorder. 

It  also  demonstrates  how  natural  processes  can  be  used  to  determine 
the  natural  direction  of  the  flow  of  time.  If  you  look  at  Fig.  18-21,  you  can 
tell  immediately  that  it  is  plotted  with  time  increasing  to  the  right.  Imagine 
your  reaction  if  someone  showed  you  a movie  of  the  molecules  in  the  box  in 
which  the  initial  value  of  were  26  (the  value  at  the  end  of  the  run  in  Ex- 
ample 18-8),  with  rii  subsequently  fluctuating  around  30  for  a minute  or 
two  and  then  spontaneously  building  up  to  a value  of  60.  You  would  know 


18-7  Disorder  and  Entropy  827 


that  the  him  was  being  run  backward.  Nature  defines  the  direction  of  the  “arrow 
of  time”  by  the  tendency  toward  disorder  in  systems  containing  many  bodies. 

This  is  striking  because  the  beha  vior  of  systems  con  taining  only  a few  bodies 
is  not  sensitive  to  whether  time  is  increasing  or  decreasing.  For  instance,  Newton’s 
second  law,  F = m d2x/dt2,  is  unchanged  if  t is  replaced  by  —t  since 
d2x/d{  — t)2  = d2x/dt2.  You  can  see  this  if  you  look  again  at  any  of  the  satel- 
lite trajectories  in  Chap.  11.  If  you  have  not  looked  at  them  recently,  you 
may  not  remember  the  direction  of  rotation  of  the  satellite  about  the  cen- 
tral body.  In  fact,  either  of  the  two  directions  is  possible.  Therefore,  if  you 
saw  a motion  picture  of  this  two-body  system  with  the  satellite  rotating  in  a 
certain  direction,  you  would  not  be  able  to  tell  whether  you  were  seeing 
what  actually  happened  or  a movie  in  which  the  satellite  was  started  in  the 
opposite  direction  but  with  the  direction  of  time  reversed  by  running  the 
him  backward.  (The  laws  of  quantum  mechanics  have  the  same  indepen- 
dence with  respect  to  time  reversal  as  do  those  of  newtonian  mechanics.) 

Example  18-9  will  illustrate  to  you  the  distinction  between  many-body 
and  few-body  systems  by  emphasizing  that  the  tendency  toward  disorder  is 
a statistical  effect  that  operates  only  in  systems  where  there  is  a large  enough 
number  of  bodies  to  make  possible  a meaningful  distinction  between  order 
and  disorder. 


EXAMPLE  18-9  — — ■ ■ '■  — 

Run  the  molecules-in-a-box  program  with  the  initial  number  of  molecules  in  the  left 
and  right  halves  of  the  box  being  ni  = 6 and  nr  = 0. 

■ Figure  18-22  is  a plot  of  the  results  obtained  from  the  simulation.  With  6 mol- 
ecules in  the  box,  fluctuations  dominate  and  no  trend  from  order  to  disorder  can  be 
discerned.  In  contrast  to  the  results  obtained  for  60  molecules  in  the  box,  the  results 
obtained  here  could  not  be  used  to  determine  the  direction  of  the  “arrow  of  time.” 
That  is,  you  cannot  tell  by  inspecting  Fig.  18-22  whether  it  is  plotted  with  time 
increasing  to  the  right  or  to  the  left.  It  is  evident  that  the  molecules-in-a-box  system 
must  be  considered  a few-body  system  when  it  contains  6 molecules  and  a many- 
body  system  when  the  number  of  molecules  contained  is  60. 


If  you  start  watching  the  molecules-in-a-box  system  when  there  are  60 
molecules  in  the  left  half  and  none  in  the  right  half,  the  chances  are  over- 
whelming that  you  will  see  the  number  in  the  left  half  drop  spontane- 
ously and  then  hover  around  30.  But  if  you  start  watching  it  when  there  are 
30  molecules  in  each  half,  there  is  a completely  negligible  chance  that  you 


Fig.  18-22  Results  obtained 
from  an  experimental  simulation 
beginning  with  6 molecules  in  the 
left  half  of  the  box  and  none  in 
the  right  half. 


828  Kinetic  Theory  and  Statistical  Mechanics 


will  see  nt  spontaneously  build  up  to  60.  (The  probability  of  nt  increasing 
monotonically  from  30  to  60  has  the  extremely  small  value  1.2  X 1 0— 21 . ) 
Why  does  the  many-body  system  exhibit  the  tendency  toward  disorder? 
Two  different  explanations  can  be  given: 

1.  The  value  of  n*  will  not  increase  from  30  to  60  since  as  soon  as  ni  be- 
comes appreciably  larger  than  30  it  is  appreciably  more  likely  that  the  pas- 
sage of  a molecule  through  the  hole  will  decrease  nt  rather  than  increase  it. 
The  closer  nt  comes  to  60,  the  greater  the  chance  that  the  next  change  will 
be  a decrease.  For  nt  actually  to  reach  60  would  take  a whole  sequence  of 
very  unlikely  happenings. 

2.  The  value  of  will  not  go  to  60  since  the  system  has  a very  large 
number  of  equally  probable  microstates  but  only  one  of  them  is  included 
in  the  macrostate  describing  60  molecules  in  the  left  hall  of  the  box. 
This  statement  is  completely  analogous  to  the  statement  that  in  Table 
18-4  the  macrostate  “four  heads”  has  only  one  microstate.  But  included 
in  the  macrostate  in  which  30  molecules  are  in  the  left  half  are  very 
many  microstates.  The  reason  is  that  it  makes  no  difference  which  of  the 
molecules  are  among  the  30  in  the  left  half.  Thus  there  are  many  different 
distributions  of  the  molecules  between  the  two  halves  that  all  correspond  to 
30  in  the  left  half.  Each  of  these  is  a microstate  within  the  same  macrostate 
for  30  molecules  in  the  left  half.  Flere  there  is  a complete  analogy  to  the 
statement  in  Table  18-4  that  the  macrostate  “two  heads”  has  many  micro- 
states. Since  the  probability  of  a macrostate  is  proportional  to  the  number 
of  its  microstates,  it  follows  that  there  is  a much  greater  probability  of  hav- 
ing 30  molecules  in  the  left  half  of  the  box  than  60. 

In  fact,  the  macrostate  in  which  there  are  30  molecules  in  the  left  half 
of  the  box  (so  that  the  identical  molecules  are  distributed  symmetrically 
between  the  symmetrical  halves  of  the  box)  is  the  macrostate  with  the 
greatest  number  of  microstates.  The  situation  is  just  like  the  one  seen  in 
Table  18-4,  where  the  macrostate  “two  heads”  (the  one  about  which  the 
other  possibilities  are  distributed  symmetrically)  is  the  macrostate  with  the 
greatest  number  of  microstates.  Since  the  probability  of  a macrostate  is  pro- 
portional to  the  number  of  its  microstates,  the  macrostate  in  which  there 
are  30  molecules  in  the  left  half  of  the  box  is  the  most  probable  macrostate. 
This  is  the  macrostate  toward  which  the  system  tends  to  evolve,  because  it  is 
the  one  of  highest  probability. 

The  macrostate  of  a system  having  the  highest  probability  is  called  the 
equilibrium  macrostate.  The  name  is  appropriate  since  the  natural  evolu- 
tion of  a system  leads  it  toward  its  equilibrium  macrostate.  In  other  words, 
it  can  be  said  that  a system  “seeks”  its  equilibrium  macrostate.  But  there  are 
always  fluctuations  which  cause  departures  from  this  trend.  The  smaller 
the  number  of  bodies  in  the  system,  the  greater  is  the  significance  of  these 
fluctuations. 

Fhe  first  of  the  explanations  that  we  have  given  for  the  behavior  of  the 
molecules  in  a box  employs  ideas  used  in  the  numerical  calculations  of  the 
experimental  simulation.  The  second  uses  ideas  developed  in  Sec. 
18-6 — ideas  better  suited  to  analytical  calculations.  But  both  show  that  the 
tendency  for  disorder  in  a many-body  system  is  a consequence  of  the  laws 
of  probability.  Hence  it  is  a property  to  be  understood  on  the  basis  of  statis- 
tical mechanics. 


18-7  Disorder  and  Entropy  829 


Our  goal  is  to  use  statistical  mechanics  to  link  the  microscopic  proper- 
ties of  a system  with  its  macroscopic  behavior.  A necessary  step  in  the  direc- 
tion of  this  goal  is  to  give  quantitative  expression  to  the  qualitative  idea  of 
disorder.  This  is  done,  along  the  lines  of  explanation  2 above,  by  relating 
the  disorder  of  a system  in  a certain  macrostate  to  the  number  of  micro- 
states belonging  to  that  macrostate.  It  is  a reasonable  thing  to  do  since  the 
most  probable  macrostate  is  the  one  with  the  greatest  number  of  micro- 
states and  the  most  probable  macrostate  also  is  the  one  of  greatest  disorder. 

The  amount  of  disorder  is  measured  by  a quantity  to  which  is  assigned 
the  symbol  S and  the  name  entropy.  (The  name  was  introduced  by 
Clausius,  who  coined  it  from  the  Greek  word  trope,  meaning  transforma- 
tion. We  will  see  in  Chap.  19  how  entropy  is  connected  with  the  changes,  or 
“transformations,”  in  systems.)  If  we  use  the  symbol  w for  the  number  of  mi- 
crostates in  a macrostate  of  a system,  then  we  are  saying  that  5 should  in- 
crease as  w increases.  However,  it  proves  most  convenient  to  define  the  new 
quantity  5 not  to  be  directly  proportional  to  w.  Instead  it  is  defined  to  be 
proportional  to  In  w (In  is  the  logarithm  to  the  base  e).  Some  justification  for 
this  is  found  in  the  observation  that  in  real  systems  w is  usually  an 
extremely  large  quantity,  but  In  w is  of  more  manageable  proportions.  The 
real  convenience  of  so  defining  S will  be  seen  later  here,  as  well  as  in  Chap. 
19.  As  for  the  proportionality  constant,  we  take  it  to  be  Boltzmann’s  con- 
stant k,  for  reasons  which  will  become  clear  later.  Thus,  by  definition,  the 
entropy  S of  a macrostate  containing  w microstates  has  the  value 

S = k In  w (18-54) 

The  entropy  of  a macrostate  of  a system  often  is  spoken  of  simply  as  the  en- 
tropy of  the  system. 

We  can  summarize  our  basic  conclusions  to  this  point  by  saying  that 
the  observed  tendency  toward  disorder  as  a system  approaches  equilibrium 
is  a result  of  the  fact  that  the  more  disordered  macrostates  are  the  more 
probable  ones  because  they  are  the  ones  with  more  microstates.  Further- 
more, we  can  say  that  since  the  entropy  of  a system  increases  as  its  disorder 
increases,  the  tendency  toward  increased  disorder  is  a tendency  toward  in- 
creased entropy.  Hence  the  entropy  of  an  isolated  system  increases  as  the  system  ap- 
proaches its  equilibrium  macrostate.  This  is  one  statement  of  the  second  law  of 
thermodynamics.  Derived  here  as  a consequence  of  statistical  mechanics,  it 
becomes  one  of  the  foundation  stones  of  thermodynamics  in  Chap.  19. 

We  can  be  more  specific  by  considering  two  separated  subsystems,  sub- 
system 1 in  its  equilibrium  macrostate  at  temperature  Tl  and  subsystem  2 in 
its  equilibrium  macrostate  at  temperature  T2.  At  some  instant  the  two  are 
placed  in  thermal  contact  to  form  a total  system.  At  that  instant  the  total 
system  is  not  in  its  equilibrium  macrostate  because  its  two  subsystems  have 
different  temperatures.  But  as  time  passes,  their  temperatures  come  into 
equality  at  some  value  intermediate  between  7\  and  T2-  When  this  has  hap- 
pened, the  total  system  has  attained  its  equilibrium  macrostate. 

Since  at  the  instant  the  total  system  is  formed  it  is  in  a nonequilibrium 
macrostate,  the  total  system  is  in  a macrostate  that  is  less  probable  than  the 
equilibrium  macrostate.  And  since  the  probability  of  a macrostate  is  propor- 
tional to  the  number  of  its  microstates,  this  means  that  the  macrostate  of 
the  total  system  at  the  instant  of  formation  has  fewer  microstates  than  the 
equilibrium  macrostate.  In  other  words,  the  initial  entropy  S*  of  the  total 

830  Kinetic  Theory  and  Statistical  Mechanics 


system  is  less  than  its  final  entropy  Sf.  Thus  the  change  in  entropy  of  the 
total  system,  AS  = Sf  — Si,  is  positive  as  it  approaches  its  equilibrium 
macrostate. 

When  the  total  system  has  reached  its  equilibrium  macrostate  then 
henceforth  AS  will  be  zero  because  it  will  remain  in  that  macrostate  and 
so  its  entropy  will  remain  constant.  Thus  it  is  true  of  any  isolated  system 
that 


AS  ^ 0 (18-55) 

This  is  another  expression  of  the  second  law  of  thermodynamics. 

What  happens  to  the  individual  entropies  of  the  subsystems?  We  give  a 
partial  answer  by  first  proving  that  the  entropy  S of  the  total  system  is  at  all 
times  given  by  the  sum  of  the  entropies  Sx  and  S2  of  its  two  subsystems.  The 
proof  is  simple.  Let  Wi  be  the  number  of  microstates  of  subsystem  1 be- 
longing to  a certain  macrostate  of  the  total  system,  and  let  w2  be  the  number 
of  microstates  of  subsystem  2 belonging  to  that  macrostate.  For  each  mi- 
crostate of  subsystem  1,  it  is  possible  for  subsystem  2 to  be  in  any  of  w2  dif- 
ferent microstates.  Since  it  is  possible  for  subsystem  1 to  be  in  any  of  wx  dif- 
ferent microstates,  the  total  number  of  different  possibilities  is  wxw2.  This 
quantity  is  w,  the  number  of  microstates  in  the  total  system.  That  is, 


W = WiU>2 

Now  evaluate  the  entropy  of  the  total  system.  Using  Eq.  (18-54),  we  find 

S = k In  w = k In  (wiiu2) 

But  the  logarithm  of  the  product  of  two  quantities  is  the  sum  of  their  loga- 
rithms. Hence  we  have 


S = k In  Wi  + k In  w2 
or,  using  Eq.  (18-54)  again, 

5 = + S2  (18-56) 

Entropy  is  additive.  This  property,  which  will  be  very  useful  in  the  study  of 
macroscopic  systems,  is  a principal  motivation  for  defining  entropy  to  be 
the  logarithm  of  the  number  of  microstates. 

We  can  employ  the  additivity  of  entropies  to  write  Eq.  (18-55),  as  the 
total  system  approaches  its  equilibrium  macrostate,  in  the  form 

A (S1  + S2)  > 0 
or 

ASj  + AS2  > 0 

Thus  theswra  of  the  changes  in  entropy  of  the  two  subsystems  is  positive  in 
the  process  of  approach  to  the  equilibrium  macrostate  by  the  total  system. 
But  at  present  we  cannot  say  anything  about  ASX  or  AS2  individually. 

Now  we  derive  a very  important  relation  among  the  change  in  the  en- 
tropy of  a system,  the  change  in  its  energy,  and  the  temperature  of  the 
system.  Among  other  things,  this  relation  will  make  it  possible  for  us  to  cal- 
culate in  specific  cases  the  individual  changes  ASX  and  AS2  in  the  entropies 
of  two  subsystems  when  they  are  brought  into  thermal  contact  and  the  total 
system  then  attains  its  equilibrium  macrostate.  The  sum  of  these  changes 
then  gives  the  change  AS  in  the  entropy  of  the  total  system. 


18-7  Disorder  and  Entropy  831 


Fig.  18-23  A schematic  illustration  of 
an  isolated  system  containing  many 
bodies.  The  total  system  is  considered  as 
a subsystem  containing  a single  body  b 
and  a subsystem  5 that  contains  all  the 
other  bodies. 


An  isolated  system  containing  many  bodies  in  its  equilibrium  macro- 
state at  temperature  T is  indicated  schematically  in  Fig.  18-23.  The  figure 
also  indicates  that  the  system  can  be  divided  into  two  subsystems  by  picking 
out  as  one  subsystem  one  of  these  bodies,  body  b,  and  calling  everything 
that  remains  the  subsystem  s.  Let  body  b occupy  one  of  its  single-object 
states  at  energy  eb.  Then  subsystem  5 has  whatever  remains  of  the  total  en- 
ergy of  the  system.  We  write  the  energy  of  subsystem  5 as  Es.  There  are  two 
ways  to  express  the  probability  that  this  situation  occurs.  One  is  to  say  that  it 
is  proportional  to  the  number  of  microstates  belonging  to  the  macrostate 
describing  the  situation.  Since  body  b is  in  one  single-object  state,  this 
number  is  the  same  as  the  number  of  microstates  belonging  to  the  macro- 
state in  which  subsystem  5 has  energy  Es.  We  designate  this  number  as  w(Es). 
The  second  way  of  expressing  the  probability  that  the  situation  occurs 
is  to  say  that  it  is  the  probability  of  the  single-object  state  at  energy  eb  being 
occupied  by  body  b,  which  is  proportional  to  the  Boltzmann  factor  e~e"lkT. 
Since  w(Es)  and  e~e“lkl  are  proportional  to  the  same  thing,  they  must  be  pro- 
portional to  each  other.  Thus  we  have 


w(Es)  = Ae~^,kT 


(18-57  a) 


where  A is  some  proportionality  constant. 

Since  the  total  system  is  isolated,  its  total  energy  has  a fixed  value  Et 
given  by 


Et  = Es  + eb 

Using  this  to  write  eb  = Et  - Es  in  Eq.  (18-57o),  we  obtain 

w(Es)  = Ae~tE,~Es)lkT  = Ae~E,lkT  eBslkT 

But  e~E< ,kl  is  a constant  since  the  temperature  T is  also  fixed.  So  we  can 
simplify  what  we  have  to  the  form 

w(Es)  = BeEslkT  (18-576) 


where  B is  another  constant. 

Now  let  us  evaluate  the  entropy  of  subsystem  5.  It  is 

Ss  = k In  w(Es)  = k In  (BeExlkT) 


Writing  the  logarithm  of  the  product  as  the  sum  of  their  logarithms,  we  get 

Ss  = k In  B + k In  eEslkT 


But  k In  B = C,  yet  another  constant.  And,  by  the  definition  of  a logarithm, 
In  eEslkT  = Es/kT.  Thus  we  have 


- C + 


kEs 

kT 


or 

Ss  = C+y  (18-58) 

Now  we  find  the  relation  between  the  change  in  the  entropy  Ss  of  the 
subsystem  and  the  change  in  its  energy  Es  that  takes  place  if  eb  changes.  We 
do  this  by  differentiating  Ss  with  respect  to  Es.  In  the  differentiation,  we 
hold  the  temperature  T fixed  at  the  value  it  has  throughout  the  total 
system.  That  is,  we  take  the  partial  derivative  with  respect  to  Es  of  all  terms 
in  Eq.  (18-58),  producing 

832  Kinetic  Theory  and  Statistical  Mechanics 


Finally,  we  redefine  tfie  system  of  interest  to  be  what  we  fiave  been  con- 
sidering as  a subsystem.  This  system  has  total  energy  E = Es,  entropy  5 = 
Ss,  and  temperature  T.  In  this  way  we  obtain  the  desired  relation 


_ ss_ 

t~He 


(18-59) 


This  important  relation  follows  from  specifying  the  disorder  in  a system  in 
terms  of  the  entropy,  defined  to  be  S = k In  w.  It  shows  that  the  reciprocal 
\/T  of  the  temperature  of  a system  is  a measure  of  the  rate  dS/ BE  at  which  the  dis- 
order in  the  system  changes  as  its  total  energy  E changes.  The  equation  1 /T  = 
dS/dE  may  be  regarded  as  the  definition  of  temperature.  Note  that  up  to  this 
point  we  have  depended  on  an  intuitive  notion  of  what  temperature  is,  bol- 
stered by  Eq.  (18-18),  (e)  = f kT,  which  defines  temperature  in  terms  of 
the  average  energy  of  the  molecules  for  an  ideal  gas  only.  But  Eq.  (18-59)  de- 
fines the  temperature  of  a system  in  terms  of  the  fundamental  mechanical 
quantity  E,  and  the  entropy  S,  in  a way  which  is  completely  independent  of  the 
details  of  the  system.  Thus  we  can  define  a temperature  scale  which  is  the  same 
for  all  systems — something  we  take  for  granted  every  time  we  read  a ther- 
mometer! We  use  various  forms  of  Eq.  (18-59)  on  a number  of  occasions  in 
Chap.  19. 


One  variation  on  Eq.  (18-59)  is  obtained  by  evaluating  the  infinitesimal 
change  dS  in  the  entropy  of  a system  at  temperature  T occurring  when  an 
infinitesimal  amount  of  energy  dE  flows  into  it.  We  have 

BS  , 

dS  — — dE 
BE 

The  partial  derivative  signifies  that  T is  considered  to  be  a constant  in  this 
infinitesimal  process.  Using  Eq.  (18-59),  we  can  write  this  as 

dS  = y (18-60) 

Another  variation  on  Eq.  (18-59)  is  found  by  looking  at  things  from  a 
macroscopic  point  of  view.  Consider  a case  in  which  the  flow  of  energy  dE 
into  the  system  is  exclusively  in  the  form  of  heat  energy  dEi.  Then  we  write 
dE  = dH  so  that  Eq.  (18-60)  assumes  the  more  restricted  form 

dH 

dS=—  (18-61) 

This  equation  gives  a differential  relation  between  entropy  and  the  macro- 
scopic quantities  heat  and  temperature.  When  n subsystems  are  brought 
into  thermal  contact,  the  second  law  of  thermodynamics  is  written  in  these 
terms  as 


n 11  rlf-f 

dS  = 2 dSj  = J t ^ 0 (18-62) 

j=l  j=l  1 i 

Suppose,  in  particular,  that  we  bring  two  subsystems  into  thermal  con- 
tact to  make  a total  system.  Their  initial  temperatures  are  Tu  and  T2i-,  with 
El,-  < T2i,  and  they  reach  a final  common  temperature  Tf.  In  so  doing,  sub- 
system 1 gains  heat  and  subsystem  2 loses  the  same  amount  of  heat.  Thus 
there  is  no  change  in  the  heat  energy  of  the  total  system.  This  follows  from 

18-7  Disorder  and  Entropy  833 


a form  of  the  law  of  energy  conservation,  which  is  known  as  the  first  law  of 
thermodynamics  in  Chap.  19.  But  there  is  a change  AS  in  the  entropy  of  the 
total  system: 


or 


AS  — ASi  + AS2  — dS  1 4-  dS 


Tu 


'ft/ 


Tu 


rr>f  dHx  [nr  dH2 
AS  = I -zr1  + 


1 T, 


, T 1 


t2 


, t2 


(18-63) 


According  to  the  second  law  of  thermodynamics,  the  value  of  AS  must  sat- 
isfy the  inequality  AS  3=  0.  Example  18-10  shows  that  it  does,  for  a specific 
case,  by  using  Eq.  (18-63)  to  evaluate  AS. 


EXAMPLE  18-10  — " ,lllirn  "i™  1 

A copper  can,  of  negligible  heat  capacity,  contains  1.000  kg  of  water  just  above  the 
freezing  point.  A similar  can  contains  1.000  kg  of  water  just  below  the  boiling  point. 
The  two  cans  are  brought  into  thermal  contact.  Find  the  change  in  entropy  of  the 
cold  water,  of  the  hot  water,  and  of  the  total  system. 

■ To  carry  out  the  integrations  required  to  evaluate  the  terms  in  Eq.  (18-63),  you 
must  express  dHx  and  dHo  in  terms  of  T.  In  the  present  case  you  can  do  so  bv  writing 
dE  = dH  in  Eq.  (18-23)  and  then  solving  for  dH,  to  obtain 

dH  = cm  dT 


Ehis  applies  to  either  the  hot  water  or  the  cold  water  if  you  use  c = 4186  J/(kg-K), 
the  specific  heat  capacity  of  water,  and  m = 1.000  kg.  Since  the  heat  capacities  of  the 
two  systems  are  equal,  the  final  temperature  will  be  the  average  of  the  initial  tem- 
peratures: 


Tv  = 


Tu  + To, 


273  K + 373  K 

9 


323  K 


You  thus  have 


fT'i  dT1  (T'-r  dT2 
A S = ASX  + AS2  = cm  — + cm—- 

J Tu  fi  Jtu  7 2 

Making  use  of  Eq.  (7-21)  to  evaluate  the  integrals,  you  find 

AS  = 4186  J/(kg-K)  X 1.000  kg{[(ln  7’i)j-j=323  k — (In  T1)j'1=273  k] 

+ [(111  7'2)r2=323  K ~ (ln  T" 2 )7’2=373  k]} 

= 4186  J/K  x [In  (323  K/273  K)  + ln  (323  K/373  K)] 

= 4186  J/K  x (0.168  - 0.144) 


or 


AS  = 703  J/K  - 603  J/K 

Thus  you  have  ASi  = 703  J/K,  AS2  = -603  J/K,  and  AS  = 100  J/K. 

The  entropy  of  the  hot  water  decreases  on  cooling,  but  not  as  much  as  the  en- 
tropy of  the  cold  water  increases  on  warming.  So  the  entropy  of  the  total  system  in- 
creases. 

The  symmetry  of  the  situation  should  make  it  apparent  to  you  that  AS  will  have 
a maximum  value  when  Ty  = Ty  = (Tu  + T2i)/2,  as  assumed.  (You  can  give  a nu- 
merical proof  by  evaluating  AS  for  several  pairs  of  values  of  T^and  Ty  which  differ 
by  a few  degrees  from  323  K.  How  would  you  prove  the  statement  analytically?) 
This  means  that  the  entropy  S of  the  total  system  will  have  a maximum  value  when 
its  two  equal  parts  have  reached  a common  temperature  which  is  the  average  of 
their  initial  temperatures. 


834  Kinetic  Theory  and  Statistical  Mechanics 


Example  18-10  allows  us  to  establish  the  relation  between  the  concept 
of  the  equilibrium  macrostate,  introduced  in  this  section,  and  that  of  thermal 
equilibrium,  introduced  in  Sec.  18-3.  Immediately  after  the  two  parts  of  the 
total  system  are  put  into  thermal  contact,  the  total  system  is  not  in  thermal 
equilibrium  because  the  two  joined  parts  are  not  at  the  same  temperature. 
But  as  the  temperature  of  the  initially  cooler  part  increases  and  that  of  the 
initially  warmer  part  decreases,  the  two  parts  approach  a common  temper- 
ature. When  both  parts  have  the  same  temperature,  the  total  system  is  in 
thermal  equilibrium.  Furthermore,  the  total  system  is  not  in  its  equilibrium 
macrostate  immediately  after  its  two  parts  are  joined,  but  it  ends  up  in  the 
equilibrium  macrostate  when  they  have  the  same  temperature.  This  is  so 
because  the  entropy  of  the  total  system  is  then  a maximum.  A maximum 
entropy  means  that  it  is  in  a macrostate  of  maximum  probability,  and  this  is 
the  equilibrium  macrostate.  Hence  we  can  say  that  when  a system  is  in  thermal 
equilibrium,  the  system  is  in  its  equilibrium  macrostate. 

EXERCISES 

Group  A 

18-1.  Molecules  in  a gas  mixture.  One  kilomole  of  he- 
lium gas  and  one  kilomole  of  argon  gas  are  in  a tightly 
sealed  container  whose  volume  is  V = 20  m3.  The  gas 
mixture  is  allowed  to  come  to  equilibrium  at  room  tem- 
perature. 

a.  What  is  the  average  energy  of  one  of  the  helium 
molecules?  Of  one  of  the  argon  molecules? 

b.  What  is  the  average  speed  of  one  of  the  helium 
molecules?  Of  one  of  the  argon  molecules?  [One  way  to 
state  the  average  speed  is  to  use  the  “rms  (root-mean- 
square)  speed.”  urms  = V ( v2) .]  Compare  your  answers  with 
the  speed  of  sound  in  air  at  standard  temperature  and 
pressure,  vs  = 330  m/s. 

c.  Calculate  the  pressure  in  the  box. 

18-2.  Ideal  gases.  Chamber  A contains  pure  helium 
gas;  chamber  if  contains  pure  neon  gas.  The  gas  pressures 
and  temperatures  in  the  two  chambers  are  the  same.  Each 
gas  consists  of  monatomic  molecules  and  can  be  consid- 
ered ideal.  The  mass  of  a neon  atom  is  five  times  the  mass 
of  a helium  atom. 

a.  Compare  the  number  of  molecules  per  unit  vol- 
ume in  the  two  chambers. 

b.  Compare  the  mass  per  unit  volume  in  the  two 
chambers. 

18-3.  Argon  gas.  A container  of  volume  8.0  nr3  con- 
tains an  ideal  gas  at  a temperature  of  300  K and  a pressure 
of  2.0  X 104  Pa  ( — 0.20  atm). 

a.  What  is  the  total  number  of  gas  molecules  in  the 
container? 

b.  What  is  the  number  of  molecules  per  unit  volume? 

c.  What  is  the  total  kinetic  energy  of  the  gas  mole- 
cules in  the  container? 

d.  Compare  the  result  of  part  c with  the  kinetic  en- 
ergy of  a rifle  bullet  (mass  — 3 X 10-2  kg;  speed  — 4 x 
102  m/s). 

e.  What  is  the  average  kinetic  energy  per  molecule  in 
the  gas? 


18-4.  Thermal  energies.  An  energy  unit  that  is  useful  in 
the  analysis  of  atomic  systems  is  the  electron  volt  (eV), 
which  is  equal  to  1.60210  X 10~19  J. 

a.  What  is  the  temperature  of  an  ideal  gas  whose 
molecules  have  an  average  translational  kinetic  energy  of 
1.00  eV? 

b.  What  is  the  average  translational  kinetic  energy  (in 
eV)  of  the  molecules  in  an  ideal  gas  at  room  temperature 
(300  K)? 

18-5.  Avogadro’s  law.  Avogadro's  law  states  that  equal 
volumes  of  ideal  gases  at  the  same  temperature  and  pres- 
sure have  equal  numbers  of  molecules.  Show  that  this 
follows  from  Eq.  (18-13)  and  the  equipartition  theorem. 

18-6.  All  that  glitters  is  not  gold.  A collector  of  precious 
metals  compares  the  amount  of  heat  required  to  raise  by 
one  Celsius  degree  the  temperature  of  one  gram  of  gold 
and  one  gram  of  silver.  What  does  her  comparison  show? 

18-7.  Snake  eyes.  In  a throw  of  three  dice,  all  three 
showed  one  spot.  In  a second  throw,  the  red  die  showed  a 
two,  the  white  die  a three,  and  the  blue  die  a four. 

a.  Which  of  the  two  results  was  the  most  probable? 

b.  Calculate  the  probability  of  getting  three  ones  on  a 
single  throw  of  the  dice. 

c.  Calculate  the  probability  of  getting  a two,  a three, 
and  a four  on  a throw,  regardless  of  the  color  of  the  dice. 

18-8.  Not  likely. 

a.  There  are  n molecules  in  a container.  What  is  the 
probability  that,  on  examination,  all  the  molecules  will  be 
found  in  the  left  half  of  the  container? 

b.  One  kilomole  of  an  ideal  gas  at  0°C  and  1 atm  pres- 
sure contains  Avogadro’s  number  of  molecules  in  22.4  m3. 
How  many  molecules  are  there  in  one  cubic  centimeter? 

c.  Calculate  the  numerical  value  of  the  probability  in 
part  a as  a power  of  10  if  the  container  has  a volume  of 
one  cubic  centimeter. 


Exercises  835 


18-9.  Molecular  energies  and  speeds  in  the  terrestrial 
atmosphere.  The  hottest  naturally  occurring  air  tempera- 
ture at  the  earth's  surface  is  approximately  330  K;  the  cold- 
est air  temperature  is  about  185  K. 

a.  Express  the  average  kinetic  energy  per  molecule  at 
185  K as  a fraction  of  the  average  kinetic  energy  per  mole- 
cule at  330  K. 

b.  Find  the  corresponding  ratio  of  rms  molecular 
speeds. 

18-10.  Entropy  change  of  melting  ice.  Exactly  1 kg  of  ice 
at  its  melting  point  is  changed  to  water  without  increasing 
its  temperature.  How  much  has  its  entropy  increased? 

18-11.  Velocity  selector  for  molecules.  In  the  apparatus 
illustrated  in  Fig.  18-17,  the  length  / of  the  rotating  drum 
is  20.0  cm  and  the  angular  displacement  c f>  between  the 
entrance  and  exit  of  a molecule  from  the  groove  is  5.0°. 
See  Fig.  18E-11.  Calculate  the  angular  speed  of  the 
drum  which  will  select  molecules  with  speeds  of  300  m/s. 
How  many  rotations  per  minute  is  this? 


Group  B 

18-12.  Ideal  gas  mixtures.  In  accordance  with  the  law 
of  partial  pressures,  in  a mixture  of  ideal  gases  at  temper- 
ature T,  the  total  pressure  is  p = 2j (n/AT)  = (tjnj)kT  = 
n'kT.  Here  nj  is  the  number  per  unit  volume  of  gas  mole- 
cules of  type  j,  and  n'  is  the  overall  number  per  unit  vol- 
ume. Let  p'j  represent  the  mass  per  unit  volume  of  gas 
molecules  of  typej,  and  let  mj  represent  the  mass  of  each 
type  j molecule.  The  “molecular  weight"  p of  type  j 
molecules  is  given  by  pj  = Am j,  where  A is  Avogadro’s 
number. 

a.  Show  that  nj  = A p'j  / pj  = pj  / 

b.  The  average  molecular  weight  (p)  and  the  average 
mass  per  molecule  ( m ) of  a gas  mixture  are  defined  by 
( m ) = (p)  I A = p'  /n' , where  p'  is  the  total  mass  density: 
p'  = Ijp'j . Show  that  (m)  = fan] mj) /fen] ). 

c.  Show  that  the  total  pressure  p of  a mixture  of  ideal 
gases  is  given  by  p = p'kT/(m). 

d.  The  average  molecular  weight  (p)  of  the  terres- 
trial atmosphere  is  28.97  kg/kmol.  Find  the  total  mass  of 
the  air  in  a gymnasium  where  the  temperature  T = 295  K 
and  the  pressure  p = 1.00  atm.  The  gymnasium  has 
dimensions  40  m x 30  m x 10  m.  Compare  the  total 
mass  you  obtain  to  the  mass  of  an  African  bull  elephant 
(6000  kg). 

18-13.  Helium  chamber  and  mean  free  path.  A chamber 
contains  (monatomic)  helium  gas  at  a pressure  of  1.00  atm 
and  a temperature  of  273  K. 


a.  What  is  the  number  per  unit  volume  n of  helium 
atoms  in  the  chamber? 

b.  The  average  separation  d between  any  atom  and 
its  nearest  neighbor  is  given  by  the  approximate  equation 
d — ( n')~ 113 . Evaluate  d for  the  helium  gas. 

The  average  distance  A that  an  atom  can  travel 
between  successive  collisions  with  other  atoms  is  called  the 
mean  free  path.  The  mean  free  path  is  given  by  the 
approximate  equation  A — 1 /n'cr,  where  cr  is  called  the 
collision  cross  section.  The  collision  cross  section  does  not 
depend  on  the  gas  density. 

c.  For  helium  atoms  at  ordinary  temperatures,  the 
cross  section  a is  approximately  1 X 10-20  m2.  Evaluate 
the  mean  free  path  for  the  sample  of  helium  gas  described 
above . 

d.  Suppose  the  helium  chamber  is  1 .0  nr  in  diameter. 
At  the  given  temperature  of  273  K.  how  much  would  the 
pressure  have  to  be  reduced  in  order  for  the  mean  free 
path  to  be  equal  to  the  chamber  diameter? 

18-14.  Energy  transfer  by  moving  piston.  Consider  an 
ideal  gas  confined  to  a cylinder,  one  end  of  which  is  closed 
by  an  extremely  massive  smooth  piston  of  aread.  The  x 
axis  is  perpendicular  to  the  face  of  the  piston,  which 
moves  with  velocity  wx,  compressing  the  gas.  Assume  that 
the  piston’s  speed  u is  much  less  than  the  average  speed  of 
a gas  molecule. 

a.  A molecule  of  mass  m impinges  on  the  piston  with 
velocity  v = — |ux|  x + vyy  + vzz.  What  is  its  velocity  after  it 
strikes  the  piston?  Hint:  This  question  can  be  answered 
fairly  easily  by  examining  the  collision  in  a reference 
frame  where  the  piston  is  at  rest,  considering  it  to  be 
elastic. 

b.  What  is  the  change  in  kinetic  energy  of  the  mole- 
cule as  a result  of  its  collision  with  the  piston? 

c.  How  much  work  did  the  piston  do  on  the  mole- 
cule? 

d.  How  much  work  does  the  piston  do  on  the  entire 
gas  during  a time  At? 

e.  Now  make  the  approximation  that  the  speed  u of 
the  piston  is  very  much  less  than  the  average  molecular 
speed  and  show  that  the  result  of  part  d can  be  expressed 
as  W = pA  Ax,  where  Ax  = u At  is  the  displacement  of  the 
piston  and  p is  the  gas  pressure.  Why  is  this  a reasonable 
result? 

18-15.  A Clausius  gas.  Consider  a gas  of  N hard- 
sphere  molecules  which  obeys  the  Clausius  equation  of 
state.  The  total  volume  VN  of  the  molecules  themselves  is 
given  by  VN  = N%nr3  = Nv,  where  r is  the  radius  and  v the 
volume  of  each  molecule. 

a.  Show  that  the  Clausius  equation  of  state,  Eq. 
(18-2  la),  can  be  written  as  />(  1 — AVN/V)  = NkT/V. 

b.  The  Clausius  equation  is  an  accurate  equation  of 
state  only  when  VN  « V.  Show  that  under  this  restric- 
tion, the  following  is  an  accurate  expression  for  the  pres- 
sure: 


NkT 

UaAVn) 

NkT 

, + 4MA 

V 

\ v J 

V 

l V 1 

836  Kinetic  Theory  and  Statistical  Mechanics 


18-16.  Equipartition  on  an  air  table.  A particular  air 
table  has  retaining  rails  that  are  kept  in  continual  steady 
vibration  with  the  help  of  an  electric  vibrator.  Pucks  on  this 
table  serve  as  a good  analogue  to  hard-sphere  molecules  in 
a chamber,  provided  that  the  two-dimensional  character 
of  the  air  table  is  properly  taken  into  account.  One  variety 
of  puck  used  on  this  table  is  in  the  form  of  a circular  disk 
of  radius  r and  uniform  density.  The  rim  of  each  disk  is 
rough,  so  that  these  pucks  can  exert  torques  upon  one  an- 
other during  collisions.  A second  variety  of  puck  has  the 
same  mass,  radius,  and  external  appearance  as  the  pucks 
just  described.  However,  this  second  variety  actually  has  a 
highly  nonuniform  mass  distribution,  with  essentially  all 
of  the  mass  concentrated  at  the  rim.  How  could  you  recog- 
nize these  pucks  if  they  were  moving  around  among  the 
other  variety  on  the  air  table? 

18-17.  Brownian  motion.  Any  particle  which  is  sus- 
pended in  a fluid  suffers  collisions  with  the  molecules  in 
the  fluid.  As  a result,  the  particle  exhibits  an  aimless  wan- 
dering called  Brownian  motion.  The  suspended  particle’s 
speed  and  direction  are  not  constant  — they  change  at 
each  collision.  Each  suspended  particle’s  rms  speed,  found 
as  the  square  root  of  the  time  average  of  the  squared  speed 
of  that  particular  particle,  can  be  shown  to  be  equal  to  the 
rms  speed  as  evaluated  in  the  standard  manner  given  in 
the  text. 

a.  With  the  help  of  the  equipartition  theorem  and  of 
the  equality  mentioned  above,  find  the  time-averaged  rms 
speed  of  a particle  of  mass  M suspended  in  a fluid  at  tem- 
perature T. 

b.  Evaluate  your  results  for  a polyethylene  sphere 
1.0  X 10-6  nr  in  diameter;  this  object  would  be  just  barely 
visible  in  a microscope.  (The  density  of  polyethylene  is 
0.95  g/cm3;  use  a temperature  of  300  K.) 

c.  For  suspended  spheres  of  a given  density,  how 
does  the  rms  speed  of  Brownian  motion  depend  upon  the 
radius  of  the  sphere?  That  is,  find  the  exponent  y in  the 
proportionality  wrms  « r-y. 

d.  If  the  rotational  motions  of  a suspended  sphere 
are  included,  what  is  the  average  total  kinetic  energy  of  a 
suspended  sphere,  as  predicted  by  the  equipartition 
theorem? 

18-18.  Rotating  oxygen  molecule.  An  oxygen  molecule 
can  be  thought  of  as  a dumbbell.  Llsing  this  model,  esti- 
mate the  angular  speed  of  a typical  molecule  in  a con- 
tainer of  oxygen  at  standard  temperature  and  pressure. 
How  does  the  rotational  speed  of  one  of  the  atoms  in  the 
molecule  compare  with  the  average  translational  speed  of 
the  entire  molecule? 

18-19.  Does  he  have  all  his  marbles ? There  are  three 
boxes  labeled  1,  2,  and  3 and  three  marbles  colored  red, 
green,  and  blue.  How  many  microstates  are  there  for  the 
macrostate  in  which 

a.  all  the  marbles  are  in  box  3? 

b.  two  marbles  are  in  box  3 and  one  is  in  box  2? 

c.  there  is  one  marble  in  each  box? 


18-20.  Tossing  dice:  microstates  and  macrostates. 

a.  Tbe  microstate  of  a tossed  die  is  easily  indexed  by 
the  number  of  spots  on  the  upper  face.  What  is  the  total 
number  of  distinct  microstates  of  a single  tossed  die?  What 
is  the  probability  of  occurrence  of  any  given  one  of  these 
when  a die  is  tossed? 

b.  A die  is  tossed  once  and  its  upper  face  observed; 
the  result  is  denoted  by  the  number  of  spots  T1 . Then  the 
die  is  tossed  again  and  the  result  is  recorded  as  T2.  How 
many  distinct  outcomes  (Xj,  T2)  are  possible  for  this 
double-toss  experiment? 

c.  What  are  the  relative  likelihoods  of  occurrence  of 
each  of  the  double-toss  outcomes  (7j,  T2)?  Justify  your 
answer.  What  is  the  actual  probability  of  the  particular 
outcome  (4,  2)  in  a double-toss  experiment?  The  various 
outcomes  (7j,  T2)  are  the  microstates  of  the  double-toss 
experiment. 

d.  Let  S = 7\  + T2.  How  many  microstates  corre- 
spond to  each  of  the  following  results  (macrostates)  in  the 
double-toss  experiment?  What  is  the  probability  of  occur- 
rence of  each  of  the  macrostates? 

(i)  5 3=  9 (iii)  4 =£  S 6 

(ii)  S = 7 (iv)  5 / 3 

e.  Find  the  probability  of  each  of  the  following  mac- 
rostates in  the  double-toss  experiment. 

(i)  T2  > 7j  (iii)  T2  - T j > 3 

(ii)  T2  = 7J  (iv)  |T2  - Ti|  ^ 2. 

18-21.  States  of  a crystal  in  a magnetic  field.  Quantum 
mechanical  considerations  tell  us  that  when  certain  mag- 
netic atoms  are  placed  in  a strong  magnetic  held  they  can 
only  have  energies  — ex  and  -l-e!.  A crystal  consisting  of  IV 
such  atoms  is  placed  in  a strong  magnetic  held. 

a.  Describe  the  single  object  states,  the  microstates, 
and  macrostates  of  the  crystal. 

b.  The  crystal  is  in  thermal  equilibrium  at  absolute 
temperature  T.  Find  the  probability  of  each  single  object 
state,  the  average  energy  of  an  atom,  and  the  total  energy 
of  the  crystal. 

c.  Find  the  molecular  heat  capacity  of  the  crystal. 

18-22.  Partition  function.  The  partition  function  Z of  a 
single  object  in  a system  is  defined  to  be 

Z = 2 e H3e‘ 

single 

object 

states 

or,  alternatively 

Z = 2 G(6i)e-e(> 

energy 

levels 

In  these  expressions  f3  = 1/AT.  e,  is  the  object’s  energy, 
and  G(e,)  is  the  density-of-states  factor. 

a.  Show  that  the  average  energy  of  the  object  is  given 
by 

_[dZ  _ _ d In  Z 
C ~~  Z dp  ~ df3 


Exercises  837 


b.  Show  that  this  is  also  equal  to 
e = ( kT2/Z)(dZ/dT ) 

18-23.  An  ideal  gas  in  an  imaginary  world.  Consider  an 
ideal  gas  in  a two-dimensional  world. 

a.  Modify  the  arguments  leading  to  Eqs.  (18-50)  and 
(18-51),  and  find  G(v)  and  n(v)  for  the  ideal  gas  in  two  di- 
mensions. 

b.  Find  the  most  probable  speed  of  a molecule  in  the 
two-dimensional  gas. 

c.  Can  you  generalize  these  results  to  an  imaginary- 
world  of  D dimensions  with  D > 3? 

18-24.  Maxwell- Boltzmann  energy  distribution.  The 
number  of  molecules  in  an  ideal  gas  having  speeds 
between  v and  v + dv  is  n{v)dv,  where  n(v)  is  the 
Maxwell-Boltzmann  speed  distribution  function. 

a.  Using  the  fact  that  the  kinetic  energy  of  a molecule 
is  e = imv2,  find  the  range  of  speeds  that  corresponds  to 
the  range  of  kinetic  energies  between  e and  e + de.  Then 
express  the  number  of  molecules  with  kinetic  energies  in 
this  range  as  n'(e)de  and  find  n'(e). 

b.  What  is  the  most  probable  kinetic  energy  of  a mol- 
ecule? 

18-25.  Molecules  in  a box.  The  molecules-in-a-box  sim- 
ulation is  started  with  30  molecules  in  each  half  of  the  box. 
Show  analytically  that  the  probability  for  the  number  w(  of 
molecules  in  the  left  half  of  the  box  to  monotonically  in- 
crease to  60  is  1.2  x 10-21. 

18-26.  Entropy  and  a large  heat  reservoir.  A large  heat 
reservoir  at  100°C  in  contact  with  1 .0  kg  of  water  warms  it 
to  100°C.  Show  that  there  is  an  increase  in  the  entropy  of 
the  entire  system. 

18-27.  Entropy  and  two  large  heat  reservoirs.  A quantity 
of  heat//  is  transferred  from  a large  heat  reservoir  at  tem- 
perature Tx  to  another  large  heat  reservoir  at  temperature 
T2 , with  Tj  > T2  required  for  spontaneous  transfer.  The 
heat  reservoirs  have  such  large  capacities  that  there  is  no 
observable  change  in  their  temperatures.  Show  that  the 
entropy  of  the  entire  system  has  increased. 


Group  C 

18-28.  A mixture  of  Clausius  gases,  I.  Consider  two 
monatomic  hard-sphere  gases  a and  /3  with  atomic  radii  ra 
and  rB.  A pure  sample  of  gas  a obeys  the  equation 

_ NqkT 
Pa0  ~ V - Navaa/2 

where  vaa  = in(ra  + ra)3  = 8377T3.  Expressed  in  words, 
vaa  is  the  volume  excluded  in  a collision  between  two 
atoms  of  type  a.  A pure  sample  of  gas  (3  obeys  the  com- 
pletely analogous  equation 

NBkT 

Pm  V - NBvBB/  2 


a.  Show  that  if  gases  a and  (3  are  mixed,  so  that  they 
must  coexist  within  the  same  volume  V at  temperature  T, 
then  the  total  gas  pressure  p is  given  by 

= NqkT NBkT 

V — Navaq/2  — N Bv qB / 2 V — NB  v BB / 2 — NqVoB/2 

where  vaB  = f 7 r(ra  + rBf.  (Hint:  Carefully  account  for  the 
volume  excluded  by  collisions  between  unlike  molecules.) 

b.  Show  that  the  pressure  p given  in  part  a exceeds 
the  sum  of  the  pressures  pa0  and  pB0  which  each  gas  would 
exert  if  it  were  alone  in  the  chamber.  Can  you  suggest  a 
physical  explanation  for  this  result? 

c.  Show  that  p > pa0  + pB 0 even  if  we  suppose  that 
rB  — » 0.  Can  you  account  for  this? 

d.  Show  that  if  the  two  gases  happen  to  be  identical, 
then  the  expression  given  in  part  a is  in  complete  agree- 
ment with  the  total  pressure  that  would  be  obtained  by 
working  directly  with  the  equation  for  a single  sample  of 
pure  a gas  (and  using  a total  number  of  a molecules  equal 
to  the  sum  Na  + NB).  Such  agreement  is  most  certainly  a 
necessary  condition  for  self-consistency;  verifying  the 
agreement  can  be  a source  of  confidence  in  a new  and 
unfamiliar  equation,  such  as  that  given  in  part  a. 

18-29.  A mixture  of  Clausius  gases,  II.  Suppose  that  two 
side-by-side  chambers,  each  with  the  same  volume  V, , are 
separated  by  a removable  partition.  One  chamber  con- 
tains a pure  Clausius  gas  of  type  a (see  Exercise  18-28), 
while  the  other  chamber  contains  pure  /3-type  Clausius 
gas.  The  pressures  and  temperatures  in  the  two  chambers 
are  equal  and  are  given  by  pi  and  Tt,  respectively.  The 
partition  separating  the  gases  is  removed,  allowing  the 
gases  to  mix  and  fill  the  combined  volume  Vf  = 2T,-.  The 
final  temperature  Tf  equals  the  initial  temperature  7’,- . (It 
is  possible  to  show  that  this  does  not  require  heat  flow  into 
or  from  the  walls.)  Is  the  final  total  pressure  pf  necessarily 
equal  to  the  initial  common  pressure  pf  If  so,  why?  If  not, 
why  not,  and  under  what  circumstances,  if  any,  will  pf  = 

pf 

18-30.  Dulong-Petit  Law.  The  Dulong-Petit  law,  Eq. 
(18-34),  holds  true  surprisingly  well  for  solids  if  the  tem- 
perature is  high  enough.  To  predict  the  behavior  of  the 
heat  capacity  at  lower  temperatures,  a quantum-mechan- 
ical model  must  be  used.  One  such  model,  originally  used 
by  Einstein,  assumes  that  all  atoms  in  a solid  vibrate  at  the 
same  frequency  v.  The  total  energy  of  a solid  of  N atoms  is 
then  the  same  as  the  energy  of  iN  one-dimensional  oscil- 
lators. The  correct  quantum-mechanical  expression  for 
the  average  energy  of  this  collection  of  oscillators  is 

(E)  = 3 Nhv[i  + \/(eBhv  - U] 


where  f3  = 1/A T,  and  h = 6.63  X 10-34  J-s. 

a.  Calculate  the  heat  capacity  per  atom  of  the  solid, 
using  this  model.  Show  that  it  can  be  written  as 


c' 

where  0 = hv/k. 


3 R 


eeiT 

(eeiT  - l)2 


838  Kinetic  Theory  and  Statistical  Mechanics 


b.  Determine  the  behavior  of  c'  in  the  limit  7 » 0. 

c.  Determine  the  behavior  of  c'  in  the  limit  T « 0. 

d.  Sketch  a graph  of  your  result.  How  does  the  pre- 
diction of  this  model  relate  to  the  Dulong-Petit  law? 

18-31.  Equilibration  of  harmonic  oscillators,  I.  Consider 
two  separate  collections  of  harmonic  oscillators.  Collection 
1 consists  of  Ni  interacting  oscillators;  the  total  energy  of 
collection  1 is  £x . Collection  2 consists  of  N2  interacting  os- 
cillators whose  combined  total  energy  is  E2  ■ 

a.  What  is  the  average  energy  (ei)  of  the  oscillators  in 
collection  1?  Assuming  that  the  collection  has  reached 
equilibrium,  what  is  the  probability  Px(e)de  that  any  partic- 
ular oscillator  in  collection  1 has  energy  between  e and  e + 
de't 

b.  Give  the  analogous  results,  (e2)  and  P2(e)de , for 
collection  2. 

c.  Write  down  the  number  of  oscillators  per  unit  en- 
ergy «i(e)  and  n2(e)  for  the  two  collections.  Suppose  now 
that  the  two  collections  are  brought  into  “contact,”  so  that 
each  oscillator  can  interact  with  any  of  the  other  Nt  + 
N2  ~ 1 oscillators.  At  the  instant  when  the  collections  are 
brought  together,  the  number  of  oscillators  per  unit  en- 
ergy for  the  combined  collection  is  certainly  given  by 
n0(e)  = nfle)  + n2(e). 

d.  What  is  the  total  energy  E of  the  combined  collec- 
tion? Find  (e),  the  average  energy  per  oscillator  in  the 
combined  collection. 

e.  After  the  combined  collection  has  reached  equilib- 
rium, what  is  the  probability  P(e)de  that  any  particular  os- 
cillator in  the  collection  has  an  energy  between  e and  e + 
de}  What  is  the  equilibrium  energy  distribution  n(e)? 

f.  Show  that  n(e)  = n0(e)  if  and  only  if  (ex)  = (e2). 

g.  Show  that  if  ex  < e2 , then  P(e)  < Px(e)  for  e < (ex) , 
and  that  P(e)  < P2(e)  for  e > (e2). 

18-32.  Equilibration  of  harmonic  oscillators,  II.  Consider 
the  two  separate  collections  ot  oscillators  described  in  Ex- 
ercise 18-31.  Suppose  that  Ni  = 1000,  £x  = 2.0  x 10  1 J . 
N2  = 2000.  and  £2  = 1.0  x 10“3J. 

a.  Evaluate  (ex)  and  (e2). 

b.  Carefully  graph,  to  the  same  vertical  and  horizon- 
tal scales,  the  energy  distributions  nx(e)  and  w2(e)  lor  0 < 
e =£  2.0  x 10“6J. 

c.  Use  your  results  for  part  b to  construct  a graph  of 
the  energy  distribution  n0(e)  = nx(e)  + n2(e). 

Suppose  that  the  collections  are  now  combined,  as 
described  in  the  Exercise  18-31. 

d.  Evaluate  E and  (e). 

e.  Using  the  same  scales  as  in  parts  b and  c.  carefullv 
graph  the  equilibrium  energy  distribution  n(e). 

18-33.  Mean  square  deviation  from  the  mean.  You  have 
seen  in  Example  18-6  that  the  average  value  ot  the  energy 
of  a single  object  is 

2 €,'£(€,)  2 

i i 

<e>  ‘ Xfio  * 

i i 


where  f3  = 1 /kT.  The  average  value  of  any  function  of  e 
can  be  expressed  in  a similar  way: 

(jf|e)>  = ^f(ei)e~Be '/£  e~Be< 

i i 

(Can  you  show  this  to  be  true?) 

a.  Show  that 

In  (V  *-*«  j 

-l 

b.  The  energy  of  a gas  fluctuates  slightly  from  its 
average  value.  The  average  fluctuation  in  energy  is  ex- 
pressed as  the  “mean  square  deviation  from  the  mean”: 
(Ae2)  = ((e  — (e))2).  Show  that  (Ae2)  can  also  be  written 
as  (Ae2)  = <e2>  - (e>2. 

18-34.  Determining  Boltzmann’s  constant.  This  exer- 
cise has  to  do  with  one  method  of  determining  Boltz- 
mann’s  constant  k.  The  Boltzmann  factor  applies  not  only 
to  molecules  but  even  to  particles  large  enough  to  be 
visible  with  a microscope,  provided  they  are  numerous. 
The  French  physicist  Perrin  applied  the  equation  to  a 
colloidal  suspension  containing  particles  visible  as 
microscopic  specks.  The  particles  could  be  considered 
identical  and  were  maintained  at  constant  temperature. 

a.  Show  that  the  effective  weight  m'g  of  a par- 
ticle of  density  p immersed  in  a liquid  of  density  p;  is  m'g  = 
mg(p  — pfl/p,  where  m is  the  ordinary  mass  of  the  colloi- 
dal particle. 

b.  Show  that  the  Boltzmann  factor  results  in  the  rela- 
tion n = n0e~m'ahlkT,  where  n and  n0  are  the  number  of  par- 
ticles per  unit  volume  suspended  in  the  liquid  at  heights 
which  differ  by  h. 

c.  The  radius  of  the  individual  particles  was  too  small 
to  measure  directly  even  with  the  aid  of  a microscope. 
Perrin  let  a film  of  the  suspension  evaporate.  The  colloi- 
dal particles  came  together  in  rows  and  he  could  count 
the  number  in  a row.  In  one  case,  there  were  34  in  a row 
0.020  mm  long.  Assume  that  they  were  spheres  in  con- 
tact. It  had  previously  been  determined  that  for  the  parti- 
cles p = 1.15  g/cm3.  Calculate  m. 

d.  The  particles  were  suspended  in  a solution  for 
which  pi  was  equal  to  1.10  g/cm3.  Perrin  focused  the 
microscope  on  one  level  of  the  fluid  and  counted  the 
number  of  particles  visible.  He  raised  the  microscope  tube 
to  focus  on  a higher  level  and  did  some  more  counting. 
He  found  that  n/n0  was  2 when  the  microscope  was  raised 
0.050  mm.  The  temperature  was  20°C.  What  does  this 
data  give  for  the  value  of  A? 

18-35.  Maxwell’s  derivation.  Maxwell’s  original  deriva- 
tion of  the  relation  P(v)  °c  e~av2  involved  the  use  of  a dia- 
gram similar  to  Fig.  18-14.  Suppose  the  probabilities  of  a 
molecule  having  components  of  velocity  vx , vy,  and  vz  are 
Px(vx),  Py(vy),  and  Pz(vz),  respectively. 

a.  Why  must  the  form  of  these  three  functions  be  the 
same? 


Exercises  839 


b.  If  a molecule  has  these  three  components  of  veloc- 

ity simultaneously,  what  is  the  expression  for  this  proba- 
bility? The  choice  of  the  direction  of  the  axes  is  quite  arbi- 
trary. If  a different  set  of  axes  were  used,  vx,  vy,  and  vz 
would  be  different  and  this  would  change  each  probabili- 
ty. But  for  all  sets  of  axes  + xrz  is  constant,  and  the 

probability  expression  in  part  b,  which  should  be  inde- 
pendent of  the  choice  of  axes,  can  be  written  as 
F(v%  + v\  + v\). 

c.  Equate  these  two  expressions  for  the  same  proba- 
bility and  show  that  the  equation  is  satisfied  by  Ptvf  = 
Ce~arJ  and  that  F(v % + v%  + v%)  = CV““'l+'»+r“’  = C3e~av\ 


18-36.  Maximize  the  entropy.  Prove  that  the  entropy  of 
the  total  system  considered  in  Example  18-10  maximizes 
when  its  two  equal  parts  have  reached  a common  temper- 
ature which  is  the  average  of  their  initial  temperatures. 

18-37.  General  proof  that  /3  = 1 /kT.  The  text  proves 
that  P(e)  = Ce~0e  is  valid  for  a system  containing  objects  of 
any  nature,  but  that  the  constant  in  the  exponent  has  the 
value  /3  = \/kT  only  for  macroscopic  harmonic  oscillators. 
Prove  that  the  relation  (3  = 1 /kT  applies  to  objects  of  any 
nature,  as  follows.  Let  subsystem  1 consist  of  identical 
macroscopic  harmonic  oscillators  and  subsystem  2 of  iden- 
tical objects  of  an  arbitrary  type.  The  two  are  in  thermal 
equilibrium  with  each  other,  and  together  they  form  an 
isolated  system  of  constant  energy  E.  First  justify  writing 
the  probability  of  a macrostate  of  the  system  in  which  sub- 
system 1 has  energy  £j  as  P(E /)  = w(ET)/wt , with  w(E/) 
being  the  number  of  its  microstates  and  wt  the  total 
number  of  microstates.  Then  show  that  P{E{)  = 
Wi(ET)W2(E  — Ef/wt,  where  and  w2  are  the  number  of 
microstates  in  subsystems  1 and  2.  Next  justify  saying  that 
in  thermal  equilibrium  E±  has  a value  which  makes  P(£j) 
be  at  (or  at  least  near)  its  maximum  value.  Maximize  P(E/) 
with  respect  to  £j,  and  show  that  this  leads  to  the  relation 

I du\(Ex)  _ 1 dw2(E2) 

Wi(Ei)  dEj  w2(E2)  BE2 

where  E2  = E — Ex.  Then  apply  to  each  subsystem  an 
argument  like  the  one  leading  to  Eq.  (18-576),  but  without 
assuming  /3  = 1 /kT,  to  show  that  Wi/Ef)  = B1el3lE'  and 
w2(. E2)  = B2ep2E2 . Use  this  in  the  relation  to  obtain  (3^  = (32. 
Then  argue  that  since  /3X  = 1/kT  because  subsystem  1 
consists  of  macroscopic  harmonic  oscillators,  it  follows 
that  ft2  = 1 /kT  for  subsystem  2 composed  of  objects  of  ar- 
bitrary  nature. 


18-38.  Proof  of  zeroth  law  of  thermodynamics.  Apply 
parts  of  the  argument  outlined  in  Exercise  1 8-37  to  a situ- 
ation in  which  three  subsystems  are  in  thermal  equilib- 
rium and  prove  the  zeroth  law  of  thermodynamics. 

Numerical 

18-39.  Boltzmann  factor  simulation,  I.  This  exercise  re- 
quires the  availability  of  a programmable  calculator  with 
approximately  100  indirectly  addressable  storage  regis- 
ters, or  a small  computer.  Consult  the  molecules-in-a-box 


program  in  the  Numerical  Calculation  Supplement  to  see 
how  to  generate  a uniformly  distributed  set  of  random 
numbers  in  the  range  0 to  1.  Then  write  a program  to 
make  the  calculating  device  you  use  perform  the  experi- 
mental simulation  that  is  related  to  the  Boltzmann  factor. 
At  each  stage  of  the  sequence  of  calculations,  the  program 
should  make  the  device  go  through  the  following  steps: 
(1)  Generate  at  random  an  integer  uniformly  distributed 
from  1 through  80,  and  store  the  number.  (2)  Do  it  again, 
and  store  the  second  random  number.  (3)  Test  to  see  if 
the  two  random  numbers  are  equal.  If  so,  go  to  step  1.  (4) 
Sum  the  contents  of  registers  whose  labels  are  the  two 
random  numbers.  (5)  Generate  a uniformly  distributed 
random  number  in  the  range  from  0 to  1.  (6)  Multiply  the 
value  obtained  in  step  4 by  the  random  number  produced 
in  step  5,  and  also  by  one  minus  this  random  number.  (7) 
Store  one  of  the  values  obtained  in  step  6 in  one  of  the 
registers  used  in  step  4,  and  the  other  value  in  the  other 
register.  (8)  Go  to  step  1.  Use  this  program  to  run  an 
experimental  simulation  like  the  one  whose  results  are 
plotted  in  Fig.  18-9,  after  entering  the  initial  value  of  the 
energy  of  each  molecule  in  the  corresponding  storage  reg- 
ister. Compare  a plot  of  your  results  with  that  figure. 

18-40.  Boltzmann  factor  simulation,  II.  Use  the  pro- 
gram written  in  Exercise  18-39  to  run  an  experimental 
simulation  like  the  one  whose  results  are  plotted  in  Fig. 
18-9,  but  with  half  the  oscillators  given  the  initial  energy 
e = 8 and  the  other  half  given  zero  initial  energy.  Com- 
pare a plot  of  your  results  with  that  figure. 

18-41.  Boltzmann  factor  simulation,  III.  Use  the  pro- 
gram written  in  Exercise  18-39  to  find  the  equilibrium  dis- 
tribution when  all  the  oscillators  are  given  the  initial 
energy  e = 6.  Explain  the  differences  between  this  dis- 
tribution and  the  ones  plotted  in  Figs.  18-9  and  18-11. 

18-42.  Boltzmann  factor  simulation,  IV.  Modify  the  pro- 
gram written  in  Exercise  18-39  so  that  there  are  only  8 
oscillators  in  the  system.  Then  run  an  experimental  simu- 
lation in  which  all  the  oscillators  are  given  the  initial  en- 
ergy e = 4.  Compare  your  results  with  those  plotted  in 
Fig.  18-9.  Explain  the  difference  between  the  two  results. 

18-43.  Average  vibrational  energy.  Write  a program  to 
make  a calculator,  or  small  computer,  evaluate  the  expres- 
sion given  in  Example  18-6d  for  (e),  the  average  energy  in 
the  vibrational  motion  of  a hydrogen  molecule  at  a high 
temperature.  Use  it  to  evaluate  this  quantity  at  the  tem- 
perature 10,000  K,  and  compare  your  result  with  the  pre- 
diction (e)  — kT. 

18-44.  Vibrational  heat  capacity,  I.  Use  the  program 
written  in  Exercise  18-43  to  evaluate  the  vibrational  con- 
tribution to  the  molecular  heat  capacity  at  constant  vol- 
ume of  hydrogen  in  the  temperature  range  10,000  K to 
11,000  K,  using  the  expression  given  at  the  end  of  Ex- 
ample 18-6d.  Compare  your  results  with  the  prediction 
made  there  that  it  is  approximately  equal  to  k. 


840  Kinetic  Theory  and  Statistical  Mechanics 


18-45.  Vibrational  heat  capacity,  II.  Use  the  program 
written  in  Exercise  18-43  and  the  procedure  of  Exercise 
18-44  to  evaluate  the  vibrational  contribution  to  the 
molecular  heat  capacity  at  constant  volume  of  hydrogen  in 
the  following  temperature  ranges:  600  to  700  K.  1000  to 
1 100  K.  1500  to  1600  K,  2000  to  2100  K.  3000  to  3100  K, 
4000  to  4100  K,  6000  to  6100  K,  and  8000  to  8100  K.  Use 
your  results  to  plot  the  temperature  dependence  of  the  vi- 
brational contribution  to  the  molecular  heat  capacity  of 
hydrogen.  Write  a one  paragraph  discussion  of  the  rela- 
tion between  your  plot  and  the  equipartition  theorem. 

18-46.  Molecules-in-a-box  simulation,  I.  Run  the 

molecules-in-a-box  program  with  the  initial  numbers  ot 
molecules  in  the  left  and  right  halves  of  the  box  being  = 
0 and  nr  = 60.  Plot  your  results  and  compare  those  shown 
in  Fig.  18-21,  commenting  on  their  essential  difference. 

18-47.  Molecules-in-a-box  simulation,  II.  Run  the 

molecules-in-a-box  program  with  the  initial  numbers  of 
molecules  in  the  left  and  right  halves  of  the  box  being  n;  = 
0 and  nr  = 6.  Plot  your  results  and  compare  those  shown 
in  Fig.  18-22,  commenting  on  their  essential  similarity. 

18-48.  Molecules-in-a-box  simulation,  III.  Run  the 

molecules-in-a-box  program  with  the  initial  numbers  of 
molecules  in  the  left  and  right  halves  of  the  box  being 

equal,  for  the  following  cases:  nt  = nr  = 2,  4,  8,  16 

Stop  the  calculating  device  the  first  time  nt  equals  the  ini- 
tial value  of  ni  + nr.  Record  the  number  of  moves  which 
were  required  for  all  the  molecules  to  go  in  an  extreme 


fluctuation  to  the  left  half  of  the  box.  The  number  of  mol- 
ecules that  are  contained  in  cube  10  cm  in  edge  length  at 
room  temperature  and  pressure  exceeds  1020.  How  diffi- 
cult is  it  for  an  initial  nt  = 1 x 1020,  nr  = 0 distribution  to 
become  an  = i X 1020,  nr  = i X 1020  distribution?  How 
difficult  is  it  for  an  initial  = i x 1020,  nr  = \ x 1020  dis- 
tribution to  become  an  nt  = 1 X 1020,  nr  = 0 distribution? 

18-49.  Random  walk.  Write  a program  to  make  a cal- 
culator or  computer  perform  a one-dimensional  “random 
walk,”  to  simulate  gas  dif  f usion.  At  each  stage  of  the  calcu- 
lation a uniformly  distributed  random  number  in  the 
range  0 to  1 is  generated,  using  the  routine  in  the 
molecules-in-a-box  program,  and  compared  to  the 
number  i.  If  the  random  number  is  larger,  the  x coordi- 
nate of  a molecule  is  increased  by  1 by  adding  1 to  the  reg- 
ister storing  its  value.  Otherwise  it  is  decreased  by  1 bv 
subtracting  1 from  that  register.  In  subsequent  stages  the 
process  is  repeated,  until  you  stop  the  calculating  device. 
Make  a run  in  which  the  molecule  takes  20  “random  steps” 
from  the  initial  location  x = 0,  and  record  its  final  loca- 
tion. Do  the  same  thing  for  a total  of  20  runs.  Then  make 
a bar  graph  of  the  final  locations  of  the  molecules.  Repeat, 
allowing  the  molecules  to  take  80  steps  in  each  walk.  Write 
a paragraph  describing  the  relation  between  the  average 
distance  between  the  initial  and  final  locations  and  the 
number  of  steps  in  the  walk  and  explaining  it.  Write  a sec- 
ond paragraph  explaining  the  connection  between  the 
experimental  simulation  and  the  dif  fusion  of  a gas  of  one 
species  through  another. 


Exercises  841 


Thermodynamics 


19-1  THERMO- 
DYNAMIC INTER- 
ACTIONS AND  THE 
FIRST  LAW  OF 
THERMODYNAMICS 


In  this  chapter,  we  continue  the  study  of  systems  consisting  of  very  many 
individual  entities,  such  as  the  molecules  of  a gas,  from  a mainly  macro- 
scopic point  of  view.  The  insights  gained  through  the  study  of  such  systems 
from  a microscopic  point  of  view  in  Chap.  18  enable  us  to  develop  much 
more  powerful  methods  and  to  reach  much  more  general  conclusions  than 
were  possible  on  the  basis  of  the  purely  empirical  approach  of  Chaps.  16 
and  17. 

For  the  most  part,  we  focus  our  attention  on  systems  which  are  in 
thermal  equilibrium.  As  was  shown  in  Sec.  18-7,  such  a system  is  in  (or  very 
near  to)  its  equilibrium  macrostate.  In  view  of  this  restriction  and  because  of 
our  almost  exclusively  macroscopic  approach,  we  are  concerned  with  equi- 
librium macrostates  only,  and  not  with  microstates  or  single-object  states. 
Therefore  frequently  we  follow  the  practice  standard  in  these  circum- 
stances and  substitute  for  the  long  term  “equilibrium  macrostate  of  the 
system”  the  short  form  state  of  the  system  (or  sometimes  just  state)  which 
in  this  chapter  means  exactly  the  same  thing.  The  state  of  a system  can  be 
altered  by  any  of  a broad  class  of  processes  to  which  the  name  thermody- 
namic interactions  is  given.  I he  study  of  such  changes,  when  the  system  is 
never  allowed  to  depart  significantly  from  thermal  equilibrium,  is  called 
equilibrium  thermodynamics.  (Provided  the  initial  and  final  macrostates 
of  a system  are  in  thermal  equilibrium,  it  is  possible  to  use  equilibrium  ther- 
modynamics to  describe  the  gross  changes  which  take  place  when  the 
system  passes  from  one  to  the  other  through  a series  of  macrostates  which 
deviate  significantly  from  equilibrium.  Rut  the  study  of  such  processes 
themselves  requires  the  application  of  nonequilibrium  thermodynamics,  which 
is  not  treated  in  this  book.) 


842 


Several  simple  special  cases  of  thermodynamic  interactions  have 
already  been  discussed  in  Chap.  17.  Figure  19-1  illustrates  the  typical 
system  involved  in  Example  17-1.  A gas,  consisting  of  a very  large  number 
of  molecules,  is  confined  in  a cylinder  fitted  with  a gas-tight  but  frictionless 
piston  of  negligible  mass.  The  system  can  be  manipulated  in  a variety  of 
ways.  All  these  ways  involve  either  the  transfer  of  heat  into  (or  out  of)  the 
system,  or  the  performance  of  mechanical  work  on  (or  by)  the  system, 
or  both. 

Heat  is  transferred  into  (or  out  of)  the  system  by  placing  the  system 
in  contact  with  another  system,  such  as  a water  bath.  This  process  is  called 
thermal  interaction.  Thermal  interaction  tends  to  change  the  internal  en- 
ergy E of  the  system.  That  is,  it  tends  to  change  the  energy  of  the  system  as 
measured  by  an  observer  in  whose  reference  frame  the  system  as  a whole  is 
at  rest.  (See  the  discussion  at  the  end  of  Sec.  18-2.)  In  thermodynamic  in- 
teractions, we  are  interested  only  in  changes  in  the  internal  energy.  There- 
fore, we  need  not  be  concerned  with  that  part  of  the  internal  energy  which 
cannot  be  affected  by  manipulating  the  system.  In  the  system  of  Fig.  19-1, 
for  example,  it  is  not  necessary  to  consider  the  binding  energy  of  the 
atomic  nuclei  of  the  gas,  because  that  energy  can  be  neither  increased  nor 
decreased  by  warming  the  system  over  the  temperature  range  of  interest  or 
by  moving  the  piston. 

The  system  can  be  manipulated  also  by  doing  mechanical  work  on  it. 
This  can  be  done,  for  instance,  by  moving  the  piston  inward  against  the  re- 
sisting force  produced  by  the  pressure  of  the  gas.  This  type  of  manipula- 
tion also  tends  to  change  the  internal  energy  of  the  system.  Doing  work  on 
a system  always  involves  a change  in  one  or  more  external  parameters  of 
the  system.  An  external  parameter  is  some  macroscopic  quantity  (other 
than  temperature)  which  is  under  external  control.  In  this  example,  the 
significant  external  parameter  is  the  volume  of  the  cylinder,  which  can  be 
calibrated  in  terms  of  the  position  of  the  piston.  But  for  other  systems  it 
could  be  some  quite  different  physical  quantity,  such  as  the  strength  of  an 
externally  applied  magnetic  field. 

Now,  the  state  of  the  system — that  is,  its  equilibrium  macrostate  — 
depends  on  its  internal  energy  E,  and  the  state  changes  as  E changes.  This 
is  certainly  true  when  the  energy  is  known  exactly,  as  is  the  case  for  a mona- 

Fig.  19-1  A typical  system  for  studying  a thermodynamic  interaction.  A certain  quantity 
of  gas  is  confined  in  a cylinder.  Heat  can  be  transferred  into  or  out  of  the  system  by  placing  the 
system  in  thermal  contact  with  a water  bath.  The  volume  of  the  gas  can  be  varied,  and  me- 
chanical work  thus  done,  by  moving  the  piston.  The  scale,  the  pressure  gauge,  and  the  ther- 
mometer enable  measurement  of  the  volume,  pressure,  and  temperature  of  the  gas,  respec- 
tively. 


19-1  Thermodynamic  Interactions  and  the  First  Law  of  Thermodynamics  843 


(19-1) 


tomic  ideal  gas  where,  according  to  Eqs.  (18-19)  and  (18-16), 

E = inRT  = ipV 

But  it  is  also  true  when  there  is  no  simple  way  of  determining  E,  as  is 
usually  the  case  in  more  complicated  systems.  We  are  led  by  our  confidence 
in  the  law  of  energy  conservation  to  assert  that  if  AH  is  defined  to  be  the 
amount  of  heat  flowing  into  the  system  and  AW  is  the  amount  of  work  done 
on  the  system  by  varying  one  or  more  external  parameters,  then  the  increase 
A E in  the  internal  energy  of  the  system  is  given  by  the  sum 

AE  = A H + AW  (19-2) 

This  is  the  first  law  of  thermodynamics.  The  sign  convention  is  chosen  so 
that  positive  values  of  A H and  AW  both  lead  to  a change  A E in  the  internal 
energy  of  the  system  which  has  a positive  value.  (Be  careful  in  comparing 
with  other  books,  because  there  are  several  sign  conventions  in  current 
use.) 


In  Chap.  18  we  concentrated  on  situations  involving  thermal  interac- 
tion only.  This  corresponds  to  locking  the  piston  in  the  cylinder  in  Fig.  19-1 
and  considering  changes  in  internal  energy  due  to  heat  flow  AH  only.  For 
an  ideal  monatomic  gas  in  particular,  we  used  such  a process  to  derive  Eq. 
(18-25),  which  expresses  c»,  the  molecular  heat  capacity  of  an  ideal  gas  at 
constant  volume,  in  terms  of  Boltzmann’s  constant  k.  The  relation  is 
c'v  — \k.  In  taking  the  macroscopic  point  of  view,  it  is  more  convenient  to 
use  the  kilomole  rather  than  the  molecule  as  the  unit  of  matter.  The  molar 
heat  capacity  at  constant  volume  c",  is  defined  to  be  the  heat  capacity  of  1 
kmol  of  any  substance  when  it  is  subjected  to  a process  in  which  its  volume 
remains  fixed.  The  molar  quantity  c"  is  related  to  the  molecular  quantity c„ 
by  the  expression  c'i  = Ac'v , where  A is  Avogadro's  number,  the  number  of 
molecules  per  kilomole.  The  molar  heat  capacity  of  an  ideal  gas  at  constant 
volume  is  thus 

c'v  = I Ak 

But  the  universal  gas  constant  R is  defined  by  Eq.  (17- 14a)  to  be  R = Ak.  So 
the  molar  heat  capacity  of  an  ideal  gas  at  constant  volume  can  be  written 

cS  = IR  (19-3) 

A process  in  which  the  volume  of  the  system  remains  fixed  is  called  iso- 
metric. If  the  cylinder  is  heated  isometrically,  the  pressure  of  the  gas  will 
increase  in  conformity  with  the  ideal-gas  law  in  the  form  p = ( nR/V)T  - 
(constant)T.  If  the  piston  is  then  unlocked,  it  will  move  until  the  pressure  p 
inside  the  cylinder  is  equal  to  the  external  pressure  paim.  For  a light,  fric- 
tionless piston  this  motion  will  be  quite  sudden.  Work  will  be  done  in  push- 
ing the  external  air  out  of  the  way,  and  turbulence  will  be  generated  both 
inside  and  outside  the  cylinder.  T he  motion  of  the  piston  may  also  generate 
sound  waves  whose  energy  is  dissipated  far  from  the  system.  Then  it  will  be 
impossible  to  restore  the  piston  to  its  original  position  without  restoring  the 
lost  energy  to  the  system  from  some  external  source;  the  original  state  of 
the  system  cannot  be  achieved  by  the  system  with  its  new,  smaller  energy. 

The  process  just  described  typifies  the  class  called  irreversible  pro- 
cesses. (Other  examples  are  processes  1 ' and  2'  in  Sec.  18-7.)  Consider  such 
a process  as  taking  place  in  a large  system  comprising  the  universe  as  a 
whole.  This  universal  system  is  divided  into  two  subsystems.  Subsystem  1 is 


the  system  of  interest,  and  subsystem  2 is  the  rest  of  the  universe.  The  en- 
ergy of  the  universe  as  a whole  has  not  changed  as  a result  of  the  process. 
But  the  entropy  of  the  universe  has  increased  in  the  conversion  of  the  or- 
dered motion  of  the  piston  to  the  disordered  motion  of  the  two  subsystems 
comprising  the  universe  as  a whole.  A spontaneous  restoration  of  sub- 
system 1 to  its  original  state  would  involve  a decrease  in  entropy.  But  the 
word  “spontaneous”  implies  that  subsystem  1 is  to  be  either  isolated  from 
the  rest  of  the  universe  (subsystem  2)  during  the  restoration  or  in  both 
thermal  and  mechanical  equilibrium  with  the  rest  of  the  universe,  which 
consequently  exerts  no  net  influence  on  it.  And  the  second  law  of  thermo- 
dynamics given  by  Eq.  (18-55),  A 5 3=  0,  requires  that  the  entropy  of  an  iso- 
lated system  (or  subsystem)  always  increase  (if  it  is  approaching  equilibrium 
from  a nonequilibrium  macrostate)  or  remain  the  same  (if  it  is  initially  in  an 
equilibrium  macrostate.)  The  practical  impossibility  of  spontaneous  resto- 
ration is  what  we  mean  by  “irreversible.” 

What  is  most  evident  from  the  macroscopic  point  of  view  is  the  way  in 
which  the  suddenness  of  the  expansion  process  leads  to  an  irrecoverable 
“escape”  of  energy  to  the  outside  world.  Such  a sudden  process  is  called 
nonquasistatic,  for  reasons  which  become  clear  in  the  discussion  of  quasi- 
static processes  immediately  below. 

The  “escape”  or  dissipation  of  energy  from  a system  can  be  avoided 
by  making  the  motion  of  the  system  gradual.  Instead  of  releasing  the 
piston,  it  is  possible  to  reduce  the  external  force  holding  it  in  place  by  an 
infinitesimal  amount.  The  piston  will  then  move  very  slowly.  The  system 
could  be  arranged  as  in  Fig.  19-2,  so  that  the  resisting  force  is  produced  by 
a weight,  which  is  raised  as  the  cylinder  expands.  Work  is  done  by  the  gas 
on  the  weight,  whose  potential  energy  is  increased.  The  idealized  process  is 
called  quasistatic  because  the  system  is  practically  (though  not  quite)  at 
rest — and  thus  in  mechanical  equilibrium — at  all  times.  Also,  because  the 
mechanical  process  takes  place  slowly,  the  temperature  changes  which  re- 
sult from  the  expansion  are  slow,  and  the  entire  system  maintains  a uni- 
form temperature.  Thus  the  system  is  in  thermal  equilibrium  as  well.  If  the 
force  exerted  by  the  weight-cam-gear  device  were  increased  by  an  infinites- 
imal amount,  the  energy  stored  in  the  weight  could  be  restored  to  the  gas 
by  recompressing  the  gas  to  its  original  equilibrium  macrostate.  Thus  this 
particular  quasistatic  process  is  also  a reversible  process. 


Fig.  19-2  A quasistatic  process.  The  gas  in  the 
cylinder  is  initially  at  atmospheric  pressure 
with  the  piston  at  the  position  labeled  A.  The 
system  is  then  heated  with  the  piston  locked  in 
place,  until  the  temperature  and  gas  pressure 
achieve  some  new,  higher  values.  The  piston  is 
then  released  and  moves  to  the  right  as  the  gas 
in  the  cylinder  expands.  Work  is  done  in 
raising  the  weight  by  means  of  the  rack-and- 
pinion  gear,  the  cam.  and  the  rope  which 
winds  up  on  the  cam.  As  the  cylinder  expands, 
the  gas  pressure  decreases  and  therefore  the 
force  exerted  by  the  piston  decreases  as  well. 
The  cam  is  shaped  so  that  the  force  exerted  by 
the  piston  at  every  moment  is  just  enough  to 
raise  the  weight.  The  expansion  of  the  gas 
therefore  takes  place  very  slowly,  or  quasistati- 
cally. 


19-1  Thermodynamic  Interactions  and  the  First  Law  of  Thermodynamics  845 


Fig.  19-3  An  isobaric,  quasistatic  process.  By  im- 
mersing the  cylinder  in  a series  of  water  baths  of 
gradually  increasing  temperature,  the  system  is 
warmed  from  initial  temperature  Tt  to  final  temper- 
ature Tf.  The  piston  is  exposed  to  the  atmosphere, 
and  the  gas  pressure  is  always/)  = patm.  The  gas  ex- 
pands, and  the  piston  moves  outward  through  a dis- 
placement Ay. 


The  heating  and  expansion  of  the  system  of  Figs.  19-1  and  19-2  can 
also  be  made  to  take  place  simultaneously.  The  process  is  different  from 
the  one  just  discussed  in  important  ways.  In  Fig.  19-3,  the  cylinder  contains 
n krnol  of  ideal  gas  at  an  initial  temperature  T.  The  initial  state  is  specified 
by  the  equation  of  state  for  an  ideal  gas: 

pV  = nRT  (19-4) 

Now  we  change  the  temperature  of  the  system  quasistatically.  In  principle, 
this  could  be  done  by  lifting  the  system  out  of  one  water  bath  and  im- 
mersing it  in  a series  of  similar  baths,  each  of  which  is  very  slightly  warmer 
than  the  last.  Flere  again,  the  system  is  essentially  in  thermal  and  mechan- 
ical equilibrium  at  all  times.  The  piston  remains  free  and  exposed  to  the 
outside  air  throughout  this  process.  Any  tendency  for  the  pressure  inside 
the  cylinder  to  change  results  immediately  in  a movement  of  the  piston, 
and  the  pressure  remains  constant.  Such  a process  is  called  isobaric. 

Let  us  consider  the  work  clone  in  an  isobaric  process.  The  work  done 
on  the  system  in  such  a process  is  not  zero,  as  it  is  in  an  isometric 
(constant-volume)  process.  As  the  gas  is  warmed  at  constant  pressure,  it 
must  expand.  In  Fig.  19-3,  the  piston,  whose  area  is  A,  moves  outward 
through  a displacement  given  by  the  signed  scalar  Ay,  which  we  take  to 
have  a positive  value  for  outward  motion.  As  it  does  so,  it  exerts  a constant 
outward  force  against  the  outside  air,  whose  pressure  is  p = pa tm.  That 
force  is  given  by  the  signed  scalar  F = pA.  The  work  done  by  the  system  on 
the  outside  world  (that  is,  the  atmosphere)  is  thus  given  by  the  product 
F Ay  = pA  Ay. 

The  force  exerted  by  the  atmosphere  on  the  piston  is  equal  in  magni- 
tude to,  and  opposite  in  direction  to,  the  force  exerted  by  the  piston  on  the 
atmosphere.  Its  value  is  therefore  —F=  —pA.  The  work  AIT  done  on  the 
system  by  the  outside  world  (the  atmosphere)  is  thus  given  by 

AIT  = -pA  Ay 

Since  the  quantity  A Ay  is  the  volume  change  AT  of  the  cylinder,  the  work 


846  Thermodynamics 


A W can  be  written 


Fig.  19-4  A thermodynamic  process 
depicted  on  a p-V  diagram.  As  the  vol- 
ume of  the  system  changes  from  VA  to 
VB,  the  pressure  p varies  in  a way  which 
is  described  by  the  curve  joining  the  ini- 
tial state  A to  the  final  state  B of  the 
system.  This  curve  is  not  the  only  pos- 
sible path  from  A to  B,  and  the  thermo- 
dynamic process  depicted  by  it  is  not  the 
only  one  by  which  the  system  can  pass 
from  its  initial  to  its  final  state. 


AW  = -p  Ay  (19-5a) 

Even  in  a process  in  which  the  pressure  does  not  remain  constant  as 
the  piston  moves  and  the  cylinder  volume  changes  — that  is,  a process 
which  is  not  isobaric  — the  work  AW  done  on  the  system  by  the  outside 
world  can  still  be  expressed  in  terms  of  the  pressure  and  the  volume 
change.  Consider  a part  of  the  process  in  which  the  piston  moves  through 
an  infinitesimal  displacement  dy.  The  work  dW  done  on  the  system  is  like- 
wise infinitesimal  and  is  given  by  the  expression 

dW  = —pAdy 

Here  the  value  of  the  pressure  p is  that  appropriate  to  the  particular  posi- 
tion of  the  piston.  Since  the  quantity  A dy  is  the  infinitesimal  volume  change 
dV  of  the  cylinder,  dW  can  be  written 

dW  = —p  dV  (19-56) 

Suppose  that  the  entire  process  begins  with  the  piston  in  such  a posi- 
tion that  the  volume  of  the  cylinder  has  the  value  V)  and  ends  with  the 
piston  in  such  a position  that  the  volume  of  the  cylinder  has  the  value  Vf. 
Then  the  total  work  AW  done  on  the  system  by  the  outside  world  is 

[vr  rvf 

AW  = dW  = - pdV  (19-5r) 

J Vi  J v. 

If  this  equation  is  to  be  useful,  the  pressure  p must  be  determined  as  a func- 
tion of  the  volume  V.  Fortunately,  it  is  frequently  possible  to  do  this,  as  will 
become  evident  in  due  course. 

Any  process  for  which  p is,  in  fact,  known  as  a function  of  V can  be  de- 
picted on  a graph  whose  axes  are  V and  p.  Figure  19-4  shows  such  a graph, 
which  is  called  a p-V  diagram.  Every  point  on  the  plane  of  the  graph  for 
which  p and  V are  positive  represents  a unique  combination  of  pressure 
and  volume.  If,  in  addition,  the  system  contains  a fixed  quantity  of  a certain 
substance  and  is  in  thermal  and  mechanical  equilibrium,  each  point  must 
have  associated  with  it  also  a unique  temperature,  given  by  the  equation  of 
state  of  that  substance.  Thus  the  state  of  the  system  is  completely  specified  for  every 
point  on  the  pV  plane. 

Suppose,  for  example,  that  the  system  contains  1 kmol  of  an  ideal  gas. 
Then  specifying  p and  V automatically  requires  that  the  temperature  be 
T = pV/R.  Depending  on  the  conditions  of  the  process,  the  system  can  be 
made  to  pass  slowly  from  state  A in  Fig.  19-4  to  state  B along  any  desired 
path.  Each  path  consists  of  a sequence  of  points  specifying  equilibrium 
macrostates  of  the  system.  And  for  each  such  state,  the  explicit  values  of  p 
and  V imply  also  specific  values  of  the  temperature  T,  the  internal  energy 
E,  and  (as  we  will  see  in  more  detail  presently)  the  entropy  S. 

According  to  Eq.  (19-5c),  the  work  done  by  the  system  on  the  outside 
world  in  expanding  from  volume  VA  to  the  volume  VB  (that  is,  the  negative 

fVB 

of  the  work  done  on  the  system  by  the  outside  world)  is  p dV.  This  is  just 

JvA 

the  area  under  the  curve  in  Fig.  19-4.  That  area  depends  on  the  particular 
path  followed  in  the  process.  (Such  a path  represents  the  passage  of  the 
system  through  a particular  sequence  of  pairs  of  values  p and  V .)  Thus, 
although  the  states  represented  by  points  A andfl  on  the  p-V  diagram  each 


19-1  Thermodynamic  Interactions  and  the  First  Law  of  Thermodynamics  847 


19-2  ISOMETRIC  AND 
ISOBARIC  PROCESSES 


p 

i 


Path  1 


VA 


Fig.  19-5  The  cycle  of  a hypothetical 
heat  engine  depicted  on  a p-V  diagram. 
The  system  passes  from  state  A to  state 
B along  path  1 and  then  returns  to  state 
A along  path  2. 


have  a unique  internal  energy  E , there  is  no  such  thing  as  a unique  “work 
difference”  between  them. 

How  can  this  be?  The  answer  lies  in  the  first  law  of  thermodynamics. 
For  a given  energy  difference  A E between  two  states,  any  desired  amount 
of  work  ATT  can  be  done  in  passing  from  one  to  the  other,  provided  the 
proper  amount  of  heat  A H flows  into  (or  out  of)  the  system.  Thus,  while 
the  p-V  diagram  explicitly  depicts  the  work  done  in  a particular  process 
going  from  state  A to  state  B in  the  form  of  the  area  under  the  particular 
curve  connecting  A and  B which  describes  the  process,  it  implicitly  requires  a 
fixed  heat  flow  for  any  particular  process  curve. 

The  conclusion  is  that  “heat  H"  and  “work  IT”  are  not  uniquely  defin- 
able quantities  that  can  be  nsec!  to  specify  the  state  of  a system.  Likewise, 
A H , the  heat  flowing  into  the  system  from  the  outside  world,  and  ATT,  the 
work  clone  on  the  system  by  the  outside  world,  cannot  be  used  separately  to 
specify  the  difference  between  two  states.  But  their  sum,  which  according 
to  the  first  law  of  thermodynamics,  A E = A.H  + ATT,  is  the  internal  energy 
difference  AT,  can  be  used  for  this  purpose.  Any  variable  which,  like  A E,p, 
or  T,  can  be  used  to  specify  the  state  of  a system  is  called  a state  variable. 


Since  the  work  done  on  a system  by  the  outside  world,  as  it  is  made  to  pass 
from  one  state  to  another,  depends  on  the  path  taken  through  a diagram 
like  the  p-V  diagram  of  Fig.  19-4,  the  heat  flow  into  the  system  from  the 
outside  world  likewise  depends  on  the  path.  In  this  observation  lies  the  pos- 
sibility of  heat  engines.  A heat  engine  is  a device — usually  (but  not  always) 
a mechanical  device  in  the  familiar  sense  of  the  word — which  cycles  a 
working  fluid  (such  as  an  ideal  gas)  repeatedly  around  a close  curve  on  the 
p-V  diagram.  By  means  of  this  process,  heat  energy  can  be  converted  into 
mechanical  work,  or  vice  versa.  Figure  19-5  shows  such  a hypothetical  heat 
engine  cycle.  The  system  expands  from  state  A (where  its  volume  is  VA)  to 
state  B (where  its  volume  is  VB)  along  the  tipper  path  1,  and  then  it  returns 
to  state  A along  the  lower  path  2.  As  its  pressure  p and  its  volume  V change 
through  one  cycle,  the  work  ATT  done  on  the  system  by  the  outside  world  is, 
according  to  Eq.  (19-5r), 


ATT  = - 


path  1 


dV  - 


path  2 


dV 


This  can  be  rewritten 


ATT 


path  1 


path  2 


( 19-6a) 


The  value  of  the  first  of  these  integrals  is  depicted  by  the  area  in  Fig.  19-5 
shaded  with  diagonal  hatching.  The  value  of  the  second  integral  is  given  by 
the  area  shaded  with  horizontal  hatching.  Thus  the  work  ATT  done  on  the 
system  in  one  cycle,  being  the  negative  of  the  difference  between  the  two 
shaded  areas,  is  the  negative  of  the  area  inside  the  closed  curve  comprising  the 
two  paths.  We  write  this  relation  for  the  work  done  on  the  system  as 


ATT  = - 


closed 


(19-66) 


curve 

This  equation  is  applied  to  a specific  heat  engine  cycle  in  Example  19-1. 
The  cycle  involves  a sequence  of  isometric  or  isobaric  processes. 


848  Thermodynamics 


Fig.  19-6  A heat  engine  cycle  dis- 
cussed in  Example  19-1.  This  not  very 
practical  cycle  is  described  by  a rectan- 
gular curve  on  a p-V  diagram. 


N M 


0 12  3 4 5 

V (in  m3) 


3 X 10s 

2?  2 X 105 

a, 

c 

cs  i v in5 


EXAMPLE  19-1  ■■■ 111  1 11 — — 

l lie  cycle  of  a possible  (but  not  very  practical)  heat  engine  is  shown  in  Fig.  19-6.  A 
cylinder  having  initial  volume  3.00  m3  contains  0.100  kmol  of  helium  gas  at  a pres- 
sure of  2.00  X 105  Pa  (about  2 atm).  This  state  is  represented  by  point  K in  the  fig- 
ure. The  system  expands  quasistatically  and  isobarically,  doing  work  on  some  ex- 
ternal load,  until  its  volume  is  5.00  m3  at  point  L.  (In  order  to  keep  the  pressure  con- 
stant as  the  volume  is  increased,  the  temperature  must  be  increased.  Thus  heat 
must  be  flowing  into  the  system  during  this  part  of  the  cycle.)  The  piston  is  then 
locked  in  position,  fixing  the  volume,  and  the  pressure  is  reduced  quasistatically 
and  isometrically  to  1.00  X 105  Pa  at  point  M.  (This  is  done  by  slowly  cooling  the 
system,  perhaps  by  placing  it  in  a series  of  successively  cooler  baths.)  The  system  is 
then  compressed  quasistatically  and  isobarically  by  pushing  in  on  the  piston.  (In 
order  to  keep  the  pressure  constant  as  the  volume  is  decreased,  the  temperature 
must  be  reduced  still  further.)  At  point  N,  with  the  volume  returned  to  its  initial 
value  of  3.00  nr3,  the  piston  is  again  locked  to  fix  the  volume,  and  the  pressure  is  in- 
creased quasistatically  and  isometrically  to  its  original  value  of  2.00  X 105  Pa,  thus 
completing  the  cycle.  (In  this  last  step,  the  temperature  must  again  be  increased.) 
Find  the  work  AW  done  on  the  engine  in  one  cycle  KLMNK. 

■ The  easiest  way  to  proceed  is  to  equate  the  work  — AW  done  by  the  engine  to  the 
area  of  the  rectangle  KLMN.  That  is,  the  negative  displayed  in  integral  of  Eq. 
(19-6(i)  has  the  value 

-AW  = Ap  AT  = (2.00  x 105  Pa  - 1.00  x 10s  Pa)  x (5.00  m3  - 3.00  m3) 

= 2.00  x 1 05  J 

Therefore 

AW  = -2.00  x 105  J 

Alternatively,  you  can  get  the  same  result  by  evaluating 

fVL  r V„  p,  rVK 

AW  = - pdV  - p dV  - pdV  - p dV 
JvK  JvL  J vM  J v„ 

= -(2.00  x 105  Pa  x 2.00  m3)  - 0 - [1.00  x 105  Pa  x (-2.00  m3)]  - 0 


or 


AW  = -2.00  x 105  J 

The  negative  value  of  AW,  the  work  done  on  the  heat  engine  by  tbe  outside  world, 
means  that  the  heat  engine  does  positive  work  on  the  outside  world.  You  will  see  in 
Example  19-2  that  the  necessary  energy  is  supplied  to  the  engine  from  the  outside 
world  in  the  form  of  heat. 


It  was  James  Watt  who  first  recognized  the  connection  between  the  enclosed 
area  of  the  p-V  curve  and  the  mechanical  work  output  of  a heat  engine  (which  for 
Watt  meant  the  steam  engine).  Following  up  this  idea,  Watt  invented  the  s team 


19-2  Isometric  and  Isobaric  Processes  849 


Fig.  19-7  A steam  engine  indicator,  a device  which  traces  the  actual  operating  cycle  of  a heat 
engine  as  a p-V  diagram. 


engine  indicator.  In  modified  form,  it  is  still  used  today  in  studies  of  reciprocating 
engines  of  all  kinds.  One  form  of  the  device  is  shown  in  Fig.  19-7.  The  cord  atQ  is 
attached  to  the  piston  rod  or  some  other  convenient  oscillating  part  of  the  engine. 
Thus  the  spring-loaded  drum  D rotates  back  and  forth  on  its  axis  as  the  engine 
runs,  and  the  angular  displacement  of  the  drum  is  proportional  to  the  instanta- 
neous volume  of  the  cylinder.  The  pressure  gauge  on  the  left  is  attached  by  a tube 
to  the  cylinder  head,  and  thus  the  long  arm  reads  the  gas  pressure  in  the  cylinder. 
A pen  on  the  end  of  the  arm  writes  on  a sheet  of  paper  wrapped  around  the  oscil- 
lating drum  and  held  by  the  clips  CL.  It  thus  produces  a p-V  diagram.  A typical 
diagram  produced  by  such  a device  for  a single-acting  steam  engine  is  shown  in 
Fig.  19-8. 


Fig.  19-8  A p-V  diagram  for  an 
operating  steam  engine,  traced  by  a de- 
vice like  that  shown  in  Fig.  19-7. 


Having  calculated  the  work  input  to  a hypothetical  heat  engine  in  the 
course  of  a cycle  of  operation,  we  now  turn  to  a calculation  of  the  corre- 
sponding heat  input.  This  requires  that  we  know  the  heat  capacity  at  con- 
stant volume  (that  is,  determined  isometrically)  and  the  heat  capacity  at 
constant  pressure  (that  is,  determined  isobarically).  In  Sec.  17-6,  we  de- 
fined the  heat  capacity  of  an  arbitrary  substance  in  an  empirical  fashion, 
but  did  not  specify  the  conditions  under  which  the  warming  process  was  to 
take  place.  Here,  however,  it  is  important  to  be  explicit  in  this  regard.  Gen- 
eralizing the  definition  we  have  already  made  in  Eq.  (17-3)  for  an  ideal  gas, 
we  therefore  refine  the  definition  of  Sec.  17-6  and  define  the  molar  heat 
capacity  at  constant  volume  cl  so  that  it  satisfies  the  empirical  equation 


A/L  = nc'l  AT  for  V = constant  (19-7a) 


In  this  equation,  A H is  the  heat  added  to  the  system  which  contains  n kmol, 
and  A T is  the  consequent  temperature  increase.  Similarly,  we  define  the 


850  Thermodynamics 


molar  heat  capacity  at  constant  pressure  c'l  according  to  the  empirical 
equation 

A H = nc'l  AT  for  p — constant  ( 19-76) 

In  an  isobaric  (constant-/?)  process,  the  work  done  on  a system  by  the 
outside  world  is  given  by  Eq.  (19-5 a),  AW  = —p  AW  The  hrst  law  of  ther- 
modynamics, A T = A H + AW,  can  therefore  be  written  in  the  special  form 

A E = A H - p AT  for/?  = constant  (19-8) 

And  since  the  process  described  by  this  equation  is  isobaric,  we  can  also 
substitute  into  it  the  value  of  A H given  by  Eq.  (19-76)  to  obtain 

AT  = nc’l  A T — p AT  for  p = constant  (19-9) 

The  value  of  c'l  depends  on  the  equation  of  state  of  the  substance  to 
which  it  applies.  If  the  substance  under  consideration  is  a monatomic  ideal 
gas,  we  can  combine  the  known  equation  of  state,  pV  = nRT,  with  Eq. 
(19-9)  to  evaluate  c'l  explicitly.  Imagine  n kmol  of  the  ideal  gas  to  be  con- 
fined in  a cyclinder,  where  it  is  subjected  to  an  isobaric  process.  Since  only  T 
and  T remain  variable,  any  change  AT  in  the  volume  must  be  proportional 
to  a change  AT  in  the  temperature.  We  thus  have 

p AT  = nR  AT  for  p — constant  (19-10) 

The  internal  energy  of  an  ideal  gas  is  determined  entirely  by  its  tem- 
perature. For  a monatomic  ideal  gas,  according  to  Eq.  (18-19),  E = f nRT. 
So  any  change  A E in  internal  energy  can  be  written 

AT  = inR  AT  (19-11) 

Substituting  Eqs.  (19-10)  and  (19-11)  into  Eq.  (19-9),  we  find 

c'l  = iR  (19- 12a) 

The  molar  heat  capacity  at  constant  pressure  is  greater  than  the  molar  heat 
capacity  at  constant  volume.  For  a monatomic  ideal  gas,  the  latter  is  given 
by  Eq.  (19-3),  which  is 

c'l  = IR  (19-126) 

The  difference  is  due  to  the  fact  that  the  gas  expands  when  it  is  warmed  at 

constant  pressure  and  therefore  does  work  on  the  outside  world.  Conse- 
quently, energy  in  the  form  of  heat  must  be  supplied  to  the  system  over 
and  above  that  required  to  increase  its  internal  energy  as  its  temperature  is 
increased.  Therefore  more  heat  input  is  required  to  produce  a given  tem- 
perature increase  than  is  the  case  when  the  system  is  warmed  at  constant 
volume. 

For  a monatomic  ideal  gas,  the  difference  between  the  molar  heat 
capacity  at  constant  pressure  and  that  at  constant  volume  can  be  found  by 
subtracting  Eq.  (19-126)  from  Eq.  (19-12a).  This  gives 

c'l  ~ c'l  = iR  - §R 
or 

c'l  — c'l  = R (19-13) 

Why  is  it  that  for  an  incompressible  fluid  the  heat  capacity  difference  is 
c'l  - c'l  = 0? 


19-2  Isometric  and  Isobaric  Processes  851 


In  Example  19-2  the  two  molar  heat  capacities  just  discussed  are  used 
to  calculate  the  heat  input  required  to  drive  the  heat  engine  of  Example 
19-1  through  one  cycle  of  operation.  Thus  the  first  law  of  thermodynamics 
is  used  to  connect  the  net  work  output  - AW  of  the  engine  with  the  neces- 
sary net  heat  input  A H. 


EXAMPLE  19-2 


For  the  heat  engine  of  Example  19-1,  find  the  temperatures  of  the  system  in  the 
states  represented  by  the  points  K,  L,  M,  and  N in  the  p-V  diagram  of  Fig.  19-6. 
Then  calculate  the  heat  input,  the  heat  output,  and  the  net  heat  flow  AH  into  the 
system  during  one  cycle. 

■ The  working  fluid  is  helium,  and  since  the  maximum  pressure  is  only  about  2 
atm,  you  can  assume  that  the  ideal-gas  law  is  a good  approximation,  provided  the 
temperature  is  always  well  above  the  boiling  point,  4 K.  For  any  state  of  the  system, 
the  ideal-gas  law  pV  = nRT  gives  you 


For  the  state  represented  by  point  K you  thus  have 

2.00  x 10s  Pa  x 3.00  m3 

r r — — 799  K 

A 0.100  kmol  x 8.31  x 103  J/(kmol-K) 

At  point  L the  volume  is  5.00/3.00  that  at  K while  the  pressure  is  the  same.  So  the 
temperature  TL  is 

5.00 

Tl  = 722  K x = 1203  K 

At  point  M the  pressure  is  1.00/2.00  that  at  L,  w'hile  the  volume  V is  unchanged.  So 
you  have  for  the  temperature  TM 

1.00 

Tm  - 1203  K x — = 602  K 

And  at  point  N the  pressure  is  1.00/2.00  that  at  K,  while  the  volume  is  unchanged, 
so  the  temperature  TN  is 

1.00 

7"=  722  KX  2700  = 361  K 

There  is  heat  flow  along  all  parts  of  the  cycle.  As  you  saw  in  Example  19-1,  heat 
must  flow  into  the  system  along  paths  NK  and  KL  and  out  of  it  along  paths  LM  and 
MN.  For  each  of  these  paths,  you  can  calculate  the  heat  flow  by  using  the  relation 
AH  = nc"  AT,  provided  that  you  use  in  each  case  the  value  of  the  molar  heat  capac- 
ity c"  which  is  appropriate  to  the  process  taking  place  and  the  proper  temperature 
difference  A Tif  = Tf  — T,. 

Path  NK  is  isometric  and  path  KL  is  isobaric,  so  you  have 

A Hnkl  = nc'v  A Tnk  + ncl  A TKL  = nR(%ATNK  + fA TKL) 

Inserting  the  numerical  values  gives  you 

AHNKL  = 0.100  kmol  x 8.31  x 103  J/(kmobK) 

x [f(722  K - 361  K)  + §(1203  K - 722  K)] 

= 1.45  x 106J 

Similarly,  path  LM  is  isometric  and  path  MN  is  isobaric,  so  you  have 
A HLMN  = nc " ATlm  + nc';  A TMN  = nR®ATLM  + §A TMN) 


852  Thermodynamics 


Inserting  the  numerical  values  gives  you 

A Hum  = 0.100  kmol  x 8.31  x 103  J/(kmol-K) 

x [f(602  K - 1203  K)  + f(361  K - 602  K)] 

= -1.25  x 106  J 

Thus  there  is  a net  heat  flow  A H into  the  system  given  by 

AH  = A Hnkl  + A Hlmn  = 1.45  x 106  J - 1.25  x 106  J 
= 2.0  x 105  J 

This  positive  value  indicates  that  the  net  heat  flow  is  indeed  into  the  system.  The 
value  of  AH  is  exactly  the  same  as  that  of  — AW,  the  net  mechanical  work  output  per 
cycle  calculated  in  Example  19-1.  This  must  be  the  case  in  order  for  the  first  law  of 
thermodynamics  to  be  satisfied. 


Examples  19-1  ancl  19-2  demonstrate  the  central  principle  used  in 
applying  the  hrst  law  of  thermodynamics,  A E = AH  + AW,  to  the  opera- 
tion of  heat  engines.  As  already  noted,  neither  AH  nor  AW  is  a state  vari- 
able (a  variable  which  can  be  used  to  describe  the  state  of  a system),  and 
therefore  it  is  not  possible  to  analyze  the  behavior  of  a thermodynamic 
system  in  terms  of  either  taken  alone.  But  their  sum  A E is  a state  variable. 
Thus  its  value  depends  on  only  the  state  of  the  system  (represented  by  a 
point  on  the  p-V  diagram),  not  on  how  it  got  there.  Consequently,  the  net 
change  in  the  internal  energy  of  the  system  over  a closed  cycle  which  begins 
and  ends  at  the  same  point  is  zero.  And  since  for  such  a closed  cycle  A E = 
0,  the  hrst  law  of  thermodynamics  becomes  0 = AH  + AW,  or 

AH  = -AW 

In  words,  the  net  heat  input  to  the  system  over  a cycle  is  equal  to  the  net  work 
output  over  the  cycle.  This  general  statement  was  verified  for  a particular 
system  in  Examples  19-1  and  19-2. 

A heat  engine  is  thus  a converter  of  heat  energy  to  mechanical  energy.  That  is, 
it  converts  the  disordered  microscopic  energy  of  the  working  fluid  (in  Ex- 
amples 19-1  and  19-2,  an  ideal  gas)  into  the  ordered  energy  of  motion  of  a 
macroscopic  object  (in  the  same  examples,  the  piston  and  whatever  ex- 
ternal mechanism  is  linked  to  it). 

An  important  relation  between  the  molar  heat  capacity  determined 
isobarically  and  the  molar  heat  capacity  determined  isometrically  is  their 
ratio,  called  the  specific  heat  ratio  y.  It  is  defined  to  be 


[This  quantity  has  nothing  to  do  with  the  shear  strain  defined  in  Eq. 
(16-10),  for  which  the  same  symbol  is  conventionally  used.]  For  monatomic 
ideal  gases,  the  value  of  y is  (5R/2)/(3R/2),  or 

y = I for  monatomic  ideal  gases 

The  specific  heat  ratio  y will  not  have  this  value  for  polyatomic  gases.  But 
the  only  property  of  ideal  gases  which  is  actually  significant  in  evaluating 
the  molar  heat  capacity  difference  c'p  — c"  is  the  equation  of  state,  pV  — 
nRT.  So  if  the  behavior  of  a gas  conforms  to  this  equation,  the  molar  heat 
capacity  difference  given  by  Eq.  (19-13),  c'p  — C = R , will  be  valid  even  ifc[' 


19-2  Isometric  and  Isobaric  Processes  853 


Table  19-1 


Specific  Heat  Ratios  and  Differences  for  Typical  Gases 

Approximate 


Atoms  per 

temperature 

Cp 

Cp  — c'v 

Gas 

molecule 

(in  K) 

y — c" 

R 

Helium  (He) 

1 

300 

1.66 

0.99 

Argon  (Ar) 

1 

300 

1.67 

1.01 

Sodium  (Na) 

1 

1100 

1.68 

1.03 

Mercury  (Hg) 

1 

650 

1.67 

1.01 

Hydrogen  (H2) 

2 

500 

1.40 

1.00 

Nitric  oxide  (NO) 

2 

300 

1.40 

1.00 

Nitrogen  (N2) 

2 

300 

1.40 

1.00 

Oxygen  (02) 

2 

300 

1.40 

1.00 

Steam  (H20) 

3 

800 

1.30 

1.05 

Carbon  dioxide  (C02) 

3 

300 

1.30 

1.04 

Ammonia  (NELd 

5 

300 

1.31 

1.06 

Dry  air 

273 

1.403 

is  not  equal  to  3R/2.  Table  19-1  gives  values  of  c'p  — c»  and  y for  typical 
gases.  The  last  column  of  the  table  lists  experimental  values  for  the  ratio 
{c'p  ~ c’v)/R.  A deviation  of  this  value  from  1 suggests  that  the  behavior  of 
the  gas  does  not  conform  precisely  to  the  ideal-gas  law. 

In  Example  19-3  the  value  of  y is  related  to  microscopic  consider- 
ations. 


EXAMPLE  19-3 

Evaluate  the  specific  heat  ratio  y of  an  approximately  ideal  gas  as  a function  of  Jf, 
the  number  of  terms  in  the  expression  for  the  energy  content  of  a gas  molecule,  by 
using  the  theorem  of  equipartition  of  energy  in  the  form  of  Eq.  ( 18-30),  c'v  = Jfk/2. 

■ If  you  multiply  both  sides  of  Eq.  (18-30)  by  Avogadro’s  number  A to  convert 
from  molecular  quantities  to  molar  quantities,  and  use  the  relations  c£  = Ac^  and 
R = Ak,  you  obtain 


c 


n 

V 


JfR 

~Y 


And  from  Eq.  (19-14)  the  molar  heat  capacity  at  constant  pressure  is 

(Jf  + 2 )R 


r"  = c"  + R = 


Thus  you  have  for  y = c'p/c'l,  the  value 


Jf  + 2 
Jf 


19-15) 


The  value  of  y for  gases  tends  downward  as  the  complexity  of  the  mol- 
ecules increases  and  the  number  Jf  of  energy  terms  participating  in  equi- 
partition tends  to  increase.  You  can  see  that  this  is  so  by  referring  to  Table 
19-1.  The  largest  possible  value  of  y is  § (corresponding  to  Jf  = 3,  the 
smallest  possible  value  of  Jf)  while  the  smallest  possible  value  is  greater 
than  1.  (In  fact,  the  smallest  observed  values  ofy  are  just  under  1.1.)  Here, 
as  in  Tables  18-1  and  18-2,  it  appears  that  the  molecules  of  diatomic  gases 
seem  to  have  five  participating  terms,  since  Table  19-1  gives  y = 1.40  = 
i = (5  + 2)/5  quite  closely  for  all  of  them.  The  more  complicated  mole- 


854  Thermodynamics 


19-3  ISOTHERMAL 
AND  ADIABATIC 
PROCESSES 


cules  certainly  seem  to  have  values  of  N greater  than  5,  judging  from  the 
values  of  y in  the  table,  but  it  is  impossible  to  distinguish  on  this  basis 
between  the  possibilities  Jf  = 6 (which  would  give  y = I = 1.33)  and  Jf  = 
7 (which  would  give  y = f = 1.29). 

So  far  we  have  considered  isometric  processes,  in  which  the  volume  of  a 
system  is  kept  constant  while  its  state  is  changed,  and  isobat  ic  processes,  in 
which  the  pressure  is  kept  constant  while  the  state  is  changed.  A third  im- 
portant type  of  process  is  the  isothermal  process,  in  which  the  temperature 
is  kept  constant  while  the  state  is  changed.  In  such  a process,  heat  must  be 
able  to  flow  freely  into  or  out  of  the  system  while  work  is  done.  If  the 
gas-containing  cylinder  we  have  been  using  as  an  example  is  immersed  in  a 
large  water  bath  while  the  piston  is  moved  slowly  in  and  out,  the  process 
will  approximate  an  isothermal  one  quite  well,  provided  the  walls  of  the  cyl- 
inder conduct  heat  well.  (Approximately  isothermal  processes  are  common 
in  real  heat  engines,  though  the  details  are  different.) 

Isothermal  processes  are  particularly  easy  to  describe  for  systems  com- 
prising ideal  gases.  We  have  already  noted  that  the  internal  energy  £ of  a 
fixed  quantity  of  ideal  gas  depends  on  only  its  temperature.  Thus  there  is 
no  energy  change  in  an  isothermal  process  involving  an  ideal  gas.  Since 
A E — 0,  the  first  law  of  thermodynamics.  A E = AT/  + AIT,  can  be  written 

A H = —AIT  for  T = constant,  ideal  gas  (19-16) 

That  is,  the  amount  of  heat  flowing  into  the  system  must  be  exactly  equal  to 
the  amount  of  work  done  by  the  system. 

For  ideal  gases,  isothermal  processes  are  “Boyle’s-law  processes.”  Since 
T is  held  constant,  the  equation  of  state  becomes 

pV  — constant  for  T = constant,  ideal  gas  (19-17) 

which  is  Boyle’s  law.  Figure  19-9  is  a p-V  plot  of  such  a process  where  it  is 
assumed  that  the  system  (say,  the  cylinder)  contains  1 kmol  of  ideal  gas.  For 
a given  value  of  the  constant  in  Eq.  (19-17),  all  points  satisfying  the  equa- 
tion lie  on  a hyperbola.  And  since  specifying  the  value  of  the  constant  fixes 
the  temperature,  all  points  on  a given  hyperbola  correspond  to  the  same 
temperature.  The  curve  is  therefore  called  an  isotherm.  The  family  of 
isotherms  specifies  the  paths  for  all  possible  isothermal  processes  for  the 
system.  The  physical  meaning  of  the  isotherm  is  that  if  the  piston  is  moved 
quasistatically  in  and  out  with  the  cylinder  immersed  in  a water  bath,  all 
attainable  p-V  combinations  will  lie  on  the  same  isotherm. 

If  the  working  fluid  in  the  cylinder  is  not  an  ideal  gas  but  is  some  other 
substance,  the  isotherms  will  not  be  hyperbolas  but  will  be  more  compli- 
cated curves.  Nevertheless,  if  the  equation  of  state  is  known,  it  is  always 
possible  to  draw  the  family  of  isotherms.  Even  if  the  equation  of  state  is  not 
known,  isotherms  can  be  traced  experimentally. 


The  fourth  important  type  of  process  is  the  adiabatic  process.  It  repre- 
sents the  opposite  extreme  from  the  isothermal  process  in  which  heat  is  al- 
lowed to  How  freely  in  and  out  of  the  system  so  that  the  temperature  can  be 
maintained  constant.  (The  power  stroke  in  a gasoline  engine  approximates 
an  adiabatic  process.)  In  the  adiabatic  process,  the  system  is  surrounded  by 
perfect  thermal  insulation,  as  shown  in  Fig.  19-10,  so  that  no  heat  at  all  can 


19-3  Isothermal  and  Adiabatic  Processes  855 


Fig.  19-9  A p-V  plot  of  a family  of  isotherms 
for  1 kmol  of  an  ideal  gas.  This  family  of  hy- 
perbolas satisfies  a set  of  equations  of  the  form 
pV  = constant,  which  is  the  form  assumed  by 
the  ideal-gas  law  when  the  temperature  is  con- 
stant. Thus  a system  whose  initial  state  is  repre- 
sented by  a point  on  such  an  isotherm  will 
always  be  represented  by  some  point  on  that 
isotherm,  as  long  as  the  temperature  of  the 
system  does  not  change. 


Fig.  19-10  Idealized  representation  of 
a system  in  which  an  adiabatic  process 
can  take  place.  The  entire  system  is  sur- 
rounded by  perfect  insulation,  so  that 
no  heat  can  flow  into  or  out  of  it.  (The 
heat  capacity  of  the  insulation  itself  is  as- 
sumed to  be  negligible.) 


flow  into  or  out  of  it.  That  is,  an  adiabatic  process  is  defined  to  be  one  in 
which  A//  = 0.  [The  word  is  based  on  the  Greek  adiabatos,  meaning 
“impermeable  (to  heat)."]  The  first  law  of  thermodynamics,  A E = A H + 
AW,  implies  that  under  such  circumstances  any  work  AW  done  on  the 
system  must  result  in  a change  AT  in  its  internal  energy.  The  first  law  thus 
becomes 

A E = AW  for  an  adiabatic  process  (19-18) 

What  quantity  is  held  constant  in  an  adiabatic  process?  By  definition, 
T — constant  in  an  isothermal  process,  just  as  p — constant  in  an  isobaric 
process  and  V = constant  in  an  isometric  process.  No  heat  flows  into  or  out 
of  a system  in  an  adiabatic  process.  But  we  cannot  express  this  fact  by  writ- 
ting  H = constant,  since  H is  not  a state  variable.  We  can  use  calorimetric 
methods  to  measure  the  amount  of  heat  A H flowing  into  or  out  of  a system, 
but  we  cannot  specify  its  “heat  H"  in  a unique  manner. 

There  is.  however,  a quantity  which  will  serve  our  purpose  — the  en- 
tropy S.  The  entropy  of  a system  is  well  and  uniquely  defined  by  Eq. 
(18-54), 

5 = k In  w 

where  w is  the  number  of  microstates  of  the  system  contained  in  the  equi- 
librium macrostate  of  interest.  Thus  S is  a state  variable.  While  it  is  not 
usually  convenient  to  measure  the  entropy  of  a system  in  these  microscopic 
terms,  we  have  already  proved  that  a change  in  entropy  can  be  expressed 
macroscopically  in  terms  of  Eq.  (18-61), 


856  Thermodynamics 


Equation  (18-61)  is  valid  for  any  infinitesimal  process.  A finite  quasi- 
static process  is  made  up  stepwise  of  such  infinitesimal  processes.  The  en- 
tropy change  of  the  system,  as  it  passes  from  an  initial  state  denoted  by  the 
subscript  i to  a final  state  denoted  by  the  subscript/,  is  found  by  integrating 
both  sides  of  the  above  equation  to  obtain 


AS 


T,  (IH 

n T 


But  since  no  heat  flows  into  or  out  of  a system  in  an  adiabatic  process,  the 
quantity  dH  is  always  zero,  and  the  process  can  be  specified  in  terms  of  the 
zero  change  in  the  well-defined  state  variable,  the  entropy  S.  Thus  we  have 
the  condition 


AS  = 0 or  S = constant  for  an  adiabatic  process  (19-19) 
(For  this  reason,  the  adiabatic  process  is  sometimes  called  “isentropic.”) 


What  is  the  adiabatic  law  for  ideal  gases  which  corresponds  to  the 
isothermal  Boyle’s  law,  pV  = constant?  We  can  make  a guess  to  begin  with, 
based  on  a consideration  of  the  physical  situation.  Let  the  gas  in  the  insu- 
lated cylinder  of  Fig.  19-10  do  mechanical  work  by  expanding  quasistati- 
cally  while  the  piston  pushes  against  some  external  resistance,  as  in  Fig.  19-2. 
While  the  volume  of  the  cylinder  increases,  the  pressure  will  de- 
crease, just  as  in  the  isothermal  process.  The  reason  is  that  the  gas  molecules 
will  strike  the  walls  less  often  on  the  average.  But  there  is  an  additional  ef- 
fect. According  to  Eq.  (19-18),  the  energy  of  the  gas  must  also  decrease, 
since  work  done  by  the  system,  — AW,  represents  energy  flowing  out  of  it. 
(This  does  not  happen  in  the  isothermal  process  because  the  loss  of  energy 
through  mechanical  work  is  made  up  by  a flow  of  heat  energy  into  the  cyl- 
inder.) As  a result  of  the  decrease  in  the  energy  of  the  gas,  the  average 
speed  of  the  molecules  decreases,  and  the  pressure  is  decreased  by  this  ad- 
ditional effect  as  well.  Compared  to  the  isothermal  process,  then,  the  pres- 
sure depends  on  the  volume  more  strongly  in  the  adiabatic  process.  Thus 
we  must  multiply  the  pressure  by  a factor  which  depends  on  the  volume, 
but  which  increases  more  rapidly  than  the  volume,  if  we  wish  the  product 
of  the  factor  with  p to  remain  constant.  1 he  simplest  such  factor  is  of  the 
form  Va,  where  the  constant  a is  greater  than  1.  Thus  we  guess  the  adia- 
batic p-V  relation  to  be  of  the  form 

pVa  = constant  for  5 = constant  (adiabatic  process)  (19-20) 

We  now  refine  this  argument  by  applying  the  equation  of  state  of  an 
ideal  gas  directly  to  the  adiabatic  form  of  the  first  law,  A E = AW;  see  Eq. 
(19-18).  If  we  expand  the  gas  in  a cylinder  by  moving  the  piston  an  infini- 
tesimally small  distance,  the  volume  will  be  increased  by  an  amount  dV. 
Since  the  pressure  remains  essentially  constant  over  this  very  small  volume 
change,  the  work  clone  on  the  gas  by  the  outside  world  is  given  by  Eq. 
(19-56), 

dW  = -p  dV 

In  an  adiabatic  process  this  produces  an  infinitesimal  change  in  the  internal 
energy  of  the  gas,  and  the  equation  A E = AW  assumes  the  infinitesimal 
form  dE  = dW.  Thus  we  have 

dE  = —p  dV  (19-21) 


19-3  Isothermal  and  Adiabatic  Processes  857 


In  an  ideal  gas,  the  internal  energy  depends  on  only  the  temperature. 
Therefore  the  relation  between  dE  and  dT  must  be  the  same,  regardless  of 
the  means  by  which  the  energy  is  changed.  So  even  though  the  change  dE  is 
effected  by  doing  work  on  the  system  in  a volume  c hange  instead  of  by  add- 
ing heat  to  the  system,  so  that  we  have  dE  = dW  instead  of  dE  = dH.  we  can 
still  write  an  equation  identical  to  that  obtained  by  setting  dE  = dH  in  the 
differential  form  of  Eq.  (19-7«),  dH  = nd'v  dT.  That  is,  the  equation 

dE  = nc'v  dT  (19-22) 

must  hold.  This  is  true  (for  ideal  gases  only)  despite  the  fact  that  c",  the 
molar  heat  capacity  at  constant  volume,  was  originally  defined  (and  can  be 
experimentally  measured)  in  a constant-volume  process.  Combining  Eqs. 
(19-21)  and  (19-22),  we  obtain 

nc'v  dT  = —pdV  (19-23) 

In  order  to  apply  the  equation  of  state  of  the  ideal  gas,  pV  = nRT,  to 
this  process,  we  must  express  it  in  differential  form  also.  We  have 

d(pV)  = nR  dT 

But  applying  the  differential  form  of  Eq.  (2-15)  to  evaluate  d(pV)  gives 

d(pV)  = pdV  + V dp 

So  the  differential  form  of  the  icleal-gas  law  is 

p dV  + V dp  = nR  dT  (19-24) 

Since  we  want  a relation  of  the  form  of  Eq.  (19-20),  in  which  the  tem- 
perature does  not  appear,  we  eliminate  dT  between  Eq.  (19-24)  and  Eq. 
(19-23),  to  obtain 

1 

pdV  + V dp  = -nRp  dV—  (19-25) 

ncv 

or 

Vdp=  - (~  + l)  pdV 

From  Ecp  (19-13),  we  have  R = c'p  — c".  Thus  the  quantity  in  parentheses  in 
the  equation  immediately  above  simplifies  to  yield 


4+i 

Cv 


where  y is  the  specific  heat  ratio  defined  in  Eq.  (19-14).  Hence  Eq.  (19-25) 
becomes 


dp  dV 

~p  = 


(19-26) 


We  now  integrate  both  sides  of  this  differential  equation  between  an  arbi- 
trary initial  state  characterized  by  the  pressure  and  volume  pi  and  Vt  and  an 
arbitrary  final  state  having  pressure  and  volume  pf  and  Vf.  We  have 


[p,  dp  _ _ [vr  dV_ 

Jp,  J ~ ~7Jvl  ~V 

Evaluating  the  integrals  by  means  of  Eq.  (7-21),  we  obtain 

(In  p)p=Pf  - (hi  p)p=p.  = — y[(ln  V)v=Vf  - (In  V)v=v. ] 


858  Thermodynamics 


Since  the  difference  of  two  logarithms  is  equal  to  the  logarithm  of  their 
ratio,  the  equation  simplifies  to 


In 


— y In 


We  now  use  the  rule  that  for  any  coefficient  —a  and  any  argument  z,  we 
have  — a In  z = In  ( z~a ) = In  [( 1 /z)"].  This  allows  us  to  rewrite  the  right  side 
of  the  equation  immediately  above  in  the  form  In  (Vy /V/).  So  the  equation 
becomes 


In 


In 


Consequently,  pf/pi  = Vy  / Vf . Rearranging  terms,  we  obtain  the  result 

pfV/  = piV 7 for  S = constant,  ideal  gas  (19-27a) 

1 hat  is,  the  product  pVy  remains  constant  as  p and  V vary  in  an  adiabatic 
process.  Since  the  initial  and  final  states  are  chosen  arbitrarily,  we  can  drop 
the  subscripts  and  write 

pVy  — constant  for  S = constant,  ideal  gas  (19-276) 

That  is,  when  an  ideal  gas  is  subjected  to  an  adiabatic  process  (for  which  the  entropy 
S remains  constant ),  the  product  of  its  pressure  p and  its  volume  V,  ra  ised  to  the  power 
of  the  specific  heat  ratio  y,  remains  constant.  You  should  compare  this  with 
Boyle’s  law,  pV  = constant,  which  applies  to  ideal  gases  under  the  condi- 
tion T — constant. 

Example  19-4  applies  Eq.  (19-27«)  to  a specific  situation. 


EXAMPLE  19-4 

An  insulated  cylinder  contains  helium,  for  which  y = f , at  an  initial  pressure  of  2.00 
atm.  The  piston  is  allowed  to  move  outward  quasistatically  until  the  pressure  inside 
the  cylinder  has  fallen  to  1.00  atm.  What  is  the  ratio  of  the  final  volume  E/to  the  ini- 
tial volume  Vi ? 

■ You  can  write  Eq.  (19-27a)  in  the  form 

Pi 

ft.y5/3  = y5/3 

Pl  ‘ 2.00  r 


or 


2.00 


Solving  for  the  ratio  of  the  final  to  the  initial  volume,  you  obtain 

Vf 

y = (2.00)3'5  = 1.52 

Compare  this  result  with  that  for  the  isothermal  expansion  between  the  same  initial 
and  final  pressures,  where  the  ratio  is  Vf/Vi  = 2.  Can  you  explain  the  fact  that  the 
volume  ratios  are  not  the  same  on  the  basis  of  the  first  law  of  thermodynamics? 


Adiabatic  processes  can  be  plotted  on  a p-V  diagram  in  the  same  way  as 
isothermal  processes.  For  an  ideal  gas,  the  precise  shape  of  the  curve  de- 
pends on  the  value  of  y for  the  gas  involved.  Figure  19-11  shows  two  adia- 
batic curves  for  1 kmol  of  a monatomic  ideal  gas  such  as  helium  (for  which 
y = f)  and  two  for  a diatomic  ideal  gas  such  as  oxygen  (for  which  y = |). 


19-3  Isothermal  and  Adiabatic  Processes  859 


p (in  Pa) 


Fig.  19-1 1 Adiabatic  curves  shown  on  a p-V  diagram.  Isotherms  are  shown  for  comparison. 
All  curves  are  for  1 kmol  of  gas.  Two  adiabatics  represent  the  behavior  of  a monatomic  ideal 
gas  (7  = J),  and  two  adiabatics  represent  the  behavior  of  a diatomic  ideal  gas  (7  = f).  The 
isotherms  represent,  respectively,  the  behavior  of  any  ideal  gas  at  T = 273  K and  T = 473  K. 
Any  state  of  a particular  gas  is  represented  by  a point  on  the  p-V  diagram.  That  point  is  the  in- 
tersection of  one  adiabatic  and  one  isotherm.  The  slope  of  the  adiabatic  is  always  steeper  than 
that  of  the  isotherm  at  the  same  point  because  the  exponent  7 is  always  greater  than  1.  For  a 
particular  gas,  two  adiabatics  and  two  isotherms  can  be  used  to  make  up  a closed  heat  engine 
cycle.  This  is  exemplified  by  the  curve  KLMNK,  in  which  the  adiabatics  LM  and  NK  are  appro- 
priate tor  a monatomic  ideal  gas. 

Two  isotherms  are  also  shown  for  comparison.  Because  y is  always  greater 
than  1,  an  adiabatic  curve,  usually  called  simply  an  adiabatic,  is  always 
steeper  than  the  isotherm  passing  through  the  same  point. 


Since  the  equation  of  state  links  the  quantities  p,  V,  and  T,  the  ideal-gas 
adiabatic  law  of  Eqs.  ( 19-27a)  and  (19-276)  can  be  rewritten  in  terms  of  any 
pair  of  these  variables.  One  way  to  do  this  is  to  rewrite  Eq.  (19-27a)  in  the 
form 

pfVf  _ (Yiy-1 
PiVi  \Vf) 

Now  apply  the  ideal-gas  law  in  the  form 


to  obtain 


or 


PfVf  _ ]j 

PiVi  ~ Tj 


Tf  vr1 


TfV/-1  = TiV T1 


(19-28o) 


860  Thermodynamics 


This  can  also  be  written  in  the  general  form 

TVy~ 1 = constant  for  S = constant,  ideal  gas  (19-286) 

You  can  use  a very  similar  algebraic  manipulation  to  obtain  the  equivalent 
result 

piiy-i  Tf  = pvy-i  Ti  (19-29a) 

which  can  also  be  written 

plly~ i T = constant  for  5 = constant,  ideal  gas  (19-296) 

Example  19-5  applies  Eqs.  (19-28«)  and  (19-29o)  to  the  system  of  Ex- 
ample 19-4. 


EXAMPLE  19-5 


In  Example  19-4,  the  cylinder  is  initially  at  room  temperature,  T = 300  K.  Find  the 
final  temperature  Tf. 

■ Using  Eq.  (19-28a),  you  have 


77(1.52  Vi)y~l  = 300  K x VJ 
or 


Tf  = 300  K x (1.52)1_y  = 300  K x (1.52)"2/s  = 227  K 
Alternatively,  you  can  begin  with  Eq.  (19-29a)  and  write 

) Tf  = p}ly~1  x 300  K 


or 

Tf  = 21/y_1  x 300  K = 2~215  x 300  K = 227  K 


Adiabatic  processes  are  quite  common  both  in  naturally  occurring 
phenomena  and  in  practical  devices.  In  order  to  apply  the  rule  which  we 
have  just  derived  for  the  behavior  of  ideal  gases  under  adiabatic  conditions, 
it  is  necessary  to  have  accurate  values  of  the  specific  heat  ratio  y.  These  val- 
ues are  needed  for  mixtures  of  gases  (such  as  air)  as  well  as  for  pure  gases. 

From  the  fundamental  point  of  view,  a knowledge  of  the  value  of  y 
gives  insight  into  the  molecular  structure  of  a gas,  particularly  in  a temper- 
ature range  where  that  structure  is  changing  because  of  dissociation  or 
other  physical  or  chemical  processes.  The  study  of  the  structure  of  stars, 
for  example,  depends  crucially  on  the  estimates  made  of  the  value  of  y for 
the  gaseous  material  of  which  stars  are  largely  composed. 

A knowledge  of  the  value  of  y is  important  in  many  practical  applica- 
tions as  well.  The  mechanical  engineer  needs  to  know'  it  in  designing  both 
turbines  and  reciprocating  engines.  The  chemical  engineer  needs  the  in- 
formation in  the  design  of  systems  intended  for  gaseous  reaction  processes. 

A particularly  elegant  experimental  method  for  determining  y is  Ruchardt’s 
method  (1929).  The  apparatus,  shown  in  Fig.  19-12,  consists  of  a glass  tube  having 
a very  precise  cylindrical  inner  bore,  which  is  connected  to  a large  bottle.  The 
system  is  filled  with  the  gas  under  study.  An  accurately  round  ball  (usually  a steel 
ball  bearing)  is  chosen  of  such  a size  that  it  barely  fits  through  the  tube.  The  ball  is 
dropped  into  the  tube.  As  it  falls,  a small  amount  of  gas  leaks  past  the  ball  and 
serves  to  reduce  frictional  contact  between  the  ball  and  the  tube,  in  much  the  same 
way  as  friction  is  reduced  in  the  puck-air  table  system.  Since  the  leakage  is  rela- 
tively small,  the  fall  of  the  ball  into  the  tube  compresses  the  gas.  If  the  tube  is  long 


19-3  Isothermal  and  Adiabatic  Processes  861 


Fig.  19-12  Apparatus  for  determination  of  y by 
using  Riichardt’s  method. 


' 1 

F 


enough,  the  ball  overshoots  the  equilibrium  point  at  which  its  weight  is  balanced 
by  the  excess  pressure.  As  a result,  the  ball  oscillates.  The  equilibrium  point 
moves  slowly  downward  with  time,  because  of  the  slow  leakage  of  gas  past  the 
ball.  But  with  a ball  and  tube  of  carefully  matched  size  it  is  possible  to  observe  10 
to  20  oscillations,  or  even  more. 

The  oscillation  is  relatively  rapid;  the  period  is  typically  of  the  order  of  1 s.  As 
the  pressure  of  the  gas  alternately  rises  and  falls  below  atmospheric,  there  are 
small  oscillations  of  temperature  of  the  gas  in  the  system  above  and  below  the  am- 
bient temperature.  But  they  are  of  too  short  a duration  for  any  appreciable  amount 
of  heat  to  flow  out  of  or  into  the  system.  The  system  is  thus  essentially  adiabatic, 
even  though  no  special  insulation  is  used. 

Consider  the  ball  as  it  falls  down  the  tube  through  an  infinitesimal  displace- 
ment represented  by  the  signed  scalar  dz,  whose  value  is  taken  to  be  positive  up- 
ward. As  it  does  so,  the  ball  compresses  the  gas.  The  gas  therefore  opposes  the 
descent  of  the  ball  by  acting  on  it  with  a force  represented  by  the  signed  scalar  F. 
The  work  done  by  the  gas  on  the  ball  is  thus  given  by  the  product  F dz.  As  usual, 
however,  we  consider  the  work  dW  done  on  the  gas  by  the  external  world.  This  is 
given  by  the  product  of  dz  with  the  reaction  force  — F exerted  on  the  gas  by  the 
ball.  Flence  we  have 

-F  dz  = dW  (19-30) 


Substituting  Eq.  (19-5b),  dW  = —p  dV,  into  this  equation,  we  obtain 


F 


(19-31) 


862 


Thermodynamics 


We  can  easily  find  dV/dz.  The  cross-sectional  area  swept  out  by  the  ball  is  nr'2, 
where  r is  the  radius  of  the  ball.  Thus,  as  the  ball  moves  through  a distance  dz,  the 
volume  of  the  trapped  gas  changes  by  an  amount  dV  = nr2  dz,  and 

dV 

— = nr2  (19-32) 

dz 


In  Eq.  (19-31),  the  pressurep  depends  on  the  volume  and  thus  on  the  position 
of  the  ball.  Since  the  process  is  adiabatic,  we  can  express  the  pressure  as  an 
explicit  function  of  the  volume  by  using  Eq.  (19-27a)  and  writing 

p = (PoVo)V-7  (19-33) 

The  quantities  p0  and  V0  are  the  equilibrium  values  of  the  pressure  and  the  vol- 
ume, and  the  quantity  in  parentheses  is  thus  a constant  for  the  duration  of  the 
experiment.  If  we  substitute  Eq.  (19-33)  into  Eq.  (19-31),  the  net  force  exerted  by 
the  gas  on  the  ball  is  given  by 

dV 

F = (p0Vn)V-y  -7—  (19-34) 


We  now  show  that  the  force  F is  a linear  function  of  the  displacement  of  the 
ball  from  an  equilibrium  position.  Thus  the  force  will  obey  Hooke’s  law.  In  order 
to  see  this,  we  evaluate  the  quantity  —dF /dz  (which  we  hope  will  turn  out  to  be  a 
constant  that  we  can  identify  as  the  force  constant  k in  the  relation  dF  = -k  dz)  by 
taking  the  derivative  of  Eq.  (19-34)  with  respect  to  z: 


dF  _ dF  dV 
dz  dV  dz 


d 

dV 


(PoVy0)V 


dVl  dV 
dz  J dz 


But  according  to  Eq.  (19-32),  dV  /dz  = 77-r2 *,  and  this  is  a constant.  So  isp0VJ.  Thus 
we  have 


dF 

dz 


= (P0V07) 


d 

dV 


(V-7)(7rr2)2 


— y(7rr2)2p0Vo  V 7 1 


(19-35) 


We  now  take  advantage  of  the  fact  that  the  volume  of  the  tube  is  very  much 
smaller  than  that  of  the  container  to  which  it  is  attached.  Thus  there  is  little  loss  in 
precision  in  making  the  approximation  that  the  volume  V at  any  instant  is  neg- 
ligibly different  from  the  equilibrium  volume  V„.  We  therefore  have  VJV-7-1  = 
VoVq7-1  = Vq1,  and  Eq.  (19-35)  becomes 


(FF  = yTTT^po 

dz  “ V0 


(19-36) 


This  is  the  quantity  linking  the  net  force  on  the  ball  to  its  displacement  from  the 
equilibrium  position  at  which  it  would  “float”  on  the  slightly  compressed  gas 
below  it  (if  it  were  not  for  the  slight  leakage  past  the  ball).  And  since  the  quantities 
on  the  right  side  of  Eq.  (19-36)  are  all  positive  constants,  -dF /dz  is  indeed  a force 
constant.  Hooke’s  law  is  thus  obeyed,  and  the  motion  of  the  ball  must  be  har- 
monic, just  as  though  it  were  attached  to  a spring.  The  period  of  oscillation  is  de- 
termined by  the  force  constant  -dF /dz  and  the  mass  m of  the  ball,  according  to 
the  relation  derived  in  Sec.  6-5  and  given  by  Eq.  (6-28a).  Written  in  terms  of  the 
present  notation,  with  r being  the  period,  that  relation  is 


4 _ 1 / — dF  /dz 

r 2 77  V m 


(19-37) 


When  this  equation  is  solved  for  the  quantity  under  the  square  root  sign,  it  yields 


1 dF  47t2 
m dz  t2 


19-3  Isothermal  and  Adiabatic  Processes  863 


Substituting  into  this  equation  the  expression  fordF/dz  given  by  Eq.  (19-36),  we 
obtain 


yn2r4p0  4772 
mV0  = 


Finally,  this  can  be  solved  for  the  specific  heat  ratio  y: 


4mV0 

T2r4p0 


(19-38) 


Thus  y is  expressed  entirely  in  terms  of  directly  measurable  quantities. 

The  period  r is  found  by  timing  the  interval  required  for  a number  of  oscilla- 
tions and  dividing  that  time  by  the  number  of  oscillations.  The  equilibrium  pres- 
sure p0  is  slightly  larger  than  the  atmospheric  pressure  patm»  because  of  the  weight 
of  the  ball.  When  “floating”  at  equilibrium,  the  ball  exerts  an  excess  pressure  p ' = 
F /nr2  = mg/nr2,  so  that  p0  is  given  by 


mg 

Po=Patm+  ; (19-39) 

nr 

The  equilibrium  volume  V0  is  a little  harder  to  define  unambiguously  since  the 
equilibrium  position  of  the  ball  moves  slowly  down  the  tube  during  the  course  of 
the  experiment,  as  gas  leaks  past  it.  However,  a fair  approximation  can  be  made  by 
observing  the  midpoints  of  the  first  and  last  oscillations  in  the  timed  interval  and 
taking  the  point  halfway  between  them  as  the  upper  boundary  of  the  volume  of  en- 
closed gas  V0.  Since  the  entire  volume  of  the  tube  is  small  compared  with  that  of 
the  large  container,  the  error  thus  introduced  cannot  be  large.  (In  a variant  of  the 
experiment,  a very  slow  flow  of  gas  into  the  container  through  a side  arm  is  ad- 
justed to  compensate  for  the  leakage  past  the  ball.) 

The  analysis  of  the  Riichardt  method  is  applied  to  actual  experimental  data  in 
Example  19-6. 


EXAMPLE  19-6 

In  performing  Riichardt’s  experiment  using  air,  you  make  the  following  measure- 
ments: 


Patm  = 0.9869  x 105  Pa 

Volume  of  container  = 4.448  X 10-3  m3 

Correction  for  average  volume  of  air  trapped  in  tube  = 5.19  x 10-5  m3 

Mass  m of  ball  = 8.268  g 

Acceleration  of  gravity  g = 9.803  m/s2 

Radius  r of  ball  = 0.630  cm 

Period  r = 0.810  s 


Find  the  value  of  y for  air. 

H First  you  find  V0  by  adding  the  tube  volume  correction  to  the  volume  of  the 
container.  You  have 


V0  = 4.448  x 10-3  m3  + 0.052  x 10“3  m3  = 4.500  x 10“3  m3 
You  then  use  Eq.  (19-39)  to  find  p0: 


po  = 9.869  X 104  Pa  + 


8.268  x 10“3  kg  x 9.803  m/s2 


n x (0.630  x 10-2  m)2 
You  now  evaluate  y,  using  Eq.  (19-38): 


= 9.934  x 104  Pa 


7 


4 x 8.268  x 10“3  kg  x 4.500  x 10“3  m3 
(0.810  s)2  X (0.630  x 10^2  m)4  x 9.934  x 104  Pa 


= 1.45 


864  Thermodynamics 


19-4  ENTROPY, 
TEMPERATURE,  AND 
THERMODYNAMIC 
EFFICIENCY 


p 


Path  1 


Fig.  19-13  A hypothetical  heat  engine 
cycle  displayed  on  a p-V  diagram. 


Since  air  consists  mainly  of  the  diatomic  gases  nitrogen  and  oxygen,  this  experi- 
mental result  may  be  reasonably  compared  to  the  theoretical  value,  which  is  exactly 
1 .4  for  an  ideal  diatomic  gas.  The  presence  of  about  1 percent  of  the  monatomic 
gas  argon  tends  to  increase  the  value  of  y above  1.4,  while  the  presence  of  lesser 
amounts  of  the  triatomic  gases  water  vapor  and  C02  tends  to  decrease  it. 


In  Sec.  19-2  we  saw  that  the  mechanical  work  put  out  by  a heat  engine  in 
one  cycle  of  operation  is  equal  to  the  net  heat  input.  This  is  a consequence 
of  the  hi  st  law  of  thermodynamics,  which  can  be  viewed  as  a special  form  of 
the  principle  of  energy  conservation.  Examples  19-1  and  19-2  demon- 
strated this  point  numerically  for  a specific  example. 

In  a p-V  diagram,  the  area  enclosed  by  the  engine  cycle  is  equal  to  the 
mechanical  work  output.  In  Fig.  19-5,  which  is  reproduced  here  as  Fig. 
19-13,  that  work  output  of  a heat  engine  is  shown  as  the  difference  between 
the  work  done  by  the  working  fluid  on  the  outside  world  as  it  expands  and 
the  work  clone  by  the  outside  world  on  the  working  fluid  as  the  latter  is 
compressed  back  to  its  initial  state,  ready  to  begin  another  cycle.  (In  actual 
engines,  the  energy  required  to  restore  the  working  fluid  to  its  initial  state 
is  often  stored  temporarily  in  a flywheel.)  It  is  by  manipulating  the  pres- 
sure, volume,  and  temperature  of  the  working  fluid  that  we  make  path  1 
and  path  2 as  different  as  possible,  and  so  maximize  the  net  work  output. 
Indeed,  this  is  a major  goal  of  engine  design. 

As  you  have  already  seen,  heat  input  and  output  — and  thus  net  heat 
input  — are  not  displayed  directly  on  a p-V  diagram,  but  must  be  inferred 
by  the  first  law  of  thermodynamics.  Explicit  calculation  of  the  sort  carried 
out  in  Example  19-2  depends  on  a knowledge  of  the  equation  of  state  of 
the  w'orking  fluid.  But  it  is  possible  to  display  the  engine  cycle  graphically  in 
a different  way,  which  show's  the  heal  input  and  output  directly,  regardless 
of  the  working  fluid.  To  see  how  this  is  clone,  we  reconsider  the  hrst  law  of 
thermodynamics  from  a somewhat  more  rigorous  point  of  view  than  that 
taken  in  Secs.  19-1  and  19-2. 

Suppose  that  the  internal  energy  of  a thermodynamic  system  changes 
by  an  infinitesimal  amount  dE.  As  w'e  have  already  seen,  this  change  can  be 
effected  if  there  is  a flow  of  heat  into  the  system  from  (or  out  of  it  to)  the 
outside  w orld,  or  if  the  system  has  work  done  on  it  by  (or  does  work  on)  the 
outside  world,  or  both.  Calling  the  infinitesimal  quantity  of  heat  flowing 
into  the  system  dH  and  the  infinitesimal  amount  of  work  done  on  it  by  the 
outside  world  dW,  we  wTrite  the  first  law  of  thermodynamics  in  the  form 


dE  = dH  + dW 


(19-40) 


In  Sec.  19-1,  we  began  with  the  fact  that  dW  could  be  written  as  —p  dV. 
Since  the  path  in  the  pV  plane  gives  p as  a function  of  V,  we  could  integrate 
the  function  represented  by  that  path  to  find  the  work  done  on  the  system 
in  one  cycle.  This  w'ork  is  given  by  Eq.  (19-66): 


closed 

curve 


Let  us  do  the  same  thing  with  the  term  dH  in  Eq.  (19-40).  The  aim  is  to 
depict  graphically  A H,  the  net  heat  flowing  into  the  engine  in  one  cycle. 
Once  that  is  done,  we  can  specify  a path  along  which  to  integrate.  That  is, 
we  can  specify  a function  w'hose  integral  yields  A H.  We  use  Eq.  (18-61), 


19-4  ENTROPY, 
TEMPERATURE,  AND 
THERMODYNAMIC 
EFFICIENCY 


p 


Path  1 


Fig.  19-13  A hypothetical  heat  engine 
cycle  displayed  on  a p-V  diagram. 


19-4  Entropy,  Temperature,  and  Thermodynamic  Efficiency  865 


T 


Path  1 


Fig.  19-14  A hypothetical  heat  engine 
cycle  displayed  on  a T-S  diagram.  The 
net  heat  input  per  cycle  to  the  engine, 
A//,  is  represented  by  the  area  enclosed 
by  the  closed  curve  comprising  path  1 
from  A to  B and  path  2 from  B to  A. 
Compare  with  the  p-V  diagram  of  Fig. 
19-13,  in  which  the  area  enclosed  by  the 
curve  represents  the  net  work  output 
per  cycle  from  the  engine,  —AW. 


which  relates  dH  to  the  state  variables  temperature  T and  entropy  S: 

dH 

dS=—  (19-41) 

The  equation  can  be  rewritten 

dH  = T dS  (19-42) 

By  using  this  relation  and  the  relation  dW  = —p  dV,  the  hrst  law  of  thermo- 
dynamics, dE  = dH  + dW , assumes  the  form 

dE  — T dS  — p dV  (19-43) 


What  has  thus  been  achieved  is  to  write  the  change  dE  in  the  internal  en- 
ergy of  the  system  entirely  in  terms  of  state  variables.  This  was  not  done  in 
Sec.  19-2,  and  we  had  to  calculate  the  heat  flow  A H indirectly.  But  now  we 
can  depict  the  heat  engine  cycle  in  a T-S  diagram  analogous  to  the  p-V  dia- 
gram already  used. 

Figure  19-14  shows  a T-S  diagram,  on  which  a hypothetical  heat 
engine  cycle  is  shown.  The  system  passes  from  state  A,  whose  entropy  is  SA, 
to  state  B,  whose  entropy  is  SB.  along  path  1 . It  then  returns  to  state  A along 
path  2.  The  heat  flow  AH1  into  the  system  as  it  passes  along  path  I is  found 
by  integrating  Eq.  (19-42)  to  obtain 

AH,=  [*  dH  = P"  TdS  (19-44a) 

J A J SA 

path  1 path  1 

The  magnitude  of  AHl  is  given  by  the  diagonally  hatched  area  in  Fig. 
19-14.  The  heat  flow  A H2  into  the  system  as  it  passes  along  path  2 is 
found  similarly  and  is 

A H2  = fA  dH  = [ ' TdS  (19-44/?) 

Jb  Js„ 

path  2 path  2 

The  magnitude  of  AH2  is  given  by  the  horizontally  hatched  area  in  Fig. 
19-14.  However,  A H2  has  a negative  value,  since  T is  always  positive  and  the 
integration  proceeds  from  a greater  to  a smaller  value  of  S,  so  that  the 
value  of  dS  is  always  negative.  (Physically  speaking,  the  positive  value  of  the 
quantity  — AH2  represents  a heat  flow  out  of  the  system.)  The  net  heat  flow 
AH  into  the  system  over  the  entire  cycle  is  therefore  represented  by  the  dif- 
ference between  the  two  areas,  which  is  the  area  enclosed  by  the  curve  com- 
prising paths  1 and  2.  Expressed  mathematically,  it  is 


AH  = J TdS  (19-45) 

closed 

curve 


The  system  states  A andB  each  have  a well-defined  entropy.  According  to  Eq. 
(18-54), 

S = k In  [w(E)]  (19-46) 

the  entropy  depends  on  w,  the  number  of  microstates  belonging  to  the  (macro) 
state.  And  w depends  on  the  internal  energy  E.  Thus  states  A and  B each  have  a 
well-defined  internal  energy  E,  whatever  the  working  fluid  may  be. 

In  the  special  case  of  an  ideal  gas — and  only  that  special  case — we  need  not 
appeal  to  an  argument  involving  the  entropy  to  establish  the  fact  that  every  point 
on  a T-S  diagram  specifies  a state  with  a well-defined  internal  energy.  For  an  ideal 
gas,  this  fact  follows  immediately  from  the  dependence  of  E on  T only.  And  T is 
specified  for  every  point  on  the  T-S  plane. 


866  Thermodynamics 


We  have  thus  developed  parallel  ways  of  picturing  the  terms  A W = 

[vf  [s, 

— p dV  and  A H = T dS  in  the  first  law.  Since  A E = 0 for  a com- 

J v,  J s, 

plete  heat  engine  cycle,  we  can  write  the  first  law,  A E = A H + A W,  in  the 
form 


A H = -AW  (19-47) 

This  equality  explains  why  the  p-V  and  T-S  curves  for  a given  heat  engine 
are  not  independent  of  each  other.  It  might  therefore  be  argued  that  Eq. 
(19-45)  gives  us  nothing  new,  since  the  value  of  A H obtained  in  this  way  is 
numerically  equal  to  the  value  of  — AW  obtained  from  the  calculation  based 
on  Fig.  19-13.  But  there  is  a difference  which  is  very  important  from  both 
the  practical  and  theoretical  points  of  view.  We  have  already  noted  that  a 
part  of  the  work  done  by  the  system  in  going  from  A to  B is  stored  exter- 
nally (for  example,  in  a flywheel  or  in  the  cam-weight  mechanism  of  Fig. 
19-2)  and  is  then  put  back  into  the  system  to  restore  the  system  to  its  origi- 
nal state  at  point  A. 

But  the  same  restorability  does  not  apply  to  the  heat  in  Fig.  19-14.  A 
quantity  of  heat  A/T  Hows  into  the  engine  from  a heat  source.  (For  a large 
steam  turbine,  the  heat  source  is  a boiler;  in  the  familiar  automobile  engine 
it  is  the  burning  of  the  fuel  in  the  cylinder  itself.)  A quantity  of  heat 
— A H2,  usually  called  rejected  heat,  Hows  from  the  engine  into  a heat 
sink.  (For  the  steam  turbine  this  is  the  so-called  condenser;  for  the  automo- 
bile engine  it  is  the  surrounding  air  into  which  the  exhaust  gas  passes.)  The 
magnitude  |A//i|  of  the  heat  flowing  into  the  engine  from  the  heat  source  is 
greater  than  the  magnitude  | — A/T2|  of  the  heat  flowing  out  of  the  engine. 
(This  is  evidenced  in  Fig.  19-14  by  the  fact  that  the  area  under  path  1 is 
greater  than  that  under  path  2.)  Since  this  is  the  case,  over  an  entire  cycle 
the  total  heat  A H flowing  into  the  engine  is  positive;  that  is, 

AH  = AH i + A H2  > 0 

Unlike  the  part  of  the  mechanical  energy  put  out  by  die  engine  which 
is  stored  temporarily  in  the  flywheel  and  then  put  back  into  the  engine,  the 
rejected  heat  — AH2  must  be  lost  forever  to  the  engine.  To  see  why  this  is  so, 
we  express  the  quantities  AHi  and  A H2  in  terms  of  the  temperature  T and 
entropy  S.  From  Eq.  (19-44o),  we  have 

[b  rs„ 

A H1  = dH=  TdS 
J A J SA 

path  1 path  1 

The  value  of  this  integral  depends  on  the  specific  path,  that  is,  on  the  spe- 
cific functional  dependence  of  T on  S.  But  for  any  particular  case  die  inte- 
gral has  a specific  value,  and  thus  A Hx  has  a specific  value.  That  value  is 
equal  to  the  product  of  some  average  temperature  (T)l  with  the  entropy 
change  SB  — SA.  That  is,  there  is  always  some  constant  temperature  (T)x 
which  satisfies  the  equation 

[Sr  I Sr  I' Sr 

T(S)  dS  — (T)1dS=(T)1  dS  =(T)ASb  - SA) 

JsA  JsA  JsA 

path  1 

(This  is  a special  case  of  wdiat  mathematicians  call  the  mean  value  theorem. 
Note  that  the  path  need  be  specified  only  in  the  first  integral  in  the  series  of 


19-4  Entropy,  Temperature,  and  Thermodynamic  Efficiency  867 


equations  displayed  immediately  above.  Can  you  explain  why?)  Applying 
this  result  to  the  evaluation  of  A Hu  we  obtain 

A Hx  =(T)l(SB  - SA) 


We  follow  a parallel  argument  to  evaluate  A H2.  Beginning  with  Eq. 
( 1 9-446),  we  have 


A H2  = [A  dH  = pTdS 
J B J SB 

path  2 


Here,  as  before,  there  is  always  some  average  temperature  (T)2  satisfying 
the  equation 

f TdS  =(T)2(Sa  - SB) 

JsB 

path  2 

So  we  have 


A H2  = (T)2(S. 4 - SB)  = ~(T)2(Sb  - SA) 


The  inequality  A Hl  + A H2  > 0 can  now  be  written 

<T)t (SB  - SA)  ~<T)2(Sb  - SA)  > 0 


or 

(T)1>(T)2 

That  is,  the  average  temperature  at  which  heat  flows  into  the  engine  as  it  performs 
external  work  — AIT  in  each  cycle  must  be  greater  than  the  average  temperature  at 
which  heat  flows  out  of  the  engine. 

This  inequality  explains  why  the  rejected  heat  cannot  be  returned  to 
the  working  fluid  for  “reuse.”  It  cannot  be  because  it  is  heat  energy  at  a re- 
duced temperature,  and  it  cannot  flow  back  into  the  working  fluid  when 
the  engine  has  returned  to  state  A,  ready  to  accept  more  heat  energy  for 
conversion  into  mechanical  work.  To  put  it  another  way,  the  working  fluid, 
left  to  itself,  will  not  warm  itself  from  the  temperatures  which  characterize 
it  along  path  2 to  those  of  path  1 in  Fig.  19-14.  Note  again  that  |AS|  is  the 
same  for  all  paths  between  A and  B.  Consequently,  there  can  be  no  net  heat 
flow  into  the  engine  over  a complete  cycle  (and  hence  no  net  work  output) 
unless  the  source  temperature  (T)1  is  greater  than  the  sink  temperature 
(T)2.  So  the  working  fluid  temperatures  along  path  2 must  be  lower  than 
those  along  path  1 if  the  engine  is  to  operate  at  all.  We  will  soon  see  that  this 
involves  a net  increase  in  the  entropy  of  the  universe  over  one  engine  cycle, 
even  though  the  entropy  of  the  engine  itself  is  exactly  the  same  every  time 
it  is  in,  say,  state  A. 

What  leads  to  this  rather  abstract  statement  is  a very  practical  question: 
How  can  we  characterize  the  efficiency  of  a heat  engine?  In  its  most  general 
sense,  efficiency  is  a concept  which  applies  more  to  economics  than  to  phys- 
ical science.  In  many  applications  having  nothing  to  do  with  physics,  we  are 
concerned  with  getting  the  greatest  possible  yield  for  a given  effort,  or  ex- 
pense, or  both.  In  this  general  sense,  we  define  efficiency  r\  (lowercase  Greek 
eta)  to  be 

what  you  get 

r\  - —r— r — (19-48) 

what  you  pay  lor 

In  the  case  of  a heat  engine,  the  efficiency  so  defined  depends  on  not  only 


868  Thermodynamics 


the  intrinsic  characteristics  of  the  engine  but  also  such  things  as  the  cost  of 
fuel,  the  capital  costs  of  the  installation  (or  the  interest  costs  on  the  money 
borrowed  to  pay  for  it),  the  labor  costs  of  operation  and  maintenance,  and 
so  forth.  On  the  output  side,  this  general  efficiency  depends  on  the  value  of 
the  work  put  out  by  the  engine. 

While  all  these  things  are  of  the  greatest  importance  to  the  engineer 
and  the  businessperson,  the  physicist  confines  the  definition  of  efficiency  to 
the  heat  energy  input  A//,  to  the  engine  (“what  you  pay  for”)  and  the  useful 
mechanical  energy  output  — AW  (“what  you  get”).  In  these  narrower  but  spe- 
cific terms,  we  define  the  thermodynamic  efficiency  17  to  be 


V 


Air 

"a 777 


(19-49) 


Since  the  numerator  and  denominator  of  this  fraction  have  the  same  units 
(energy),  the  efficiency  is  a dimensionless  number.  It  is  very  often  ex- 
pressed as  a percentage. 


In  actual  engines,  the  useful  work  output  is  always  diminished  by  such  effects 
as  friction  in  the  moving  parts  or  the  need  to  drive  auxiliary  machinery  associated 
with  the  engine,  such  as  fuel  feed  and  cooling  water  pumps  or  air  draft  blowers. 
These  can  be  minimized  (though  never  completely  eliminated)  by  careful  engi- 
neering design  and  practice.  But  while  these  effects  are  important,  the  engineer 
and  the  physicist  find  it  useful  to  distinguish  between  the  overall  efficiency, 
which  takes  them  into  consideration,  and  the  thermodynamic  efficiency,  which 
ignores  them.  Naturally,  the  thermodynamic  efficiency  is  always  greater  than  the 
overall  efficiency.  It  is  the  thermodynamic  efficiency  that  we  consider  here,  and 
specifically  in  Example  19-7. 


EXAMPLE  19-7 

Find  the  thermodynamic  efficiency  of  the  engine  studied  in  Examples  19-1  and 
19-2. 

■ You  saw  in  Example  19-1  that  the  output  work  per  cycle  was  — AW  = 2.0  x 
105  J.  And  in  Example  19-2,  the  input  heat  per  cycle  was  found  to  be  A HNKL  = 
1.45  X 10s  J . You  use  these  values  in  Eq.  (19-49)  to  find 

2.0  X 105  I 

-n  = = 0.14 

1 1.45  x 106  J 

Tfiat  is,  only  14  percent  of  the  input  heat  energy  is  converted  by  the  engine  into  me- 
chanical energy.  The  other  86  percent  of  the  heat  energy  is  discarded  to  the  outside 
world  as  rejected  heat. 


In  the  most  general  sense,  a heat  engine  is  a device  which  converts  heat 
energy  into  mechanical  energy.  (Sometimes  the  heat  energy  is  converted 
into  some  other  kind  ol  macroscopic  energy.  For  example,  in  the  so-called 
magnetohydroclynamic  generator  the  output  is  in  the  form  of  electric  en- 
ergy.) To  put  it  in  microscopic  language,  a heat  engine  converts  some  of 
the  random  energy  of  the  molecules  of  the  working  fluid  into  macro- 
scopic organized  energy.  In  most  familiar  engines,  the  output  energy  is  in 
the  form  of  organized  kinetic  energy  of  moving  parts.  Since  the  engine  is  a 
converter  of  energy  from  one  form  to  another,  and  not  a creator  of  energy, 
the  efficiency  of  a heat  engine  can  under  no  circumstances  be  greater  than 
100  percent.  A hypothetical  engine  which  violated  this  rule  would  violate 
the  first  law  of  thermodynamics  in  the  form  of  Eq.  (19-47).  It  is  therefore 


19-4  Entropy,  Temperature,  and  Thermodynamic  Efficiency  869 


called  a perpetual-motion  machine  of  the  first  kind.  If  such  an  engine 
could  exist,  it  could  not  only  run  itself  forever  without  energy  input,  but 
also  provide  a steady  stream  of  mechanical  energy  to  the  outside  world.  It 
may  be  a sad  fact  that  no  such  machine  is  possible,  but  at  least  we  can  save  a 
lot  of  time  trying  to  invent  one  by  knowing  in  advance  that  the  task  is  im- 
possible. (A  machine  that  runs  forever  without  energy  input  does  not  vio- 
late physical  law,  provided  that  no  energy  is  extracted  from  it  as  it  runs.  The 
electric  current  that  can  be  made  to  flow  in  a ring  of  superconducting  mate- 
rial is  in  “perpetual  motion”  in  this  limited  sense.  Rotating  mechanical 
systems,  operating  in  high  vacuum  and  suspended  magnetically  come 
remarkably  close  to  this  situation  as  well.  On  a much  larger  scale,  the  mo- 
tion of  the  solar  system  is  a striking  example.  But  in  all  these  cases  the  term 
“efficiency”  becomes  meaningless,  since  no  energy  goes  in  and  none  comes 
out.) 


19-5  THE  CARNOT 
ENGINE  AND  THE 
SECOND  LAW  OF 
THERMODYNAMICS 


Historically,  the  question  of  efficiency  of  heat  engines  did  not  arise  until 
the  steam  engine  had  been  in  widespread  use  for  some  time.  The  earliest 
engines — particularly  those  antedating  Watt’s  engines  of  the  late  eigh- 
teenth century — were  so  very  inefficient  (p  = 1 percent  or  less)  that  it  was 
not  clear  that  any  direct  connection  at  all  existed  between  heat  input  and 
mechanical  energy  output.  With  a rise  in  coal  costs  and  particularly  with 
the  ever-widening  use  of  steam  engines  at  locations  far  from  coal  mines,  ef- 
ficiency became  a matter  of  greater  concern.  A different  but  related  pres- 
sure toward  consideration  of  efficiency  came  from  the  development  of 
steamboats  (and  later  railroad  locomotives)  which  had  to  carry  their  fuel, 
and  which  consequently  could  not  be  practical  until  they  could  go  a reason- 
able distance  on  a full  load  of  coal  or  wood. 

But  in  1824,  even  before  the  equivalence  between  heat  and  energy  had 
been  clearly  established,  the  young  French  engineer  Nicolas  L.  Sadi  Carnot 
(1796-1832)  published  his  small  book  Reflections  on  the  Motive  Power  of  Fire 
and  on  the  Means  Suitable  for  Developing  It.  Although  Carnot  was  originally  an 
adherent  of  the  caloric  theory  of  heat,  he  had  doubts  which  later  led  him  to 
espouse  the  kinetic  theory.  Perhaps  on  account  of  his  uncertainty  con- 
cerning the  basic  nature  of  heat,  he  couched  his  theory  of  heat  engines  in 
terms  so  general  that  the  conclusions  did  not  depend  on  what  heat  was.  He 
followed  the  false  but  nevertheless  fruitful  analogy  between  the  way  that 
falling  water  propels  water  wheels  and  the  way  that  the  “fall  of  caloric” 
from  high-temperature  source  to  low-temperature  sink  propels  engines. 
On  this  basis,  he  came  to  essentially  correct  conclusions  which  we  will 
develop  immediately  below,  using  a more  modern  line  of  argument. 
Carnot  reached  these  correct  results  in  spite  of  the  fact  that  at  the  time  of 
publication  of  his  book  he  probably  did  not  realize  that  some  of  the  “falling 
heat”  is  converted  into  mechanical  energy.  Long  after  his  untimely  death  of 
cholera  in  1832,  a study  of  his  private  papers  revealed  that  he  had  later 
come  to  understand  this  point  clearly  in  the  light  of  the  equivalence  of  heat 
and  energy. 


Much  confusion  existed  for  many  years  as  to  the  details  (and  even  the  title)  of 
Carnot’s  work.  It  was  largely  ignored  at  the  time  of  its  publication.  In  1834,  how- 
ever, it  evoked  considerable  attention  when  Clapeyron  restated  Carnot’s  argu- 
ments in  vivid  geometric  form.  Clapeyron’s  memoir,  whose  title  is  similar  to 
Carnot’s,  was  for  many  years  the  main  source  for  Carnot’s  ideas  because  Carnot’s 
book  was  very  rare.  The  reprinting  of  the  1824  work,  the  painstaking  historical 


870  Thermodynamics 


T 


Fig.  19-15  The  Carnot  cycle  shown  on 
a T-S  diagram.  Paths  AT  and  MN  are 
isotherms,  while  paths  LM  and  NK  are 
adiabatics. 


studies  of  Lord  Kelvin,  and  the  publication  in  1878  of  Carnot’s  notes  clarified  the 
matter  of  historical  priorities.  To  this  day,  however,  Carnot’s  book  is  often 
wrongly  referred  to  by  the  title  “On  the  Motive  Power  of  Heat,”  which  is  really 
closer  to  the  title  of  Clapeyron’s  work. 

Carnot  imagined  an  ideal  engine  whose  cycle  is  particularly  easy  to  de- 
pict on  a T-S  diagram.  T he  Carnot  cycle  is  shown  in  Fig.  19-15.  All  heat 
Hows  into  the  engine  from  the  heat  source  at  a single  temperature;  the 
process  is  denoted  by  the  isotherm  KL.  And  all  heat  Hows  out  of  the  engine 
to  the  heat  sink  at  a single  temperature;  the  isotherm  MN  denotes  this 
process.  The  other  two  parts  of  the  cycle  are  the  adiabatics  LM  and  NK,  for 
which  S remains  constant.  In  the  special  case  where  the  working  fluid  is  an 
ideal  gas,  this  cycle  is  identical  to  the  one  shown  on  the  p-V  diagram  of  Fig. 
19-16,  which  is  a retracing  of  the  closed  curve  KLMNK  from  Fig.  19-1  1. 

How  might  such  a Carnot  engine  be  operated?  The  idealized  process  is 
shown  in  Fig.  19-17.  Starting  at  point  K in  Fig.  19-15  (or  in  Fig.  19-16,  if  the 
working  fluid  is  an  ideal  gas),  the  engine  is  maintained  at  a temperature  Thi 
by  keeping  it  in  ideal  thermal  contact  with  a large  heat  reservoir  (say  a 
steam  bath)  at  that  temperature.  It  is  allowed  to  expand  quasistatically 
along  path  KL.  In  doing  so,  it  does  work  on  the  outside  world.  And  as  we 
have  seen,  heat  flows  into  the  engine  during  this  process.  At  point  L the 
engine  is  removed  from  the  heat  reservoir  and  completely  insulated  ther- 
mally from  the  outside  world.  It  is  allowed  to  expand  quasistatically  along 
path  LM  and  does  more  work  on  the  outside  world.  When  this  adiabatic  ex- 
pansion has  cooled  the  working  fluid  of  the  engine  to  the  temperature  Ti0, 
at  point  M,  the  engine  is  placed  in  contact  with  a second  large  heat  res- 
ervoir at  temperature  Tlo.  This  reservoir  maintains  the  temperature  con- 
stant while  external  work  is  expended  to  compress  the  working  fluid  to 
point  N.  Since  the  compression  tends  to  increase  the  temperature  of  the 
working  fluid,  heat  must  flow  out  of  the  engine  during  this  process.  At 
point  N,  the  engine  is  again  isolated  thermally,  and  more  external  work  is 


Fig.  19-16  Carnot  cycle  for  an  engine  whose  working  fluid  is 
an  ideal  gas,  represented  on  a p-V  diagram.  The  cycle  shown  is 
identical  to  the  closed  path  KLMNK  of  Fig.  19-11.  Unlike  the 
perfectly  general  Carnot  cycle  shown  in  the  T-S  diagram  of 
Fig.  19-15,  this  cycle  applies  to  ideal  gases  only.  On  a p-V  dia- 
gram, the  Carnot  cycles  for  other  kinds  of  working  fluids  have 
different  shapes. 


19-5  The  Carnot  Engine  and  the  Second  Law  of  Thermodynamics  871 


KL 

Isothermal  expansion 


Adiabatic  expansion 


MN 

Isothermal  compression 


Adiabatic  compression 


Fig.  19-17  Visualization  of  the  operation  of  a Carnot  engine.  The  cycle  is  explained  in  the 
text.  The  first  illustration  represents  the  process  KL  of  Fig.  19-15.  The  second  illustration  rep- 
resents the  process  LM , the  third  the  process  MN,  and  the  fourth  the  process  NK. 

expended  lo  compress  the  working  fluid  to  its  original  state  at  point  K. 
During  this  process  the  temperature  rises  to  Thi. 


We  now  calculate  the  efficiency  of  a Carnot  engine.  For  a working 
fluid  of  ideal  gas,  either  Fig.  19-15  or  Fig.  19-16  can  be  used  to  calculate  the 
heat  flow  of  magnitude  |A Hx\  into  the  engine  from  the  heat  source,  the  heat 
flow  of  magnitude  |AH2|  out  of  the  engine  to  the  heat  sink,  and  the  net 
work  done  during  one  cycle.  You  have  seen  in  Examples  19-1,  19-2,  and 
19-7  that  the  first  law  of  thermodynamics  guarantees  the  complete  equiva- 
lence of  the  calculations.  However,  the  calculation  is  most  easily  performed 
by  using  the  T-S  diagram,  since  the  cycle  has  rectangular  form  on  this  plot. 
Much  more  importantly,  the  rectangular  shape  of  the  Carnot  cycle  on  the 
T-S  diagram  is  independent  of  the  working  fluid,  and  the  result  is  therefore 
perfectly  general.  In  the  p-V  diagram,  to  the  contrary,  the  shape  of  the 
Carnot  cycle  depends  on  the  equation  of  state  of  the  working  fluid. 

In  Fig.  19-15,  heat  flows  into  the  engine  only  during  the  isothermal 
process  KL,  and  the  heat  A H,  flowing  into  the  engine  from  the  heat  source 
is  given  by  the  area  of  the  diagonally  hatched  rectangle.  That  is,  Eq. 
(19-44a)  leads  immediately  to  the  result 

AH,  = r dH  = I Thi  dS  = Thi  P dS  = Thi(SL  - SK)  (19-50) 

J K J SK  J SK 

isothermal 

path 

F=rhi 

According  to  Eq.  (19-47),  A H — — AVF,  the  work  output  — AIT  of  the 
engine  during  one  cycle  must  be  equal  to  the  net  heat  input  A H over  the 
cycle.  The  latter  is  given  by  Eq.  (19-45), 

A H = j TdS 

closed 

curve 


872  Thermodynamics 


Carnot  efficiency  77 


0 0.2  0.4  0.6  0.8  1.0 


Ik 

Fig.  19-18  Plot  of  the  thermodynamic 
efficiency  17*  of  a Carnot  engine  as  a 
function  of  the  ratio  Tl0/Thi  of  the  abso- 
lute temperature  of  the  heat  sink  to  that 
of  the  heat  source. 


But  we  have  already  noted  that  the  value  of  this  integral  is  represented  by 
the  area  enclosed  by  the  cycle  on  the  T-S  diagram.  And  for  the  Carnot 
cycle,  this  is  the  area  of  the  rectangle  KLMN  in  Fig.  19-15.  So  we  have 

— AW  = AH  = (Thi  — Tio)(SL  - SK)  (19-51) 

We  now  apply  the  definition  of  thermodynamic  efficiency,  Eq.  (19-49), 
to  the  Carnot  engine.  Employing  the  symbol  77*  for  the  carnot 
efficiency — that  is,  the  thermodynamic  efficiency  of  a Carnot  engine — we 
have 

* _ _ (^m  ~ Tl0)  (SL  - SK) 

V AH,  Thi  (SL  - SK) 

or 

7)*  - T—l1  Tj 2 (19-52) 

1 hi 

This  important  result  shows  that  the  thermodynamic  efficiency  of  a Carnot 
engine  depends  on  only  the  input  (source)  and  output  (sink)  temperatures 
(expressed  as  absolute  temperatures).  That  is  why  Carnot’s  uncertainties  as 
to  the  nature  of  heat  were  immaterial,  as  long  as  he  understood  the 
meaning  of  temperature.  Figure  19-18  is  a plot  of  17*  versus  Tlo/Thi. 

The  importance  of  the  Carnot  engine  arises  mainly  from  its  applica- 
tion to  Carnot’s  theorem:  No  heat  engine  operating  between  two  heat  reservoirs 
having  specified  temperatures  can  be  more  efficient  than  a Carnot  engine  operating 
between  the  same  reservoirs.  That  is,  Eq.  (19-52)  sets  an  absolute  upper  limit  on 
the  efficiency  of  heat  engines,  and  this  limit  is  determined  by  the  reservoir 
temperatures  only. 

The  theorem  is  proved  by  assuming  that  an  engine  more  efficient  than 
a Carnot  engine  does  exist.  This  assumption  leads  to  a violation  of  the  sec- 
ond law  of  thermodynamics,  Eq.  (18-55),  which  states  that  all  thermal  pro- 
cesses lead  to  either  an  increase  or  no  change  in  the  entropy  of  the  uni- 
verse. 

In  preparing  to  prove  Carnot’s  theorem,  the  hi  st  thing  to  note  is  that  a 
Carnot  engine  is  reversible  in  the  sense  defined  in  Sec.  19-1.  That  is,  it  can  be 
stopped  at  any  point  along  its  cycle — at  any  state  through  which  it 
passes — and  restored  to  a neighboring  state  through  which  it  has  just 
passed  by  simply  reversing  the  process.  As  a consequence,  the  heat  engine 
can  be  run  entirely  in  reverse.  It  is  then  called  a heat  pump.  When  it  is  run 
in  the  direction  KNMLK  in  either  Fig.  19-15  or,  if  the  working  fluid  is  an 
ideal  gas,  Fig.  19-16,  it  will  remove  heat  from  the  low-temperature  res- 
ervoir (which  thus  becomes  the  heat  source).  And  it  will  deposit  heat  in  the 
high-temperature  reservoir  (which  thus  becomes  the  heat  sink).  In  order  to 
satisfy  the  first  law,  the  difference  in  energy  must  be  made  up  by  doing  net 
work  on  the  system.  The  magnitude  of  that  work  input  is  exactly  equal  to 
the  work  output  of  the  engine  when  it  is  running  in  the  forward  direction. 
This  is  because  both  of  these  quantities  are  given  by  the  area  enclosed  by 
the  curve  KLMNK.  (Note  that  less  work  is  done  by  the  heat  pump  in  ex- 
panding from  N to  M than  must  be  done  on  it  to  compress  it  from  L to  K.) 

The  proof  of  the  theorem  is  as  follows.  Suppose  that  a Carnot  engine 
C,  operating  between  heat  reservoirs  at  temperatures  Thi  and  Tlo,  puts  out 
an  amount  of  work  — AW  per  cycle.  In  doing  so,  it  must  take  in  an  amount 
of  heat  A Hchi  from  the  hotter  reservoir  and  reject  an  amount  of  heat 

19-5  The  Carnot  Engine  and  the  Second  Law  of  Thermodynamics  873 


Fig.  19-19  Schematic  diagram  of  two  idealized  heat 
engines.  Each  engine  takes  heat  energy  from  a large 
heat  reservoir  whose  temperature  is  Thi.  It  rejects  a 
smaller  amount  of  heat  energy  to  a second  large  heat 
reservoir  whose  temperature  has  the  smaller  value  7j0. 
Each  engine  puts  out  an  amount  of  energy  — AW  per 
cycle  in  the  form  of  mechanical  work.  To  satisfy  the 
first  law  of  thermodynamics,  the  sum  of  the  magni- 
tudes of  the  heat  energy  and  the  mechanical  energy 
flowing  out  of  each  engine  through  the  two  output 
channels  shown  must  be  equal  to  the  magnitude  of  the 
heat  energy  flowing  into  it  through  the  input  channel. 
This  is  represented  graphically  by  the  widths  of  the 
channels,  which  are  proportional  to  the  amount  of  en- 
ergy flowing  through  them.  Both  engines  are  designed 
to  put  out  the  same  amount  of  work  -AW  per  cycle. 
Since  the  super  engine  on  the  right  is  more  efficient 
than  the  Carnot  engine  on  the  left,  it  takes  in  less  heat 
energy  from  the  hotter  reservoir  and  rejects  less  heat 
energy  to  the  cooler  reservoir  than  does  the  Carnot 
engine. 


— A Hclo  to  the  cooler  reservoir.  The  hrst  law  of  thermodynamics  in  the 
form  of  Eq.  (19-47),  A H — — AW,  establishes  a relation  between  the 
engine’s  work  output  per  cycle  and  its  net  heat  intake  A H per  cycle.  We 
have 

A H = A Hc  hi  + A//ri0  = - AW  (19-53) 

From  the  definition  of  the  Carnot  efficiency  r /*,  Eq.  (19-49)  gives  us 

-AW 

A HCbi=-^  (19-54) 


for  the  necessary  heat  input  to  the  engine  per  cycle.  And  combining  this 
equation  with  Eq.  (19-53),  we  have  for  the  heat  output  per  cycle 

— A Hcl0  = A HChi  + AW  = - AW  - 1 ) (19-55) 


Since  — AW,  the  work  output  of  the  engine,  has  a positive  value,  and  since 
7)*  < 1,  the  rejected  heat  — A/4clo  has  a positive  value,  as  you  would  expect. 

Now  assume  that  we  can  design  a more  efficient  “super  engine”  S, 
whose  efficiency  is  7)'  > rj*.  In  particular,  let  us  design  one  which  puts  out 
exactly  the  same  amount  of  work  per  cycle,  — AW,  as  the  Carnot  engine.  As 
shown  in  Fig.  19-19,  this  engine  will  take  in  an  amount  of  heat  &HShi  from 
the  hotter  reservoir.  According  to  the  same  argument  as  that  used  for  the 
Carnot  engine,  we  can  write 


A HShi 


-AW 

V 


(19-56) 


And  the  super  engine  will  reject  to  the  cooler  reservoir  an  amount  of  heat 
-A Hsl0  given  by 

-A//5i0  - - AW  (A  - 1 ) (19-57) 


This  quantity,  like  — A HC\0,  is  positive  in  value. 


874  Thermodynamics 


Since  the  Carnot  engine  is  reversible,  we  will  use  the  more  efficient 
super  engine  to  run  the  Carnot  engine  in  reverse  as  a heat  pump,  as  shown  in  Fig. 
19-20a.  And  since  we  have  designed  the  two  engines  so  that  the  work  out- 
put of  the  super  engine  just  suffices  to  run  the  Carnot  engine  as  a heat 
pump,  no  net  work  is  done  by  the  system  consisting  of  the  two  engines  and 
the  two  heat  reservoirs.  However,  heat  is  being  transferred  from  reservoir 
to  reservoir.  In  each  cycle,  the  super  engine  removes  heat  from  the  hotter 
reservoir,  taking  heat  A HSM  into  itself  and  thus  changing  the  reservoir's  in- 
ternal energy  by  an  amount  -A//Shi-  At  the  same  time,  the  Carnot  heat 
pump  puts  heat  into  this  reservoir,  rejecting  heat  - AF/Chi  and  thus  changing 
the  reservoir’s  internal  energy  by  an  amount  — A//Chi.  In  each  cycle,  there- 
fore, the  net  heat  input  A//hi  into  the  hotter  reservoir  is 

AHhi=  -AHShi- A HChi  (19-58) 

We  now  use  Eqs.  (19-56)  and  (19-54),  with  a sign  reversal  in  the  latter  so 
that  — AW  can  represent  the  engine’s  work  output  in  both,  obtaining 


A Hhl  = 


AW  AW 


V 


7)* 


= -AW  — 


V 


(19-59) 


Fig.  19-20  Arrangement  for  the  proof  of  Carnot’s  theorem,  (a)  The  Carnot  engine  of  Fig. 
19-19  is  run  in  reverse  as  a heat  pump.  It  takes  from  the  cooler  reservoir  the  same  amount  of 
heat  energy  which  it  rejected  to  that  reservoir  when  running  as  an  engine  in  Fig.  19-19.  Also,  it 
rejects  to  the  hotter  reservoir  the  same  amount  of  heat  energy  which  it  took  from  that  res- 
ervoir when  running  as  an  engine.  In  order  to  do  this,  it  must  have  mechanical  work  AW  done 
on  it.  This  work  is  supplied  by  the  super  engine,  which  is  runningjust  as  it  did  in  Fig.  19-19.  ( b ) 
The  net  result  of  the  process  shown  in  part  a.  The  mechanical  work  produced  by  the  su- 
per engine  is  entirely  consumed  by  the  Carnot  engine,  so  that  the  system  as  a whole  neither 
puts  out  nor  takes  in  mechanical  work.  The  super  engine  transfers  less  heat  energy  from  the 
hotter  reservoir  to  the  cooler  reservoir  than  the  Carnot  heat  pump  transfers  in  the  opposite 
direction.  Consequently,  the  net  effect  of  the  operation  of  the  system  is  a transfer  of  heat  en- 
ergy from  the  cooler  to  the  hotter  reservoir,  as  shown  by  detailed  calculation  in  the  text.  For 
reasons  discussed  there,  this  violates  the  second  law  of  thermodynamics. 


19-5  The  Carnot  Engine  and  the  Second  Law  of  Thermodynamics  875 


Since  we  have  assumed  17'  > 17*,  the  quantity  in  parentheses  has  a positive 
value.  Also,  remember  that  — AW  represents  the  work  output  of  the  engine 
and  thus  also  has  a positive  value.  So  we  have 

A//hi  > 0 (19-60) 

This  means  that  the  net  result  of  the  process  shown  in  Fig.  19-20*7  is  an  in- 
crease in  the  internal  energy  of  the  hotter  reservoir. 

We  determine  the  net  heat  input  AHU,  to  the  cooler  reservoir  in  similar 
fashion.  In  each  cycle,  the  super  engine  rejects  heat  —AHS\0  which  flows  to 
the  cooler  reservoir,  changing  the  reservoir’s  internal  energy  by  an  amount 
— AHS\o,  whose  value  is  positive.  At  the  same  time,  the  Carnot  heat  pump 
removes  heat  from  this  reservoir,  taking  heat  A HC\0  into  itself  and  thus 
changing  the  reservoir’s  internal  energy  by  an  amount  — A Hc  i0,  whose  value 
is  negative.  In  each  cycle,  therefore,  the  net  heat  input  A//lo  into  the  cooler 
reservoir  is 

A /Fio  = — AZ/s  io  — AHc  to  (19-61) 

We  now  use  Eqs.  (19-57)  and  (19-55),  with  a sign  reversal  in  the  latter  so 
that  — AW  can  represent  the  engine’s  work  output  in  both,  obtaining 

AHl0  = -AW  (Jj  - 1 ) + AW  (f  - l)  = -AW  (jr  - ji)  (19-62) 

Since  we  have  assumed  17'  > 77*,  the  quantity  in  parentheses  has  a negative 
value,  and  since  — AW  has  a positive  value,  we  have 

AZ/io  < 0 (19-63) 

which  means  that  the  net  result  of  the  process  shown  in  Fig.  19-20a  is  a de- 
crease in  the  internal  energy  of  the  cooler  reservoir.  Indeed,  comparing  the 
value  of  A//io  with  that  previously  obtained  for  A Hh[,  the  internal  energy 
change  of  the  hotter  reservoir,  we  have 

A Hl0  = — A Hhi  (19-64) 

Thus  the  net  result  of  the  entire  process  is  the  removal  of  a quantity  of  heat 
from  the  cooler  reservoir  and  its  transfer  to  the  hotter  reservoir,  as  shown 
in  Fig.  19-20/;.  Your  intuition  probably  tells  you  that  this  is  impossible.  In- 
deed, this  intuition,  stated  precisely,  is  called  the  Clausius  statement  of  the 
second  law  of  thermodynamics:  No  process  is  possible  in  which  nothing  happens 
except  the  transfer  of  heat  from  a cooler  body  to  a warmer  one. 


But  we  can  do  better  than  simply  asserting  that  this  is  a fundamental 
law  of  physics.  In  Sec.  18-7,  we  defined  entropy  in  microscopic  terms  and 
deduced  the  second  law  in  this  form:  In  a thermal  interaction,  the  entropy  of  an 
isolated  system  ( possibly  consisting  of  several  subsystems)  must  either  increase  or  re- 
main the  same.  Let  us  show  that  the  system  of  Fig.  19-20,  in  violating  the 
Clausius  statement  of  the  second  law,  also  violates  the  microscopic  state- 
ment. We  do  this  by  calculating  the  entropy  change  of  the  system  as  an 
amount  of  heat  |A//|  is  transferred  from  the  cooler  to  the  hotter  reservoir. 
The  internal  energy  of  the  cooler  reservoir  is  changed  by  an  amount 
— |A//|,  and  its  entropy  is  changed  by  an  amount 


ASl0 


\AH\ 


(19-65fl) 


876  Thermodynamics 


The  internal  energy  of  the  hotter  reservoir  is  changed  by  an  amount 
|A//|,  and  its  entropy  is  changed  by  an  amount 

|AH| 

AShl  = ™ (19-656) 

l hi 

Thus  the  total  entropy  change  of  the  system  is 

AS  = ASl0  + AShi  = \AH\  (19-66) 

Since  Thi  > Tl0,  the  quantity  in  parentheses  has  a negative  value,  and  we 
have 

AS  < 0 (19-67) 

This  is  a violation  of  the  second  law  of  thermodynamics,  which  requires  that 

the  entropy  of  an  isolated  system  either  increase  or  remain  unchanged. 
Hence  the  existence  of  a heat  engine  more  efficient  than  a Carnot  en- 
gine amounts  to  a violation  of  the  second  law  of  thermodynamics,  which 
we  derived  from  fundamental  principles  in  Sec.  18-7.  1 his  proves  Carnot’s 
theorem:  No  heat  engine  can  be  more  efficient  than  a Carnot  engine 
operating  between  the  same  two  heat  reservoirs. 

There  is  a straightforward  corollary  to  Carnot’s  theorem.  Let  us  as- 
sume that  the  non-Carnot  engine  in  Fig.  19-19  (now  no  longer  a “super 
engine”)  is  also  a reversible  engine.  Assume  also  that  its  efficiency  is  less  than 
that  of  the  Carnot  engine.  If  that  is  so,  we  can  run  the  system  so  that  the 
Carnot  engine  drives  the  other  engine  in  reverse  as  a heat  pump.  (This  is 
the  inverse  of  the  situation  in  Fig.  19-20«. ) Here  again,  with  the  more  effi- 
cient engine  running  the  less  efficient  one,  we  will  obtain  a net  result  of 
pumping  heat  from  a low-  to  a high-temperature  reservoir  without  any 
work  input,  in  violation  of  the  second  law.  This  conclusion  follows:  All 
reversible  engines  have  the  same  efficiency  as  a Carnot  engine  operating  between  the 
same  heat  reservoirs.  (Of  course,  no  real  engine  is  exactly  reversible.  The 
existence  of  friction  ensures  this,  even  if  there  were  no  other  irreversible 
processes  taking  place.  Thus  a real  engine  is  certain  to  be  less  efficient  than 
a Carnot  engine  operating  between  the  same  reservoirs.) 

An  engine  which  takes  heat  energy  from  a reservoir  and  converts  it  en- 
tirely into  mechanical  energy  does  not  violate  the  hrst  law,  which  requires 
only  that  the  sum  of  heat  energy  and  mechanical  energy  be  conserved  in 
any  conversion  process.  Such  an  engine  does  violate  the  second  law,  by  de- 
creasing the  entropy  of  the  universe.  It  is  therefore  called  a perpetual- 
motion  machine  of  the  second  kind.  It  extracts  heat  A H from  the  reservoir 
at  temperature  T,  thus  decreasing  the  entropy  of  the  reservoir  by  an 
amount  AS  = A H/T.  But  since  it  rejects  no  heat  elsewhere,  there  is  no 
other  subsystem  whose  entropy  can  increase  by  at  least  a compensating 
amount.  This  leads  to  the  so-called  Kelvin-Planck  statement  of  the  second 
law  of  thermodynamics:  No  process  is  possible  in  which  nothing  happens  except 
the  conversion  of  heat  energy  from  a single  reservoir  into  macroscopic  work.  Indeed, 
a Carnot  engine  rejects  just  enough  heat  into  the  cooler  reservoir  that  the 
reservoir’s  entropy  is  increased  exactly  as  much  as  the  entropy  of  the  hotter 
reservoir  is  decreased  as  the  engine  removes  heat  from  it.  Thus  a Carnot 
engine  does  not  change  the  entropy  of  the  universe.  It  is  this  consideration 
which  underlies  the  Carnot  efficiency  17*  = (Thi  — Tlo)/Thi  given  by  Eq. 
(19-52),  as  Example  19-8  shows. 

19-5  The  Carnot  Engine  and  the  Second  Law  of  Thermodynamics  877 


EXAMPLE  19-8 


19-6  HEAT  PUMPS, 
REFRIGERATORS, 
AND  ENGINES 


HT — rTn — ■ uri!,!  | mmm  ||(|  | ••  — mmw— — 

Show  that  a heat  engine  which  produces  no  net  change  in  the  entropy  of  the  uni- 
verse must  have  efficiency  17*. 

■ An  engine  with  efficiency  rj  takes  in  an  amount  of  heat  A//hi,  whose  value  is 
positive,  from  the  hotter  reservoir.  The  internal  energy  of  the  hotter  reservoir  is 
thus  changed  by  the  amount  - A Hhi,  whose  value  is  negative.  At  the  same  time,  the 
engine  rejects  an  amount  of  heat  - A//l0,  whose  value  is  positive,  to  the  cooler  res- 
ervoir. The  internal  energy  of  the  cooler  reservoir  is  thus  changed  by  the  amount 
— A Hl0,  whose  value  is  positive.  If  the  entropy  change  of  the  entire  system  is  to  be 
zero,  you  must  have 


as  = 0 = ^+-A//,o 


Th 


Tu 


But  analogues  to  Eqs.  (19-49)  and  (19-53)  show  that  A//|0  is  given  by  AH]0  = 
AHhl(v  — 1).  Substituting  this  value  into  the  equation  immediately  above  gives  you 


0 = AH 


hi 


T\ !i  Tl0  / 


Solving  for  17,  you  obtain 


T\0  7jii  To 

= — t — = 17 

1 hi  1 hi 


which  is  the  Carnot  efficiency. 


Just  as  a Carnot  engine  run  in  reverse  acts  as  a heat  pump,  removing  heat 
from  a low-tempeature  source  and  rejecting  it  to  a high-temperature  sink, 
a properly  designed  real  engine  can  be  made  to  do  the  same  thing.  This  is 
the  basis  of  the  familiar  refrigerator  as  well  as  the  less  common,  practical 
heat  pump.  The  difference  between  the  two  is  merely  a matter  of  purpose. 
The  refrigerator  is  designed  to  maintain  an  enclosed  space  at  a tempera- 
ture lower  than  the  ambient  temperature,  so  as  to  store  food,  to  keep  peo- 
ple comfortable  in  summer,  or  for  some  similar  reason.  Thus  the  space 
being  cooled  is  itself  the  heat  source,  and  the  outside  environment  is  the 
heat  sink.  Harking  back  to  the  general  definition  of  efficiency,  that  is, 
(what  you  get)/(what  you  pay  for),  we  characterize  a refrigerator  by  its 
refrigerator  coefficient  of  performance  Er : 


heat  removed  from  source 

Er  = 7^ — (19-b8a) 

work  input  to  refrigerator 

In  a heat  pump,  on  the  other  hand,  the  aim  is  to  warm  an  enclosed  space  by 
extracting  heat  from  the  cooler  outside  world  and  rejecting  it  into  the  en- 
closure, which  thus  becomes  the  heat  sink.  Here  it  is  more  meaningful  to 
define  the  heat  pump  coefficient  of  performance  Ehp\ 


EhP 


heat  added  to  sink 
work  input  to  heat  pump 


(19-68  b) 


These  two  coefficients  of  performance  are  closely  related.  For  a 
Carnot  heat  pump,  the  coefficent  is  written  Eftp.  It  is  simply  the  reciprocal 
of  the  thermodynamic  efficiency  77*  given  by  Eq.  (19-52)  for  the  same  de- 
vice run  in  the  forward  direction  as  an  engine.  Since  the  thermodynamic 
efficiency  is  always  less  than  1,  the  coefficient  of  performance  is  always 
greater  than  1.  Using  Eq.  (19-52),  we  can  write 


Eftp  = — 

Tj* 


Thi 


878  Thermodynamics 


Thi  Tl0 


(19-69a) 


w e can  use  the  first  law  to  rewrite  the  refrigerator  coefficient  of  perform- 
ance Er  given  by  Eq.  (19-68a)  in  the  form 

^ _ heat  added  to  sink  — work  input 
work  input 

= Ehp  - 1 

For  a Carnot  refrigerator,  we  therefore  have  from  Eq.  (19-69«)  a coeffi- 
cient of  performance  Ef  given  by 

E*  = Efp  - 1 = ^ (19-696) 

•'hi  -<10 

In  Fig.  19-21,  the  two  coefficients  of  performance Effp  andEji?  are  plotted  as 
functions  of  T\0/Thi.  As  the  Carnot  engine  becomes  more  efficient  with  a 
decrease  in  this  temperature  ratio  (see  Fig.  19-18),  the  Carnot  heat  pump 
and  refrigerator  become  more  efficient  with  its  increase. 

Qualitatively  speaking,  the  same  is  true  of  real  heat  pumps.  This  is  why  they 
are  most  useful  in  relatively  mild  climates,  where  their  coefficients  of  performance 
can  be  relatively  large.  The  popularity  of  heat  pumps  depends  in  part  on  their  re- 
versibility, not  in  the  thermodynamic  sense  of  the  word  but  in  the  sense  that  the 
same  unit  can  be  used  as  an  air  conditioner  (that  is,  a refrigerator)  in  summer.  We 
need  merely  interchange  the  source  and  sink  connections.  But  the  major  economic 
factors  in  heat-pump  heating  are  their  complexity  and  high  cost  compared  to  the 
simple  furnace  and  the  cost  of  electricity  relative  to  that  of  heating  fuels.  With  the 


Mo 

7hi 


Fig.  19-21  Plot  of  the  refrigerator  coefficient  of  performance  E?  and  the 
heat  pump  coefficient  of  performance  Effr,  for  a Carnot  engine  run  in  re- 
verse as  a refrigerator  (or  heat  pump).  Plotted  on  the  horizontal  axis  is 
T\o/Thl,  the  ratio  of  the  temperature  of  the  heat  source  (in  this  case  the 
cooler  reservoir)  and  the  heat  sink  (the  warmer  reservoir). 


0.4 


0.6 


0.8 


19-6  Heat  Pumps,  Refrigerators,  and  Engines  879 


increasing  cost  of  petroleum  products,  the  relative  cost  of  electricity  may  fall  as 
more  reliance  is  placed  on  nuclear  power  plants.  But  whatever  the  cost  of  electric- 
ity, the  cost  of  using  it  to  drive  a heat  pump  is  1 /Ehp  times  the  cost  of  direct  electric 
heating. 

If  a Carnot  refrigerator  were  available  for  household  use,  it  might  typ- 
ically be  required  to  keep  food  at  an  absolute  temperature  T]o  = 255  K in 
its  freezer  section,  when  the  household  temperature  was  Thi  = 295  K.  Its 
coefficient  of  performance  would  tfius  be 

255  K 

£ ''  295  K - 255  K 

That  is,  each  1 J of  input  energy  transfers  6.4  J of  heat  energy  out  of  the 
refrigerator  compartment.  Real  household  refrigerators  have  coefficients 
of  performance  about  half  this  large  at  best.  One  important  reason  for  this 
is  a practical  one.  Il  is  not  economically  feasible,  from  the  point  of  view  of 
initial  capital  investment  and  mechanical  reliability,  to  use  a reversible 
engine  at  all.  The  most  common  kind  of  household  refrigerator  works  on 
the  cycle  shown  in  Fig.  19-22.  The  compressor  (a  highly  reliable  electric 
pump)  compresses  a readily  liquefiable  gas,  usually  one  of  the  synthetic 
fluorocarbons  called  Freons.  The  compression  process  is  more  or  less  adia- 
batic. l he  heated,  compressed  liquid  passes  through  a tube  to  a set  of  coils 
outside  the  refrigerator.  These  radiator  coils  are  usually  in  back,  and  it  is 
easy  to  feel  the  heat  evolved  by  placing  your  hand  near  them.  (It  is  impor- 
tant to  keep  them  reasonably  clean  and  to  not  block  the  passage  of  air  past 
them,  so  as  to  minimize  Thi .)  The  compressed  liquid,  thus  cooled  to  a tem- 
perature not  too  far  above  room  temperature,  passes  into  an  expansion 


Insulation 


Radiator 
(outside  box) 


Fig.  19-22  (a)  Schematic  diagram  of  the 

operation  of  the  most  common  form  of 
household  refrigerator.  ( b ) Both  p-V  and 
T-S  diagrams  of  the  thermodynamic  “cycle” 
of  the  refrigerator.  Path  1 represents  the 
(approximately)  adiabatic  compression  of 
the  working  fluid.  (The  adiabaticity  of  this 
process  is  more  evident  in  the  T-S  plot.  Can 
you  explain  the  “corner”  on  the  p-V  plot 
where  the  gas  liquefies?)  Along  path  2 the 
working  fluid,  which  has  been  heated  well 
above  room  temperature  in  the  adiabatic 
compression  process,  cools  to  approxi- 
mately room  temperature  in  the  radiator 
coils.  This  process  is  isobaric,  as  can  be  seen 
from  the  p-V  plot.  The  working  fluid  is  then 
sprayed  through  a nozzle  located  inside  the 
freezer  compartment,  vaporizing  from  liq- 
uid back  to  gas  in  its  rapid  expansion.  This 
J oule-Kelvin  expansion  is  nonquasistatic,  and 
the  working  fluid  does  not  pass  through  a 
series  of  well-defined  equilibrium  states. 
Therefore  the  process  cannot  be  repre- 
sented as  a curve  on  a state  diagram  such  as 
a p-V  or  T-S  diagram.  When  the  fluid  is 
again  approximately  at  equilibrium,  its 
temperature  and  pressure  have  decreased 
and  its  volume  and  entropy  have  increased, 
as  represented  by  the  point  at  the  begin- 
ning of  path  4.  Along  path  4,  the  fluid 
warms  further  in  an  isobaric  process,  as  it 
absorbs  heat  from  its  surroundings,  and  re- 
turns to  the  compressor  for  another  cycle. 


880  Thermodynamics 


chamber  located  in  the  freezer  compartment.  There  the  liquid,  still  under 
pressure,  sprays  through  a small  nozzle,  evaporating  into  a much  larger 
volume  at  lower  pressure.  The  result  is  a considerable  chop  in  tempera- 
ture, and  the  cooled  working  fluid — now  a gas — can  absorb  heat  from  the 
refrigerator  compartment.  This  Joule-Kelvin  expansion  process,  though 
it  is  realizable  in  a compact,  mechanically  simple,  and  very  reliable  device,  is 
nonquasistatic  and  highly  irreversible  thermodynamically.  Substantial  effi- 
ciency is  sacrificed  to  simplicity. 

In  the  United  States,  new  refrigerators  are  required  by  law  to  carry  a figure  of 
merit  expressing  their  efficiency.  The  figure  used,  however,  takes  into  consider- 
ation such  other  matters  as  the  efficacy  of  the  insulation  in  the  walls.  It  is  thus  not 
directly  comparable  to  the  coefficient  of  performance. 

Not  all  real  refrigerators  are  irreversible.  The  device  most  often  used  in  mod- 
ern practice  to  liquefy  helium  gas  employs  a small  reciprocating  engine  run  in  re- 
verse, in  which  the  expanding  gas  does  work  against  a piston.  At  still  lower  tem- 
peratures, the  magnetic  cooling,  or  adiabatic  demagnetization,  refrigerator  is 
used.  It  has  no  moving  parts  at  all,  and  yet  it  approximates  a Carnot  engine.  The 
sample  to  be  cooled  is  placed  in  good  thermal  contact  with  a quantity  of  a so- 
called  paramagnetic  salt.  For  our  purposes,  the  paramagnetic  salt  may  be  consid- 
ered as  containing  a collection  of  atom-sized  bar  magnets,  each  mounted  on  a tiny 
pivot;  see  Fig.  19-23.  These  magnets  are  in  permanent  thermal  contact  with  the 
salt  considered  as  a whole,  and  particularly  with  the  crystal  system  we  have  idea- 
lized as  an  array  of  bodies  connected  by  a network  of  springs.  In  the  absence  of  an 
external  magnetic  field,  the  magnets  are  oriented  in  all  possible  directions.  In  this 
disordered  state,  the  entropy  of  the  magnets,  and  thus  of  the  entire  system,  is  rela- 
tively high. 

The  entire  system  is  well  insulated,  but  it  is  placed  in  good  thermal  contact 
with  a container  of  liquid  helium,  chilled  by  evaporation  to  about  1.2  K.  One  way 
of  doing  this  is  by  making  the  thermal  connection  with  a metal  rod. 


Fig.  19-23  Schematic  diagram  of  the  adiabatic  demagnetization  process  used  to  achieve  very 
low  temperatures.  The  first  T-S  diagram  represents  the  isothermal  magnetization  of  a sample 
of  paramagnetic  salt  (the  “working  fluid”)  in  thermal  contact  with  a bath  of  liquid  helium.  The 
magnetization  process  lines  up  the  atomic  magnets  and  thus  orders  them,  reducing  their  en- 
tropy, while  heat  flows  to  the  bath.  The  second  T-S  diagram  represents  the  demagnetization, 
in  which  the  alignment  of  the  atomic  magnets  is  destroyed,  with  their  entropy  increasing  in 
the  process.  But  the  sample  as  a whole  is  thermally  isolated,  and  therefore  the  overall  process 
must  be  adiabatic.  So  heat  flows  to  the  atomic  magnets  from  the  sample  as  a whole,  and  the 
sample  temperature  is  reduced. 


Magnetic 

field 


Paramagnetic 
salt 


Thermal 

contact 

broken 


T=  1.2  K 


Magnetic 

field 


Magnet 

turned 

off 


T = 1.2  K 


T = 1 0~2  K 


M 


19-6  Heat  Pumps,  Refrigerators,  and  Engines  881 


A large  external  magnet  is  now  turned  on,  and  the  little  bar  magnets  line  up 
along  the  external  magnetic  field.  The  order  of  the  system  is  thus  increased,  and 
the  entropy  is  decreased.  This  process  is  depicted  by  the  isothermal  path  KL  on  the 
T-S  diagram  in  Fig.  19-23. 

The  thermal  contact  to  the  helium  bath  is  broken,  leaving  the  system  ther- 
mally isolated.  Then  the  external  magnet  is  turned  off.  The  orientation  of  the 
atomic  magnets  therefore  returns  to  its  original  random  state,  and  this  means  their 
entropy  must  increase.  Thus,  as  far  as  the  subsystem  comprising  the  atomic 
magnets  is  concerned,  the  quantity  T AS  = AH  has  a positive  value  (since  the 
temperature  is  always  positive).  But  the  system  as  a whole  is  thermally  isolated, 
and  thus  its  internal  energy  must  remain  constant.  So  heat  flows  from  the  other 
“components”  of  the  system,  notably  the  molecular  vibrations,  to  the  atomic 
magnets.  The  result  is  a fall  in  the  temperature.  The  entropy  of  the  entire  system 
remains  constant  in  this  adiabatic  process,  and  thus  the  system  passes  along  path 
LM  on  the  T-S  diagram  in  Fig.  19-23. 

Very  tiny  amounts  of  heat  are  involved  in  magnetic  cooling,  but  the  tempera- 
ture drop  is  dramatic.  By  beginning  at  1.2  K,  it  is  possible  to  achieve  temperatures 
in  the  neighborhood  of  10-2  K,  a hundredfold  reduction.  The  analogous  process 
using  nuclear  magnetism  can  be  used  to  reach  temperatures  of  order  10-6  K,  the 
lowest  yet  achieved. 

The  process  is  not  usually  continued  cyclically,  and  so  the  rectangular  Carnot 
cycle  on  the  T-S  diagram  is  not  completed.  The  reason  is  that  the  heat  capacities  of 
materials  at  very  low  temperatures  are  so  small  that  it  is  not  usually  possible  to 
approximate  a large  heat  reservoir  with  which  the  paramagnetic  refrigerator  can 
interact  isothermally. 

Let  us  now  compare  real  heat  engine  systems  with  the  Carnot  engine. 
Perhaps  the  most  obviously  unrealistic  property  of  the  Carnot  engine  is  the 
way  in  which  it  is  transferred  from  heat  source  to  thermal  isolation  to  heat 
sink.  In  any  real  engine,  the  working  fluid  is  brought  to  the  mechanical  de- 
vice (cylinder  or  turbine)  which  is  the  heart  (and  the  most  obvious  part)  of 
the  engine.  But  this  difference  is  not  thermodynamically  significant.  There 
are  other  more  important,  if  less  visible,  differences  between  real  and 
Carnot  engines.  Principal  among  these  is  the  fact  that  it  is  rarely,  if  ever, 
possible  in  a real  system  to  approximate  the  isothermal  Carnot  heat 
transfer  condition.  As  a result,  the  ideal  efficiency  is  determined  by  some 
sort  of  average  input  and  exhaust  temperatures  (which  were  called  ( T)  i and 
<T)2,  respectively,  in  Sec.  19-4)  rather  than  by  the  highest  and  lowest  tem- 
peratures present  during  the  cycle.  This  immediately  reduces  the  efficiency 
below  the  Carnot  efficiency,  r?*  = (Thi  — TIo)/7hi,  given  by  Eq.  (19-52). 

In  order  to  see  how  the  conditions  imposed  by  reality  affect  the  design 
and  properties  of  heat  engines,  we  consider  very  briefly  two  important 
types.  One  is  the  steam  turbine,  used  very  widely  in  large  electric  gen- 
erating plants  and  ships.  The  other  is  the  most  common  kind  of  gasoline 
engine  (known  technically  as  the  Otto  engine),  used  in  most  automobiles 
and  countless  other  applications. 

The  steam  turbine  is  currently  the  most  important  type  of  external- 
combustion  engine.  In  such  an  engine,  the  cycle  does  not  take  place  en- 
tirely in  the  turbine  (or  cylinder),  but  in  various  parts  of  the  closed  system 
shown  in  Fig.  19-24.  Steam  is  produced  at  the  highest  temperature  practi- 
cable in  a boiler  and  passes  under  pressure  to  the  turbine.  There  it  expands 
in  approximately  adiabatic  fashion  and  emerges  at  a reduced  temperature 
and  pressure  to  pass  into  the  condenser.  In  the  condenser,  the  coldest  sub- 
stance readily  available  in  quantity  is  used  to  cool  the  steam  to  the  lowest 


882  Thermodynamics 


Exhaust 

gas 


Fig.  19-24  Schematic  diagram  of  a steam  turbine 
system.  The  closed  system  is  typical  of  large  external- 
combustion  engines. 


Turbine 


A W„ 


out 


temperature  practicable.  The  boiler  and  condenser  temperatures  establish 
the  ideal  Carnot  efficiency  and  thus  limit  the  maximum  efficiency  which 
can  be  expected  from  the  system.  (This  is  one  reason  why  large  power 
plants  are  placed  near  rivers  or  large  bodies  of  water,  if  it  is  at  all  possible  to 
do  so.) 

In  the  condensation  process,  the  steam  condenses  to  liquid  water.  The 
condensed  water  is  pumped  back  into  tbe  boiler  by  a feed  pump.  Since  the 
boiler  is  under  pressure,  the  water  must  be  compressed.  The  compression 
is  approximately  adiabatic.  In  a Carnot  engine,  this  process  would  result  in 
bringing  the  working  fluid  back  to  the  boiler  temperature.  But  water,  being 
a liquid,  changes  its  volume  and  temperature  little  under  compression. 
When  its  pressure  is  equal  to  that  in  the  boiler,  its  temperature  is  still  well 
below  that  of  the  steam  in  the  boiler.  Therefore  the  water  must  be  heated 
in  a process  as  close  to  reversible  as  possible.  This  is  usually  accomplished 
with  multistage  boilers,  aided  by  a series  of  preheaters  which  may  employ 
the  heat  from  the  stack  gases  or  the  low-pressure  steam  exhausted  from  the 
turbine.  But  even  if  the  heating  process  were  reversible,  it  would  still  take 
place  at  a temperature  whose  average  value  is  considerably  less  than  that  of 
the  boiler  itself.  The  thermodynamic  efficiency  is  thus  impaired. 

There  is  yet  another  important  limit  on  attainable  efficiency.  The 
highest  temperature  available  in  the  flame  of  common  fuels  is  quite 
high  — 3000  K is  not  difficult  to  achieve.  But  a practical  upper  limit  on 
boiler  temperature  is  set  by  the  strength  of  materials  at  high  temperature 
and  by  the  properties  of  water,  the  working  fluid.  Roughly  speaking,  this 
upper  limit  is  Thi  = 800  K.  It  is  not  possible  to  take  advantage  of  the  much 
higher  Carnot  efficiency  which  would  result  if  Thi  were  the  flame  tempera- 
ture. 


19-6  Heat  Pumps,  Refrigerators,  and  Engines  883 


If  we  take  the  condenser  temperature  Ti0  to  be  roughly  300  K,  the 
Carnot  efficiency  is  about  0.63,  or  63  percent.  The  actual  efficiency 
achieved  by  the  most  modern  steam  plants  is  about  49  percent.  Any  sub- 
stantial improvement  will  depend  on  new  technology’s  allowing  the  use  of 
much  higher  boiler  temperatures. 

Nuclear  fission  steam  plants  differ  thermodynamically  from  fossil-fuel 
plants  only  in  the  way  that  the  water  is  heated.  However,  the  hostile  envi- 
ronment in  which  parts  of  the  heating  system  must  operate,  subject  to  in- 
tense nuclear  radiation,  rules  out  the  use  of  some  of  the  strongest  materials 
(such  as  special  steels).  It  also  requires  that  the  components  be  operated  at 
lower  temperatures  and  pressures  than  would  be  possible  in  the  absence  of 
the  radiation.  Consequently,  nuclear  plants  are  operated  at  substantially 
lower  temperatures  Thi  than  fossil-fuel  plants.  Their  overall  efficiency  is 
currently  about  30  percent,  although  improvement  is  still  fairly  rapid. 


The  gasoline  engine  presents  a series  of  problems  which  are  different 
in  detail,  but  still  fall  into  the  same  general  categories.  The  gasoline  engine 
is  an  internal-combustion  engine — that  is,  one  in  which  the  heat  is  pro- 
duced by  chemical  reaction  right  inside  the  cylinder.  Indeed,  all  parts  of 
the  cycle  take  place  in  the  engine  proper.  As  a consequence,  the  working 
fluid  cannot  recirculate,  but  is  created  anew  in  each  engine  cycle. 

The  idealized  form  of  the  most  important  type  of  gasoline-engine 
cycle  is  called  the  four-stroke  Otto  cycle.  Its  operation  is  shown  schemati- 
cally in  Fig.  19-25«,  and  the  p-V  diagram  of  the  cycle  is  shown  in  Fig. 
19-256.  An  air-fuel  mixture  is  drawn  into  the  cylinder  during  the  outward 
intake  stroke  of  the  piston.  The  intake  valve  then  closes,  and  the  mixture  is 
compressed  approximately  acliabatically  during  the  compression  stroke.  Near 
the  point  of  maximum  compression,  a spark  ignites  the  mixture,  which 
burns  quite  rapidly.  Ideally,  the  combustion  is  so  fast  that  the  piston  hardly 
moves  during  its  course.  Thus  the  “heat  input”  to  the  engine  occurs  under 
isometric  conditions.  The  actual  combustion  process  is  far  from  quasistatic. 
It  is  difficult  to  measure — or  even  define — the  working  fluid  temperature 
under  such  conditions.  Locally,  however,  the  temperature  may  reach  6000 
K.  The  effective  input  temperature  is  far  lower.  It  is  further  reduced  by  the 
contact  of  the  hot  gas  with  the  cylinder  walls,  which  have  a relatively  high 
heat  capacity  and  are  kept  relatively  cool.  In  the  idealized  picture,  however, 
the  hot  gas  expands  more  or  less  acliabatically  in  the  power  stroke  and  then 
escapes  irreversibly  through  the  exhaust  valve,  which  opens  at  the  end  of 
the  stroke.  Finally,  the  residual  working  fluid  is  pushed  out  of  the  cylinder 
during  the  exhaust  stroke,  and  a fresh  quantity  of  air-fuel  mixture  is 
drawn  in. 

It  can  be  shown  that  the  ideal  efficiency  of  tfie  Otto  cycle  is  given  by  the 
expression 

7]  = 1 - rJ-y  (19-70) 

where  r,  the  compression  ratio,  is  the  ratio  of  the  maximum  volume  of  the 
cylinder  to  its  minimum  volume  and  y is  the  specific  heat  ratio.  (This  result 
applies  to  the  adiabatic  compression  ratio  of  Carnot  engines  as  well.)  The 
practical  compression  ratio  is  limited  by  the  cost  and  availability  of  fuels 
which  do  not  ignite  prematurely  as  they  are  heated  by  adiabatic  compres- 
sion. With  lead-free  fuels,  a practical  limit  is  about  r = 9.  Since  y — 1.4,  this 
corresponds  to  an  ideal  efficiency  j)  — 58  percent.  In  fact,  attainable  effi- 


884  Thermodynamics 


Intake  Exhaust 
valve  valve 


Fuel-air 

mixture 


M 


Spark 
/ Plug 


riW 

/>|X 


ri-U-1 


Intake  Compression  Ignition 

stroke  stroke 


Power 

stroke 


(a) 


Exhaust 


Exhaust 

valve  Exhaust 

opens  stroke 


Exhaust 

gas 


Fig.  19-25  (a)  Operation  of  the  four- 

stroke  Otto  cycle,  a typical  internal- 
combustion  cycle.  ( b ) The  Otto  cycle 
shown  on  a p-V  diagram. 


( b ) 


ciencies  run  rather  lower  than  30  percent.  In  any  case,  Otto  engines  are 
rarely  operated  under  the  constant-load,  constant-speed  conditions  which 
maximize  efficiency. 


19-7  THE  THIRD  LAW  If  thermodynamics  is  to  be  treated  as  a self-contained  science,  it  requires 
OF  THERMODYNAMICS  one  more  fundamental  law  for  its  foundation.  Like  the  zeroth,  hrst,  and 

second  laws  of  thermodynamics,  the  third  law  of  thermodynamics  can  be 
derived  from  statistical  considerations.  One  way  of  stating  the  third  law  is 
as  follows:  The  absolute  zero  of  temperature  can,  in  principle,  be  approached  as 
closely  as  desired,  but  can  never  be  attained  in  a finite  number  of  steps. 

A nonrigorous  justification  of  this  law  follows.  Imagine  that  we  have  a 
Carnot  refrigerator  whose  working  fluid  has  a total  heat  capacity  C,  which 
is,  in  general,  a function  of  the  temperature.  This  refrigerator  is  to  be  used 
to  cool  a heat  source  to  absolute  zero.  The  heat  extracted  from  that  source 
is  rejected  to  a sink  at  a temperature  Thi  which  is  AT  higher  than  Tl0,  the 
source  temperature.  As  the  source  is  cooled  and  Tlo  decreases,  some  other 
refrigerator  is  used  to  cool  the  sink  so  as  to  keep  AT  constant,  as  shown  in 
Fig.  19-26.  In  each  cycle,  the  Carnot  refrigerator  extracts  an  amount  of 
heat  from  the  source  given  by 


A// = C AT  (19-71) 

as  the  working  fluid  is  cycled  through  a temperature  difference  AT  = 

19-7  The  Third  Law  of  Thermodynamics  885 


Fig.  19-26  A hypothetical  apparatus 
for  cooling  a heat  source  to  absolute 
AH  zero.  The  inability  of  this  idealized 

apparatus  to  carry  out  its  task  in  a finite 
number  of  cycles  exemplifies  the  third 
law  of  thermodynamics. 

Auxiliary 

refrigerator 


O AH 

Heat  sink  for 

T = T = T + AT 

1 1 hi  Mo  Carnot  refrigerator 

| AH 

Carnot 

refrigerator 


-i  AH 

Heat  source 
T = T]o  to  be  cooled 

to  T = 0 


Thi  — Co.  If  we  let  AT  be  arbitrarily  small,  we  can  use  Eq.  (19-42),  dH  = 
T dS,  and  write  Eq.  (19-71)  in  tbe  form 

T dS  = C dT 

Thus  the  entropy  change  of  the  heat  source  in  one  such  cycle  of  the  refrig- 
erator is 


C 

dS=jdT  (19-72) 

If  we  cool  the  source  from  an  initial  tempeature  Tt  to  a final  temperature 
Tf,  we  reduce  its  entropy  by  an  amount 

[Tf  C 

AS  = - dT  (19-73) 

J T i 1 

Ellis  is  also  the  amount  of  entropy  which  “passes  through"  the  reversible 
Carnot  refrigerator.  That  is,  the  refrigerator  increases  the  entropy  of  the 
heat  sink  by  AS. 

Now  let  us  try  to  make  Tf  = 0.  That  is,  let  us  try  to  cool  the  heat  source 
to  absolute  zero.  We  then  have 

AS  = f°^dT  (19-74) 

J 7)  l 

The  quantity  AS  on  the  left  side  of  this  equation  cannot  be  infinite. 

To  see  this,  note  that  AS  represents  the  difference  S (0)  — S(T;).  The  definition 
of  entropy  is  given  by  Eq.  (18-54],  S = k In  [w(E]],  where  k is  Boltzmann’s  con- 
stant and  w(E)  is  the  number  of  microstates  available  to  the  system  when  its  in- 
ternal energy  is  E.  At  any  nonzero  temperature  T,-,  w(E)  is  a finite  number, 


886  Thermodynamics 


although  it  can  be  very  large.  Consequently,  S(Tf)  is  finite.  And  when  T = 0,  the 
number  of  microstates  cannot  be  less  than  1,  so  we  can  discount  the  possibility 
that  S (0)  = k In  0 = — °°. 

But  if  AS  is  not  to  be  infinite,  the  integrand  on  the  right  side  of  Eq. 
(19-74)  must  not  “blow  up”  as  T approaches  zero.  This  can  be  the  case  only 
if 

C >0  as  T >0  (19-75) 

That  is,  the  heat  capacity  of  the  working  fluid  in  the  Carnot  refrigerator 
must  approach  zero  sufficiently  rapidly,  as  T approaches  zero,  to  ensure 
that  the  integral  has  a finite  value. 

The  physical  significance  is  that  in  each  cycle,  the  refrigerator  is  ca- 
pable of  removing  less  heat  from  the  source  than  it  did  in  the  previous 
cycle.  Although  it  cannot  be  proved  rigorously  here,  the  total  heat  removed 
from  the  source  is  described  mathematically  by  a convergent  infinite  series. 
At  best,  an  infinite  number  of  terms,  representing  an  infinite  number  of 
refrigerator  cycles,  are  required  to  reach  absolute  zero.  Since  each  cycle  of 
a real  refrigerator  takes  a finite  amount  of  time  to  execute,  even  an  ideal 
Carnot  refrigerator  cannot  reach  absolute  zero  in  a finite  amount  of  time. 
However,  there  is  no  bar  in  principle  against  using  such  a refrigerator  to 
reach  an  arbitrarily  low  finite  temperature. 

If  a Carnot  refrigerator  cannot  reach  absolute  zero  in  a finite  number 
of  cycles,  no  other  refrigerator  can  do  so,  since  we  have  seen  that  no  other 
refrigerator  can  have  a higher  coefficient  of  performance.  It  is  not  neces- 
sary that  the  Carnot  refrigerator  operate  on  the  basis  of  a gas  in  a cylinder. 
Note  that  the  adiabatic  demagnetization  refrigerator  discussed  in  Sec.  19-6 
operates  in  principle  as  a Carnot  refrigerator.  Note  also  that  the  require- 
ment of  Eq.  (19-75),  that  the  specific  heat  of  the  working  fluid  approach 
zero  as  the  temperature  approaches  absolute  zero,  applies  to  all  matter, 
since  no  particular  working  fluid  was  specified. 

Implicit  in  this  discussion  of  the  third  law  of  thermodynamics  is  an 
alternative  way  of  stating  the  law:  As  the  temperature  of  a system  approaches  ab- 
solute zero,  its  entropy  approaches  a constant  value  S0 ■ This  is  a direct  conse- 
quence of  the  microscopic  definition  of  entropy,  which  was  used  explicitly 
in  the  discussion  of  Eq.  (19-74).  Because  of  the  direct  dependence  of  the 
third  law  on  the  microscopic  definition  of  entropy,  a deeper  inquiry  leads 
to  the  conclusion  that  the  third  law  has  an  essentially  quantum-mechanical 
nature.  This  is  to  be  expected,  since  the  microscopic  behavior  of  the  statis- 
tical world  in  which  entropy  is  defined  is  quantum-mechanical. 

We  have  considered,  on  the  basis  of  statistical  mechanics,  all  four  of  the 
fundamental  laws  that  underlie  all  macroscopic  thermodynamic  interac- 
tions. Thermodynamics  is  thus  grafted  onto  the  main  stem  of  physics.  The 
four  laws  of  thermodynamics  may  be  summarized  as  follows: 

0.  If  two  systems  are  in  thermal  equilibrium  with  a third  system,  then 
they  are  in  thermal  equilibrium  with  each  other. 

1.  The  change  in  the  internal  energy  E of  a system,  as  it  passes  from 
one  equilibrium  state  to  another,  is  equal  to  the  sum  of  the  heat  energy 
input  A77  via  thermal  interaction  and  the  work  A W done  on  the  system  by 
varying  one  or  more  external  parameters: 

AT  = \H  + AW 


19-7  The  Third  Law  of  Thermodynamics  887 


2.  In  any  thermodynamic  interaction  among  the  subsystems  of  an  iso- 
lated system  involving  the  transfer  of  heat,  the  performance  of  mechanical 
work,  or  both,  the  entropy  of  the  system  must  either  remain  constant  or  in- 
crease: 

A5  55  0 

3.  The  absolute  zero  of  temperature  can  be  approached  but  cannot  be 
attained  by  any  physical  process. 

Because  of  the  underlying  statistical  nature  of  these  laws,  it  is  possible 
to  express  them  trenchantly,  if  indirectly,  in  the  language  of  the  gambling 
casino: 

0.  You  can’t  be  luckier  than  any  other  player. 

1.  You  can't  win  more  than  the  whole  pot. 

2.  You  can’t  do  any  better  than  to  break  even. 

3.  You  can’t  cash  in  all  your  chips  and  get  out  of  the  game. 


EXERCISES 


Group  A 

19-1.  From  water  to  steam.  The  volume  of  one  kilogram 
of  water  at  100°C  is  about  1 x 10-3  m3.  The  volume  of  the 
vapor  formed  when  it  boils  at  this  temperature  and  at 
standard  atmospheric  pressure  is  1.671  m3. 

a.  How  much  work  is  done  in  pushing  back  the 
atmosphere? 

b.  How  much  is  the  increase  in  the  internal  energy 
when  the  liquid  changes  to  vapor? 

c.  Is  this  increase  in  internal  energy  an  increase  in 
kinetic  energy,  in  potential  energy,  or  in  both? 

19-2.  It  works!  Exactly  1 kmol  of  air  expands  isother- 
mally  and  reversibly  at  300  K from  2.00  atm  pressure  to 

1 .00  atm.  Calculate  the  work  done  by  the  gas. 

19-3 .Joule’s  experiment.  In  Fig.  19E-3,  A contains 
one  kilomole  of  an  ideal  gas  at  pressure  p and  tempera- 
ture T.  B is  an  evacuated  space  of  the  same  volume.  When 
the  partition  between  the  chambers  is  removed,  the  gas 
rushes  into  B,  filling  both  chambers.  Joule  performed  this 
experiment  and  could  detect  no  temperature  change  in 
either  the  gas  or  the  surroundings. 


A 


B 


Fig.  19E-3 


a.  Is  the  process  quasistatic  or  irreversible?  What  is 
the  value  of  A H?  of  A W?  of  AE? 

b.  Accepting  Joule’s  result,  how  does  E for  an  ideal 
gas  depend  on  the  volume  of  the  gas? 


19-4.  Adiabatic  expansion.  In  an  adiabatic  process, 

1.00  kmol  of  oxygen  at  5.00  atm  pressure  and  300  K tem- 
perature expands  so  that  its  final  pressure  is  1.00  atm. 
What  is  its  final  volume  and  temperature?  Consider  the 
gas  as  ideal. 

19-5.  It  rork  done  in  compressing  gas.  From  physical  con- 
siderations, show  that  the  work  done  in  compressing  adia- 
batically  and  reversibly  n kilomoles  of  an  ideal  gas  is  W = 
nCv'(Tf — Ti),  where  Tf  and  ZJ  are  the  final  and  initial 
temperatures. 

19-6.  Diesel  engine.  The  compression  ratio  in  a diesel 
engine  is  16  to  1.  If  the  initial  temperature  of  the  air  being 
compressed  is  300  K,  what  is  its  temperature  at  the  end  of 
the  compression  stroke?  Consider  air  an  ideal  gas  under 
these  conditions. 

19-7.  Liquefying  helium.  When  helium  gas  is  liquefied, 
it  is  initially  cooled  by  adiabatic  reversible  expansion  from 

15.0  atm  pressure  to  1.00  atm  pressure.  Calculate  the  final 
temperature  if  the  gas  was  at  a temperature  of  290  K. 

19-8.  Adiabatic  compression.  An  insulated  cylinder  con- 
tains 0.200  m3  of  carbon  dioxide  at  room  temperature 
(300  K)  and  atmospheric  pressure  (1.01  x 10°  Pa).  It  is 
compressed  adiabatically  until  the  pressure  is  increased  to 

1.00  x 106  Pa.  Find  the  final  volume  and  temperature  of 
the  gas. 

19-9.  Monatomic  or  diatomic ? As  a sample  of  gas  is  al- 
lowed to  expand  quasistatically  and  adiabatically,  its  pres- 
sure drops  from  1.20  X 105  Pa  to  1.00  x 105  Pa,  and  its 
temperature  drops  from  300  K to  280  K.  Is  the  gas  mona- 
tomic or  diatomic? 


888  Thermodynamics 


19-10.  Fill  in  the  blanks.  The  p-V  diagram  in  Fig. 
19E-10  represents  a reversible  cycle  of  operations  per- 
formed by  an  ideal  gas  in  which  MN  is  an  isothermal  and 
NK  an  adiabatic.  Fill  in  the  chart  for  this  cycle,  using  + to 
indicate  an  increase  in  the  quantity  listed,  — to  indicate  a 
decrease,  and  0 to  indicate  no  change. 


P Fig.  19E-10 


Path 

A H 

AW 

AE 

AT 

AS 

KL 

LM 

MN 

NK 

19-11.  Carnot  engine,  I.  The  work  output  of  a Carnot 
engine  is  1.00  x 103  J.  Its  operating  temperatures  are 
Thi  = 400  K and  7j0  = 300  K. 

a.  How  much  energy  is  absorbed  by  the  working  sub- 
stance during  the  isothermal  expansion? 

b.  How  much  energy  is  rejected  to  the  low- 
temperature  heat  reservoir? 

19-12.  Carnot  engine,  II.  A Carnot  engine  removes  16 
units  of  heat  from  the  high-temperature  reservoir.  Its  ef- 
ficiency is  1/4. 

a.  How  much  heat  does  it  give  up  to  the  low- 
temperature  reservoir?  What  is  the  work  output? 

b.  If  the  temperature  of  the  heat  source  is  400  K, 
what  is  the  temperature  of  the  heat  sink? 

19-13.  Heat  pump.  A heat  pump  is  to  be  used  to  heat  a 
house  to  an  inside  temperature  of  20°C.  Compare  its  ideal 
coef  hcient  of  performance  E*p  for  the  cases  when  the  out- 
side temperature  is  — 10°C  (14°F)  and  when  it  is  + 10°C 
(50°F). 

19-14.  Efficient  performance.  Show  that  the  relation 
between  the  Carnot  efficiency  17*  and  the  Carnot  coeffi- 
cient of  performance  E?  is  Ef  = (1  — r]*)/r}*. 


Group  B 

19-15.  Work  done  by  a heat  engine.  Imagine  the  cylin- 
der and  piston  arrangement  in  Fig.  19-1  immersed  in  a 
large  tank  of  water  at  a temperature  of  300  K and  con- 
taining 0.200  knrol  of  an  ideal  gas. 

a.  What  volume  must  the  cylinder  have  so  that  the 
piston  does  not  move  when  it  is  in  contact  with  the  open 
air  at  sea  level? 

b.  Now  let  the  same  cylinder  and  piston  be  immersed 
in  a large  water  tank  at  a temperature  of  360  K,  and  con- 
strained from  expanding  or  contracting.  What  will  the 
new  value  be  for  the  pressure  of  the  gas  inside  the  cylin- 
der? 

c.  Finally,  let  the  piston  move  slowly  until  the  pres- 
sure in  the  cylinder  is  equal  to  atmospheric  pressure.  How 
much  work  has  been  done  by  the  piston  on  the  sur- 
roundings outside  of  the  cylinder? 

d.  Show  all  changes  in  state  of  the  gas  in  the  cylinder 
on  a p-V  diagram. 

19-16.  Moles  following  a straight-line  path. 

a.  n kmol  of  a monatomic  ideal  gas  is  taken  quasistati- 
cally  from  state  A to  state  C along  the  straight-line  path 
shown  in  Fig.  19E-16.  For  this  process,  calculate  the  work 
AW  done  on  the  gas,  the  increase  AE  of  its  internal  energy, 
and  the  heat  A H added  to  the  gas.  Express  all  answers  in 
terms  of  pA  and  V A . 


P Fig.  19E-16 


b.  Repeat  part  a if  the  gas  is  taken  quasistatically 
from  A to  C along  the  path  ABC. 

c.  Explain  the  similarities  and  differences  between 
the  results  of  parts  a and  b. 

19-17.  Compressing  an  ideal  gas.  n kmol  of  an  ideal  gas 
is  compressed  quasistatically  at  constant  temperature 
from  an  initial  volume  VA  to  a final  volume  Vg  < V A ■ How 
much  work  was  done  on  the  gas  to  compress  it?  How 
much  heat  was  added  to  the  gas? 

19-18.  Cycling  helium.  A sample  containing  1.00  kmol 
of  the  nearly  ideal  gas  helium  is  put  through  the  cycle  of 
operations  shown  in  Fig.  19E-18.5C  is  an  isothermal,  and 
pA  = 1.00  atm,  VA  = 22.4  m3,  pB  = 2.00  atm. 

a.  What  are  TA , 7e , and  Fc? 


Exercises  889 


p 


Fig.  19E-18 


b.  Calculate  the  work  output  during  the  cycle. 

c.  Show  that  this  work  is  equal  to  the  net  heat  ab- 
sorbed by  the  gas. 

d.  Calculate  the  efficiency  of  the  cycle. 

19-19.  Cycling  an  ideal  gas.  In  the  cycle  shown  in  Fig. 
19E-19.TB  and  CD  are  isotherms.  The  working  substance 
may  be  considered  an  ideal  gas. 

P Fig.  19E-19 


a.  For  each  of  the  four  parts  of  the  cycle,  are  the 
quantities  AH',  AIT,  A E,  AT,  and  AS  positive  or  nega- 
tive according  to  the  sign  convention  used  in  the  text? 

b.  When  the  working  substance  returns  to  A after 
one  complete  cycle,  are  these  five  quantities  positive  or 
negative  compared  with  their  original  value? 

19-20.  Slopes  of  adiabatics  and  isothermals.  In  a p-V  dia- 
gram an  adiabatic  and  an  isothermal  curve  for  an  ideal  gas 
intersect.  Show  that  the  absolute  value  of  the  slope  of  the 
adiabatic  is  y times  that  of  the  isothermal.  Hence  the  adia- 
batic curve  is  steeper  because  the  specific  heat  ratio  y is 
greater  than  1 . 

19-21.  Isothermal  expansion,  n kmol  of  an  ideal  gas  ex- 
pands isothermally  from  volume  VA  to  volume  VB>  VA. 
Find  the  change  in  entropy  of  the  gas. 

19-22.  Work  on  an  ideal  gas. 

a.  Show  that  the  work  on  an  ideal  gas  in  an  adiabatic 
expansion  from  Vt  to  Vf  is  W = (pf  Vf  — pt  Vf)/(y  - 1). 


b.  Referring  to  Fig.  19-16  for  the  Carnot  cycle,  show 
that  the  net  work  along  the  adiabatics  NK  and  LM  is  zero 
by  using  the  result  obtained  for  part  a. 

19-23.  Isothermal  expansion,  work,  and  entropy  change. 
One  kilomole  of  an  ideal  gas  expands  reversibly  and  iso- 
thermally to  twice  its  original  volume. 

a.  Show  that  the  work  done  on  the  gas  is  equal  to  — RT 

In  2. 

b.  What  is  the  change  in  entropy  of  the  gas? 

19-24.  Two  ways  to  calculate  work.  A sample  of  ideal  gas 
expands  adiabatically  and  quasistatically  from  an  initial 
state  of  pressure  pA  and  volume  V A to  a final  state  of  pres- 
sure pB  and  volume  VB.  Find  the  work  done  on  the  gas 
during  this  process  and  express  it  in  terms  oi  pA,  VA,  pB, 
VB , and  y. 

a.  directly  from  W = - f{p  dV. 

b.  from  the  first  law  of  thermodynamics  and  Eqs. 
(19-13)  and  (19-14). 

19-25.  Compressibility.  The  compressibility  of  a sub- 
stance is  a measure  of  that  substance's  deformability 
under  pressure.  It  is  defined  as 

1 dV 
K ~ V dp 

The  compressibility  of  an  ideal  gas  depends  on  the  condi- 
tions under  which  it  is  compressed.  Show  that  if  an  ideal 
gas  is  compressed  isothermally  its  compressibility  is  1 /p, 
whereas  if  it  is  compressed  adiabatically  its  compressibility 
is  l/yp. 

19-26.  Heat  engine,  I.  A monatomic  ideal  gas  is  used 
as  the  working  fluid  of  a heat  engine  which  operates  qua- 
sistatically in  the  elliptical  cycle  ABCDA  shown  in  Fig. 
19E-26.  (The  area  of  an  ellipse  of  semimajor  axis  a and 
semiminor  axis  b is  7 mb.) 


Fig.  19E-26 


a.  Find  the  work  output  for  one  cycle  of  operation  of 
this  engine. 

b.  If  the  input  part  of  the  cycle  is  taken  to  be  the  arc 
ABC,  find  the  heat  energy  input  A Hx. 

c.  Find  the  thermodynamic  efficiency  of  this  engine. 


890  Thermodynamics 


19-27.  Heat  engine,  II.  n kmol  of  a diatomic  ideal  gas  is 
used  as  the  working  fluid  of  a heat  engine  which  operates 
quasistatically  in  the  cycle  ABCA  shown  in  Fig.  19E-27. 
Here pB  = 2 pA  , Vc  = 3VA,  and  the  process BC  is  a straight 
line  on  a p-V  diagram. 

P Fig.  19E-27 


B 


A 


VA  Vc=3  VA 

a.  Find  the  work  output  for  one  cycle  of  operation  of 
the  engine. 

b.  Find  the  heat  input  and  the  thermodynamic  effi- 
ciency of  the  engine. 

c.  Find  the  change  of  entropy  of  the  gas  for  the  pro- 
cesses AB , BC,  and  CA,  and  show  that  the  change  of  en- 
tropy for  a whole  cycle  is  zero. 

19-28.  It’s  illegal.  Suppose  a Carnot  engine  takes  in 
heat  A HChi  at  temperature  Thi  and  rejects  heat  ~AHC\0 
(A//Cl0  is  negative)  at  temperature  7j0.  Suppose  a su- 
per engine  of  higher  efficiency  operates  between  the  same 
temperatures  and  rejects  heat  — AHSi0  equal  to  — A HC\0- 

a.  Show  that  the  heat  absorbed  by  the  super  engine 
from  the  high  temperature  source,  AF/Shi,  is  greater  than 

AH  chi- 

b.  Show  that  -A1T5  > - AWC. 

c.  Show  that  the  existence  of  such  a super  engine 
would  make  it  possible  to  drive  a freighter  across  the 
ocean  with  the  only  heat  consumed  coming  from  the 
ocean  itself. 

d.  State  the  second  law  of  thermodynamics  in  the 
form  violated  by  the  super  engine  in  this  problem. 

19-29.  Ottomobile  engine.  This  exercise  demonstrates 
the  derivation  of  Eq.  (19-70)  for  the  efficiency  of  the 
ideal  Otto  cycle. 

Let  m be  the  mass  of  the  mixture  of  gasoline  vapor 
and  air  and  cv  the  specific  heat  capacity.  Then  Hin  = 
mcv(TD  - Tc)  and  |//out|  = mcv(TE  ~ TB). 

a.  Show  that  the  efficiency  is  equal  to  17  = 1 — 
(Tb  - Tb)/(Td  - Tc) 

b.  Show  that  TC/TD  = TB/TE  and  therefore  that  17  = 
1 - Tb/Tc,  where  TB  and  Tc  are  the  temperatures  at  the 
beginning  and  end  of  the  compression  stroke. 

c.  Show  that  17  = 1 — r1_v. 

d.  Show  that  an  ideal  Otto  engine  is  less  efficient  than 
a Carnot  engine  operating  between  the  source  and  sink 
temperatures  Thi  and  7j0. 


19-30.  Heat  engine,  III.  A reversible  heat  engine  using 
a diatomic  ideal  gas  as  a working  fluid  operates  in  the  fol- 
lowing cycle:  Starting  at  pressure  pA  and  volume  VA,  the 
gas  expands  isothermally  until  its  volume  has  increased  to 
VB  = 3T^.  It  is  then  cooled  isometrically  until  its  pressure 
has  dropped  sufficiently  (to  pc)  that  it  can  be  compressed 
adiabatically  back  to  its  initial  state. 

a.  Sketch  the  cycle  on  a p-V  diagram  and  find  the 
pressure  pc  as  a multiple  of  pA . 

b.  Find  the  thermodynamic  efficiency  of  the  engine. 

c.  If  the  engine  is  run  in  reverse  as  a refrigerator, 
find  its  coefficient  of  performance. 

d.  Repeat  parts  b and  c for  a Carnot  engine  operating 
with  the  same  maximum  and  minimum  temperatures. 
Compare  your  results  with  those  of  parts  b and  c. 

Group  C 

19-31.  Isothermal  compression.  A monatomic  ideal  gas 
is  compressed  from  an  initial  volume  T,  to  a final  volume 
V f.  During  the  compression,  there  is  a transfer  of  heat 
which  maintains  the  temperature  of  the  gas  at  its  initial 
value  so  that  Tf  = T, . The  sample  contains  n kmol  of  the 
gas. 

a.  Find  the  initial  and  final  pressures  pi  and  pf,  in 
terms  of  n,  V, , T,  and  Vf. 

b.  What  is  the  pressure  p(V)  of  the  gas  during  this 
isothermal  compression? 

c.  Find  the  work  AIT  done  on  the  gas  during  the 
compression.  Express  your  result  in  terms  of  n,  Tt,  and 
the  compression  ratio  Vt/Vf. 

d.  Find  the  heat  AH  added  to  the  gas  during  the  com- 
pression. 

e.  Compare  the  work  found  in  part  c to  the  constant 
total  internal  energy  for  the  following  values  of  Vf: 

(i)  Vf  = VJ2 

(ii)  Vf  = V{/3 

(iii)  Vf  = Vt/5 

19-32.  Van  der  Waals  equation  of  state. 

a.  Derive  an  expression  for  the  work  done  on  a 
system  undergoing  isothermal  compression  (or  expan- 
sion) from  volume  Vj  to  V2  for  a gas  which  obeys  the  van  der 
Waals  equation  of  state,  (p  + aJH/V2)  (V  — bJf)  = nRT. 

b.  The  numerical  values  of  a and  b in  SI  units  for  hy- 
drogen gas  (H2)  are  approximately  24,800  and  0.0266 
respectively,  and  for  oxygen  (02)  they  are  138,000  and 
0.0318.  How  large  is  the  correction  introduced  by  using  the 
van  der  Waals  equation  of  state  when  calculating  the  work 
required  to  compress  0.100  kmol  of  H2  and  of  02  from 
10.0  m3  to  0.100  m3  at  temperatures  of  300  K and  600  K? 

19-33.  Specific  heat  ratio  for  an  ideal  gas  mixture. 

a.  Ideal  gas  1 is  chemically  pure  (that  is,  homoge- 
neous, or  having  only  one  type  of  molecule).  It  has  a spe- 
cific heat  ratio  ji . Ideal  gas  2 is  also  pure;  its  specific  heat 
ratio  is  y2  ■ Suppose  that  a sample  of  an  ideal  gas  consists  of 
/1  n kmol  of  gas  1 mixed  with  f2n  kmol  of  gas  2,  with  fx  + 
fi  = 1 • Then  f and  /2  are  the  fractional  abundances  (by 
number)  of  gases  1 and  2,  and  the  mixture  contains  a total 


Exercises  891 


of  n kmol.  Show  that  under  adiabatic  changes,  the  gas 
mixture  obeys  an  equation  of  the  form  pVy  = constant. 
Obtain  an  expression  for  y in  terms  of  ji , y2,/i>  and  /2 • 

b.  Generalize  the  result  of  part  a to  obtain  an  expres- 
sion for  the  specific  heat  ratio  y of  a mixture  of  K different 
pure  gases  k.  with  specific  heat  ratios  yk  and  fractional 
abundances  fk,  for  k = 1,  2,  . . . , K. 

19-34.  Hang  gliders  will  get  a rise  out  of  this.  In  the 
main,  the  sun’s  rays  are  not  absorbed  by  the  transparent 
atmosphere  but  are  absorbed  by  the  opaque  ground.  This 
heats  the  air  in  contact  with  the  ground,  which  as  a result 
becomes  less  dense  than  neighboring  unheatecl  air.  The 
buoyant  force  makes  the  heated  air  rise.  At  higher  alti- 
tudes, the  pressure  on  this  rising  air  decreases  so  that  the 
rising  air  continues  to  expand.  The  expansion  is  essen- 
tially adiabatic  because  of  the  poor  thermal  conductivity  of 
air.  Assume  that  the  air  is  dry  so  that  no  condensation 
occurs. 

a.  Show'  that  the  temperature  gradient  dT/dh,  where 
T is  temperature  and  h is  height,  is  constant  and  equal  to 
Mg  (y  — \)/y.  where  M is  the  mass  of  one  kilomole  of  air. 

b.  What  is  the  numerical  value  of  the  temperature 
gradient?  For  air,  M equals  28.8  kg/kmol. 

c.  What  would  be  the  finite  height  of  such  an  atmo- 
sphere if  the  ground  temperature  were  17°C? 

19-35.  Height  of  the  atmosphere.  Exercise  19-34  ex- 
plains why,  to  a hrst  approximation,  the  earth’s  atmo- 
sphere can  be  considered  to  behave  adiabatically.  Show 
that  such  an  atmosphere  has  a finite  height 


and  calculate  hm  if  the  sealevel  density  p0  were  1.21  kg/m3 
corresponding  to  an  average  sealevel  temperature  of  17°C 
with  p0  equal  to  one  standard  atmosphere. 

In  the  actual  atmosphere,  the  rise  of  air  due  to  its 
heating  by  the  warm  earth  surface  continues  only  to  about 
1 1 km.  The  region  above  this  height  is  called  the  strato- 
sphere. 

19-36.  Adiabatic  pressurizations. 

a.  A chamber  contains  n kmol  of  an  ideal  gas  whose 
specific  heat  ratio  is  y.  The  initial  volume,  temperature, 
and  pressure  are  V,- , T, , and  p,- , respectively.  The  gas  is  now 
adiabatically  compressed  until  the  pressure  is  pf.  Find  the 
final  volume  Vf  and  the  final  temperature  Tf  in  terms  of 
Vi.  Ti,  pi,  pf,  and  y. 

b.  Obtain  numerical  values  for  the  ratios  Vf/Vt  and 
Tf/Ti  in  the  case  of  pure  helium  gas  (y  = f)  or  a pres- 
sure ratio  pf/pi  = 2. 

c.  Obtain  numerical  values  as  in  part  b for  pure  ni- 
trogen gas  (y  = |). 

d.  Suppose  the  chamber  described  in  part  a contains 
a mixture  of  n/ 2 kmol  of  helium  and  n/2  kmol  of  ni- 
trogen. Use  the  result  of  Exercise  19-33  to  find  the  specific 
heat  ratio  for  the  mixture. 


e.  Obtain  numerical  values  for  Vf/Vi  and  Tf /Ti  if  the 
helium-nitrogen  mixture  described  in  part  d is  com- 
pressed to  pf/pi  = 2.  Compare  your  results  with  the  corre- 
sponding results  in  parts  b and  c. 

19-37.  Entropy  of  an  ideal  gas.  Consider  a sample  con- 
sisting of  n kmol  of  an  ideal  gas  with  specific  heat  ratio  y. 
Fet  S(T,  V)  represent  its  entropy;  denote  S(T0,  V0)  by  50. 

a.  Show  that  S(T.  V0)  = S0  + [nR/(y  - 1)]  In  (T/T0). 

b.  Show'  that  S(T0,  V)  = S0  + nR  In  (V/V0). 

c.  Combine  the  results  of  parts  a and  b to  show  that 


S(T,  V)  - S0  = nR 


In 


+ In 


Justify  your  procedure  carefully. 

d.  Which  causes  a greater  entropy  increase  for  a 
sample  of  a monatomic  ideal  gas,  an  isothermal  doubling 
of  the  volume  or  an  isometric  doubling  of  the  absolute 
temperature? 

19-38.  Kelvin’s  thermodynamic  temperatures.  The  effi- 
ciency of  a Carnot  cycle  does  not  depend  on  the  nature  of 
the  working  substance.  (See  Sec.  19-5.)  That  is. 


A/7in  ~ |A//„ut| 
A Hm 


— f(Thi,  7j0) 


or 


or 


|A/7„U,| 

A/fin 


7j0) 


AF/jn 

|A//0Ut 


1 

1 ~ f (T hi  > T|0) 


F(Thi,  ^ 0) 


In  1848,  William  Thomson,  later  Lord  Kelvin,  pro- 
posed that  F(Thi,  7j0)  be  taken  equal  to  0hi/0]O.  The 
quantities  0 are  called  thermodynamic  temperatures  and 
would  be  independent  of  any  particular  substance.  The 
thermodynamic  temperatures  0 can  be  shown  to  be  the 
same  as  the  temperatures  T used  in  the  ideal  gas  relation 
pV  = nRT. 

a.  Referring  to  Fig.  19-16,  for  a Carnot  cycle  using  an 
ideal  gas,  calculate  A Hm  for  one  kilomole  of  the  gas  in 
terms  of  VK,  VL,  and  Thi. 

b.  Calculate  |A//0Ut|  in  terms  of  VN,  VM,  and  7j0. 

c.  With  the  aid  ol  Eq.  (19-28<7),  prove  that  Vl/Vk  = 
VM/VN. 

d.  Show  that  AHjAHouX  = Thi/7j0  = 0hi/0io  so  that 
the  ideal  gas  temperatures  are  the  same  as  Kelvin’s  ther- 
modynamic temperatures. 

19-39.  Cooling  doiun.  A new  electric  refrigerator  has 
just  been  installed  in  a home,  and  its  temperature  is  ini- 
tially 7 hi,  the  temperature  of  the  surroundings.  The 
refrigerator’s  interior  and  contents  have  a total  heat 
capacity  C w'hich  does  not  vary  with  temperature  in  the 
range  7j0  =£  T =£  Thi.  The  refrigerator  is  turned  on,  and 


892  Thermodynamics 


the  interior  temperature  drops  slowly  and  continuously 
until  the  desired  value  TUl  is  reached. 

a.  Assuming  that  the  internal  temperature  changes 
very  little  in  each  cycle  of'  the  refrigerant,  show  that 
the  minimum  electrical  energy  Wmin  required  to  complete 
the  cooling  process  is  given  by 


wmin  = crhi 


_ Jjp 

Thi,  j 

b.  Evaluate  the  ratio  [W min/ C(Thi  - T]0)]  for  the  case 


In  ) - 

1 lo 


Thi  = 30°C  (303  K)  and  T,0  = 3°C. 


19-40.  Pumping  heat. 

a.  Find  the  coefficient  of  performance  of  a reversible 
heat  pump  which  is  being  used  to  maintain  a temperature 
of  17°C  (290  K)  in  an  enclosure  which  is  surrounded  by  air 
at  - 18°C  (255  K). 

b.  Suppose  that  air  from  a cave  is  available  for  use  as 
the  low-temperature  reservoir.  If  the  temperature  of  that 
air  is  8°C,  what  is  the  coefficient  of  performance  of  a 
reversible  heat  pump  being  used  to  maintain  a 17°C  inte- 
rior temperature? 

c.  What  percentage  savings  in  work  input  is  obtained 
by  utilizing  a cave-air  reservoir?  Hint:  The  rate  of  heat 
loss  from  the  enclosure  to  the  exterior  does  not  depend 
upon  what  is  being  used  as  the  low-temperature  reservoir 
in  the  heat-pump  cycle. 

19-41.  Revived  interest  in  old  ideas.  Spurred  on  by  the 
search  for  more  efficient  uses  of  available  fuels,  interest 
has  been  revived  in  the  improvement  of  the  Stirling 
engine,  a hot  air  engine  invented  bv  the  Scottish  engineer 
Robert  Stirling.  Operation  of  this  engine  is  as  follows. 
First  ait'  is  compressed  isothermally  at  an  initial  tempera- 
ture T0  from  an  initial  pressure  and  volume  p0  and  l 0 to 
new  values  pi  and  Vx . This  isothermal  compression  is  then 
followed  by  isometric  heating  to  a new  temperature  Tj . 
Next  the  air  is  expanded,  again  isothermally,  back  to  its 
initial  volume  V0.  Finally,  tbe  gas  is  cooled  isometrically  to 
the  original  temperature  T0. 

a.  Derive  an  expression  for  the  efficiency  of  the 
Stirling  engine  in  terms  of  the  lower  and  upper  tempera- 
tures of  the  cycle  and  the  volumes  for  the  case  of  an  ideal 
monatomic  gas  as  the  working  fluid. 

b.  Draw  the  process  on  a p-V  diagram. 

c.  On  the  same  diagram  indicate  the  departure  from 


the  ideal  case  for  a gas  which  follows  the  van  der  Waals 
equation  of  state  (see  Exercise  19-32).  How  does  this  de- 
parture from  the  ideal-gas  state  affect  the  efficiency  of  the 
engine? 

19-42.  Entropy  change  recalculated  microscopically.  One 
kilomole  of  an  ideal  gas  expands  isothermally  from  vol- 
ume V to  2T.  Since  the  temperature  is  constant,  there  is  no 
change  in  the  energy,  which  is  all  kinetic.  However,  the 
spatial  distribution  of  the  molecules  is  different.  At  the 
end,  half  of  the  molecules  are  in  V on  the  left  of  the  con- 
tainer. Originally  all  were  in  this  V. 

a.  Show  that  the  number  w of  microstates  which  re- 
sult from  the  possibility  of  a molecule  being  in  the  left  or 
right  half  is 

o 


W 


N(N  — 1)  • • ■ - 1 ) 


Nl 


b.  Stirling’s  formula  for  approximating  Nl  (N- 
factorial)  when N is  large  is  Nl  — NN/eN.  Evaluate  w using 
this  valid  approximation  and  use  your  result  to  calculate 
the  entropy  of  the  gas. 

c.  How  many  microstates  are  there  in  the  macrostate 
in  which  all  the  molecules  are  in  the  left  half?  Calculate 
the  change  in  entropy  and  compare  the  result  with  that 
obtained  in  Exercise  19-23,  where  the  entropy  change  was 
calculated  macroscopically. 

19-43.  Energy,  specific  heat,  and  entropy  of  a two-state 
system.  A system  has  only  two  microstates,  with  energies 
6!  = 0 and  e2  = e.  The  results  of  Chap.  18  can  be  used  to 
show  that  the  average  energy  E(T)  for  this  system  is  given 
by 

E(T)  = ee~dkT  / ( 1 + e~dkT) 

a.  Find  C(T)  = dE/dT.  Show  that  C(T)  0 as  T 0 
and  as  T — » 

b.  As  T 0,  the  probability  that  the  system  is  in  the 
lower  state  approached  unity,  so  the  entropy  S — » 0 as 
T — > 0.  What  value  should  the  entropy  approach  in  the 
high-temperature  limit  (AT  55=-  e)? 

c.  Write  an  expression  for  S(T)  in  terms  of  an  inte- 
gral over  dT'  from  0 to  T of  some  quantity. 


Exercises  893 


20 

The  Electric  Force 
and  the  Electric 

Field 


20-1  THE  We  now  begin  a sequence  of  eight  chapters  that  explore  the  properties  and 
ELECTROMAGNETIC  consequences  of  the  electromagnetic  force.  This  force  is  one  of  the  four 
FORCE  fundamental  forces  of  nature.  The  other  three  are  the  gravitational  force,  the 
strong  nuclear  force,  ancl  the  weak  nuclear  force.  These  four  forces  are 
said  to  be  fundamental  because  every  type  of  interaction  between  two  ob- 
jects of  any  species  can  be  attributed  to  one  or  more  of  them. 

The  earlier  chapters  of  this  book  abound  with  examples  of  the  familiar  gravi- 
tational force — the  force  observed  in  the  attractive  interaction  between  every  pair 
of  objects  of  macroscopic  size,  such  as  the  earth  and  Newton’s  falling  apple  or  the 
earth  and  the  moon.  The  two  nuclear  forces  operate  only  between  those  parts  of 
two  objects  whose  separation  is  smaller  than  the  radius  of  a typical  atomic  nu- 
cleus. Hence  on  the  macroscopic  scale  of  our  everyday  lives  the  existence  of  the 
nuclear  forces  is  not  directly  evident  in  the  way  that  the  existence  of  the  gravita- 
tional force  is  evident.  Nevertheless,  the  nuclear  forces  have  a key  role  in  the 
operation  of  the  universe.  For  instance,  they  are  responsible  for  the  processes 
which  make  the  sun  luminous. 

The  electromagnetic  force  is  less  familiar  than  the  gravitational  force  (but 
much  more  familiar  than  the  nuclear  forces).  Yet  consequences  of  the  electromag- 
netic force  are  very  common.  It  is  likely  that  you  are  now  supported  against  the 
downward-acting  gravitational  force  that  the  earth  exerts  on  you  by  an  upward- 
acting  contact  force  exerted  on  you  by  a chair.  And  you  do  not  slide  off  the  seat  be- 
cause of  the  forces  of  contact  friction.  Both  of  these  familiar  macroscopic  forces  are 
consequences  of  electromagnetic  forces  operating  at  the  microscopic  level 
between  many  atoms  in  the  two  surfaces  in  contact. 

Two  bodies  exert  electromagnetic  forces  on  each  other  if  both  have  the 
attribute  known  as  electric  charge.  There  are  two  types  of  charge,  called 
negative  and  positive.  An  electron  has  a specific  amount  of  negative  charge, 

894 


and  a proton  has  the  same  amount  of  positive  charge.  The  nucleus  of  an 
atom  contains,  in  addition  to  uncharged  neutrons,  a certain  number  of 
protons  (the  number  determines  the  type  of  atom).  When  the  atom  is  in  its 
normal  state,  there  are  as  many  electrons  surrounding  the  nucleus  as  there 
are  protons  in  the  nucleus.  In  this  situation,  it  is  said  that  the  atom  has  no 
net  electric  charge,  or  that  it  is  electrically  neutral.  All  other  circumstances 
being  the  same,  the  force  exerted  by  a certain  negative  charge  on  some 
other  charged  object  is  equal  in  magnitude  but  opposite  in  direction  to  the 
force  exerted  by  an  equal  positive  charge  on  the  object.  Now  the  circum- 
stances for  an  electron  and  a proton  in  an  atom  are  not  the  same  (for  in- 
stance, their  positions  and  velocities  differ).  Nevertheless,  in  a neutral  atom 
most  of,  or  all,  the  electromagnetic  forces  exerted  by  its  negatively  charged 
electrons  on  some  charged  object  external  to  the  atom  are  canceled  by  the 
electromagnetic  forces  exerted  by  its  equal  number  of  positively  charged 
protons  on  that  object.  As  a consequence,  direct  evidence  of  electromag- 
netic forces  exerted  by  two  separated  objects  on  each  other  may  be  difficult 
to  obtain  if  one  or  both  are  composed  entirely  of  neutral  atoms.  For  such 
evidence  usually  it  is  necessary  that  the  electrical  neutrality  of  both  objects  be 
disturbed  by  removing  or  adding  electrons.  A way  to  do  this  is  described 
soon. 

However,  if  two  neutral  objects  are  brought  into  intimate  contact,  then  elec- 
tromagnetic forces  of  appreciable  strength  develop  between  the  atoms  of  the  two 
in  the  region  of  contact.  This  occurs  because  the  electrons  in  the  atoms  can  get 
much  closer  to  each  other  than  can  the  nuclei.  Resulting  from  this  are  the  contact 
force  and  the  contact  friction  force,  mentioned  above. 

In  the  simplest  situation  in  which  two  objects  exert  electromagnetic 
forces  on  each  other,  both  have  a net  charge  because  in  each  the  total 
number  of  electrons  differs  from  the  total  number  of  protons.  Such  a situa- 
tion is  the  one  we  consider  in  this  chapter  and  the  next.  Furthermore,  in 
these  two  chapters  we  make  the  simplification  of  restricting  our  attention  to 
cases  in  which  one  of  these  objects  is  at  rest  with  respect  to  the  person  who  is 
observing  the  interaction  between  the  two.  (The  other  object  may  also  be  at 
rest,  or  it  may  be  moving.)  In  such  cases  each  of  the  electromagnetic  forces 
exerted  by  one  object  on  the  other  has  a form  that  we  call  the  electric  force. 
(Tater  we  turn  out  attention  to  the  so-called  magnetic  force.  Electromag- 
netic forces  of  this  form  are  exerted  by  charged  objects  on  each  other  when 
both  are  moving  with  respect  to  the  observer.  Magnetic  forces  are  exerted 
also  when  there  is  a charged  object  that  is  moving  with  respect  to  an  ob- 
server and  an  electrically  neutral  metal  wire  through  which  electrons  move 
with  respect  to  the  observer  to  produce  an  electric  current.)  Our  discussion 
of  the  electric  force  will  lead  us  directly  to  the  closely  related  concept  of  the 
electric  field.  The  properties  of  the  electric  force  and  of  the  electric  held  con- 
stitute the  principal  topic  of  this  chapter. 

We  will  soon  see  that  in  some  regards  there  is  a great  similarity 
between  the  electric  force  and  the  gravitational  force.  But  there  are  also 
great  differences.  Perhaps  the  most  striking  is  this:  The  gravitational  forces 
exerted  by  two  bodies  on  each  other  are  always  attractive,  whereas  the  electric  forces 
exerted  by  two  bodies  on  each  other  are  sometimes  attractive  and  sometimes  repulsive. 
Whether  the  electric  force  is  attractive  or  repulsive  depends  on  the  relative 
signs  of  the  net  charges  in  the  two  bodies.  Specifically,  if  both  bodies  contain 
more  electrons  than  protons,  so  that  both  have  a net  negative  charge,  then 


20-1  The  Electromagnetic  Force  895 


m 


Fig.  20-1  An  analogy  to  the  emission 
and  subsequent  reabsorption  of  a 
photon  by  an  electron.  The  black  dot 
represents  a person  throwing  a boomer- 
ang whose  trajectory  is  shown  by  the 
directed  curve. 


Fig.  20-2  When  two  people  exchange 
boomerangs  as  shown,  the  effect  is  the 
same  as  if  they  exerted  attractive  forces 
on  each  other. 


(a) 


( b ) 


Fig.  20-3  The  effect  of  two  people 
exchanging  boomerangs  in  the  manner 
depicted  is  the  same  as  if  each  exerted 
a repulsive  force  on  the  other. 


each  body  exerts  a repulsive  electric  force  on  the  other.  And  if  for  both 
bodies  the  number  of  protons  exceeds  the  number  of  electrons,  so  that  both 
have  a net  positive  charge,  then,  too,  the  force  that  each  exerts  on  the  other 
is  repulsive.  But  if  die  net  charge  of  one  body  is  negative  and  that  of  the 
other  is  positive,  then  each  exerts  an  attractive  electric  force  on  the  other. 
Succinctly  put,  like  charges  repel  and  unlike  charges  attract. 

The  theory  of  quantum  electrodynamics  provides  a very  detailed  explanation 
of  what  happens  when  repulsive  electric  forces  operate  between  bodies  of  like  net 
charge  or  when  attractive  electric  forces  operate  between  bodies  of  unlike  net 
charge.  (Indeed,  it  is  the  most  precise  theory  in  physics  today.)  The  level  of  the 
theory  is  far  above  that  of  this  book.  But  an  analogy  can  be  presented  which, 
although  very  crude,  still  conveys  well  the  flavor  of  the  explanation.  In  quantum 
electrodynamics,  an  isolated  charged  particle  (like  a single  electron  or  a single 
proton)  is  continually  emitting  and  then  reabsorbing  a small  bundle  of  electro- 
magnetic radiation  called  a “photon.”  The  situation  is  quite  analogous  to  a person 
standing  on  the  slippery  surface  of  a frozen  lake  and  passing  time  by  throwing  out 
a boomerang  and  then  catching  it  as  it  comes  back.  When  the  boomerang  is 
thrown  to  the  left,  as  in  Fig.  20-la,  conservation  of  momentum  requires  that  the 
person  start  moving  to  the  right.  But  this  motion  is  soon  stopped  by  the  mo- 
mentum delivered  to  the  person  by  the  boomerang,  which  is  caught  when  it  is 
again  moving  to  the  left.  The  small  displacement  of  the  person  to  the  right  while 
the  boomerang  is  in  flight  is  canceled  by  the  oppositely  directed  displacement  that 
occurs  in  the  next  throw,  which  is  illustrated  in  Fig.  20-lb.  The  process  continues, 
with  the  equivalent  properties  of  space  in  all  directions  making  all  directions  of 
throw  equally  likely. 

Now  imagine  placing  in  the  vicinity  of  the  first  person  a second  who  is  doing 
the  same  thing.  Assume  they  are  of  the  opposite  sex  so  that  when  they  notice  each 
other,  they  obey  the  tendency  of  opposite  sexes  to  attract.  The  way  they  do  this  is 
illustrated  in  Fig.  20-2a  orb.  Since  space  is  no  longer  equivalent  in  all  directions, 
they  are  not  constrained  to  throw  their  boomerangs  in  all  directions  with  equal 
likelihood.  Instead  each  throws  a boomerang  in  the  direction  away  from  the  other, 
and  each  catches  the  other’s  boomerang  when  it  is  “on  its  way  back.”  If  you  ana- 
lyze the  momentum  transfers,  you  see  immediately  that  all  of  them  are  in  such  a 
direction  as  to  cause  the  two  persons  to  move  together  after  the  exchange  of  boo- 
merangs. It  is  as  if  attractive  forces  acted  between  them. 

Next  assume  the  two  persons  are  of  the  same  sex  and  feel  the  territorial  imper- 
ative to  keep  apart.  They  do  so  in  the  manner  illustrated  in  Fig.  20-3a  or  b.  Each 
throws  a boomerang  toward  the  other  and  catches  the  other’s  boomerang  “on  the 
way  out.”  Now  the  momentum  transfers  cause  them  to  move  apart.  In  effect,  a re- 
pulsion results  from  the  exchange. 

It  should  be  pointed  out  that  this  analogy  suffers  from  the  fundamental  defect 
that  it  depends  on  the  presence  of  air  in  the  region  occupied  by  the  two  persons, 
since  a boomerang  thrown  in  vacuum  will  travel  away,  never  to  return.  In  con- 
trast, electric  forces  are  exerted  between  two  charged  particles  in  vacuum.  A photon 
is  able  to  do  what  a boomerang  cannot  do. 


20-2  ELECTRIC  Electrical  phenomena  have  been  known  since  time  immemorial.  Written 
CHARGE  AND  accounts  stretch  back  as  far  as  classical  Greece  of  the  way  that  certain  pairs 
COULOMB’S  LAW  substances,  having  been  rubbed  together,  will  attract  each  other  as  well 

as  attract  small  bits  of  substances  like  paper.  Before  the  modern  era,  amber 
was  one  of  the  best  available  materials  for  demonstrating  this  phenome- 
non. The  attraction  was  therefore  called  electric  after  the  Greek  word  elek- 
tron,  meaning  amber. 


896  The  Electric  Force  and  the  Electric  Field 


+ 

( b ) 


Fig.  20-4  (a)  A metal  sphere  that  has 
been  given  a negative  charge  by  adding 
electrons  to  it.  ( b ) A metal  sphere  that 
has  been  given  a positive  charge  by 
removing  electrons  from  it.  In  either 
case,  whatever  charge  is  given  to  the 
conducting  body  ends  distributed  over 
its  surface.  The  spherical  shape  of  the 
body  causes  the  distribution  to  be  uni- 
form. That  is,  there  is  the  same  amount 
of  charge  on  each  equal  area  of  the  sur- 
face because  each  of  these  areas  is 
equivalent  for  a sphere.  Whether  the 
charges  given  to  the  conducting  body 
are  negative  or  positive,  they  distribute 
themselves  so  as  to  maximize  the 
spacings  between  near  neighbors.  In 
other  words,  when  there  is  an  excess  of 
electrons,  each  of  the  extra  electrons 
acts  like  a particle  under  the  influence 
of  repulsive  forces  exerted  on  it  by  all 
other  such  particles.  And  when  there  is 
a deficiency  of  electrons,  each  of  the 
missing  electrons  acts  like  a particle 
under  the  influence  of  repulsive  forces 
exerted  on  it  by  all  other  such  particles. 


It  was  not  until  t he  1 730s  that  it  was  clearly  demonstrated  that  the  elec- 
tric force  could  be  either  attractive  or  repulsive.  This  was  hrst  clone  by  the 
French  investigator  Charles  Dn  Fay.  Two  objects  made  of  the  same  mate- 
rial (for  example,  two  glass  rods)  can  be  charged  electrically  by  brisk  rub- 
bing with  an  object  made  of  some  other  material  (for  example,  a silk  cloth). 
When  this  is  done,  the  glass  rods  repel  each  other  and  attract  the  silk  cloth. 

Around  1750,  the  American  statesman-scientist  Benjamin  Franklin 
introduced  the  convention  that  when  a glass  rod  is  rubbed  with  a silk  cloth, 
the  glass  receives  a positive  charge  and  the  silk  receives  a negative  charge. 

We  now  know  that  when  a glass  rod  and  a silk  cloth  are  rubbed 
together,  electrons  are  transferred  across  the  surfaces  in  contact  from  the 
glass  to  the  silk.  Electric  charge  is  not.  created  in  the  process,  just  transferred. 
As  a result,  the  silk  ends  up  with  more  electrons  than  protons  and  the  glass 
with  more  protons  than  electrons.  For  consistency  with  Franklin’s  conven- 
tion that  the  net  charge  of  the  silk  is  negative  and  that  of  the  glass  is  posi- 
tive, we  must  say  that  an  electron  has  a negative  charge  and  a proton  has  a posi- 
tive charge.  I bis  is  the  reason  for  the  signs  used  to  describe  the  different 
types  of  charges  of  the  two  particles. 

By  the  middle  of  the  eighteenth  century,  it  had  become  clear  that  al- 
most all  familiar  solid  materials  can  be  divided  fairly  unambiguously  into  two 
classes.  Materials  in  the  hrst  class  are  most  commonly  called  insulators,  and 
those  in  the  second  class  are  called  conductors.  Glass  and  nearly  all  plastics 
are  insulators,  whereas  all  metals  are  conductors.  The  distinction  has  to  do 
with  the  mobility  of  electric  charge  placed  on  a material.  If  charges  are 
added  to  an  ideal  insulator,  they  remain  just  where  they  are  placed  initially 
because  charges  cannot  move  through  an  insulator  or  along  its  surface.  But  if 
charges  are  added  to  a conductor,  they  are  free  to  move. 

In  normal  circumstances  any  material  is  electrically  neutral  because  it 
contains  the  same  number  of  electrons  as  protons.  When  an  object  is  given 
a negative  charge — or,  as  it  is  said,  charged  negatively — electrons  are 
added  to  the  object.  The  object  is  given  positive  charge — that  is,  charged 
positively  — in  almost  all  cases  not  by  adding  protons  to  it  but  by  removing 
electrons  from  it.  The  reason  is  simply  that  it  is  easy  to  remove  electrons 
from  an  object  but  very  difficult  to  add  the  atomic  nuclei  in  which  protons 
are  found.  When  electrons  are  removed  from  the  object,  it  then  has  more 
protons  than  electrons  and  so  has  a positive  charge.  Since  in  effect  a defi- 
ciency of  negative  charge  is  equivalent  to  an  excess  of  positive  charge,  we 
can  speak,  and  think,  in  terms  of  positive  charge  being  added  to  a body, 
even  though  what  really  happens  is  that  negative  charge  has  been  removed 
from  it. 

When  a certain  region  of  a solid  body  is  charged  negatively  (by  adding 
electrons)  or  charged  positively  (by  removing  electrons),  what  happens 
next  depends  on  whether  the  body  is  an  insulator  or  a conductor.  If  the 
body  is  an  insulator,  the  charge  stays  where  it  is  placed  initially.  But  if  the 
body  is  a conductor,  any  charge  given  to  it  at  any  point  is  free  to  move 
through  the  body.  Within  a very  short  time  this  charge  moves  to  the  surface 
of  the  body,  if  it  is  not  initially  there,  and  spreads  over  the  surface.  After 
i he  charge  given  to  the  body  has  spread  over  its  surface,  the  motion  ceases. 
The  charge  distribution  that  results  is  illustrated  in  Fig.  20-4  for  two  simple 
cases.  Whether  a conducting  body  is  charged  negatively  or  charged  posi- 
tively, the  charges  given  to  the  body  act  as  if  they  move  only  under  the 
influence  of  the  repulsive  forces  they  exert  on  each  other  because  they  are 


20-2  Electric  Charge  and  Coulomb's  Law  897 


all  either  negative,  or  positive.  That  is,  they  spread  over  the  surface  of  the 
body.  The  interior  of  the  conducting  body  remains  electrically  neutral, 
fhus  for  our  present  purposes  everything  about  a conductor  can  be  ig- 
nored except  that  it  has  a surface  which  determines  where  any  charge 
placed  on  it  will  reside. 

The  key  factor  in  the  behavior  of  a conductor  is  that  its  interior  remains  elec- 
trically neutral,  no  matter  whether  it  is  charged  negatively,  charged  positively,  or 
uncharged.  A formal  proof  of  this  is  given  in  Sec.  20-6,  but  the  reason  why  it 
happens  can  be  explained  in  simple  terms  here.  In  its  normal  state  any  material  is 
electrically  neutral  overall,  since  there  is  as  much  positive  charge  in  the  nuclei  of 
its  atoms  as  there  is  negative  charge  in  the  atomic  electrons.  When  the  mate- 
rial is  a conductor,  some  of  these  electrons,  called  the  conduction  electrons,  are 
not  constrained  to  each  remain  in  a particular  atom.  Rather  the  conduction  elec- 
trons are  free  to  move  through  the  conductor.  The  conduction  electrons  form 
something  like  a cloud  of  negative  charge.  No  matter  what  the  state  of  charge  of 
the  conductor  overall,  in  equilibrium  the  cloud  of  conduction  electrons  must  be 
distributed  through  the  interior  so  that  any  small  interior  region  is  electrically 
neutral.  That  is,  the  total  number  of  electrons  in  the  region  equals  the  total  number 
of  protons.  To  see  this,  consider  what  would  happen  if  in  some  such  region  the 
cloud  of  conduction  electrons  were  not  dense  enough,  so  that  the  total  number  of 
electrons  was  smaller  than  the  total  number  of  protons  and  the  region  had  a net 
positive  charge.  Then  the  region  would  exert  attractive  forces  on  the  negatively 
charged  conduction  electrons  in  the  cloud  adjacent  to  it.  Some  of  these  electrons 
would  move  into  the  region.  The  motion  would  cease  only  when  the  region  be- 
came electrically  neutral.  The  opposite  would  happen  if  the  region  had  a net  nega- 
tive charge.  Thus  when  there  is  no  general  motion  of  conduction  electrons  in  a 
conductor,  every  part  of  the  interior  of  the  conductor  must  then  be  electrically 
neutral. 

The  argument  above  does  not  apply  at  the  surface  of  a conductor.  This  is  be- 
cause the  conditions  at  the  surface  of  a body  are  quite  different  from  the  condi- 
tions in  its  interior.  The  different  conditions  give  rise  to  electric  forces  acting  at 
the  surface  which  have  different  (and  more  complicated]  properties  than  those 
acting  in  the  interior.  Hence  we  cannot  argue  that  the  surface  of  a conductor  re- 
mains electrically  neutral  in  all  circumstances.  In  fact,  it  certainly  does  not  do  so. 
Since  in  all  circumstances  the  interior  of  a conductor  is  electrically  neutral  when 
there  is  no  general  motion  of  charge,  any  charge  that  has  been  given  to  the  con- 
ductor must  then  reside  on  its  surface.  Where  else  can  this  charge  be? 

Most  of  l he  qualitative  macroscopic  properties  of  the  electric  force, 
and  of  the  electrical  behavior  of  common  materials,  had  been  thoroughly 
investigated  and  applied  over  a wide  variety  of  circumstances  by  the  late 
eighteenth  century.  There  was  continual  improvement  in  the  apparatus 
used  to  produce  transfer  of  electric  charge  between  two  objects — that  is,  to 
produce  a negative  charge  on  one  and  an  equal  positive  charge  on  the 
other.  (In  subsequent  chapters  we  consider  modern  techniques,  such  as  the 
use  of  a battery.) 

As  experience  with  electrical  phenomena  accumulated,  it  became  evi- 
dent that  significant  further  progress  depended  on  determining  quantita- 
tively the  basic  properties  of  l he  electric  forces  that  two  charged  bodies 
exert  on  each  other.  The  two  questions  which  required  answers  were: 

1.  How  does  the  electric  force  vary  with  the  distance  between  two  elec- 
trically charged  bodies? 

2.  How  does  the  force  vary  with  the  amount  of  electric  charge  on  each 
of  the  two  bodies? 


898  The  Electric  Force  and  the  Electric  Field 


We  approach  these  questions  from  the  point  of  view  of  the  people  who 
first  attacked  and  answered  them.  From  such  a point  of  view,  it  seems  rea- 
sonable as  a beginning  to  look  for  analogies  with  Newton’s  law  of  gravita- 
tion, which,  like  the  electric  force,  acts  on  pairs  of  bodies  not  in  actual  con- 
tact with  each  other.  According  to  Eq.  fl  1-6/ ) the  gravitational  law  is  given 
by 


Fkj 


= G 


yrijmk 

v 2 

rjk 


The  magnitude  of  the  gravitational  force  Fkj  exerted  on  body  k of  mass  mk 
by  body  j of  mass  m,  is  inversely  proportional  to  the  square  of  the  distance 
rjk  from  body  j to  body  k,  and  G is  the  experimentally  determined  propor- 
tionality constant.  (The  bodies  must  be  small  compared  to  the  distance 
between  them  if  this  equation  is  to  have  a clear  meaning.)  Daniel  Bernoulli 
seems  to  have  been  the  first  to  suggest,  in  1760,  applying  the  inverse-square 
law,  by  analogy,  to  the  electric  force.  But  analogy  is  not  the  same  as  experi- 
ment. An  experiment  is  needed  to  see  whether  the  inverse-square  law  ap- 
plies to  the  electric  force  or,  if  not,  what  the  actual  functional  relationship  is 
between  the  force  and  the  distance  between  the  charged  bodies. 

The  second  question,  concerning  the  dependence  of  the  force  on  the 
quantity  of  charge  on  the  bodies,  is  a knottier  one.  How  can  we  determine 
this  dependence  if  we  have  not  devised  a way  to  measure  the  quantity  of 
charge?  The  answer  is  best  discussed  in  the  framework  of  the  experiment 
itself.  For  the  moment,  we  can  guess  that  there  exists  a physical  quantity 
which  we  will  call  electric  charge  q.  It  is  analogous  to  mass  in  the  sense  that  all 
material  objects  possess  more  or  less  of  it.  Only  experiment  can  indicate 
whether  it  enters  into  the  electric  force  law  in  the  same  way  that  mass  enters 
Newton’s  law  for  the  force  of  gravity. 

We  can  reformulate  the  two  questions  in  terms  of  the  hoped-for  anal- 
ogy with  Newton’s  law  of  gravitation  by  asking  a single  question:  Is  the  elec- 
tric force  law  of  the  form 

Fk j = (constant)  --j^  (20-1) 

Gr- 
in this  equation,  as  in  the  equation  expressing  Newton’s  law  of  gravitation, 
Fk j is  the  magnitude  of  the  force  exerted  on  charged  body  k by  the  charge 
on  body/  \qj\  and  |?fc|  are  the  magnitudes  of  the  charges  on  the  two  bodies, 
and  rjk  is  the  distance  from  j to  k. 

The  reformulated  question  we  have  posed  was  first  answered  with  a 
fair  degree  of  precision  by  Charles  Augustin  cle  Coulomb  (1736-1806).  He 
used  an  adaptation  of  the  torsion  pendulum  (see  Sec.  10-2),  which  he  called 

the  torsion  balance. 

Figure  20-5  is  a drawing  of  Coulomb's  apparatus,  reproduced  from 
the  paper  reporting  the  results  of  his  measurements.  The  torsion  balance 
AC  is  suspended  from  an  adjustable  knob,  which  can  be  turned  and  whose 
position  can  be  noted  by  the  reading  of  a pointer  P on  an  angle  scale.  Cou- 
lomb’s torsion  fiber  is  a fine  bronze  wire.  The  cross  member  of  the  tor- 
sion balance,  called  the  beam,  must  be  made  of  a good  insulator.  Coulomb 
used  hardened  shellac.  On  one  end  of  the  beam  is  a spherical  ball  A,  which 
can  be  charged  and  on  which  the  force  is  to  be  measured.  Coulomb’s 
sphere  is  made  of  elder  bush  pith  (for  lightness)  which  has  been  coated 
with  gold  foil  to  make  its  surface  conducting.  The  counterweight  C on  the 
other  end  of  the  beam  is  a disk  of  light  cardboard  or  similar  material.  The 
disk  shape  maximizes  air  resistance  as  the  balance  beam  swings  and  thus 


20-2  Electric  Charge  and  Coulomb's  Law  899 


Fig.  20-5  Coulomb’s  illustration  of  his 
apparatus. 


helps  to  damp  out  quickly  the  oscillations  of  the  beam.  The  entire  torsion 
balance  is  mounted  in  a glass  case  on  which  an  angle  scale  is  marked.  Thus 
air  currents  are  excluded  from  the  apparatus,  and  the  position  of  the  bal- 
ance beam  can  be  readily  measured. 

The  hrst  part  of  the  Coulomb  experiment  is  intended  to  determine  the 
dependence  of  the  electric  force  on  distance.  The  fixed  sphere  B,  which 
also  has  a conducting  surface,  is  given  an  excess  charge  by  touching  it  mo- 
mentarily with  a rod  that  has  been  charged  by  rubbing  against  some  other 
material.  Sphere  A,  which  initially  is  neutral,  then  begins  to  move  toward 
sphere  B , even  though  A is  neutral.  This  happens  because  mobile  charge  in 
A can  move  freely  over  its  metal  surface,  and  does  so  in  such  a way  that  a 
certain  amount  of  charge  of  sign  opposite  to  that  of  the  charged  sphere  B 
concentrates  on  the  surface  of  A nearest  B , while  an  equal  amount  of 
charge  of  sign  the  same  as  that  of  B concentrates  on  the  surface  of  A far- 
thest from  B.  Sphere  B then  attracts  the  charge  of  opposite  sign  on  the  near 
surface  of  A and  repels  the  charge  of  like  sign  on  the  far  surface  of  A.  But 
the  attraction  is  stronger  than  the  repulsion  because  the  attracted  charge  is 
closer  than  the  repelled  charge.  Hence  there  is  a net  force  acting  on  A 
which  pulls  it  toward  B. 

When  sphere  A touches  sphere  B , part  of  the  excess  charge  on  B 
immediately  moves  to  A.  This  is  a result  of  the  mutual  repulsions  between 
the  charges  on  B,  which  tend  to  make  them  keep  as  far  apart  as  possible. 
Now  both  spheres  A and  B have  an  excess  charge  of  the  same  sign. 

Sphere  A is  then  repelled,  and  the  balance  beam  moves  from  the  posi- 
tion shown  in  Fig.  20-6o.  As  the  motion  continues,  the  torsion  fiber  twists 
until  the  equilibrium  position  shown  in  Fig.  20-66  is  achieved.  The  knob  is 
then  turned,  further  twisting  the  torsion  fiber,  until  the  balance  beam  is  re- 
turned to  its  original  position,  as  in  Fig.  20-6c.  The  angle  6X  is  the  angle 
through  which  the  torsion  fiber  is  twisted.  This  angle  determines  the 
torque  exerted  on  the  torsion  balance  beam.  The  force  exerted  on  sphere  A 
as  a result  of  the  twist  in  the  fiber  is  proportional  to  this  torque.  Also,  the 
force  is  equal  and  opposite  to  the  electric  force  exerted  on  sphere  A be- 
cause of  the  presence  of  sphere  5,  since  sphere  A is  now  in  equilibrium. 

The  center-to-center  distance  r,  between  the  two  charged  spheres  is 
now  measured.  If  this  distance  is  large  compared  to  the  radii  of  the  spheres, 
then  the  force  acting  between  them  is  the  same  as  it  would  be  if  all  the  net 
charge  on  each  sphere  were  located  at  the  center  of  the  sphere. 


Fig.  20-6  Experimental  demonstration  that  the  magnitude  of  the  electric  force  between  two 
charged  metal  balls  is  inversely  proportional  to  their  center-to-center  distance. 


900  The  Electric  Force  and  the  Electric  Field 


The  justification  for  this  statement  involves  two  things.  First,  when  the 
charges  added  to  an  isolated  metal  sphere  spread  over  its  surface  because  of  their 
mutual  repulsions,  they  end  up  in  a distribution  that  has  the  same  number  of 
charges  per  unit  area  everywhere.  This  uniform  distribution  is  a consequence  of 
the  symmetry  of  a spherical  surface.  For  such  a surface  any  other  distribution  does 
not  maximize  the  spacings  between  near  neighbors.  The  situation  becomes  more 
complicated  if  a second  metal  sphere  with  the  same  sign  of  charge  as  that  of  the 
first  sphere  is  brought  near  the  first  sphere.  Now  there  are  repulsions  between  the 
charges  in  one  sphere  and  those  in  the  other  which  compete  with  the  repulsions 
between  the  charges  on  each  sphere.  The  result  is  that  the  uniform  distribution  of 
charges  on  each  sphere  is  disturbed,  and  there  will  be  some  accumulation  of 
charges  in  each  sphere  at  the  part  of  the  surface  farthest  from  the  other  sphere.  But 
if  the  center-to-center  distance  between  the  spheres  is  large  compared  to  their 
radii,  the  competition  will  be  dominated  by  the  repulsions  between  the  charges  on 
each  sphere.  That  is,  on  each  sphere  the  charge  will  be  very  nearly  uniformly  dis- 
tributed over  the  surface. 

Second,  when  charges  are  distributed  uniformly  over  a spherical  surface, 
the  electric  force  produced  on  any  external  charged  object  is  identical  to  the  force 
that  would  be  produced  if  all  these  charges  were  concentrated  at  the  center.  This 
property  is  completely  analogous  to  one  involving  the  gravitational  force  and 
a uniform,  massive,  thin  spherical  shell.  It  was  justified  qualitatively  for  the 
gravitational  case  in  the  caption  to  Fig.  11-5.  It  is  proved  quantitatively  later 
in  this  chapter. 

Sphere  B is  now  brought  closer  to  sphere  A,  as  shown  in  Fig.  20-6 d. 
Sphere  A is  thus  repelled  by  an  increased  electric  force,  and  it  swings  away, 
twisting  the  torsion  fiber  still  further.  The  adjustable  knob  is  again  twisted 
sufficiently  to  return  the  torsion  beam  to  its  original  position,  as  shown  in 
Fig.  20-6c.  The  new  angle  02  and  the  new  center-to-center  distance  r2  are 
measured.  Since  r2  < rq,  the  electric  repulsive  force  exerted  on  sphere  A is 
greater  than  in  the  former  case,  and  d2  > (fi.  That  is,  it  takes  a larger  twist 
of  the  torsion  fiber  to  balance  an  increased  electric  force. 

This  process  can  be  repeated  for  different  distances  and  for  different 
initial  charges.  (Care  must  be  taken,  however,  to  carry  out  the  measure- 
ments quickly,  so  as  to  minimize  leakage  of  charge  from  the  two  spheres.) 

Within  the  limits  of  the  accuracy  of  the  experiment,  the  data  always 
conform  to  the  ride 


(20-2) 


Since  the  angle  6 is  proportional  to  the  electric  force  exerted  on  sphere  A 
by  sphere  B , the  conclusion  is  that  the  strength  of  this  force  is  inversely  pro- 
portional to  the  square  of  the  center-to-center  distance  between  them,  all 
other  factors  being  held  constant.  If  the  separation  between  the  spheres  is 
large  compared  to  their  radii,  then  the  situation  is  the  same  as  if  all  the 
charge  on  each  sphere  were  located  at  the  central  point,  and  the  center- 
to-center  distance  becomes  simply  the  distance  between  the  two  point 
charges.  Thus  Eq.  (20-2)  is  an  experimental  demonstration  of  the  propor- 
tionality between  Fkj  and  1 /r]k  in  Eq.  (20-1).  The  accuracy  of  the  experi- 
ment is  not  very  high,  but  later  we  describe  an  experiment  which  confirms 
this  proportionality  to  an  extremely  high  degree  of  accuracy. 

I he  second  part  of  the  Coulomb  experiment  is  intended  to  determine 
whether  the  electric  force  Fki  has  the  proportionality  to  |<7j|  or  to  |gfc|,  found 
in  Eq.  (20-1).  The  situation  in  Fig.  20-7 a is  the  same  as  that  in  Fig.  20-6c. 


20-2  Electric  Charge  and  Coulomb’s  Law  901 


0 

(a) 


B 


T 

0 


(c)  {d) 


Fig.  20-7  Experimental  demonstration 
that  the  magnitude  of  the  electric  force 
between  two  charged  metal  balls  is 
proportional  to  the  charge  on  one  of 
them. 


Spheres  A and  B have  been  given  charge  of  the  same  sign,  and  the  balance 
beam  has  been  twisted  back  to  its  original  position  by  adjusting  the  knob 
from  which  the  torsion  fiber  is  suspended.  Then  an  uncharged  metal  sphere 
B'  of  dimensions  and  construction  identical  to  that  of  the  metal  sphere  B , 
and  also  mounted  on  an  insulating  wand,  is  touched  momentarily  to  sphere 
B.  When  this  happens,  some  of  the  charges  on  sphere  B flow  to  sphere  B' 
because  the  repulsive  forces  that  act  between  like  charges  make  each 
charge  get  as  far  away  as  possible  from  all  the  other  charges.  Since  spheres 
B and  B'  are  identical,  the  symmetry  of  the  situation  dictates  that  this  is 
achieved  when  eactly  half  of  the  charge  originally  on  B flows  to  B' . 

Sphere  .B'  is  now  withdrawn,  as  in  Fig.  20-7c.  Since  the  charge  on  B has 
been  reduced,  the  electric  repulsive  force  acting  on  sphere  A is  reduced, 
and  the  twisted  torsion  fiber  can  untwist  partially,  forcing  sphere  A closer 
to  sphere  B.  In  Fig.  20-7 d,  the  knob  is  adjusted  to  restore  the  torsion  beam 
to  its  original  position  with  the  center-to-center  distance  r2,  and  the  angle 
02  is  measured. 

Within  the  limits  of  experimental  accuracy,  halving  the  charge  on 
sphere  B halves  the  torque  required  to  keep  the  torsion  beam  at  its  original 
position.  The  electric  force  on  sphere  A is  thus  directly  proportional  to  the 
charge  \qB\  on  sphere  B.  For  reasons  of  symmetry,  expressed  in  terms  of 
either  Newton’s  third  law  or  the  law  of  momentum  conservation,  the  elec- 
tric force  must  be  directly  proportional  also  to  the  charge  \qA\  on  sphere  A. 
(See  Sec.  11-1  for  a completely  analogous  discussion  of  this  point  with 
respect  to  the  gravitational  force.) 

The  results  of  the  two  parts  of  the  Coulomb  experiment  can  be  com- 
bined in  the  mathematical  statement,  called  Coulomb’s  law, 


F 


k j 


= (constant) 


dihk 

9 

rjk 


(20-3) 


Coulomb’s  law  confirms  the  conjecture  made  in  Eq.  (20-1).  The  law  applies 
to  two  charged  bodies  whose  dimensions  are  both  very  small  compared  to  the 
distance  between  them.  Usually  this  is  expressed  by  saying  that  it  applies  to 
two  point  charges.  The  law  states  that  the  magnitude  Fkiof  the  electric  force  ex- 
erted on  charge  qk  by  charge  q^  is  proportional  to  the  product  |#fc||#j|  of  their  magni- 
tudes and  inversely  proportional  to  the  square  of  the  distance  rjkfrom  charge  j to 
charge  k.  The  direction  of  the  force  exerted  on  charge  qk  by  charge  qj  is  de- 
termined by  the  rule  that  like  charges  repel  and  unlike  charges  attract. 


If  the  proportionality  constant  in  Eq.  (20-3)  is  given  some  agreed-upon 
value,  then  the  equation  can  be  used  to  determine  the  magnitude  of  the 
unit  of  electric  charge  that  corresponds  to  this  value  of  the  constant.  In 
principle  this  can  be  done  by  performing  a Coulomb  experiment  with 
spheres  A and  B identical  so  that  it  is  easy  to  give  both  the  same  charge  |#|. 
Then  Fkj  can  be  measured  from  the  properties  of  the  torsion  balance  and 
the  twist  angle,  and  rjk  can  be  measured.  The  equation  is  solved  for  |^|,  pro- 
ducing 


M = 


Fkjrjk  \1/2 
constant  / 


Knowing  the  values  of  Fkj  and  rjk  and  given  the  agreed-upon  value  of  the 
constant,  we  can  then  determine  the  value  of  |<?|  in  terms  of  the  electric 


902  The  Electric  Force  and  the  Electric  Field 


charge  unit  corresponding  to  the  value  of  the  constant.  This  is  tantamount 
to  determining  the  electric  charge  unit  itself. 

But  as  we  have  mentioned,  the  Coulomb  experiment  is  inaccurate.  This 
is  one  reason  why  Coulomb’s  law  is  not  used  to  define  the  unit  of  elec- 
tric charge.  A far  more  accurate,  and  useful,  definition  is  obtained  by 
means  of  measurements  involving  the  flow  of  electric  charge  called  electric 
current.  In  fact,  what  is  done  is  to  define  the  unit  of  current  by  means  that 
are  discussed  in  Chaps.  22  and  23.  Since  current  is  charge  flow  per  second, 
the  unit  of  charge  can  then  be  expressed  in  terms  of  the  unit  of  current. 
Thus  the  charge  unit  is  defined  indirectly  in  terms  of  the  definition  of  the 
current  unit. 

Once  the  charge  unit  is  so  defined,  the  value  of  the  proportionality 
constant  in  Coulomb’s  law  can  be  determined,  since  everything  else  has 
then  been  specified.  In  SI  units  this  proportionality  constant  is  written  as 
l/47re0.  (The  symbol  e is  the  Greek  letter  epsilon,  and  e0  is  read  “epsilon 
naught.”)  Thus  Eq.  (20-3)  is  written 


Fki  = 


i Wihk 

4ve0  r ffc 


(20-4a) 


The  awkward  form  of  the  constant  l/477e0  is  the  price  paid  so  that  a 
convenient  form  will  appear  in  equations,  which  we  come  across  later,  that 
are  more  frequently  used  than  Coulomb’s  law. 

The  unit  of  electric  charge  is  called  the  coulomb  (C).  It  is  defined  to  be 
the  amount  of  electric  charge  which  flows  past  a given  point  in  a wire  in  1 s 
when  the  current  in  the  wire  is  1 ampere  (A).  The  best  value  of  the  con- 
stant l/47re0  corresponding  to  this  definition  of  the  charge  unit  is  com- 
piled from  a variety  of  experimental  results  less  direct  but  more  accurate 
than  the  Coulomb  experiment.  It  is 


= 8.987551790  x 109  N-m2/C2 

4776o 

For  many  practical  purposes,  the  easy-to-remember  value 

= 9 x 109  N-m2/C2 

47760 

is  sufficiently  accurate. 


(20-46) 


(20-4c) 


The  system  of  electrical  units  in  which  Coulomb’s  law  takes  the  form  of  Eq. 
(20-4a)  was  originally  introduced  by  the  Italian  engineer  Giovanni  Giorgi  shortly 
after  1900.  It  was  immediately  popular  with  electrical  engineers  because  of  its 
practical  convenience.  Physicists,  however,  were  more  reluctant  to  adopt  the  sys- 
tem, mainly  because  it  tends  to  obscure  the  fundamental  connection  between 
electricity  and  magnetism.  But  the  advantages  of  the  system  greatly  outweigh  the 
disadvantages,  and  the  agreement  to  use  the  system  is  now  essentially  universal. 


Coulomb’s  law  has  one  complication  not  possessed  by  Newton’s  law  of 
gravitation.  The  electric  force  can  be  either  repulsive  (if  the  two  charges 
are  like)  or  attractive  (if  they  are  unlike).  We  can  take  care  of  the  direction 
of  the  force  automatically  by  writing  Coulomb’s  law  in  the  vector  form 


F 


kj 


1 QjQk 
\ 2 *3k 

47760  r jk 


(20-5) 


20-2  Electric  Charge  and  Coulomb’s  Law  903 


9/ 


A. 

rik 


Qk 

+ 


Fig.  20-8  According  to  the  vector  form  of  Coulomb’s  law,  the  direction  Fkj 
of  the  force  exerted  on  charge  qk  by  charge  q}  is  the  same  as  that  of  the  unit 
vector  rjk  from  charge  q,  to  charge  qk  when  the  charges  have  the  same  signs. 
The  force  has  the  direction  opposite  to  that  of  the  unit  vector  rjk  when  the 
charges  have  opposite  signs. 


rik 


9/9  k > 0 


+ 


rik 


<7/9 * < 0 


9/9*  < 0 


Here  ¥kj  is  the  vector  specifying  the  direction  and  magnitude  of  the  force 
exerted  on  charge  k by  charge  j;  q,  and  qk  are  signed  scalars  giving  the  mag- 
nitude and  sign  of  their  charges;  rjk  is  the  distance  between  j and  k;  and  rjk 
is  a unit  vector  in  the  direction fromj to  k.  Figure  20-8  illustrates  the  applica- 
tion of  Eq.  (20-5)  to  determine  the  direction  of  the  force  in  all  possible 
cases. 

Example  20-1  will  give  you  an  idea  of  both  the  size  of  the  electric 
charge  unit  and  the  strength  of  the  electric  force. 


EXAMPLE  20-1 

H ow  big  is  a coulomb?  One  way  to  get  a feeling  for  it  is  to  calculate  the  force  on 
either  of  two  1-C  charges  when  they  are  separated  by  1 m.  Another  way  is  to  find 
the  magnitude  of  two  equal  charges,  each  of  which  experiences  a force  of  magni- 
tude 1 N when  they  are  separated  by  1 m.  Make  both  these  calculations. 

■ Equation  (20-4a)  gives  you 

F-9  x 109  N-nr/C2  x - = 9 x 109  N 

(1  m)2 


This  is  an  enormous  force.  It  is  almost  equal  to  the  weight  of  a million-ton  mass. 
Rearranging  Eq.  (20-4a),  you  obtain 


(4ve0Fr2)112  = 


|_9  x JO9  N-irr/C2 


X 1 N X (1  m)2 


1/2 


= l X lfr5  C 


Thus  at  the  very  human-scale  separation  of  1 m,  the  human-scale  force  of  1 N is  ex- 
erted by  each  of  two  charges  on  the  other  when  their  magnitudes  are  approximately 
10  microcoulombs  (/u.C ). 


In  1891,  the  Irish  physicist  G.  Johnstone  Stoney  suggested  the  name 
electron  for  the  (then  hypothetical)  indivisible  charge  of  electricity.  (Ear- 
lier, in  1874,  he  had  used  electrochemical  data,  together  with  the  very 
crude  value  of  Avogadro’s  number  then  available,  to  estimate  its  magni- 
tude at  1 x 1 0“20  C.)  In  1909,  the  U.S.  physicist  Robert  A.  Millikan  made 
direct  measurements  on  quite  small  numbers  of  electrons,  using  the  beauti- 
ful  "oil  drop”  technique  that  is  described  in  Chap.  21.  He  confirmed  that 
the  electron  charge  has  a negative  value  and  obtained  a magnitude  which  is 

904  The  Electric  Force  and  the  Electric  Field 


quite  close  to  the  currently  accepted  value.  The  magnitude  of  the  electron 
charge  is  expressed  universally  by  the  symbol  e.  In  other  words,  when  Eq. 
(20-5)  is  used,  the  charge  on  an  electron  is  written  as  q = — e.  The  most  reli- 
able modern  value  for  the  magnitude  of  the  electron  charge  is 

e = 1.6021892  x 10“19  C (20-6a) 

For  many  practical  purposes,  the  approximate  value 

e=  1.60  x 10~19  C (20-66) 

is  sufficiently  accurate. 

In  1897,  the  British  physicist  J.  J.  Thomson  had  performed  a series  of 
experiments,  which  we  discuss  in  detail  in  Sec.  23-3.  From  these  experi- 
ments he  could  deduce  the  ratio  of  the  magnitude  e of  the  electron  charge 
to  the  value  me  of  the  electron  mass.  His  value  for  the  charge-to-mass  ratio 
was  quite  close  to  the  modern  value,  which  to  three  decimal  places  is 

™ = 1.76  x 10n  C/kg  (20-7) 

Compared  to  the  charge-to-mass  ratio  of,  say,  one  of  the  charged  spheres 
in  Coulomb’s  experiment,  the  value  of  this  ratio  for  an  electron  is 
extremely  large. 

From  Eqs.  (20-66)  and  (20-7),  we  can  calculate  the  electron  mass: 

e _ 1.60  x 10-19  C 

me  e/me  1.76  x 1011  C/kg 

or 


me  = 9.11  x W~31  kg  (20-8) 

(More  precisely  put,  the  value  quoted  is  the  electron  rest  mass.  But  the  elec- 
trons in  Thomson’s  experiment  moved  at  nonrelativistic  speeds  where 
there  is  no  meaningful  distinction  to  be  made  between  relativistic  mass  and 
rest  mass.  That  will  be  the  case  for  all  the  motions  with  which  we  deal  in  our 
treatment  of  the  electromagnetic  force.  So  we  use  simply  the  word  “mass.”) 
The  mass  of  an  electron  is  less  than  0. 1 percent  of  the  mass  of  a hydrogen 
atom  and  less  than  0.001  percent  of  the  mass  of  a uranium  atom. 


Many  other  types  of  microscopic  charged  particles  are  known  to  exist 
■in  nature,  although  most  are  not  stable  and  decay  rapidly  into  other  par- 
ticles. All  have  a charge  equal  to  ±e  if  they  are  what  are  called  elementary 
particles  (such  as  an  electron  or  a proton)  or  ±e  multiplied  by  some  small 
integer  if  they  are  particles  (such  as  an  atomic  nucleus)  formed  from  some 
small  number  of  elementary  particles.  And  since  a macroscopic  charged 
body  is  charged  because  it  has  a certain  number  of  electrons  more  or  less 
than  its  normal  complement,  the  charge  on  the  body  is  equal  to  ±e  multi- 
plied by  some  large  integer.  Thus  the  charge  on  any  object  can  be  only  one 
of  the  discrete  set  of  values  ±e,  ±2e,  ±3e,  . . . . This  feature  is  described 
by  saying  electric  charge  is  quantized. 


The  most  important  stable  positively  charged  particle  is  the  proton, 
which  has  a charge  q — +e.  1 hat  the  proton  charge  is  precisely  + e is  evi- 
denced by  the  results  of  very  accurate  experiments  which  show  that  the  hy- 
drogen atom,  consisting  of  one  proton  and  one  electron,  has  zero  net 

20-2  Electric  Charge  and  Coulomb’s  Law  905 


charge.  The  proton  is  much  more  massive  than  the  electron.  The  proton 
mass  is  approximately 

mp  = 1.67  x 10~27  kg  (20-9) 


or  about  1836 me . 

For  every  charged  particle  there  exists  an  antiparticle,  that  is,  a par- 
ticle having  a mass  identical  to  that  of  the  particle  and  a charge  of  identical 
magnitude  but  opposite  sign.  This  assertion  is  supported  by  a great  variety 
of  direct  and  indirect  experimental  evidence.  The  antiparticle  of  the  elec- 
tron is  called  the  positron;  that  of  the  proton  is  called  the  antiproton. 
When  an  electron  and  a positron  come  together,  they  may  annihilate  each 
other,  that  is,  disappear.  When  this  happens,  in  the  process  two  uncharged 
particles  appear,  called  photons.  Each  photon  is  a bundle  of  electromag- 
netic radiation.  Before  the  annihilation,  the  total  charge  present  is 
— e + e — 0.  Afterward  it  is  0 + 0 = 0.  Conversely,  it  is  possible  under 
proper  circumstances  to  create  electrons  and  positrons  from  photons — but 
only  in  electron-positron  pairs.  The  total  charge  present  before  such  a 
pair-production  process  is  zero  since  photons  are  uncharged.  It  is  also  zero 
afterward  since  the  charge  of  an  electron  is  the  negative  of  that  of  a posi- 
tron. The  essential  point  is  that  the  total  amount  of  electric  charge  in  the  universe 
never  changes,  since  charge  can  be  created  or  destroyed  only  with  the  simultaneous 
creation  or  destruction  of  an  equal  amount  of  opposite  charge.  This  feature  is 
described  by  saying  electric  charge  is  conserved. 

Example  20-2  applies  Coulomb’s  law  to  a simple  model  of  the  hy- 
drogen atom. 


EXAMPLE  20-2  1 ■"  — 1 

In  a simplified  view,  the  stable  hydrogen  atom  consists  of  a single  electron  revolving 
in  a circular  orbit  around  a much  more  massive  proton,  to  which  it  is  bound  by  the 
attractive  electric  force  exerted  between  these  oppositely  charged  particles.  (This 
picture  is  very  consciously  analogous  to  that  of  a single  planet  revolving  around  a 
much  more  massive  sun,  to  which  it  is  bound  by  the  attractive  gravitational  force.) 
The  radius  of  the  hydrogen  atom  is  r = 5.29  x 10~n  m.  Find  the  strength  of  the 
force  on  the  electron,  its  centripetal  acceleration,  and  its  orbital  speed,  assuming  the 
correctness  of  this  picture  of  the  atom  and  the  applicability  of  Newton’s  laws  of  mo- 
tion to  systems  of  atomic  size. 

■ Using  Eq.  (20-4a), 


F 1 klkl 

47760  r2 


you  have 


F = 8.99  x 


109  N-nr/C2  x 


(1.60  x 1Q~19  C)2 
(5.29  X KT11  m)2 


= 8.23  x 10“8  N 


This  is  by  no  means  a small  force  when  you  consider  the  tiny  mass 
me  = 9. 1 1 x 10-31  kg  of  the  particle  on  which  it  is  acting.  To  see  this,  you  calculate 
the  centripetal  acceleration  by  using  Newton’s  second  law  in  the  form 

F 

a = — 
m 


906  The  Electric  Force  and  the  Electric  Field 


You  have 


a 


8.23  x IQ-8  N 
9.11  x IQ’31  kg 


= 9.03  x 1022  m/s2 


The  orbital  speed  is  given  from  the  expression  a = v2/r  for  centripetal  acceleration. 
It  is 


v = ( ar)m 


or 

v = (9.03  x 1022  m/s2  x 5.29  x KT11  m)1/2  = 2.19  x 106  m/s 

Since  this  speed  is  less  than  1 percent  of  the  speed  of  light,  you  are  consistent  in 
applying  the  nonrelativistic  form  of  Newton’s  second  law. 


20-3  ALPHA-PARTICLE  I n Sec.  1 1-5,  we  made  numerical  studies  of  bodies  moving  under  the  influ- 
SCATTERING  ence  of  the  gravitational  force.  Much  of  the  work  we  did  there  can  be  ap- 
plied to  the  study  of  a body  moving  under  the  influence  of  an  electric  force, 
because  the  mathematical  similarities  between  Newton’s  law  of  gravitation 
and  Coulomb’s  law  of  the  electric  force  are  very  great.  You  can  see  this  by 
direct  comparison: 


gravitational  force 


electric  force 


r m5mk  f. 
u „2  rt  k 

rjk 


1 Wk  - 

47760  rfk  Jk 


If  the  electron  obeyed  Newton’s  laws  of  motion  in  interactions  on  the 
atomic  scale  (and  we  see  in  Chap.  31  how  and  to  what  extent  it  does  not  do 
so),  the  only  difference  between  the  orbit  of  a planet  about  a star  and  the 
orbit  of  an  electi  on  about  a proton  would  be  one  of  scale.  Something  new  is 
introduced,  however,  when  we  consider  the  motion  of  a positively  charged 
particle  in  the  vicinity  of  another  positively  charged  particle.  The  force  is 
repulsive,  which  never  happens  in  the  gravitational  case.  This  is  the  situation 
in  alpha-particle  scattering. 

At  the  center  of  every  atom  there  is  a small,  positively  charged  body 
called  the  nucleus.  When  the  atom  is  in  its  normal  state,  the  nucleus  is  sur- 
rounded by  the  number  of  negatively  charged  electrons  that  makes  the 
atom  electrically  neutral  overall.  An  alpha  particle  is  the  nucleus  of  the 
particular  atom  helium.  That  is,  if  the  two  electrons  normally  present  in  a 
helium  atom  are  removed,  what  remains  is  an  alpha  particle.  An  alpha  par- 
ticle has  a charge  q ~ +2e  and  a mass  ma  = 6.65  x 10-27  kg. 

Alpha  particles  are  emitted  spontaneously  by  various  radioactive  sub- 
stances. The  commonly  used  substance  is  radium,  which  emits  alpha  par- 
ticles whose  speeds  are  about  1 X 107  m/s.  If  one  of  these  high-speed  alpha 
particles  passes  through  a thin  foil  of  some  material,  experiments  show  that 
it  has  a small  chance  of  being  scattered  through  an  appreciable  angle. 
(Strictly  speaking,  the  word  “scatter”  refers  to  the  behavior  of  a large  number 
of  objects,  as  when  you  throw  a handful  of  rocks  against  a tree  and  they 
scatter.  Nevertheless,  we  will  follow  the  universal  usage  of  nuclear  and 
elementary-particle  physics  in  which  the  word  is  used  to  describe  what 
happens  when  a single  particle  interacts  with  some  other  object  and  is 
deflected.) 


20-3  Alpha-Particle  Scattering  907 


Fig.  20-9  A cloud  chamber  photograph  of  an  alpha  particle  scattering  from  an  oxygen  nu- 
cleus. The  nucleus  is  part  of  an  atom  of  the  gas  filling  the  cloud  chamber.  Each  of  the  long, 
unforked  “tracks”  shows  the  path  followed  by  an  alpha  particle  emitted  from  a radioactive 
source  located  to  the  left  of  the  region  photographed.  These  paths  are  straight,  except  near 
their  ends  where  the  there-slowly-moving  alpha  particles  experience  multiple  small-angle 
scattering  from  atomic  electrons.  The  forked  track  shows  the  single  large-angle  scattering 
of  interest.  The  longer  prong  extending  generally  upward  from  the  apex  of  the  fork  is  the 
track  of  the  scattered  alpha  particle  and  the  shorter  prong  extending  generally  downward 
is  the  track  of  the  recoiling  oxygen  nucleus.  This  photograph  was  made  by  P.  M.  S.  Blackett 
in  1923. 


Thus  occasionally  an  alpha  particle  emerges  from  the  foil  traveling 
along  a path  which  is  at  an  appreciable  angle  to  its  path  upon  entering.  The 
cloud  chamber  photograph  in  Fig.  20-9  shows  an  example  of  large-angle 
alpha-particle  scattering.  We  will  use  Coulomb’s  law  to  predict  some  of  the 
features  of  alpha-particle  scattering,  for  the  purpose  of  comparison  with 
those  observed  in  the  alpha-particle  scattering  experiments. 


An  alpha  particle  passing  through  the  atoms  in  a foil  of  material  in- 
teracts with  both  the  electrons  and  the  nuclei  of  these  atoms  because  it  has 
electric  charge  and  so  do  they.  But  for  it  to  be  scattered  through  an  appre- 
ciable angle  in  an  encounter  with  any  other  particle,  two  requirements 
must  be  met:  (1)  The  particle  from  which  it  scatters  must  have  a comparable 
or  greater  mass;  (2)  there  must  be  a strong  force  of  some  nature  exerted 
between  it  and  the  particle  from  which  it  scatters.  Requirement  1 is  intui- 
tively obvious  if  you  imagine  a bowling  ball  scattering  from  a billiard  ball. 
There  will  be  only  a slight  deflection  of  the  bowling  ball  from  its  original 
path — that  is,  its  scattering  angle  will  be  small — because  the  mass  of  the 
billiard  ball  from  which  it  scatters  is  small  compared  to  its  own  mass.  Re- 
quirement 2 follows  from  the  fact  that  appreciable  momentum  must  be 
transferred  to  the  alpha  particle  from  the  particle  scattering  it  if  the  scat- 
tering angle  is  to  be  appreciable.  For  this  to  happen,  a strong  force  must  be 
exerted  between  the  two  during  their  encounter. 

Now  the  mass  of  an  electron  is  very  small  compared  to  the  mass  of  an 
atom.  Hence  almost  all  the  mass  of  any  atom  resides  in  its  nucleus.  Since  an  alpha 
particle  is  itself  a nucleus,  it  is  very  massive  compared  to  an  electron.  (Speci- 
fically, its  mass  is  ma  — 7300 me.)  Thus  the  electrons  in  the  atoms  of  the 
foil  certainly  fail  to  satisfy  requirement  1.  But  the  nuclei  in  the  atoms  satisfy 
this  requirement  since  they  have  mass  comparable  to  that  of  the  alpha  par- 
ticle. So  we  turn  our  attention  to  these  nuclei. 


What  about  requirement  2?  In  comparison  to  the  size  of  an  atom,  an 
alpha  particle  — and  any  other  atomic  nucleus — is  so  very  small  that  it  can 
be  regarded  as  a charged  point  particle.  (Later  we  will  see  how  alpha- 
particle  scattering  can  be  used  to  measure  nuclear  radii.  These  measure- 
ments, and  many  others,  show  that  the  radius  of  a nucleus  is  about  10-5 
times  the  radius  of  the  atom  containing  that  nucleus.)  If  in  traveling 
through  the  foil  the  alpha  particle  happens  to  pass  very  near  the  nucleus  at 
the  center  of  one  of  the  atoms  in  the  foil,  then  a strong  electric  force  will  be 
exerted  between  the  positively  charged  alpha  particle  and  the  positively 
charged  nucleus.  But  it  must  be  a close  passage  because  Coulomb’s  law 


908  The  Electric  Force  and  the  Electric  Field 


Alpha 

partijVn 

y o 


y 


A 

Nucleus 

''"Target  area  for  impact 

parameter  to  be  less  than  1 y0  | 

Fig.  20-10  An  alpha  particle  approaching  a nucleus  with  impact  parameter  |y0|. 


shows  that  the  strength  of  the  electric  force  is  inversely  proportional  to  the 
square  of  the  separation  between  the  alpha  particle  and  the  nucleus.  (Note, 
however,  that  a collision  involving  actual  contact  is  not  required.)  If  we  call 
any  angle  larger  than  0.1  rad  = 6°  an  “appreciable  angle,”  then  calcula- 
tions soon  to  be  presented  show  that  even  for  the  highly  charged  gold  nu- 
cleus, the  force  it  exerts  on  a passing  alpha  particle  is  large  enough  to 
scatter  it  through  an  appreciable  angle  only  if  the  alpha  particle  ap- 
proaches with  an  impact  parameter  less  than  about  3 x 10~13  m.  The  im- 
pact parameter  is  the  magnitude  of  the  quantity  y0  depicted  in  Fig.  20-10. 
It  is  the  distance  between  the  initial  trajectory  of  the  alpha  particle  and  the 
trajectory  which  would  lead  to  a head-on  collision. 

Since  the  impact  parameter  must  be  less  than  about  3 X 10-13  m for 
there  to  be  a scattering  at  an  appreciable  angle,  the  alpha  particle  must  pass 
through  a “target  area”  in  the  form  of  a circular  disk  of  radius  about 
3 x 10-13  m.  See  Fig.  20-10.  The  area  of  this  target  is  about 
7t(3  x 10-13  m)2  — 3 x 1 0-25  m2.  In  contrast,  the  “target  area”  presented 
to  the  alpha  particle  by  the  entire  atom  containing  the  nucleus  is  the  area  of 
a circular  disk  of  radius  equal  to  the  atomic  radius.  This  radius  is  about 

2 x 10_1°  m for  gold  atoms,  and  so  the  associated  target  area  is  about 

7r( 2 x 10-10  m)* 2  - 1 x ltr19  m2. 

Now  an  alpha  particle  passing  through  an  atom  in  a gold  foil  is  not 
“aimed”  at  the  nucleus.  There  is  no  way  that  this  can  be  done.  In  other 
words,  on  the  atomic  scale  the  alpha  particle  is  “aimed”  at  random.  Hence 
the  probability  that  it  will  pass  close  enough  to  a nucleus  to  be  scattered 
through  an  appreciable  angle  is  given  by  the  target  area  ratio 

3 x 10~25  m2/(l  x 10-19  m2)  = 3 x 10-6.  The  chance  of  scattering 

through  an  appreciable  angle  in  passing  through  one  atom  of  the  foil  is 

very  small  indeed. 

A gold  foil  used  in  the  experiment  is  made  very  thin  in  order  to  keep 

the  alpha  particles  from  slowing  significantly  in  their  passage  through  it.  A 
typical  foil  is  only  about  1 X 10-6  m thick.  (It  is  not  difficult  to  beat  a gold 

foil  to  this  thickness.)  Since  the  diameter  of  a gold  atom  is  about 

4 x 10_1°  m,  such  a foil  is  about  1 x 10-6  m/(4  x 10-6  m/atom)  — 3 x 

103 * * * * * * 10  atoms  thick.  Therefore,  in  passing  through  all  the  atoms  in  the  foil,  the 
alpha  particle  has  a total  probability  of  being  scattered  through  an  appre- 


20-3  Alpha-Particle  Scattering  909 


ciable  angle  which  is  something  like  3 x 10~6  x 3 x 1 03  = 1 x 10~2.  This 
estimate  tells  us  that  we  do  not  have  to  consider  two  sequential  scatterings 
through  appreciable  angles  in  the  passage  of  an  alpha  particle  through  the 
thin  foil.  Since  the  probability  of  one  such  scattering  occurring  is  only 
about  1 x 10~2,  the  probability  of  two  of  these  independent  events  occur- 
ring is  only  about  (1  x 10-2)2  = 1 x 10-4.  Also  the  estimate  we  have  made 
explains  why  most  of  the  alpha  particles  seen  in  Fig.  20-9  do  not  experience 
scattering  through  an  appreciable  angle  as  they  go  through  the  chamber. 

The  means  to  calculate  numerically  the  trajectories  of  alpha  particles 
scattering  from  gold  nuclei  have  been  almost  completely  developed  in  Secs. 
11-4  and  11-5,  where  we  considered  the  attractive  inverse-square  gravita- 
tional force  between  two  massive  bodies  instead  of  the  repulsive  inverse- 
square  electric  force  between  two  bodies  with  like  charge.  Adapting  the 
earlier  work  to  the  case  at  hand  is  a matter  of  using  Coulomb’s  law  to  evalu- 
ate the  constant  a in  the  quantities  Qx  and  Qy  appearing  in  differential 
equations  just  like  Eqs.  (1 1-29).  The  value  of  a will  be  negative  in  this  case 
since  here  the  force  is  repulsive,  instead  of  attractive.  The  adaptation  is  car- 
ried out  in  Example  20-3. 


Following  the  developments  of  Secs.  11-4  and  11-5,  set  up  a pair  of  differential 
equations  that  can  be  used  to  calculate  the  trajectory  of  an  alpha  particle  which 
passes  near  a gold  nucleus. 

■ Even  though  a gold  nucleus  is  one  of  the  most  massive  nuclei,  its  mass.  mAll  = 
3.27  X 1(T25  kg,  is  only  about  45  times  larger  than  ma  = 6.65  x 10-27  kg,  the  mass 
of  an  alpha  particle.  Thus  it  is  not  a good  approximation  to  assume  the  nucleus  will 
remain  stationary  during  its  interaction  with  an  alpha  particle.  And  such  an  approx- 
imation would  be  still  poorer  for  studying  the  scattering  of  alpha  particles  by  nuclei 
that  are  not  among  the  most  massive.  But  it  is  easy  to  avoid  the  approximation  com- 
pletely by  taking  advantage  of  the  reduced-mass  procedure  developed  in  Sec.  11-4. 
You  replace  the  nucleus  of  mass  mn  with  a particle  of  infinite  mass  and  substitute  for 
the  alpha  particle  of  mass  ma  a particle  whose  reduced  mass  [a  is  evaluated,  using  Eq. 
(1 1-20Y),  to  be 


mam  n 
ma  + mn 


(20-10) 


Next  make  a sketch  of  the  situation  at  a particular  instant,  as  in  Fig.  20-11, 
which  is  similar  to  Fig.  11-13.  The  nucleus  (of  infinite  mass)  is  fixed  at  the  origin, 
and  the  alpha  particle  (of  reduced  mass  /a)  is  at  a location  specified  by  the  vector  r. 
The  vector  F represents  the  electric  force  on  the  alpha  particle,  and  its  x and  y com- 
ponents Fx  and  Fu  are  shown.  The  components  of  acceleration  of  the  alpha  particle 
in  the  x and  y directions  are  given  by  Newton’s  second  law: 


d2x  _ Fx 
dt2  /a 

and 


(20-1  la) 


d2y  Fy 

dt2  /a 


(20-11  b) 


These  equations  are  analogous  to  Eqs.  (ll-26a)  and  (11-266),  which  describe  the 
acceleration  of  a planet  under  the  action  of  a gravitational  force.  If  </>  is  the  angle 
between  the  positive  x axis  and  r,  you  have 


Fx  = F cos  </> 


= F 


x 

(x2  + y2)112 


(20-12 a) 


910  The  Electric  Force  and  the  Electric  Field 


Fig.  20-11  The  force  F acting  on  an  alpha  particle  when  its  position  relative  to  the  nucleus 

is  r. 


and 


F sin  0 = F 


( x 2 + y 2) 


2\l/2 


(20-126) 


where  F is  the  magnitude  of  the  electric  force.  But  according  to  Coulomb’s  law,  Eq. 
(20-4a),  the  value  of  F is 


1 IffoJ  iffnl  ‘Wl n 1 

4776  0 r2  47re0  x2  + y 2 


(20-13) 


where  the  positive  quantities  qa  and  q„  are  the  electric  charges  on  the  alpha  particle 
and  nucleus,  respectively. 

Combining  Eqs.  (20- 12a),  (20-126),  and  (20-13)  and  applying  the  result  to  Eqs. 
(20-1  la)  and  (20-116),  you  have 


d2x 

1, e 


d2y 
dt 2 


^ 

4-n-eoM  (x2  + y2)312 


qaq  n >’ 

4776  oM  (x2  + y2)3'2 


(20- 14a) 
(20-146) 


Now  you  define  the  parameter  —a  to  equal  the  multiplicative  constant  in  Eqs. 
(20- 14a)  and  (20-146).  Then  you  can  write 


q o//  n 

4776  o/X 


(20-15) 


Also  you  can  write  the  exponent  of  the  x2  + y2  term  in  Eqs.  (20-14a)  and  (20-146)  as 
/ 3 . Using  these  two  quantities,  you  then  put  Eqs.  (20- 14a)  and  (20-146)  in  the  form 


^ = Q*  (20-16a) 

and 

= (20-166) 

where 

Qx  = —ax(x2  + y2)e  (20-17  a) 


20-3  Alpha-Particle  Scattering  911 


and 


ay(x 2 + y2)0 


Q«  - - 


(20-176) 


In  this  case  of  an  inverse-square  electric  force  you  have  (3  = — f , just  as  in  the  case 
of  the  inverse-square  gravitational  force. 


Equations  (20-16)  and  (20- 1 7)  are  identical  to  Eqs.  ( 1 1-29)  and  (1 1-30) 
and  can  be  solved  by  employing  exactly  the  same  numerical  procedure. 
This  procedure  is  given  by  Ecjs.  (11-21).  In  Examples  20-4  and  20-5  you 
will  investigate  the  effect  on  the  alpha-particle  trajectory  of  varying  the 
impact  parameter.  The  examples  use  the  conditions  of  the  1909  experiment 
of  Rutherford,  Geiger,  and  Marsden,  which  are  best  discussed  further  in  the 
light  of  the  results  of  the  examples. 


EXAMPLE  20-4  — — ' — 

Alpha  particles  emitted  in  the  radioactive  decay  of  radium  have  kinetic  energy 
equal  to  7.63  X 1 0— 13  J . Using  the  central-force  program  in  the  Numerical  Calcula- 
tion Supplement,  find  the  trajectory  of  such  a particle  incident  upon  a gold  nucleus 
with  impact  parameter  equal  to  1.60  X 10-14  m.  This  is  roughly  twice  the  radius 
of  the  gold  nucleus.  The  mass  and  charge  of  the  alpha  particle  are  ma  = 
6.65  x 10-27  kg  and  qa  = +2e;  the  mass  and  charge  of  the  gold  nucleus  are 
mAu  = 3.27  X 10-25  kg  and  qAu  = +79e. 

■ First  you  must  compute  the  reduced  mass  /r.  From  Eq.  (20-10)  it  is 


mamA  u 
ma  + mAu 


6.65  x IQ-27  kg  x 3.27  x lQ-«  kg 

6.65  x 10-27  kg  + 3.27  x 10“25  kg  8 


Then  you  must  evaluate  a , the  constant  which  determines  the  sign  and  strength  of 
the  electric  force.  Using  Eq.  (20-15),  you  have 


a 


QcxQau 

47 re0/u 


-8.99  x 109  N-m2/C2  x 
— 5.60  m3/s2 


2 x 1.60  x 1Q~19  C x 79  x 1.60  x 10~ 
6.52  x 10-27  kg 


You  also  need  values  for  the  initial  velocity  components  of  the  alpha  particle. 
Aligning  the  positive  x axis  with  the  direction  of  its  initial  motion,  as  in  Fig.  20-10, 
allows  you  to  write  (dy/dt) 0 = 0.  To  determine  (dx/dt) 0,  you  assume  that  initially  the 
alpha  particle  is  far  enough  away  from  the  nucleus  that  it  has  not  lost  an  appreciable 
amount  of  its  kinetic  energy  as  a result  of  the  increase  of  the  potential  energy  asso- 
ciated with  the  electric  force  in  the  system.  Then  you  evaluate  (dx/dt) 0 in  terms  of 
the  given  kinetic  energy  K = ix[(dx  / dt)0J  / 2,  as  follows: 


912 


(dx/dt)  o = 


/ 2 x 7.63  x 1Q~13  J 
\ 6.52  x 10“27  kg 


1/2 

= 1.53  x 107  m/s 


(What  does  this  figure  tell  you  about  the  accuracy  of  using  newtonian  mechanics,  in- 
stead of  relativistic  mechanics?) 

For  initial  coordinates  of  the  alpha  particle,  you  take  x0  = -3.00  x 10-13  m 
and  y0  = 1.60  x 10-14  m.  This  makes  its  initial  distance  from  the  nucleus  about 
20  times  the  nuclear  radius.  So  the  electric  force  acting  on  the  alpha  particle  initially 
is  about  l/(20)2  = 1/400  of  its  value  at  the  nuclear  surface,  and  it  is  reasonable  to 
assume  that  the  alpha  particle  has  not  yet  lost  appreciable  kinetic  energy.  The  value 
of  y0  corresponds  to  the  specified  impact  parameter. 

Finally,  you  choose  a time  interval  A t = 1.50  x 10~21  s.  This  choice  is  such  that 


The  Electric  Force  and  the  Electric  Field 


the  alpha  particle  would  traverse  the  distance  from  its  initial  position  to  its  closest 
approach  to  the  nucleus  in  about  15  time  intervals  if  no  electric  force  were  exerted 
on  it.  So  you  can  expect  to  obtain  about  15  points  on  the  incoming  part  of  the  trajec- 
tory, enough  to  give  you  a good  idea  of  its  properties. 

You  thus  have  the  following  initial  values  and  parameters  to  enter  in  the 
storage  registers: 

x0  = — 3 x 10~13  (in  m);  (dx/dt)0  = 1.53  X 107  (in  m/s);  y0  = 1.6  x 10-14  (in  m); 
(dy/dt) o = 0;  t0  = 0;  At  = 1 .5  x 10-21  (in  s);  a = -5.6  (in  m3/s2);  /3  = —1.5. 

The  sequence  of  alpha-particle  positions  produced  by  the  calculating  device 
run  with  these  values  is  plotted  as  a series  of  dots  in  Fig.  20-12.  The  alpha  particle  at 
first  is  moving  along  a very  nearly  straight  line,  covering  very  nearly  equal  distances 
in  equal  time  intervals.  Thus  at  first  the  alpha  particle  maintains  a very  nearly  con- 
stant velocity.  This  indicates  that  it  is  not  losing  kinetic  energy,  as  has  been  assumed. 
But  as  the  alpha  particle  approaches  the  gold  nucleus,  the  repulsive  electric  force 
begins  to  act  more  and  more  strongly  on  it  and  slows  it  down  very  considerably,  so 
that  the  alpha  particle  covers  less  distance  in  the  time  interval.  Also  the  force  acts  to 
change  the  direction  of  the  alpha  particle’s  motion  so  that  it  is  deflected  through  a 
large  angle.  The  alpha  particle  then  speeds  up  as  it  leaves  the  vicinity  of  the  nucleus, 
being  pushed  away  by  the  repulsive  force.  The  final  path  of  the  alpha  particle  is  a 
straight  line  along  which  it  moves  at  a speed  equal  to  its  initial  speed.  You  can  mea- 
sure the  scattering  angle  0 directly  from  the  graph.  This  is  the  angle  between  the 
final  and  initial  straight-line  portions  of  the  trajectory.  In  this  case  it  has  the  value 
6 = 108°. 


Fig.  20-12  An  alpha  particle  from  a radium  source  scattered  by  a gold  nucleus.  The  impact 
parameter  is  1.6  x 10~14  m. 


y (in  10  13  m) 


.v  (in  10  13  m) 


20-3  Alpha-Particle  Scattering  913 


y (in  10  13m) 


Fig.  20-13  Scattering  of  alpha  particles,  originating  from  a radium  source,  when  incident  on 
a gold  nucleus  with  several  impact  parameters. 


EXAMPLE  20-5 

Repeat  the  calculation  of  Example  20-4,  but  use  an  impact  parameter  twice,  four 
times,  eight  times,  and  sixteen  times  the  value  1.6  x 10-14  m used  in  that  example. 

® Figure  20-13  adds  to  the  trajectory  obtained  in  Example  20-4  those  obtained 
here.  The  trajectory  for  y0  = 3.2  x 10-14  m is  shown  as  a series  of  x’s.  The  scat- 
tering angle  for  this  trajectory  is  8 = 69°.  For  y0  = 6.4  X 10-14  m,  the  trajectory  is 
shown  as  a series  of  crosses,  and  8 = 38°.  For  y0  = 12.8  x 10-14  m the  trajectory  is 
plotted  as  a series  of  open  circles,  and  8 = 17°.  And  for  y0  = 25.6  x 10-14  m, 
open  triangles  are  used  to  plot  the  trajectory,  and  the  scattering  angle  that  results 
is  8 = 6°. 

Why  does  the  scattering  angle  decrease  so  rapidly  with  increasing  impact 
parameter?  What  is  the  connection  between  the  fact  that  the  trajectories  are  hyper- 
bolas and  the  fact  that  the  alpha  particle  is  undergoing  central  force  motion  with  a 
positive  total  energy?  (See  Sec.  11-6.) 


Example  20-5  shows  that  alpha  particles  from  the  radioactive  source 
used  in  the  experiments  of  Rutherford  and  his  collaborators  are  scattered 
through  angles  greater  than  6°  when  passing  thi'ough  gold  atoms  only  if 
the  impact  parameter  is  less  than  about  3 x 10-13  m.  This  is  the  figure  we 
used  in  the  calculation  preceding  Example  20-3.  There  we  concluded  that 
in  passing  through  a gold  foil  with  a thickness  typical  of  that  used  in  the 
experiments,  about  1 percent  of  the  alpha  particles  would  be  scattered 
through  angles  greater  than  6°.  If  you  consider  the  two  trajectories  in  Fig. 

914  The  Electric  Force  and  the  Electric  Field 


20-13  with  the  smallest  impact  parameters,  you  should  be  able  to  modify 
the  calculation  to  conclude  that  something  like  0.01  percent  of  the  alpha 
particles  will  be  scattered  through  angles  greater  than  90°.  This  conclusion 
is  the  central  one  in  Rutherford’s  landmark  discovery  of  the  existence  of 
atomic  nuclei.  The  story  is  recounted  in  the  material  in  small  print  that 
follows. 


p 


Fig.  20-14  The  apparatus  used  by 
Rutherford  and  his  collaborators. 


After  he  provided  strong  experimental  support  to  the  existence  of  electrons 
toward  the  close  of  the  nineteenth  century,  J.  J.  Thomson  suggested  that  the  atom 
was  constmcted  something  like  a spherical  “plum  pudding” — what  Americans 
would  call  a “raisin  cake,”  with  the  “raisins”  being  electrons  and  the  “cake” 
being  a continuous  distribution  of  positive  charge.  Since  Thomson  knew  that  the 
mass  of  an  electron  is  very  small  compared  to  the  mass  of  an  atom,  he  knew  that  al- 
most all  the  atomic  mass  was  contained  in  the  positively  charged  material  he  be- 
lieved to  be  distributed  uniformly  over  the  atomic  volume.  Such  an  atom  would 
never  be  able  to  scatter  an  alpha  particle  through  an  appreciable  angle.  Its  elec- 
trons cannot  do  so  because  they  do  not  satisfy  the  comparable-or-greater-mass  re- 
quirement discussed  at  the  beginning  of  this  section.  The  positively  charged  mate- 
rial spread  over  the  atomic  volume  cannot  scatter  an  alpha  particle  because  it  does 
not  satisfy  the  strong-force  requirement.  This  is  because  an  alpha  particle  travel- 
ing through  this  material  is  never  close  enough  at  any  one  time  to  enough  positive 
charge  to  experience  a strong  electric  force.  But  Thomson’s  model  did  predict  that 
the  electrons  would  produce  very  small  angle  scattering  of  alpha  particles  passing 
through  a thin  foil. 

Ernest  Rutherford  was  born  in  New  Zealand,  but  studied  in  England  and  did 
most  of  his  scientific  work  there  and  in  Canada.  Around  1908  he  and  his  German 
associate  Hans  Geiger  (who  later  invented  the  Geiger  counter)  built  the  apparatus 
shown  in  Fig.  20-14  to  study  alpha-particle  scattering.  Inside  the  evacuated 
chamber,  a stream  of  alpha  particles  emitted  by  the  radium  source  R passes 
through  a narrow  channel,  whose  outer  end  is  atD.  This  “collimates”  the  alpha 
particles  into  a narrow  beam,  which  is  directed  onto  a thin  gold  foilF.  A scattered 
alpha  particle  striking  a small  zinc  sulfide  screen  S was  detected  by  using  the 
microscope  M to  observe  the  tiny  flash  of  light  produced  when  it  strikes  S . The 
scattering  angle  being  studied  could  be  varied  by  rotating  the  microscope  and 
screen  about  the  vertical  axis. 

Geiger  performed  a very  long  series  of  experiments  in  which  he  counted  the 
number  of  scattered  alpha  particles  as  a function  of  scattering  angle.  But  he  con- 
centrated on  small  scattering  angles  because  this  is  where  the  Thomson  model 
predicted  that  the  scattered  alpha  particles  would  be  found.  Indeed,  Geiger  con- 
firmed that  almost  every  alpha  particle  striking  the  foil  passed  through  it  to 
emerge  on  a path  whose  direction  differed  from  the  direction  of  the  incident  beam 
by  an  angle  less  than  1°. 

Ernest  Marsden,  a 20-year-old  student,  had  just  come  to  the  laboratory,  ready 
to  begin  research.  Geiger  suggested  to  Rutherford  that  it  would  be  good  practice 
for  Marsden  to  look  for  alpha  particles  scattered  through  large  angles.  Neither  of 
them  really  expected  Marsden  to  find  anything  while  he  was  sharpening  his  tech- 
nical skills.  But  within  a few  days  he  had  seen  large-angle  scattering.  In  fact,  he 
found  that  about  0.01  percent  of  the  alpha  particles  incident  upon  the  foil  were 
scattered  at  an  angle  greater  than  90°.  In  other  words,  this  small — but 
nonzero — percentage  of  the  alpha  particles  emerged  from  the  side  of  the  foil  on 
which  they  were  incident.  As  Rutherford  wrote  in  1937,  “It  was  quite  the  most 
incredible  event  ...  in  my  life.  It  was  almost  as  incredible  as  if  you  fired  a 
15-inch  shell  at  a piece  of  tissue  paper  and  it  came  back  and  hit  you.” 

Overcoming  his  amazement  at  the  experimental  results,  Rutherford  went 
through  an  analysis  like  the  one  at  the  beginning  of  this  section  and  concluded 
that  the  results  could  be  explained  only  by  assuming  that  all  the  positive  charge  in 
an  atom,  and  almost  all  the  mass,  are  concentrated  in  a very  small  region  he  called 


20-3  Alpha-Particle  Scattering  915 


the  atomic  nucleus.  (He  placed  the  nucleus  at  the  center  of  the  atom  from  consid- 
erations of  symmetry.)  This  conclusion  constituted  the  discovery  of  the  nucleus,  a 
discovery  that  opened  a new  era  in  physics — and  in  world  history. 

In  detailed  calculations  that  he  made  in  1911,  Rutherford  proved  that 
the  probability  of  an  alpha  particle  scattering  at  any  particular  large  angle  is 
proportional  to  the  square  of  the  electric  charge  on  the  nuclei  of  the  atoms 
in  the  foil.  This  charge  is  written  as  Z(+e),  where  Z is  a positive  integer. 
The  relation  was  then  used  by  Rutherford's  collaborators  to  determine 
experimentally  the  value  of  Z for  a variety  of  atoms.  The  measured  values 
of  Z turned  out  to  equal  what  chemists  called  the  atomic  number  of  the 
atom,  that  is,  the  number  ordering  the  atom  in  the  chemical  periodic  table. 
Since  in  its  normal  state  an  atom  has  no  net  electric  charge,  if  the  nuclear 
charge  is  Z(  + c),  the  total  electron  charge  must  be  Z(~e).  So  Z is  also  the 
number  of  electrons  in  an  atom.  Thus  the  experiments  showed,  for  the 
first  time,  that  the  atomic  number  Z of  an  atom  is  the  number  of  electrons  in  the 
atom. 

Rutherford  also  showed  that  the  alpha-particle  scattering  experiments 
lead  to  the  very  important  conclusion  that  the  radius  of  a nucleus  is  smaller 
than  that  of  the  atom  containing  the  nucleus  by  a factor  of  the  order  of  magnitude  of 
10~5.  The  way  he  did  this  is  explained  below  in  small  print. 

Rutherford’s  analytical  calculations  were  based  on  the  assumption  that  the 
force  acting  between  an  alpha  particle  and  a nucleus  is  always  an  electric  force 
between  two  point  charges — just  as  we  assumed  in  our  numerical  calculations. 
This  basic  assumption  is  justified  if  the  charge  on  the  alpha  particle  is  distributed 
with  spherical  symmetry,  if  the  same  is  true  of  the  charge  on  the  nucleus,  and  if 
two  never  begin  to  overlap.  (To  be  precise,  the  separation  between  their  surfaces 
actually  must  never  be  less  than  about  2 x 10-15  m,  so  that  the  strong  nuclear 
force  is  never  exerted  between  them.  Rutherford  did  not  know  this.)  If  the  assump- 
tion is  satisfied,  then  the  force  acting  on  the  alpha  particle  will  be  just  the  electric 
force  that  it  would  feel  if  all  its  charge  were  concentrated  at  its  center  and  all  the 
nuclear  charge  were  concentrated  at  the  nuclear  center.  Some  justification  for  this 
statement  was  given  in  small  print  above  Eq.  (20-2)  when  Coulomb's  experiment 
was  discussed.  Proof  is  given  later  in  this  chapter. 

But  if  at  the  distance  of  closest  approach  to  the  nucleus  the  distance  between 
the  centers  of  the  alpha  particle  and  nucleus  is  less  than  the  sum  of  their  radii, 
then  they  are  not  separated  and  the  basic  assumption  is  not  satisfied.  In  such  an 
event  the  measured  scattering  probability  may  be  expected  to  deviate  from  the 
probability  calculated  on  the  basis  of  the  assumption.  No  such  deviations  were 
seen  throughout  the  entire  range  of  scattering  angle  in  the  experiments  carried  out 
on  nuclei  of  high  atomic  number,  such  as  gold,  for  which  Z = 79.  The  reason  is 
that  such  nuclei  are  so  highly  charged  that  the  alpha  particle  cannot  overcome  the 
strong  repulsive  force  exerted  on  it  and  come  close  to  the  nucleus.  If  for  the 
8 = 108°  trajectory  of  Fig.  20-13  you  measure  the  separation  between  the  center  of 
the  nucleus  (the  origin)  and  the  center  of  the  alpha  particle  at  its  point  of  closest 
approach  (the  point  on  the  trajectory  closest  to  the  origin),  you  will  find  that  it  is 
about  46  x 10~15  m.  For  a backward  scattering  trajectory — that  is,  one  with 
8 = 180° — the  distance  of  closest  approach  will  be  a minimum,  as  you  can  see  by 
inspecting  the  trend  shown  in  Fig.  20-12.  Rutherford  calculated  the  value  of  the 
minimum  distance  of  closest  approach  to  be  42  x 10-15  m.  Hence  he  could  con- 
clude that  the  agreement  between  the  measured  and  predicted  scattering  probabil- 
ities means  that  the  sum  of  the  radii  of  the  alpha  particle  and  the  gold  nucleus  is 
less  than  42  x 10-15  m. 

Deviations  between  experiment  and  theory  were  seen  in  the  scattering  at 
angles  near  180°  of  alpha  particles  from  nuclei  of  aluminum  atoms,  which  have 


916 


The  Electric  Force  and  the  Electric  Field 


the  low  atomic  number  Z =13.  The  low  charge  on  the  nuclei  reduces  the  repul- 
sion they  exert  on  the  alpha  particles,  allowing  them  to  come  closer — particularly 
for  scattering  angles  near  6 - 180°.  Rutherford  correctly  interpreted  the  devia- 
tions to  occur  because  the  alpha  particle  and  nucleus  overlap  slightly  at  the  point 
of  closest  approach.  He  then  calculated  the  center-to-center  separation  at  this 
point  for  6 = 180°,  obtaining  7 x 1CT15  m.  He  suggested  that  this  value  represents 
an  estimate  of  the  sum  of  the  radii  of  an  alpha  particle  and  an  aluminum  nucleus. 
A variety  of  modern  experiments  (including  the  scattering  of  high-energy  alpha 
particles  obtained  from  accelerators]  show  that  the  radii  of  the  alpha  particle  and 
of  the  aluminum  nucleus  are  about  2 x 10-15  m and  3 x 1CT15  m,  respectively. 
Their  sum  is  in  reasonable  agreement  with  Rutherford’s  estimate.  (The  agreement 
is  very  good  if  2 x 10-15  m is  added  to  account  for  the  fact  that  the  strong  nuclear 
force  acts  if  the  surfaces  are  closer  than  that  amount.) 

The  value  of  about  3 x 10-15  m for  the  radius  of  an  aluminum  nucleus  is  to  be 
compared  with  a value  of  about  1 x 10~10  m for  the  radius  of  an  aluminum  atom. 
(The  comparison  should  not  be  pushed  too  far  since  neither  a nucleus  nor  an  atom 
has  a completely  abrupt  “edge”  like  a bowling  ball.  And  some  nuclei  are  noticeably 
nonspherical,  looking  more  like  a football  than  a bowling  ball.  So  characterizing  a 
nucleus  or  an  atom  as  having  a particular  radius  can  be  only  an  approximation.) 
The  two  values  lead  to  Rutherford’s  conclusion  that  the  radius  of  a nucleus  is 
smaller  than  that  of  the  atom  containing  the  nucleus  by  a factor  of  the  order  of 
magnitude  10-5. 


20-4  THE  ELECTRIC 
FIELD  AND 
ELECTRIC  FIELD 
LINES 


Fig.  20-15  The  electric  force  Ff  exerted 
on  a positive  test  charge  q,  by  a source 
charge  q,  which  is  assumed  to  be  posi- 
tive. The  vector  r gives  the  location  of 
the  test  charge  relative  to  the  source 
charge.  The  quantity  F ,/q,  is  defined  to 
be  the  electric  field  £ of  the  source 
charge  at  location  r. 


The  electric  field  of  a charged  body  is  closely  related  to  the  electric  force  that 
it  exerts  on  any  other  charged  body.  In  this  section  we  introduce  the  elec- 
tric held  as  a device  that  makes  the  computation  of  the  electric  force  more 
convenient.  Then  we  show  how  an  electric  held  can  be  represented  pic- 
torially  in  terms  of  what  are  called  electric  field  lines.  In  Sec.  20-5  the  proper- 
ties of  electric  held  lines  are  used  in  a simple  way  to  develop  a very  power- 
ful computational  and  conceptual  principle  known  as  Gauss’  law. 

In  Fig.  20-15  a hxed  point  charge  q is  located  at  a certain  position  in 
space.  As  a result  of  the  presence  of  q,  any  other  point  charge  qt  will  experi- 
ence a force  given  by  Coulomb’s  law.  The  magnitude  and  direction  of  this 
force  F(  will  depend  on  the  position  vector  r of  qt  relative  to  q.  Depending 
on  the  signs  of  q and  qt,  the  direction  of  Ff  is  always  either  the  same  as  or 
opposite  to  the  direction  of  r itself.  In  principle,  we  can  determine  its  mag- 
nitude Ft  by  attaching  a small  spring  scale  to  qt  and  moving  it  around  so  as 
to  measure  Ft  as  a function  of  r,  the  magnitude  of  its  position  vector.  Also, 
Ft  depends  in  direct  proportion  on  the  magnitude  of  q,.  If,  for  example,  we 
replace  qt  with  another  point  charge  q't  = qt/2,  then  at  every  position  the 
force  on  q',  will  be  just  half  that  on  qt. 

If  we  are  mainly  interested  in  exploring  the  effect  of  the  point  charge 
q,  which  we  call  the  source  charge,  we  are  not  very  interested  in  the  value 
of  the  point  charge  qt,  which  we  call  the  test  charge.  So  let  us  divide  qt  out, 
rewriting  Coulomb’s  law,  F<  = ( l/47T60)(qqt/r2)r , in  a form  which  gives 
the  force  on  the  test  charge  per  unit  charge  of  the  test  charge.  We  have 


Fr  _ 1 q 

q,  4 7r€0  r2 


(20-18) 


In  this  equation  F,  is  the  force  exerted  on  the  test  charge  qt  by  the  source 
charge  q,  r is  the  distance  from  the  source  charge  to  the  test  charge,  and  r is 
a unit  vector  in  the  direction  from  the  source  charge  to  the  test  charge.  The 
force  per  unit  charge  acting  on  a test  charge  located  at  some  point  is  known 
as  the  electric  field  8 at  that  point. 


20-4  The  Electric  Field  and  Electric  Field  Lines  917 


The  use  of  this  name  is  actually  something  of  a misnomer.  The  electric  field, 
as  you  will  see  later  in  this  chapter,  is  an  entity  which  extends  over  all  space.  The 
quantity  £ is  one  of  its  most  important  properties.  The  official  name  of  the  quan- 
tity 8 is  “electric  field  intensity.’’  Nevertheless,  the  name  “electric  field”  is  loosely 
but  universally  employed  for  8.  We  adopt  this  name  because  we  wish  to  reserve 
the  term  “intensity”  for  a quite  different  class  of  quantities. 

The  value  of  the  electric  field  8 is  given  by  the  definition 


(20-19) 


The  electric  field  8 is  a vector,  since  it  is  defined  as  a vector  divided  by 
a scalar.  The  SI  unit  of  its  magnitude  is  newtons  per  coulomb  (N/C). 
Both  the  magnitude  and  the  direction  of  8 at  any  location  are  determined 
by  Eq.  (20-19).  The  direction  is  the  same  as  that  of  the  electric  force  acting 
on  a positive  test  charge  located  there.  Note  that  although  a test  charge  is  in- 
timately involved  in  the  definition  of  the  electric  field,  the  electric  field  is 
nevertheless  a property  of  the  source  charge  only.  It  does  not  depend  at  all 
on  the  test  charge  because  the  value  qt  of  this  charge  has  been  removed 
from  8 by  defining  it  as  F,  divided  by  qt.  Thus  the  electric  field  is  associated 
with  the  source  charge,  not  with  the  test  charge. 

But  to  determine  experimentally  the  value  of  8 at  some  location  in  the  space 
surrounding  the  source  charge,  we  must  place  a test  charge  at  that  location,  mea- 
sure the  force  Fr  exerted  on  it  by  the  source  charge,  and  then  divide  by  q(.  In  do- 
ing this,  we  must  be  very  sure  that  the  equal  but  opposite  reaction  force  exerted 
by  the  test  charge  on  the  source  charge  does  not  cause  the  source  charge  to  move. 
Any  movement  would  make  8 have  a value  at  the  measurement  location  different 
from  the  value  it  had  there  before  the  measurement.  This  restriction  becomes  par- 
ticularly significant  when  we  try  to  survey  the  electric  field  associated  with 
source  charges  distributed  on  the  surface  of  a conductor.  If  the  test  charge  is  too 
big,  it  may  result  in  a redistribution  of  the  source  charges. 

We  will  call  the  electric  field  associated  with  a source  charge  the  “electric 
field  of  the  source  charge.”  The  electric  field  of  a single  point-source  charge 
q can  be  found  by  combining  Coulomb’s  law,  Eq.  (20-18),  and  the  definition 
of  electric  field,  Eq.  (20-19),  to  obtain  a form  of  Coulomb’s  law  that  involves 
the  electric  field  explicitly.  It  is 


(20-20) 


If  q is  positive,  8 is  everywhere  directed  away  from  the  point-source 
charge;  if  q is  negative,  8 is  everywhere  directed  toward  the  charge.  Its 
magnitude  % at  a distance  r from  the  charge  is  ql^neyi3'. 

To  find  the  electric  field  of  a set  of  n point-source  charges,  we  take  ad- 
vantage of  the  experimental  fact  that  the  electric  forces  exerted  on  a test 
charge  qt  by  a number  of  other  charges  q1,  q2 , . . . , q3,  . . . , q„  add  vec- 
torially,  just  as  is  the  case  for  all  other  types  of  forces.  That  is,  we  can  write 
the  force  Ff  on  qt  as  the  sum 


Ff  — Ffl  + Fj2  + • • • + Ffj  + • • • + Ffn 


(20-21) 


918  The  Electric  Force  and  the  Electric  Field 


(a) 


/>• 


Q 


(b) 


P 


(c) 


Fig.  20-16  (a)  A charge  distribution  and 
a charge  q at  point  P.  ( b ) To  find  the 
force  F exerted  by  the  charge  distribu- 
tion on  the  charge  q , we  first  find  the 
electric  field  £ of  the  charge  distribution 
at  P.  In  this  step  Eq.  (20-25)  is  used,  and 
there  is  no  reference  to  charge  q.  (c) 
Then  we  find  F from  £ and  q.  In  this 
step  Eq.  (20-26)  is  used,  and  there  is  no 
reference  to  the  charge  distribution. 


Here  ¥tj  is  the  force  exerted  on  qt  by  the  presence  of  the  source  charge  qj. 
Evaluating  the  terms  such  as  Ftj  by  means  of  Coulomb’s  law,  we  have 


F,  = 


47760 


Ml  f + Ml  r i 

„2  rlt  ^ y 2 r2f  ^ 
' It  '2 1 


+ 


+ 


+ 


qnqt 


1 nt 


nt 


(20-22) 


where  rjt  is  the  vector  from  q}  to  qt . Dividing  by  the  common  factor  qt  gives 
us  an  expression  for  8,  the  electric  held  of  the  set  of  point-source  charges: 


s-S- 

qt 


4776n 


q i - 


r2 
T It 


r + 

*1/1  o r2 t ' 

rit 


+ r jt  + 

r it 


+ 


qn 


■ nt 


nt 


(20-23) 


This  can  be  written  in  terms  of  the  electric  fields  8j  of  the  individual  source 
charges  qj.  To  do  so,  we  use  Eq.  (20-20)  and  obtain 


8 — Si  + 82  + 


Sj  + ■ 


+ 2, 


(20-24) 


Thus  electric  fields  combine  vectorially,  just  as  electric  forces  combine  ac- 
cording to  the  vector  addition  of  Eq.  (20-21).  In  summation  notation,  Eq. 
(20-23)  assumes  the  form 


8 = 


1 V . 

t—  2,  72 

477  6 0 “ r jt 


(20-25) 


Equation  (20-25)  allows  us  to  calculate  the  electric  held  8 of  a set  of 
point-source  charges  at  all  positions  in  space. 

Then  the  electric  held  at  any  position  can  be  used  to  calculate  the  elec- 
tric force  exerted  on  a charge  located  at  that  position.  This  force  F is  the 
force  per  unit  charge  acting  on  a charge  at  that  position,  8,  multiplied  by 
the  amount  of  its  charge,  q.  That  is, 

F = qS  (20-26) 

[This  is  just  Eq.  (20-19)  with  the  subscript  t dropped  because  the  charge  q 
on  which  the  force  is  exerted  is  not  a test  charge.] 


Thus  the  force  exerted  by  a distribution  of  charges  on  some  other 
charge  is  calculated  in  two  steps.  First,  Eq.  (20-25)  is  used  to  evaluate  the 
electric  held  at  the  location  of  the  other  charge.  Second,  Eq.  (20-26)  is  used 
to  evaluate  the  electric  force  on  that  charge  resulting  from  the  electric  held 
at  its  location.  The  computational  convenience  of  this  two-step  procedure  is 
that  once  8 has  been  calculated  from  Eq.  (20-25)  for  a certain  set  of 
charges,  it  never  has  to  be  calculated  again.  When  8 is  known,  Eq.  (20-26) 
can  be  used  to  calculate  immediately  the  force  F acting  on  any  other  charge 
q — no  matter  what  its  sign  or  magnitude.  The  procedure  is  indicated  sche- 
matically in  Fig.  20-16. 

If  only  a single  point  charge  is  the  source  of  the  electric  field  that  acts  on  an- 
other point  charge,  there  does  not  seem  to  be  much  practical  advantage  in  using 
the  two-step  procedure  instead  of  a direct  application  of  Coulomb’s  law.  This  may 
be  true  for  the  cases  presently  considered,  where  at  least  one  of  the  charges  always 
is  at  rest  with  respect  to  the  observer.  But  in  more  general  cases  involving  a pair  of 
point  charges,  the  two-step  procedure  made  possible  by  introducing  the  electric 
field  (and  also  the  magnetic  field  present  in  such  cases)  can  be  a necessity — not 
just  a convenience. 

Coulomb’s  law  describes  the  electric  forces  that  two  separated  point  charges 
exert  on  each  other.  If  you  reread  the  material  at  the  end  of  Sec.  4-5,  you  will  be 
reminded  that  such  “action  at  a distance”  forces  fail  to  satisfy  Newton’s  law  of  ac- 


20-4  The  Electric  Field  and  Electric  Field  Lines  919 


tion  and  reaction  if  there  is  an  abrupt  change  in  a characteristic  of  one  of  the 
charges  that  is  important  to  the  interaction  between  the  two.  Such  is  the  case  if 
one  charge  is  given  a sequence  of  accelerations  which  set  it  into  oscillatory  mo- 
tion. If  this  happens,  then  its  interaction  with  the  other  charge  will  cause  that 
charge  to  start  oscillating  too — but  only  after  a certain  delay.  Because  of  the  delay, 
the  force  which  the  first  charge  exerts  on  the  second  is  not  accompanied  by  an 
equal  but  opposite  force  exerted  by  the  second  charge  on  the  first,  in  violation  of 
the  law  of  action  and  reaction.  This  is  very  serious  since  the  law  follows  directly 
from  the  fundamental  law  of  momentum  conservation  and  the  fundamental  defi- 
nition of  force  as  rate  of  change  of  momentum. 

The  material  at  the  end  of  Sec.  4-5  sketches  the  way  that  the  difficulty  is  elimi- 
nated by  going  from  the  idea  that  two  charges  interact  directly  through  action- 
at-a-distance  forces  to  the  idea  that  they  interact  indirectly  by  means  of  a field.  In 
this  two-step  process,  the  oscillating  charge  interacts  with  its  own  field  and  sets 
up  an  oscillation  in  the  field.  The  oscillation  propagates  through  the  field,  eventu- 
ally arriving  at  the  other  charge.  There  an  interaction  occurs  between  the  field  and 
that  charge,  which  sets  the  charge  into  oscillation.  At  each  step  the  interaction 
takes  place  at  a particular  location — just  like  an  interaction  between  two  pucks 
colliding  on  an  air  table — and  a pair  of  forces  is  exerted  between  a charge  and  a 
field  which  do  satisfy  the  law  of  action  and  reaction.  In  this  picture  the  field 
carries  momentum  from  one  charge  to  the  other  by  means  of  the  photons  men- 
tioned at  the  end  of  Sec.  20-1.  Because  momentum  is  transferred  from  one  to 
the  other,  the  two  charges  exert  forces  on  each  other.  But  there  is  a delay  in- 
volved because  photons  do  not  travel  with  infinite  speed. 

By  introducing  the  concept  of  electric  field  and  the  associated  two-step  proce- 
dure in  the  simple  situation  we  deal  with  here,  we  are  laying  foundations  essential 
to  understanding  the  more  complicated  situations  we  treat  later. 

In  Example  20-6  the  electric  field  of  a set  of  point  charges  is  evaluated, 
and  then  used  to  calculate  the  electric  force  that  is  exerted  on  another 
charge. 


EXAMPLE  20-6  ' 1 ■ 

a.  Three  point  charges,  q1  = + 1.00  X 10-6  C,  q2  = —2.00  X 10-6  C,  and 
q 3 = +3.00  x 10-6  C,  are  fixed  rigidly  at  the  vertices  of  an  isosceles  triangle,  as 
shown  in  Fig.  20-17.  Find  the  electric  field  8 at  the  midpoint  P of  the  base  of  the  tri- 
angle. 

b.  A point  charge  q = —4.00  X 10-6  C is  moved  to  P.  What  electric  force  F acts 
on  this  charge? 


<7 3 =+3.00  X 10  6 C Fig. 


20-17 


~6  C 


Illustration  for  Example 


920  The  Electric  Force  and  the  Electric  Field 


■ a.  You  use  Eq.  (20-25),  with  n = 3,  to  write 


4776(1 


<h 

2 1 
r It 


, ?2  - 
+ TIT  r2' 

r 2t 


Then  define  x and  y axes  as  shown  in  the  figure  so  that  you  can  express  ru,  the 
vector  from  to  the  point  P where  the  held  is  to  be  evaluated,  as  r1(  = 
(0.200  m)(+x),  with  x being  a unit  vector  in  the  positive  x direction.  Similarly,  you 
express  the  other  two  vectors  as  r 2t  = (0.200  m)(  — x)  and  r3(  = (0.300  m)(—  y). 
Then  you  have 


8 = 8.99  x 109  N-m2/C2 


+ 1.00  x 10-6  C „ - 2.00  x 10~6  C 

(+x)  + — (-x) 


(0.200  m)2 


(0.200  m) 


= 8.99  x 109  x (7.50  x l(r5  x - 3.33  x 1CT5  y)  N/C 


+ 3.00  x 10-6  C , „ 

-| ( — v) 

(0.300  m)2  v y 


The  magnitude  of  the  electric  held  is  obtained  by  using  the  pythagorean  theorem: 
S = 8.99  x 109  x [(7.50  x 1CT5)2  + (3.33  x KT5)2]1'2  N/C  = 7.39  x 105  N/C 


The  direction  of  the  electric  held  can  be  specihed  in  terms  of  the  angle  8%  between 
its  direction  and  the  positive  x axis.  Using  the  dehnition  of  the  tangent  of  an  angle, 
you  find  its  value  to  be 


= tan 


tan  1 


/— 3.33  x 1 Q 5 \ 
V 7.50  x ur5  j 


-23.9° 


The  direction  of  8 is  shown  in  the  hsrure. 

o 

b.  If  you  know  8 at  the  point  P,  calculating  the  force  F which  is  exerted  on  a 
charge  q placed  at  that  point  is  simply  a matter  of  using  Eq.  (20-26), 

F = r/8 


For  the  magnitude  of  F you  have 

F = \q\%  = 4.00  x 10-6  C x 7.39  x 105  N/C  = 2.96  N 

The  direction  of  F is  opposite  to  the  direction  of  8 , as  shown  in  the  figure,  since  q 
has  a negative  value.  Hence  the  angle  0F  between  F and  the  positive  x axis  is 

8f  = 180°  - |%|  = 180.0°  - 23.9°  = 156.1° 


It  often  happens  in  electrical  studies  that  the  situation  involves  a set  of 
source  charges  whose  distribution  can  he  regarded  as  continuous.  The  elec- 
tric held  of  such  a distribution  can  be  calculated  by  using  the  methods  of 
integral  calculus.  Figure  20-18  shows  a region  of  space  over  which  an  elec- 
tric charge  q is  distributed.  We  wish  to  find  the  electric  held  8 at  point  P, 


p 


<7 


Fig.  20-18  A continuous  distribution  of  charge.  The 
total  charge  q is  subdivided  in  imagination  into  infinites- 
imal parts  clq  contained  in  infinitesimal  regions,  one  of 
which  is  shown.  Since  these  regions  approximate  points. 
Coulomb’s  law  can  be  applied  to  calculating  the  electric 
field  at  the  arbitrary  point  P outside  the  charge  distri- 
bution. 


20-4  The  Electric  Field  and  Electric  Field  Lines  921 


where  a test  charge  qt  is  located.  As  is  usual  in  such  situations,  we  divide  the 
distributed  charge  into  infinitesimal  elements  clq , as  shown  in  the  figure. 
Each  of  these  elements  can  be  considered  as  a point  charge.  So  we  can  use 
Coulomb’s  law  to  write  its  contribution  d F to  the  total  force  acting  on  q,  as 


d F = 


1 qt  dq  f 
47760  r2 


Dividing  d¥  by  the  amount  qt  of  charge  on  the  test  charge,  we  find  the  force 
per  unit  charge  on  the  test  charge.  This  is  just  <78,  the  contribution  of  dq  to 
the  electric  field  at  the  location  of  the  test  charge.  In  other  words,  d£>  = 
dF/qt,  so  that 


d&  = 


1 dq  „ 

1 5-  r 

47760  r 


(20-27  a) 


Next  we  integrate  over  the  source  charge  distribution  to  sum  the  individual 
contributions  of  all  the  elements  of  charge  in  the  distribution.  We  obtain 


charge  charge 

distribution  distribution 


The  integral  on  the  left  side  is  just  the  electric  field  8 of  all  the  continu- 
ously distributed  charge  q.  So  we  have 


charge 

distribution 


(20-275) 


You  should  compare  this  equation  with  Eq.  (20-25),  which  is  the  analogous 
summation  for  a set  of  discrete  charges. 

Because  the  integral  on  the  right  side  of  Eq.  (20-275)  is  an  integral  of  a 
vector  quantity,  it  can  be  quite  difficult  to  evaluate  unless  the  charge  distri- 
bution has  a high  degree  of  symmetry.  Example  20-7  considers  a distribu- 
tion of  charge  which  is  symmetrical  about  a line,  which  makes  it  relatively 
easy  to  use  Eq.  (20-275)  to  evaluate  its  electric  field  anywhere  along  that  line. 
In  Sec.  20-5  we  develop  a method  of  evaluating  electric  fields  of  certain 
symmetrical  charge  distributions  which  exploits  their  symmetry  in  a very 
effective  way  and  makes  it  easy  to  evaluate  these  fields.  In  Chap.  21  an  indi- 
rect method  is  developed  that  applies  to  continuous  distributions  of  any 
type.  But  it  involves  integration  of  a scalar  quantity,  so  it  is  not  too  difficult 
to  use  for  asymmetric  charge  distributions. 


EXAMPLE  20-7 

Figure  20-19  shows  a circular  loop  of  radius  k,  made  of  fine  copper  wire.  The  con- 
ducting loop  is  given  a macroscopic  positive  charge  q.  Find  the  electric  field  8 at  a 
point  P on  the  axis  of  the  loop  at  a distance  z from  its  center. 

B Because  a large  number  of  charges  have  been  added  to  the  wire,  you  can  con- 
sider them  to  be  distributed  continuously  over  its  surface.  Since  all  regions  of  the 
symmetrical  loop  are  equivalent,  the  charge  distribution  is  uniform  around  the 
loop.  With  this  distribution,  the  ratio  of  the  charge  dq  on  an  infinitesimal  segment  of 
the  loop  to  the  total  charge  q equals  the  ratio  of  the  length  ds  of  the  segment  to  the 
total  length  2-nk  of  the  loop.  That  is, 

dq  _ ds 
q 2 irk 


922  The  Electric  Force  and  the  Electric  Field 


Writing  dq  in  terms  of  the  length  element  ds,  you  have 


Fig.  20-19  A loop  of  copper  wire 
having  a positive  charge  q.  The  shading 
indicates  the  continuous  distribution  of 
charge. 


dq 


Q 

2 77  A 


ds 


This  infinitesimal  element  of  charge,  taken  by  itself,  has  associated  with  it  an  infini- 
tesimal held  dZ  at  the  axial  point  P.  The  direction  of  dZ  is  away  from  its  source  dq, 
as  shown  in  the  figure.  You  must  add  to  this  element  of  electric  held  all  the  others 
associated  with  inhnitesimal  elements  of  charge  elsewhere  on  the  loop.  The  sym- 
metry of  the  system  can  be  exploited  to  make  this  task  easier. 

To  exploit  the  symmetry,  you  hrst  express  the  vector  dZ  in  terms  of  its  compo- 
nents d%>z  and  d%>± . As  shown  in  the  hgure,  the  component  d%z  is  along  the  axis  of 
the  loop,  and  the  component  d%±  is  perpendicular  to  the  axis  and  lies  in  the  plane 
containing  P,  ds,  and  the  center  of  the  loop.  Next  you  note  that  in  summing  all  the 
contributions  to  the  electric  held  of  the  loop,  each  component  d%±  will  be  exactly 
canceled  by  an  equal  but  opposite  component  d%'±  of  the  held  associated  with  the 
charge  dq'  in  the  element  ds'  of  the  loop  directly  opposite  to  the  element  ds.  Thus 
the  total  electric  held  at  P will  be  the  sum  of  contributions  of  only  the  z components 
of  the  inhnitesimal  helds,  d%z. 

You  therefore  need  calculate  only  the  value  of  d%z.  The  hgure  shows  that  it  is 
related  to  the  magnitude  d¥>  of  the  inhnitesimal  held  dZ  as  follows: 


d%z  = d%  cos  (f>  = d%  - = d%  Z 2 /2 
r (z~  + k y 


Using  Eq.  (20-27 a)  to  evaluate  d'S,  you  have 

ise  _ 1 q/Zirk  _ q 

d ^ i 9 ds  2 l 2 \ L* 

47re0  r 877  e0A  z + A 


1 


ds 


Hence 


,ce  = __J z , 

2 8772e0A  (z2  + A2)3'2 

To  hnd  %z,  you  must  integrate  d%z  around  the  loop.  Since  the  entire  factor 
multiplying  ds  on  the  right  side  of  the  equation  is  a constant  as  far  as  this  integration 
is  concerned,  you  have 

= / = 8772e0A(I2  + A2)3'2  i ds 

loop  loop 

But  integrating  the  length  element  ds  around  the  loop  just  gives  you  the  circumfer- 
ence 27tA  of  the  loop.  So  you  obtain 


Z = 


cp  +■ 

&zz  = 


qz2irk 


87r2e0A(z2  + A2): 


2\3/2 


or 


1 ?z  - 

4 7T€0  (Z2  + A2)3/2 


(20-28) 


Here  z is  a unit  vector  in  the  direction  along  the  axis  from  the  loop  to  the  point 
where  the  held  is  evaluated,  as  shown  in  the  hgure. 

To  check  this  result,  consider  the  extreme  cases  z = 0 and  z A.  In  the  hrst 
case  you  get  £ = 0.  That  is,  Eq.  (20-28)  predicts  that  the  electric  held  at  the  center 
of  the  loop  will  be  zero.  This  is  certainly  what  you  would  expect  on  the  basis  of  sym- 
metry. For  z = 0 all  the  electric  held  elements  dZ  lie  entirely  in  the  plane  of  the 
loop,  and  they  cancel  exactly  in  pairs,  as  you  have  seen  above.  To  put  it  even  more 
simply,  a positive  test  charge  at  the  center  of  the  loop  feels  no  net  electric  force  be- 


20-4  The  Electric  Field  and  Electric  Field  Lines  923 


cause  it  is  repelled  equally  by  all  the  elements  of  positive  charge  distributed  symmet- 
rically around  it. 

In  the  case  z k , Eq.  (20-28)  predicts  that 


47re0  z2 

This  is  the  same  result  as  that  which  would  be  obtained  if  all  the  charge  in  the  loop 
were  located  at  its  center.  This  makes  sense  since  the  electric  force  acting  on  a test 
charge  at  a very  great  distance  from  the  loop  should  be  the  same  whether  the  source 
charge  is  distributed  around  the  loop  or  concentrated  at  its  center. 

The  electric  field  of  one  or  more  source  charges  is  a vector  field.  That 
is,  at  each  point  in  the  space  surrounding  the  source  of  the  field,  the  field  is 
described  by  a vector  8 of  a certain  magnitude  and  a certain  direction.  A 
direct  pictorial  representation  of  an  electric  field  is  difficult  to  create  be- 
cause it  involves  constructing  a vector  at  each  of  a large  number  of  repre- 
sentative points  in  the  field.  And  in  such  a representation  there  are  so  many 
different  vectors  that  it  is  difficult  to  interpret  their  significance.  The  Brit- 
ish investigator  of  electrical  phenomena  Michael  Faraday  (1791-1867)  in- 
troduced a way  to  picture  an  electric  field  very  conveniently  in  terms  of 
what  he  called  lines  of  force  but  are  called  more  accurately  electric  field 
lines. 

The  properties  of  electric  field  lines  are  as  follows: 

1.  An  electric  field  line  emanates  from  a point  charge  that  is  a source 
of  the  electric  field  8 which  is  represented  in  part  by  the  line. 

2.  An  electric  field  line  is  a directed  line  having  at  every  position  along 
the  line  the  same  direction  as  8 at  that  position. 

3.  Electric  field  lines  emanate  from  a point  charge  symmetrically  in  all 
directions. 

4.  The  total  number  of  electric  field  lines  emanating  from  a point 


Fig.  20-20  Electric  field  lines  for  a posi- 
tive point  charge  q.  The  spherical  sur- 
face of  radius  r is  used  in  the  text  to 
show  that  the  number  of  lines  per  unit 
area  crossing  an  imaginary  surface 
normal  to  the  direction  of  the  lines  is 
proportional  to  the  magnitude  of  the 
electric  field. 


924  The  Electric  Force  and  the  Electric  Field 


(b) 


Fig.  20-21  A two-dimensional  rep- 
resentation of  the  electric  field  lines  for 
(a)  a positive  point  charge,  (b)  a negative 
point  charge. 


Fig.  20-22  A two-dimensional  repre- 
sentation of  the  electric  field  lines  for  a 
set  of  two  separated  point  charges. 
Their  signs  are  opposite,  but  their  mag- 
nitudes are  the  same.  In  the  absence  of 
other  charges,  none  of  the  field  lines 
terminate.  Instead  they  extend  continu- 
ously from  the  positive  charge  to  the 
negative  charge. 


charge  is  proportional  to  the  magnitude  \q\  of  that  charge.  (The  value  of 
the  proportionality  constant  is  chosen  arbitrarily  so  as  to  provide  the 
clearest  pictorial  representation  of  the  electric  field.) 

5.  The  number  of  electric  field  lines  per  unit  area  crossing  an  imagi- 
nary surface  normal  to  the  direction  of  the  lines  at  any  location  is  propor- 
tional to  the  magnitude  of  8 at  that  location.  (The  value  of  the  proportion- 
ality constant  is  arbitrary  because  the  total  number  of  lines  emanating  from 
the  charge  is  arbitrary.) 

Figure  20-20  shows  the  electric  field  lines  that  represent  an  electric 
field  8 whose  source  is  a positive  point  charge  q.  In  agreement  with  proper- 
ties 1 and  2,  each  line  is  everywhere  directed  away  from  the  positive  charge 
because  8 everywhere  is  in  that  direction.  The  lines  emanate  uniformly  in 
all  directions  from  the  point  charge.  This  is  required  by  property  3,  whose 
justification  is  simply  that  space  is  symmetrically  disposed  about  a point 
charge  in  all  directions.  As  is  allowed  by  property  4,  the  total  number  of 
lines  used  in  the  figure  was  chosen  so  that  there  are  enough  to  make  what 
goes  on  clear,  but  not  so  many  as  to  make  the  figure  difficult  to  construct  or 
to  interpret.  To  see  that  property  5 is  satisfied,  imagine  a sphere  of  radius  r 
centered  on  the  source  charge,  as  in  the  figure.  The  surface  area  of  the 
sphere  is  47rr2.  Its  surface  is  everywhere  normal  to  the  lines  crossing  the  sur- 
face. So  at  any  location  on  the  sphere  the  number  of  lines  crossing  the  sur- 
face per  unit  area  is  just  the  total  number  N of  lines  divided  by  the  total 
surface  area  47rr2.  Property  5 says  that  N/^nr2.  Since  A is  a constant, 
this  is  equivalent  to  the  proportionality  % « 1/r2.  Is  this  proportionality 
correct?  You  can  see  immediately  that  it  is  from  the  form  of  Coulomb’s 
law  given  in  Eq.  (20-20),  = ^/47re0r2.  Since  q/4-ne0  is  a constant,  it  is 

evident  that  property  5 of  electric  field  lines  correctly  describes  the  way 
that  the  magnitude  of  the  electric  field  of  a point-source  charge  varies 
in  proportion  to  the  inverse  of  the  square  of  the  distance  from  the  charge. 

Because  the  pattern  of  field  lines  for  a point  charge  is  the  same  in  any 
plane  passing  through  the  charge,  it  really  is  not  necessary  to  use  three  di- 
mensions to  show  the  pattern.  Thus  Fig.  20-2 la  depicts  the  pattern  in  Fig. 
20-20  by  showing  its  cross  section  in  the  plane  of  the  page. 

The  electric  field  lines  in  Fig.  20-2  lb  are  those  of  a point  charge  whose 
magnitude  is  the  same  as  that  of  the  one  in  Fig.  20-2 la  but  whose  sign  is 
negative.  The  pattern  is  the  same  except  for  the  fact  that  the  field  lines  are 
everywhere  directed  toward  the  negative  charge  since  8 is  everywhere  in  that 
direction.  The  situation  depicted  in  the  figure  is  frequently  described  by 
saying  that  electric  field  lines  begin  on  positive  charges  and  end  on  negative 
charges. 

Figure  20-22  shows  a cross  section  in  the  plane  of  the  page  of  the  field 
line  pattern  for  a set  of  two  separated  point  charges.  Charge  1 has  a certain 
positive  charge,  and  charge  2 has  an  equal  negative  charge.  Very  near 
charge  1 the  pattern  is  indistinguishable  from  the  single  positive  point 
charge  pattern  in  Fig.  20-2 la.  And  very  near  charge  2 it  cannot  be  distin- 
guished from  the  single  negative  point  charge  pattern  in  Fig.  20-216.  The 
reason  is  that  at  a location  very  near  one  of  the  charges,  the  inverse-square 
distance  dependence  of  the  electric  field  of  each  single  charge  results  in  the 
complete  domination  of  the  contribution  of  the  near  charge  to  the  electric 
field  8 of  the  pair  of  charges  over  that  of  the  far  charge.  Thus  each  line 


20-4  The  Electric  Field  and  Electric  Field  Lines  925 


begins  on  the  positive  charge  and  initially  goes  almost  straight  out  because 
it  follows  Sj,  which  is  directed  out  from  charge  1.  But  as  the  line  continues 
away  from  charge  1,  the  contribution  82  that  charge  2 makes  to  8 becomes 
relatively  more  and  more  important.  Since  82  is  directed  into  charge  2,  this 
makes  the  line  begin  to  bend  toward  charge  2.  As  it  continues,  the  line 
bends  more  and  more  toward  charge  2,  until  finally  it  comes  straight  in  to 
end  on  that  negative  charge.  Why  is  it  that  at  the  location  on  each  line  equi- 
distant from  the  two  source  charges  the  line  is  parallel  to  the  symmetry 
axis?  Why  are  the  lines  most  closely  spaced  in  the  region  halfway  between 
the  two  charges? 


20-5  ELECTRIC  FLUX  By  introducing  a quantity  known  as  electric  flux  we  can  obtain  what  is  called 
AND  GAUSS’  LAW  Gauss’  law.  Gauss’  law  is  equivalent  to  Coulomb’s  law.  But  it  has  two  signifi- 
cant advantages  over  Coulomb’s  law  in  certain  circumstances.  One  is  that 
Gauss’  law  makes  it  particularly  easy  to  evaluate  the  electric  fields  of  certain 
symmetrical  charge  distributions.  The  other  advantage  is  that  Gauss’  law 
provides  a particularly  clear  insight  into  certain  basic  properties  of  the  elec- 
tric field  of  any  charge  distribution.  Gauss’  law  is  named  after  its  originator, 
the  mathematician  Karl  Friedrich  Gauss  (1775-1855).  In  this  section  we 
develop  Gauss’  law  (using  an  argument  involving  electric  field  lines  that  is 
quite  different  from  the  argument  used  by  Gauss  himself).  Section  20-6  is 
devoted  to  applying  Gauss’  law  to  a variety  of  important  cases. 

If  some  quantity  is  flowing  through  a three-dimensional  space,  the  rate 
at  which  it  crosses  a fixed  surface  is  called  a flux.  For  example,  when  dis- 
cussing fluid  flow  in  Sec.  16-7,  we  defined  the  mass  flux  to  be  the  rate  at 
which  the  flow  carries  mass  across  a fixed  marker  surface.  But  the  flowing 
quantity  does  not  have  to  be  material.  In  fact,  our  first  use  of  flux  in  a 
three-dimensional  situation  was  in  Sec.  12-6,  where  we  defined  the  energy 
flux  in  a wave  as  the  rate  at  which  the  wave  carries  energy  across  a fixed 
marker  surface. 

In  an  electric  field  there  is  nothing  material.  Nor  is  anything  flowing. 
Nevertheless,  the  idea  of  an  electric  flux  is  strongly  suggested  by  the  simi- 
larities between  electric  field  lines  and  the  streamlines  used  to  describe 
fluid  flow.  (See  Fig.  16-16  for  an  example  of  streamlines.)  The  electric  field 
lines  drawn  in  Fig.  20-2  la  look  like  streamlines  that  would  be  drawn  to  rep- 
resent fluid  flowing  uniformly  in  all  directions  out  of  a source  of  fluid  into 
the  surrounding  space.  In  Fig.  20-2 16  they  look  like  streamlines  represent- 
ing fluid  flowing  uniformly  from  all  directions  into  a sink  of  the  fluid  (that 
is,  a point  where  the  fluid  is  withdrawn  from  the  surrounding  space).  And 
in  Fig.  20-22  the  electric  field  lines  have  the  same  appearance  as  stream- 
lines representing  fluid  flowing  out  of  a source  and  then  into  a nearby  sink. 

A quantitative  expression  of  the  electric  flux  can  be  based  on  the  prop- 
erty of  electric  field  lines  that  requires  the  number  of  lines  beginning  on  a 
positive  point  charge  to  be  proportional  to  the  value  of  the  charge.  When 
the  proportionality  constant  is  specified,  the  relation  between  the  value  of 
the  charge  and  the  number  of  field  lines  beginning  on  it  becomes  specific. 
This  makes  it  possible  either  to  determine  the  number  of  field  lines  by  mea- 
suring the  value  of  the  charge  or  to  determine  the  value  of  the  charge  by 
counting  the  number  of  field  lines.  That  number  is  the  electric  flux  origi- 
nating on  the  charge. 

The  value  of  the  electric  flux,  and  therefore  the  value  of  the  charge 


926  The  Electric  Force  and  the  Electric  Field 


Fig.  20-24  A closed  surface  and  an  in- 
finitesimal element  of  that  surface.  The 
surface  element  vector  da  has  a direc- 
tion out  of  the  enclosed  region  and  nor- 
mal to  the  plane  ot  the  surface  element. 
Its  magnitude  da  equals  the  area  of  the 
surface  element.  For  the  particular  sur- 
face element  illustrated,  da  appears  to  be 
pointing  directly  away  from  the  charge, 
as  a field  line  does.  But  this  is  not  true 
in  general.  Find  a region  where  the 
direction  of  da  is  significantly  different 
from  that  of  a nearby  field  line — in 
other  words,  significantly  different  from 
the  direction  of  the  electric  field  £. 


Fig.  20-23  To  aid  in  counting  the  number  of  field 
lines  beginning  on  the  positive  charge  q,  we  imagine 
a closed  surface  of  arbitrary  shape  surrounding  the 
charge.  Next  we  mark  each  passage  of  a field  line 
through  the  surface  from  the  region  inside  to  the 
region  outside.  Then  we  count  this  number.  You  will 
see  in  Fig.  20-27  that  when  properly  interpreted,  this 
procedure  is  valid  even  if  the  surface — or  the  field 
lines — is  so  convoluted  that  some  field  lines  pass 
through  the  surface  more  than  once. 


giving  rise  to  it,  can  be  found  by  counting  the  number  of  field  lines  begin- 
ning on  the  charge.  A methodical  way  to  make  the  count  is  to  take  any 
region  of  space  that  contains  only  the  charge  and  then  count  the  number  of 
penetrations  of  a field  line  outward  through  the  surface  enclosing  the 
region.  Since  the  region  can  have  any  shape,  providing  it  contains  only  the 
charge,  the  closed  surface  surrounding  the  charge  can  have  any  shape.  This 
simple  idea  is  illustrated  in  Fig.  20-23.  It  is  the  basic  one  involved  in  Gauss’ 
law:  What  passes  out  through  a closed  surface  is  just  what  originates  in  the 
region  enclosed  by  the  surface.  We  now  proceed  to  give  an  explicit  formu- 
lation of  electric  11  ux  and  Gauss’  law. 

Shown  in  Fig.  20-24  are  a closed  surface  and  an  infinitesimal  element 
of  the  surface.  Both  the  area  of  this  surface  element  and  its  orientation  in 
space  are  specified  by  the  surface  element  vector  da,  illustrated  in  the  fig- 
ure. By  definition,  the  magnitude  da  of  the  vector  equals  the  area  of  the  sur- 
face element.  The  direction  of  the  vector  is  defined  to  be  that  normal  to  the 
surface  element  and  outward  from  the  region  enclosed  by  the  surface.  Suc- 
cinctly put,  da  is  in  the  outward  normal  direction. 

The  electric  field  at  the  surface  element  is  specified  by  the  vector  8.  We 
define  the  electric  flux  element  d<t>e  passing  out  of  the  enclosed  region 
through  the  surface  element  to  be 

d<$>e  = 8 • da  (20-29) 

Fhe  total  electric  flux  <t>(,  passing  out  of  the  entire  enclosed  region  is  the 
integral  of  d<t>e  over  the  closed  surface: 

= J d®e 

closed 

surface 

By  using  the  definition  of  d<t>e,  the  expression  for  4>P  becomes 

= | 8 • da  (20-30) 

closed 

surface 

The  reason  for  defining  the  electric  flux  element  in  this  way  can  be 
seen  by  evaluating  the  dot  product  in  Eq.  (20-29).  The  equation  then  be- 
comes 

d<t>  = cf  cos  6 da 


20-5  Electric  Flux  and  Gauss’  Law  927 


Fig.  20-25  The  solid  square  represents 
an  infinitesimal  surface  element  of  area 
da.  It  is  in  a plane  normal  to  the  surface 
element  vector  da.  The  dashed  rec- 
tangle is  the  projection  of  the  surface 
element  onto  a plane  normal  to  the  elec- 
tric fields.  For  convenience,  the  square 
has  been  oriented  so  that  one  side  is  par- 
allel to  the  plane  containing  £ and  da.  If 
6 is  the  angle  between  8 and  da.  the  pro- 
jection reduces  one  dimension  of  the 
surface  element  by  the  factor  cos  6.  The 
other  dimension  is  unchanged.  Hence 
the  area  of  the  projected  surface  ele- 
ment is  smaller  than  that  of  the  unpro- 
jected surface  element  by  the  factor  cos 
6.  In  other  words,  the  projected  area  is 
cos  6 da. 


Fig.  20-26  A spherical  surface  of 
radius  r centered  on  a positive  point 
charge  q.  The  electric  field  £ is  shown  at 
the  location  of  an  element  da  of  the 
surface. 


where  9 is  the  angle  between  the  directions  of  8 and  da.  Figure  20-25  and 
its  caption  show  that  cos  9 da  is  just  the  area  covered  by  the  projection  of 
the  surface  element  onto  a plane  normal  to  the  direction  of  8.  Every  held 
line  which  passes  out  of  the  enclosed  region  through  the  surface  element 
must  cross  the  projection  of  the  surface  element  on  this  plane.  This  is  be- 
cause the  held  lines  are  parallel  to  the  direction  of  8 and  therefore  parallel 
to  the  lines  connecting  the  corners  of  the  surface  element  represented  by 
da  with  the  corners  of  the  projection  of  that  surface  element  on  the  plane 
normal  to  8.  Now  one  property  of  held  lines  requires  that  the  number  per 
unit  area  crossing  the  plane  be  proportional  to  <9,  the  magnitude  of  8,  since 
the  plane  is  normal  to  the  direction  of  8.  Hence  % cos  9 da  must  be  propor- 
tional to  the  number  of  lines  per  unit  area  crossing  the  plane  multiplied  by 
an  area  on  this  plane  crossed  by  every  line  passing  through  the  surface  ele- 
ment. In  other  words,  c?  cos  9 da  is  proportional  to  the  number  of  lines 
passing  out  through  the  surface  element.  As  far  as  the  property  of  held 
lines  is  concerned,  the  value  of  the  proportionality  constant  is  arbitrary.  But 
to  have  a specihc  measure  of  the  number  of  lines,  we  must  specify  the  value 
of  this  proportionality  constant.  We  do  so  by  giving  it  the  simplest  value,  1. 
Thus  we  say  that  the  number  of  held  lines  passing  out  of  the  enclosed 
region  through  the  surface  element  is  numerically  equal  to  % cos  9 da.  In- 
troducing the  symbol  to  represent  this  number,  we  then  have  Eq. 
(20-29).  " 

Let  us  use  Eq.  (20-30)  to  evaluate  the  electric  flux  <f>e  passing  out 
through  an  imaginary  closed  surface  surrounding  a point  charge  q.  This  is 
particularly  easy  to  do  if  we  take  the  surface  to  be  a sphere  centered  on  the 
charge,  as  in  Fig.  20-26.  The  figure  shows  a representative  surface  element 
vector  da.  It  points  directly  away  from  the  center  of  the  sphere.  The  electric 
held  vector  8 at  that  surface  element  is  shown  also,  with  its  direction  based 
on  the  assumption  that  the  charge  q has  a positive  value.  This  vector,  too, 
points  directly  away  from  the  center  of  the  sphere  where  the  source  charge 
is  located.  Thus  8 is  parallel  to  da,  and  so 

8 • da  = % cos  0 da  = % da  (20-31) 

As  a consequence,  Eq.  (20-30), 

<Pe  = I 8 ’ d a 

closed 

surface 

simplifies  to 

<Pe  = j % da 

closed 

surface 

But  % has  the  same  value  everywhere  over  the  surface  of  the  sphere  cen- 
tered on  the  point  charge  that  is  its  source.  Hence  we  can  pass  % through 
the  integral  sign  and  obtain 

= % j da 

spherical 

surface 

The  value  of  the  integral  is  just  47rr2,  the  total  surface  area  of  the  sphere. 
Thus  we  have 


928  The  Electric  Force  and  the  Electric  Field 


<f>e  — c?47rr2 


(20-32) 


Now  we  evaluate  % at  locations  on  the  surface  of  the  sphere,  all  of 
which  are  at  the  same  distance  r from  the  charge  q,  by  using  Coulomb’s  law  in 
the  form  of  Eq.  (20-20): 

ce>  1 Q 

© — o 

47re0  r2 

Inserting  this  value  in  Eq.  (20-32),  we  obtain  the  result 


1 q 

47re0  r2 


47rr2 


or 


<&e=—  (20-33) 

Co 

Note  that  the  radius  of  the  sphere  has  canceled  and  does  not  appear  in  the 
expression  relating  the  total  flux  <t>e  to  the  value  of  the  source  charge  q. 
This  certainly  is  as  it  should  be.  The  total  number  of  electric  field  lines  pass- 
ing out  through  the  sphere  that  we  imagine  to  be  centered  on  the  point 
charge  q is  just  the  total  number  that  begin  on  q.  That  number,  <f>e,  cannot 
depend  on  the  radius  of  the  sphere.  Rather  it  depends  only  on  the  fact  that 
the  sphere  is  a closed  surface  containing  the  charge  q. 

Since  we  have  taken  the  point  charge  q to  have  a positive  value,  Eq. 
(20-33)  shows  that  the  electric  flux  <E>e  has  a positive  value.  This  means  that 
the  electric  field  lines  are  passing  outward  through  the  sphere  surrounding 
the  positive  charge,  as  we  defined  them  to  do.  If  we  take  q to  have  a nega- 
tive value  instead,  then  8 will  point  inward  toward  the  center  of  the  sphere 
where  the  charge  is  located,  while  da,  being  always  in  the  outward  normal 
direction,  will  not  change.  This  causes  Eq.  (20-31)  to  yield  8 • da  = — % da. 
Also,  Eq.  (20-20)  now  yields  % = -^/47re0r2.  The  minus  sign  is  re- 
quired when  the  quantity  q has  a negative  value  so  that  the  quantity  — q will 
have  a positive  value.  This  is  necessary  since  %,  being  the  magnitude  of  a 
vector,  always  has  a positive  value.  As  minus  signs  are  introduced  into  two 
factors  that  produce  the  final  result,  their  net  effect  is  to  make  no  change  in 
that  result.  In  other  words,  Eq.  (20-33)  applies  to  both  positive  and  negative 
charges,  with  being  a signed  scalar  just  as  q is.  When  q is  positive,  then  4>e 
is  positive  and  the  field  lines  penetrate  the  sphere  surrounding  the  charge 
in  the  outward  direction.  When  q is  negative,  then  <J>e  is  negative  and  the 
field  lines  penetrate  the  sphere  in  the  inward  direction.  The  magnitude 
of  <t>e  is,  in  both  cases,  the  number  of  field  lines.  For  a positive  q this 
number  is  the  number  of  lines  that  begin  on  the  charge.  For  a negative  q it 
is  the  number  that  end  on  the  charge. 

We  can  make  a powerful  extension  of  the  result  in  Eq.  (20-33).  First  we 
note  that  the  number  of  lines  in  the  field  of  a single  positive  point  charge 
that  cross  out  through  any  surface  surrounding  the  charge  is  the  same  as 
the  number  that  cross  out  through  a spherical  surface  centered  on  the 
charge.  This  fact  is  illustrated  in  Fig.  20-27.  It  is  a consequence  of  the  fact 
that  the  electric  flux  <be  is  a property  of  the  charge,  not  of  the  surface  with 
which  we  surround  the  charge  in  our  imagination  to  help  us  evaluate 
The  same  holds  for  the  number  of  lines  crossing  inward  through  any  sur- 
face surrounding  a negative  point  charge.  Hence,  Eq.  (20-33)  applies  no 


20-5  Electric  Flux  and  Gauss'  Law  929 


930 


Fig.  20-27  A positive  point  charge  surrounded  by  two  closed  surfaces. 
One  is  a sphere  centered  on  the  charge,  and  the  other  has  an  arbitrary 
shape.  Both  are  represented  by  their  cross  sections  in  the  plane  of  the 
page.  The  electric  held  lines  are  represented  in  the  same  way.  The  pas- 
sage of  a held  line  through  the  spherical  surface  outward  from  the 
region  which  it  encloses  occurs  in  14  places.  For  the  surface  of  arbitrary 
shape  there  are  15  outward  crossings  and  1 inward  crossing.  So  the  net 
number  of  outward  crossings  is  14  here  also.  This  result  is  due  to  the  fact 
that  held  lines  beginning  on  the  positive  charge  cannot  end  in  either  of 
the  regions  enclosed  by  the  surfaces  because  there  are  no  other  charges 
in  these  regions. 


matter  what  closed  surface  surrounding  the  point  charge  is  used  to  evalu- 
ate the  electric  flux  resulting  from  the  presence  of  the  charge. 

Next,  we  consider  a set  of  n point  charges  qt,  q2,  . . . , qn,  all  within 
the  same  closed  surface.  Equation  (20-33)  will  apply  to  each  of  these  point 
charges,  since  it  makes  no  difference  where  the  charge  is  located  within  the 
closed  surface.  Thus  the  values  of  the  electric  fluxes  d>el,  <bp2,  . . . , 
arising  from  the  presence  of  the  charges  are  given  by  the  equations 


®el 

^62 


4l 

^0 

(H 


Adding,  we  obtain 

<E>ei  + <f>p2  + • • • + = (<7i  + q2  + ' ' ' + qn)  (20-34) 

eo 

The  sum  on  the  left  side  of  Eq.  (20-34)  is  the  total  electric  flux 
penetrating  through  the  closed  surface: 

Tp!  + fbp2  + • • • + Opn  = Tp  (20-35a) 

Furthermore,  the  sum  on  the  right  side  of  Eq.  (20-34)  is  the  total  electric 
charge  q contained  within  the  surface: 

qi  + q2  + ■ ■ ■ + q„  = q (20-35  b) 

Hence  Eq.  (20-34)  can  be  written 

<S>e=±  (20-36) 

£o 

This  can  be  expressed  in  terms  of  the  total  electric  held  8 at  the  closed  sur- 
face by  using  Eq.  (20-30)  again.  Doing  so,  we  obtain 

I 8 • da  = — (20-37) 

J e0 

closed 

surface 


The  Electric  Force  and  the  Electric  Field 


Equation  (20-37)  is  Gauss’  law:  The  integral  over  any  closed  surface  of  the  dot 
product  of  the  electric  field  8 at  the  surface  and  the  outwardly  directed  surface  ele- 
ment vector  da.  equals  the  charge  q contained  within  the  surface  divided  by  the  con- 
stant e0. 


Fig.  20-28  A schematic  representation 
of  the  electric  field  lines  for  a set  of  four 
charges  of  equal  magnitude,  three  being 
positive  and  one  being  negative.  The 
magnitude  is  such  that  four  field  lines 
begin  on  each  positive  charge  and  four 
end  on  the  negative  charge.  A closed 
surface  surrounds  the  charges.  The 
electric  field  at  the  surface  is  in  the 
generally  outward  direction,  except 
near  the  negative  charge.  Note  that 
there  are  nine  places  where  a field  line 
penetrates  the  surface  outward  and  one 
where  a field  line  penetrates  inward. 
The  net  number  of  outward  penetra- 
tions is  eight.  This  agrees  with  the  fact 
that  there  are  three  positive  charges  and 
one  negative  charge  within  the  surface, 
for  a net  charge  of  two  positive  charges, 
and  with  the  fact  that  four  lines  should 
emerge  from  the  surface  for  each  posi- 
tive charge.  Note  also  that  field  lines 
never  cross  one  another.  Why  not? 


You  should  note  that  a positive  value  of  q in  Eq.  (20-37)  does  not  neces- 
sarily mean  that  all  the  charges  within  the  closed  surface  are  positive,  but 
only  that  there  are  more  positive  charges  than  negative  charges.  If  q has  a 
negative  value,  then  the  opposite  is  true.  Also  you  should  note  that  a posi- 
tive value  of  the  integral  of  8 • da  over  the  surface  does  not  necessarily 
mean  that  8 at  the  surface  is  everywhere  in  the  generally  outward  direction 
so  that  the  sign  of  8 • da  is  everywhere  positive.  Rather,  it  means  only  that 
the  positive  contributions  of  8 * da  to  the  integral  outweigh  the  negative 
contributions.  The  opposite  is  true  if  the  value  of  the  integral  is  negative. 
Figure  20-28  illustrates  these  points  schematically  in  a case  where  the 
charge  and  therefore  also  the  integral  have  a positive  value. 

Several  comments  are  in  order: 


1.  Gauss’  law  is  not  dependent  on  the  concept  of  counting  the  number 
of  electric  held  lines  that  cross  a surface  surrounding  a set  of  source 
charges — even  though  we  used  this  concept  in  our  development  of  the  law. 
Just  as  electric  held  lines  provide  a way  of  visualizing  properties  of  the  elec- 
tric held,  so  the  number  of  held  lines  crossing  a surface  provides  a way  of 
visualizing  properties  of  the  electric  flux  through  that  surface.  But  Cou- 
lomb's law  and  the  properties  of  the  electric  held  can  be  presented  without 
making  reference  to  electric  held  lines.  (This  is  the  way  we  presented  them 
earlier  in  this  chapter.)  And  arguments  can  be  given  which  lead  to  Gauss’ 
law  and  the  properties  of  the  electric  Hux  but  have  nothing  to  do  with 
counting  electric  held  lines.  (This  is  true  of  Gauss’  own  arguments.) 

2.  We  derived  Gauss’  law  by  using  Coulomb’s  law.  But  it  is  very  easy  to 
derive  Coulomb’s  law  from  Gauss’  law.  (An  exercise  at  the  end  of  this  chapter 
suggests  how  you  do  this.)  Hence  Gauss’  law  can  be  considered  to  be  as  basic 
as  Coulomb’s  law.  From  a theoretical  point  of  view,  the  two  laws  are  on  an 
equal  footing. 

3.  Gauss’  law  and  Coulomb’s  law  are  on  an  equal  footing  from  an 
experimental  point  of  view,  too.  The  first  experiments  (those  described  in 
Sec.  20-2)  provide  direct  evidence  for  Coulomb’s  law.  But  the  most  accurate 
experiments  (described  in  Sec.  20-6)  provide  direct  evidence  for  Gauss’ 
law. 

4.  As  is  shown  in  Sec.  20-6,  Gauss’  law  makes  it  easy  to  evaluate  the 
electric  helds  of  certain  symmetrical  charge  distributions  — much  easier 
than  it  would  be  to  do  this  by  applying  Coulomb's  law  to  the  charge  distri- 
butions. Other  applications  of  Gauss’  law  are  given  in  subsequent  chapters. 
If  we  consider  all  aspects  of  the  study  of  the  electromagnetic  force,  Gauss’ 
law  is  more  useful — and  therefore  more  frequently  used — than  Cou- 
lomb’s law. 


A hint  at  the  utility  of  Gauss’  law  is  given  by  the  presence  of  the  factor 
477  in  the  proportionality  constant  1/47 re0  of  Coulomb’s  law  and  its  ab- 
sence in  the  proportionality  constant  l/e0  of  Gauss’  law.  The  two  propor- 


20-5  Electric  Flux  and  Gauss’  Law  931 


tionality  constants  must  differ  by  a factor  of  4 tt  because  that  factor  relates 
the  area  of  a sphere  to  the  square  of  its  radius,  and  Gauss'  law  is  obtained 
from  Coulomb’s  law  by  integrating  over  the  surface  of  a sphere.  In  the 
systems  of  units  used  before  the  introduction  of  the  present  system  by  the 
engineer  Giovanni  Giorgi,  the  4-7t  appeared  in  Gauss’  law,  not  Coulomb’s 
law.  In  the  earliest  version  of  what  has  evolved  into  the  now  universally  ac- 
cepted SI  system,  Giorgi  switched  the  4n  to  Coulomb’s  law.  One  reason  is 
the  convenience  that  results  if  it  does  not  appear  in  the  more  frequently 
used  of  the  two  laws. 

Because  the  constant  appearing  in  Gauss’  law  is  just  the  reciprocal  of 
e0,  the  value  of  e0  is  important.  Using  the  value  of  l/47re0  given  in  Eq. 
(20-46)  and  a calculator,  we  find 


e0  = 8.854187818  x 10“12  C2/(N-m2) 
Often  it  is  adequate  to  use  the  approximate  value 

e0  = 8.85  x 10“12  C2/(N-m2) 


(20-38a) 


(20-386) 


The  constant  e0  is  called  the  permittivity  of  free  space.  The  name  was  in- 
troduced at  an  early  stage,  for  reasons  that  are  not  pertinent  to  our  present 
understanding  of  the  electromagnetic  force. 


20-6  APPLICATIONS  Gauss  law, 
OF  GAUSS’  LAW 


closed 

surface 


relates  an  integral  of  8 • da  over  all  the  locations  on  an  imaginary  closed 
surface,  of  arbitrary  shape,  to  the  total  charge  q enclosed  by  the  surface.  It 
says  nothing  directly  about  the  value  of  the  electric  held  8 itself  at  any  par- 
ticular location  on  the  surface.  Nevertheless,  for  certain  charge  distribu- 
tions Gauss'  law  is  very  useful  for  finding  the  values  of  8 at  all  locations  on 
the  surface.  Most  of  these  charge  distributions  are  ones  with  enough  sym- 
metry that  the  electric  held  has  a high  degree  of  symmetry.  If  this  is  the 
case  and  if  the  closed  surface  is  chosen  to  have  a symmetry  appropriate  to 
that  of  the  electric  held,  then  the  integral  can  be  very  easy  to  evaluate  in 
terms  of  the  values  of  8 at  various  locations  on  the  surface.  By  equating  the 
integral  to  q/e0,  these  values  of  8 are  then  determined.  A closed  surface, 
carefully  chosen  to  facilitate  the  evaluation  of  the  integral  in  Gauss’  law,  is 
called  a gaussian  surface. 

To  be  more  specihc,  a gaussian  surface  usually  is  chosen  so  that  it  con- 
sists of  a surface,  or  of  several  joined  surfaces,  that  encloses  some  region 
and  has  a simple  shape.  Each  part  of  a gaussian  surface  is  usually  chosen  to 
have  one  or  the  other  of  two  features:  (1)  It  is  so  oriented  that  8 is  every- 
where perpendicular  to  the  surface  element  vectors  d a,  so  that  8 • da  = 0 
everywhere  on  the  surface;  (2)  it  is  so  oriented  that  8 is  everywhere  parallel 
to  da  and  has  a uniform  magnitude  %,  so  that  everywhere  on  the  surface 
8 • da  = % da,  with  constant. 

The  way  gaussian  surfaces  are  chosen  and  used  is  best  demonstrated 
by  employing  them  in  specihc  cases.  Examples  20-8  through  20- 1 1 serve  this 
purpose. 


932  The  Electric  Force  and  the  Electric  Field 


EXAMPLE  20-8 


Positive  charge  is  distributed  uniformly  on  a plane  of  infinite  extent.  The  charge 
per  unit  area  has  the  value  a.  Calculate  the  electric  held  8 at  a distance  r from  the 
plane.  (A  charge  distribution  of  infinite  extent  is  unrealistic,  but  easily  treated.  And 
as  is  explained  after  this  example,  the  results  of  the  treatment  provide  a good 
approximation  to  those  found  for  a real  charge  distribution  that  occurs  frequently.) 

■ First  you  sketch  the  uniform  positive  charge  distribution  and  the  point  P at 
which  the  electric  held  8 is  to  be  evaluated,  as  in  Fig.  20-29a.  Then  you  use  sym- 
metry arguments  to  determine  as  much  as  possible  about  the  characteristics  of  8. 
Since  charge  is  distributed  uniformly  over  the  plane,  symmetry  dictates  that  8 must 
have  a direction  along  the  normal  to  the  plane  which  passes  through  P.  Otherwise, 
8 would  have  a component  along  a particular  direction  parallel  to  the  plane.  This 
cannot  be  since  all  directions  parallel  to  the  plane  are  completely  equivalent.  Of  the 
two  directions  along  the  normal,  8 is  in  the  direction  away  from  the  plane  because 
the  charge  on  the  plane  is  positive.  Thus  if  P is  to  the  right  of  the  plane,  then  8 is 
directed  to  the  right;  and  if  P is  to  the  left  of  the  plane,  then  8 is  directed  to  the  left. 
Symmetry  also  requires  8 to  have  the  same  magnitude  at  all  points  P whose  distance 
from  the  plane  has  the  same  value  r.  If  this  were  not  true,  then  there  would  be  some 
particular  direction  parallel  to  the  plane  in  which  the  magnitude  of  8 increases,  in 
violation  of  the  equivalence  of  all  such  directions. 

Taking  your  cue  from  the  geometry  of  the  situation,  you  next  specify  a gaus- 
sian  surface  as  in  Fig.  20-296.  It  is  in  the  form  of  a cylinder  with  an  axis  normal  to 
the  charged  plane  that  intercepts  an  area  a of  the  plane,  extends  to  the  left  and 
right  of  the  plane  a distance  r,  and  is  closed  at  its  ends  by  flat  surfaces  parallel  to  the 
plane.  With  this  choice  for  a gaussian  surface,  the  integral  in  Gauss’  law  splits  into 
the  sum  of  three  integrals: 


| 8 • da,  — j 8 • da  + J 8 • da  + j 8 • da  (20-39) 

closed  left  right  cylindrical 

surface  surface  surface  surface 


For  the  integral  over  the  left  surface,  8 is  directed  to  the  left  and  so  is  the  sur- 
face element  vector  da.  So  you  have  8 • da  = da.  Furthermore,  <?  has  a constant 
value  over  the  left  surface  since  all  points  on  it  are  equidistant  from  the  charged 
plane.  Hence  you  have 

| 8 • da  ~ j'  % da  = % J da 

left  left  left 

surface  surface  surface 


€ P r 


Fig.  20-29  A figure  used  in  Example  20-8 
to  evaluate  the  electric  field  of  positive  elec- 
tric charge  distributed  uniformly  over  an 
infinite  plane,  (a)  The  shading  indicates  the 
continuous,  uniform  charge  distribution  on 
the  part  of  the  plane  that  is  depicted.  ( b ) The 
charge  contained  within  the  gaussian  surface 
is  emphasized  with  darker  shading. 


(a) 


da 

t 

Area  a 

J , 

k da 

£ ^ da 

.-» — r~ 

^ 

r 

(b) 

20-6  Applications  of  Gauss’  Law  933 


Since  the  integral  of  the  magnitude  da  of  the  surface  element  vector  over  the  left 
surface  is  just  equal  to  the  area  a of  that  surface,  you  obtain 

I 8 • da  = %a  (20-40) 

left 

surface 

For  the  integral  over  the  right  surface,  both  8 and  da  are  directed  to  the  right, 
instead  of  to  the  left.  But  what  counts  is  that  they  are  parallel  to  each  other,  as  be- 
fore. So  the  same  result  is  obtained,  to  wit: 

| 8 • da  = %a  (20-41) 

right 

surface 

It  is  not  necessary  to  make  a distinction  between  the  values  of  % in  Eqs.  (20-40)  and 
(20-41).  Since  the  left  and  right  surfaces  are  at  the  same  distance  r from  the  charged 
plane,  symmetry  dictates  that  the  values  of  % at  these  surfaces  must  be  the  same. 

For  the  cylindrical  surface  you  have  8 • da  = 0.  The  reason  is  that  at  all  loca- 
tions on  this  surface  8 is  parallel  to  the  surface.  But  the  vector  da  representing  a 
surface  element  is  always  normal  to  the  surface  of  which  the  element  is  a part. 
Therefore  8 is  everywhere  perpendicular  to  da  on  the  cylindrical  surface,  and  so 
the  dot  product  is  zero.  Thus  you  have 

| 8 • da  = 0 (20-42) 

cylindrical 

surface 

Using  Eqs.  (20-40)  through  (20-42)  in  Eq.  (20-39),  you  obtain  the  following  evalua- 
tion of  the  integral  in  Gauss’  law: 

I S-da  = 2%a  (20-43) 

closed 

surface 

The  next  step  is  to  calculate  q,  the  total  charge  contained  in  the  closed  gaussian 
surface.  The  charge  is  located  on  the  charged  plane.  Its  quantity  is  the  charge  per 
unit  area  on  the  plane,  cr,  multiplied  by  the  area  of  the  plane  that  lies  within  the  cyl- 
inder, a.  Thus 

q = era  (20-44) 

According  to  Gauss"  law, 

I 8 • da  = -2- 

J e0 

closed 

surface 

Substituting  Eqs.  (20-43)  and  (20-44)  into  this  equation  gives  you  the  result 


or 


2%a  — — 


(20-45) 


This  is  the  magnitude  of  the  electric  held  8 at  point  P.  Its  direction  is  away  from  the 
plane  carrying  positive  charge  per  unit  area  cr.  Note  that  % does  not  depend  on  the 
distance  r from  the  plane  to  P. 


You  may  have  expected  that  as  the  distance  r from  the  uniformly 
charged,  infinite  plane  to  the  point  P in  its  electric  field  increases,  the  mag- 


934  The  Electric  Force  and  the  Electric  Field 


nitude  of  the  electric  field  should  decrease.  But  Eq.  (20-45)  shows  that  the 
electric  field  actually  maintains  a constant  magnitude  % as  this  distance  in- 
creases. To  get  an  intuitive  understanding  of  the  situation,  think  of  the 
electric  field  lines.  Since  the  number  of  these  lines  crossing  a unit  area 
normal  to  their  direction  is  proportional  to  %,  a decreasing  % would  mean 
that  the  field  lines  are  spreading  apart  as  they  get  farther  from  the  charged 
plane.  But  if  the  lines  emanating  uniformly  from  the  uniformly  charged 
plane  spread  apart  in  some  regions  of  the  field,  they  necessarily  bunch 
together  in  others  because  they  never  begin  or  end  in  charge-free  space. 
The  symmetry  of  the  situation  does  not  allow  either  spreading  or  bunching 
to  occur.  The  field  lines  must  maintain  the  same  spacing  as  they  get  farther 
from  the  charged  plane. 

If  you  still  have  doubts,  you  can  verify  this  qualitative  statement — and  also 
the  quantitative  values  of  % found  in  Eq.  (20-45) — by  a calculation  that  has 
nothing  to  do  with  electric  field  lines  or  with  the  closely  related  Gauss’  law.  To  do 
so,  consider  a loop-shaped  segment  of  the  charge  distribution,  centered  on  its  in- 
tersection with  the  perpendicular  from  the  point  P in  the  field.  The  inner  and 
outer  radii  of  the  segment  are  k and  k + dk.  The  electric  field  at  P due  to  the  seg- 
ment can  be  obtained  immediately  by  setting  the  charge  in  Eq.  (20-28)  equal  to  the 
charge  density  times  the  area  of  the  loop.  Then  integrate  from  k = 0 to  k = °° 


Fig.  20-30  A cross  section  in  the  plane 
of  the  page  representing  qualitatively 
the  electric  field  lines  emanating  from  a 
positively  charged  conducting  plate  of 
finite  extent. 


No  real  charge  distribution  extends  uniformly  over  an  infinite  plane, 
as  was  assumed  in  the  calculation  leading  to  Eq.  (20-45).  Still,  the  equation 
is  quite  useful.  Visualize  a large,  flat  metal  plate  to  which  positive  charges 
have  been  added.  The  tendency  to  maximize  spacings  between  near- 
neighbor charges  imposed  by  the  mutual  repulsions  will  make  the  charges 
spread  over  the  surface  of  the  plate  in  a distribution  which  is  uniform,  ex- 
cept very  near  the  edges.  (There  is  more  charge  per  unit  area  very  near  the 
edges  than  elsewhere  because  there  are  no  charges  beyond  the  edges  to 
repel  the  charges  very  near  the  edges.)  At  a point  P whose  distance  r from 
the  plate  is  small  compared  to  its  distance  from  any  edge,  there  is  no  signifi- 
cant distinction  to  be  made  between  the  actual  situation  and  the  one  as- 
sumed in  obtaining  Eq.  (20-45)  because  the  edges  are  relatively  far  away 
and  have  little  effect  on  the  electric  field.  So  the  expression  % — cr/2eQ  may 
be  used  in  this  case  to  find  a very  good  approximation  to  the  actual  magni- 
tude of  the  electric  field.  This  is  done  by  setting  the  charge  per  nnit  area  cr 
equal  to  the  total  charge  q on  the  plate  divided  by  its  total  area  a.  We  use 
this  approximation  on  several  occasions  in  Chap.  21. 

At  the  other  extreme,  consider  a point  P so  far  from  the  plate  that  the 
distance  r from  any  part  of  the  plate  to  P has  essentially  the  same  value.  In 
this  case  there  is  no  significant  distinction  to  be  made  between  the  actual  sit- 
uation and  one  in  which  all  the  charge  on  the  plate  is  concentrated  at  a 
point.  Thus  % will  decrease  with  increasing  r in  proportion  to  r~2.  For  inter- 
mediate cases  % will  have  a dependence  on  r which  is  intermediate  between 
being  proportional  to  r°  and  proportional  to  r-2.  See  Fig.  20-30. 

How  are  Example  20-8  and  the  related  discussion  modified  if  the 
charge  in  the  distribution  is  negative? 


EXAMPLE  20-9 


A thin,  spherical  metal  shell  of  radius  R is  charged  uniformly,  the  total  charge  on 
the  shell  being  q = — 1</|.  Find  the  electric  field  8 outside  the  shell. 

■ The  spherical  symmetry  of  the  charge  distribution  should  suggest  taking  for  a 


20-6  Applications  of  Gauss’  Law  935 


da 


R 


Fig.  20-31  A figure  used  in  Example 
20-9  to  evaluate  the  electric  field  due  to 
a negatively  charged,  thin  metal  shell  of 
radius  R at  a point  outside  the  shell. 
Darker  shading  is  used  to  represent  the 
continuous  charge  distribution. 


gaussian  surface  a sphere  concentric  with  the  charged  shell.  Since  you  want  to  de- 
termine S outside  the  shell  of  radius  R,  the  spherical  gaussian  surface  must  have  a 
larger  radius  r,  as  in  Fig.  20-31. 

Everywhere  on  the  gaussian  surface  8 is  directed  inward  toward  the  negative 
charge  distribution.  But  the  surface  element  vector  da  always  is  directed  outward. 
So  you  have  8 • da  = —%  da.  Also,  «?  has  the  same  value  everywhere  on  the  sphere 
because  of  the  symmetry  of  the  situation.  Thus  the  integral  in  Gauss’  law  can  be 
written 

| 8 • da  = J — % da  = —<?  J da 

closed  spherical  spherical 

surface  surface  surface 


The  last  integral  is  just  the  area  of  the  sphere,  47rr2.  Hence  you  have 


| 8 • da  = — 

closed 

surface 

Using  Gauss’  law  to  equate  this  to  — |<?|/e0,  the  charge  within  the  closed  sur- 
face divided  by  €0,  you  obtain 


— «?477r2 

This  leads  directly  to  the  result 

g = _L_M 

47760  T 


zM 


for  r > R 


Employing  the  outward-directed  unit  vector  r and  remembering  that  the  value 
of  q is  negative,  you  can  write  this  in  the  vector  form 


8 = —0  r for  r > R 

47760  r 


(20-46) 


The  negative  value  of  q makes  8 be  directed  inward.  But  you  can  easily  modify  the 
calculation  to  show  that  Eq.  (20-46)  applies  just  as  well  when  q has  a positive  value. 


Except  for  the  restriction  that  it  applies  only  outside  the  spherical  shell 
on  which  charge  q is  distributed  uniformly,  Eq.  (20-46)  tells  us  that  the  elec- 
tric held  8 depends  on  the  distance  r from  the  center  of  the  shell  in  exactly 
the  way  that  Eq.  (20-20)  tells  us  that  8 depends  on  the  distance  r from  a 
point  charge  q.  That  is,  the  electric  field  of  charge  q distributed  uniformly  over  a 
spherical  shell  is  identical  at  all  locations  outside  the  shell  to  the  electric  field  of  the 
same  charge  q concentrated  at  a single  point  at  the  center  of  the  shell.  It  is  not  sur- 
prising that  this  result  holds  for  locations  whose  distance  from  the  shell  is 
very  large  compared  to  the  radius  of  the  shell.  From  such  a location,  an  ob- 
server cannot  distinguish  the  spherical  shell  from  a geometric  point.  But 
Gauss’  law  shows  the  italicized  statement  is  true  even  at  locations  just  out- 
side the  shell!  This  result  provides  the  quantitative  proof  of  the  property 
needed  to  justify  measuring  center-to-center  spacings  of  the  two  uniformly 
charged,  spherical  metal  shells  in  Coulomb’s  experiment. 

The  result  can  be  extended  to  find  the  electric  field  outside  a solid, 
spherically  symmetrical  charge  distribution.  In  Fig.  20-32  we  dissect  the 
solid  charge  distribution  into  nested,  concentric  spherical  shells  of  inner 
radius  r and  outer  radius  r + dr — like  the  layers  of  an  onion.  Each  shell  has 
a uniform  charge  distribution,  but  the  density  of  charge  can  vary  from  shell 


936  The  Electric  Force  and  the  Electric  Field 


Fig.  20-32  A cutaway  view  of  a sphere 
of  radius  R,  divided  into  nested  shells  of 
thickness  dr. 


EXAMPLE  20-10 


Fig.  20-33  A figure  used  in  Example 
20-10  to  evaluate  the  electric  field  of  a 
negatively  charged,  thin  metal  shell  of 
radius  R at  a point  inside  the  shell. 
Lighter  shading  is  used  to  represent  the 
continuous  charge  distribution. 


to  shell.  In  other  words,  the  charge  density  can  depend  on  r as  long  as  it  is 
the  same  everywhere  at  a particular  value  of  r.  Considering  only  locations 
outside  the  radius  R of  the  outermost  shell,  we  can  say  that  each  shell 
makes  the  same  contribution  to  the  total  electric  held  8 as  would  be  made 
by  a point  charge  at  the  common  center,  whose  charge  equals  the  charge  of 
the  shell.  The  total  charge  on  all  these  overlapping  point  charges  is  just  the 
total  charge  of  the  spherically  symmetrical  charge  distribution.  Thus  the 
electric  field  of  charge  q distributed  with  spherical  symmetry  is  identical  at  all  loca- 
tions outside  the  distribution  to  the  electric  field  of  the  same  charge  q concentrated  at  a 
single  point  at  the  center  of  the  distribution. 

Since  Gauss’  law  is  equivalent  to  Coulomb’s  law,  and  since  the  mathe- 
matical form  of  Coulomb’s  law  for  the  electric  force  is  identical  to  that  of 
Newton’s  law  for  the  gravitational  force,  there  is  a gravitational  analogue  of 
Gauss’  law.  It  can  be  used  in  arguments  almost  identical  to  those  above  to 
reach  the  following  conclusion.  Outside  a body  with  a spherically  symmet- 
rical mass  distribution,  the  gravitational  force  which  it  exerts  on  some  other 
body  is  identical  with  the  force  that  would  be  exerted  on  that  body  by  a par- 
ticle located  at  the  center  of  the  spherically  symmetrical  body  and  having 
the  same  mass  as  that  body.  This  verifies  the  guess  we  made  in  Sec.  11-1. 
Newton  needed  to  prove  it  in  order  to  show  that  the  moon  and  an  apple 
were  both  attracted  to  the  earth  by  gravitational  interaction.  The  proof 
using  Gauss’  law  requires  almost  no  calculation.  But  Newton’s  proof,  which 
involves  a three-dimensional  integration,  is  quite  laborious  and  gave  him 
great  difficulty. 

Gauss’  law  can  also  be  used  to  determine  the  electric  held  inside  a uni- 
formly charged  spherical  shell,  as  you  will  see  in  Example  20-10. 


Find  the  electric  field  8 inside  the  thin,  uniformly  charged  spherical  shell  consid- 
ered in  Example  20-9  of  radius  R having  charge  q = — \q |. 

■ Figure  20-33  shows  a concentric  sphere  of  radius  r < R that  serves  as  a gaus- 
sian  surface.  In  this  case  Gauss’  law, 

f 8-da=^- 
J e0 

closed 

surface 

tells  you  that 

| 8 • da.  = 0 for  r < R 

spherical 

surface 

The  reason  is  that  the  q in  Gauss’  law  is  always  the  total  charge  inside  the  closed  sur- 
face, which  in  this  case  is  zero. 

Now  Gauss’  law  requires  only  that  the  integral  of  8 • da  over  the  surface  of  the 
gaussian  sphere  be  zero.  But  the  complete  spherical  symmetry  of  the  situation 
makes  it  evident  that  the  vectors  8 and  da.  will  both  have  spherical  symmetry.  Thus 
the  integral  cannot  have  zero  value  because  on  some  parts  of  the  sphere  8 • da  has 
positive  values  while  on  others  it  has  compensating  negative  values.  The  only  possi- 
bility is  that  8 • da  be  zero  everywhere  on  the  sphere.  Since  da  is  not  zero  and  since 
8 is  not  perpendicular  to  da.  you  can  conclude  that 

8 = 0 for  r < R (20-47) 


20-6  Applications  of  Gauss’  Law  937 


A 


Fig.  20-34  A cross  section  in  the  plane 
of  the  page  representing  a negatively 
charged,  spherical,  metal  shell  and  a 
unit  positive  test  charge  at  a point  P 
inside  the  shell.  The  other  symbols  are 
explained  in  the  text. 


EXAMPLE  20-11 


It  can  be  seen  intuitively  that  the  electric  field  is  zero  at  the  center  of  a 
uniformly  charged,  spherical  shell.  There  is  the  same  amount  of  charge  of 
the  same  sign  in  any  pair  of  equal  area  elements  lying  at  opposite  ends  of  a 
diameter  of  the  shell.  These  equal  charges  exert  forces  of  equal  strength  on 
a positive  test  charge  at  the  center  since  they  are  at  equal  distances  from  it. 
But  the  two  forces  are  in  opposite  directions  and  so  cancel.  Since  the  same 
cancellation  is  obtained  for  every  other  such  pair  of  area  elements,  there  is 
zero  total  electric  force  acting  on  the  test  charge. 

The  fact  that  there  is  zero  electric  field  at  any  location  inside  a uni- 
formly charged  spherical  shell,  as  proved  in  Example  20-10  by  using  Gauss’ 
law,  is  not  intuitively  evident.  But  this  fact  can  be  shown  to  be  a conse- 
quence of  Coulomb's  law,  as  it  must  be  in  light  of  the  equivalence  of  Cou- 
lomb’s law  and  Gauss’  law.  Figure  20-34  is  supposed  to  represent  cones 
and  C[,  with  the  same  small  apex  angles,  extending  to  the  left  and  right 
from  an  off-center  positive  test  charge  at  P.  All  the  negative  charge  on  the 
shell  within  the  region  intercepted  by  the  left  cone  Cx  exerts  a force  Fx  on 
the  test  charge  directed  to  the  left.  All  the  charge  within  the  region  inter- 
cepted by  the  right  coneCj  exerts  a force  FJ  directed  to  the  right.  The  mag- 
nitude fq  of  the  force  to  the  left  is  proportional  to  the  amount  of  charge  in- 
tercepted to  the  left,  and  that  is  proportional  to  r\,  the  square  of  the  axial 
length  of  the  left  cone.  But  Coulomb’s  law  states  that  iq  is  also  inversely  pro- 
portional to  rf,  the  square  of  the  distance  from  the  intercepted  charge  to 
the  test  charge.  So  fq  is  independent  of  rq . The  same  is  true  offq,  the  mag- 
nitude of  the  force  to  the  right.  Hence  the  equality  fq  = F[  holds  just  as 
well  for  the  off-center  test  charge  as  for  an  on-center  one,  as  far  as  these 
particular  cones  are  concerned,  and  the  left  and  right  forces  exerted  on  the 
test  charge  cancel.  If  you  first  prove  that  the  angles  d2  and  d2  in  the  figure 
are  equal,  so  that  the  ratio  of  the  areas  cut  off  by  the  general  cones  C2  and 
C 2 is  proportional  to  the  ratio  of  the  squares  of  their  axial  dimensions  r2 
and  r2,  you  can  apply  the  same  argument  to  this  pair.  Thus  a complete  can- 
cellation is  obtained,  and  there  is  no  electric  force  acting  on  the  test  charge. 


Most  applications  of  Gauss’  law  are  to  systems  in  which  source  charges 
are  distributed  with  a high  degree  of  symmetry,  such  as  the  uniformly 
charged  spherical  shell,  or  the  uniformly  charged  plane  of  infinite  extent, 
analyzed  in  examples  of  this  section.  Another  highly  symmetrical  system, 
whose  analysis  is  reserved  for  you  to  do  in  an  exercise  at  the  end  of  this 
chapter,  is  a uniformly  charged  straight  line  of  infinite  length.  In  that  exer- 
cise you  will  prove  that  if  A is  the  charge  per  unit  length  distributed  along 
the  infinite  straight  line,  then  the  electric  field  8 at  a point  at  distance  r 
from  the  line  is  given  by 


8 = 


1 

277e0 


(20-48) 


The  unit  vector  r is  directed  perpendicularly  from  the  line  to  the  point. 

However,  there  are  some  important  applications  of  Gauss’  law  to 
systems  completely  devoid  of  symmetry.  One  is  given  in  Example  20-11. 


A solid  metal  body,  of  arbitrary  shape,  is  shown  in  Fig.  20-35.  The  body  is  given  a 
certain  charge.  Use  Gauss’  law  to  show  that  this  charge  must  distribute  itself  in  such 
a way  that  when  the  distribution  process  ceases,  all  the  charge  is  on  the  surface  of 
the  metal  body. 


938  The  Electric  Force  and  the  Electric  Field 


Fig.  20-35  A cross  section  in  the  plane 
of  the  page  of  a solid  metal  body  of 
arbitrary  shape.  The  circle  represents 
the  cross  section  of  a gaussian  surface 
within  the  body  whose  size  and  location 
are  arbitrary. 


Fig.  20-36  A cutaway  view  of  a nega- 
tively charged  metal  shell,  of  arbitrary 
shape.  Just  inside  the  shell  is  a gaussian 
surface. 


■ First  you  imagine  a gaussian  surface  to  be  constructed  around  any  region  inside 
the  metal  body.  Such  a closed  surface  is  indicated  in  Fig.  20-35,  along  with  one  of  its 
typical  surface  elements  da.  When  there  is  no  longer  a general  motion  of  charge 
through  the  metal  body  because  the  charge  has  attained  its  equilibrium  distribution, 
there  can  no  longer  be  an  electric  held  anywhere  in  the  interior  of  the  body.  This  is 
so  because  if  there  were  such  an  electric  held,  then  it  would  drive  charges  through 
the  conductor,  in  violation  of  the  statement  that  the  charges  are  at  rest.  Hence,  you 
can  say  that  everywhere  on  the  gaussian  surface  8 = 0.  And  so  8 • da  = 0 every- 
where on  that  surface.  Then  you  apply  Gauss’  law: 

8 • da  = — 

£o 

closed 

surface 

Since  the  integrand  is  everywhere  zero,  the  integral  itself  is  zero.  Thus  Gauss’  law 
requires  that  q = 0,  where  q is  the  net  charge  contained  within  the  gaussian  surface. 
But  this  result  is  obtained  no  matter  what  region  inside  the  metal  body  is  enclosed 
by  the  gaussian  surface.  Therefore  you  can  conclude  that  there  is  no  net  charge 
anywhere  inside  the  metal  body.  From  this  you  can  say  that  all  the  charge  given  to 
the  body  must  reside  on  its  surface,  as  soon  as  the  charge  reaches  its  equilibrium  dis- 
tribution. The  same  statement  was  justified  on  different  grounds  in  Sec.  20-2. 


Figure  20-36  depicts  a charged  metal  shell,  of  arbitrary  shape.  Since 
charge  moves  freely  through  metal,  it  will  distribute  itself  along  the  sur- 
faces of  the  shell.  This  was  proved  in  Example  20-11.  But  here  there  are 
two  surfaces,  an  inner  one  and  an  outer  one.  Will  there  be  charge  on  both? 
No.  All  the  charge  added  to  the  metal  shell  with  reside  on  its  outer  surface, 
in  equilibrium.  An  intuitive  argument  leading  to  this  conclusion  is  based  on 
the  fact  that  like  charges  repel.  You  should  be  able  to  give  such  an  argu- 
ment. A formal  argument  leading  to  the  same  conclusion  is  based  on  Gauss’ 
law  and  is  much  like  the  one  in  Example  20- 1 1 . Can  you  give  this  argument? 

In  contrast  to  the  highly  symmetrical  distribution  of  source  charges  in 
the  case  of  the  uniformly  charged,  spherical  shell  considered  earlier,  for 
the  situation  in  Fig.  20-36  there  is  no  symmetry  in  the  distribution  of  source 
charges.  Nevertheless,  the  electric  field  is  zero  at  any  location  in  the  interior  of  a 
charged  metal  shell,  no  matter  what  its  shape,  just  as  8 is  zero  anywhere  inside  a 
uniformly  charged,  spherical  shell.  It  is  not  feasible  to  prove  this  statement 
by  a direct  application  of  Coulomb’s  law,  as  can  be  done  for  the  corre- 
sponding statement  about  a uniformly  charged,  spherical  shell.  The  reason 
is  that  the  charge  placed  on  an  arbitrary  conducting  shell  distributes  itself 
in  a complicated  way  that  depends  on  the  exact  shape  of  the  shell  and  is 
very  difficult  to  calculate.  Until  the  distribution  of  source  charges  is  known, 
there  is  no  way  to  make  dfiect  use  of  Coulomb’s  law.  (In  Chap.  21  you  will 
see  that  this  charge  tends  to  concentrate  in  regions  where  the  curvature  of 
the  surface  of  the  shell  is  highest.  And  you  will  also  see  that  there  is  a 
method  that  could  determine  the  charge  distribution.  But  it  is  not  easy.) 

Gauss’  law  can  be  used  in  a quite  simple  argument  to  prove  the  itali- 
cized statement  above.  The  gaussian  surface  indicated  in  Fig.  20-36  is  con- 
structed just  inside  the  charged  metal  shell.  Since  there  is  zero  charge  en- 
closed by  the  gaussian  surface,  Gauss’  law  requires  that 

I 8 ■ d a = 0 (20-49) 

closed 

surface 


20-6  Applications  of  Gauss'  Law  939 


At  any  location  on  the  closed  gaussian  surface  da  is  directed  outward. 
And  if  8 is  not  zero  at  any  location,  then  it  certainly  must  have  a direction 
which  is  essentially  the  same  as  that  of  da.  The  reason  is  that  da  points 
toward  the  nearest  of  the  negative  charges  on  the  outer  surface  of  the 
metal  shell.  If  there  are  electric  held  lines,  then  they  must  be  heading  in  the 
direction  of  those  charges.  Hence  if  8 is  not  zero,  then  the  quantity  8 • da 
must  have  a positive  value.  The  same  is  true  for  every  region  of  the  closed 
surface.  This  means  that  the  integral  in  Eq.  (20-49)  cannot  be  equal  to  zero 
on  account  of  cancellation  between  regions  where  8 • da  is  positive  and 
ones  where  it  is  negative.  It  follows  that  8 • da  must  be  zero  everywhere  on 
the  closed  surface.  Now  da  is  not  zero,  and  we  have  just  shown  that  the  zero 
value  of  8 • da  certainly  cannot  be  due  to  8 and  da  being  perpendicular. 
We  must  conclude  that  8 is  zero  at  all  locations  just  inside  the  charged 
metal  shell. 

Since  there  is  no  electric  held  at  any  location  just  inside  the  metal  shell, 
there  are  no  electric  held  lines  at  any  of  these  locations.  But  electric  held 
lines  begin  and  end  only  on  charges,  and  there  are  no  charges  anywhere  in- 
side the  metal  shell.  Hence,  there  being  no  electric  held  lines  just  inside  the 
metal  shell,  there  can  be  none  anywhere  inside  it.  In  other  words,  the  elec- 
tric held  8 is  zero  everywhere  inside  the  charged  metal  shell,  as  we  set  out  to 
prove. 

The  prediction  that  there  is  no  electric  held  inside  a charged  metal 
shell  was  tested  experimentally  in  a qualitative  way  by  Faraday  and  in  a 
quantitative  manner  by  Henry  Cavendish,  another  of  the  early  inves- 
tigators of  the  electric  force.  Improved  techniques  were  used  by  a succes- 
sion of  other  workers,  culminating  in  an  experiment  performed  in  1971  by 
E.  R.  Williams,  J.  E.  Faller,  and  H.  A.  Hill.  Since  the  experiment  tests 
Gauss’  law,  it  also  tests  Coulomb’s  law.  The  specihc  feature  of  Coulomb’s 
law  tested  is  the  variation  with  the  distance  r between  two  point  charges  of 
the  magnitude  F of  the  force  they  exert  on  each  other.  If  F K l/rn  with 
n = 2 precisely,  then  there  will  be  precisely  zero  electric  held  inside  the 
shell.  The  experiment  showed  that  the  electric  held  is  zero  to  within  such 
extremely  narrow  limits  that  Williams,  Faller,  and  Hill  were  able  to  conclude 
that  the  value  of  n is  2 to  within  about  1 part  in  1016.  The  inverse-square 
dependence  of  Coulomb’s  law  is  the  most  accurately  known  property  of 
nature! 


EXERCISES 
Group  A 

20-1.  Repulsive  spheres,  I.  Two  small  charged  spheres 
repel  each  other  with  a force  of  0.090  N when  they  are 
10  cm  apart.  What  will  be  the  repulsive  force  if  they  are 
moved  until  they  are  30  cm  apart? 

20-2.  Testing!  Testing!  A charge  of  +3.0  x 10-8  C is 
10  cm  from  one  of —3.0  x 10-8  C.  A test  charge  of  + 1 .0  X 
10-8  C is  placed  midway  between  them.  What  is  the  mag- 
nitude and  direction  of  the  force  on  the  test  charge? 

20-3.  Comparing  electric  and  gravitational  force. 

a.  Calculate  the  electric  force  of  repulsion  between 
two  protons  1.0  cm  apart. 


b.  Calculate  the  gravitational  force  of  attraction 
between  the  two  protons  of  part  a. 

c.  What  is  the  ratio  of  the  electric  to  the  gravitational 
force? 

20-4.  Accelerated  alpha  particle.  To  a first  approxi- 
mation, a uranium  nucleus  may  be  considered  a uni- 
formly charged  sphere  with  a total  charge  of  +92  e.  Simi- 
larly, an  alpha  particle  can  be  treated  as  a smaller  sphere 
with  a charge  of  +2  e and  a mass  of  6.7  X 10~27  kg.  What 
is  the  magnitude  of  the  acceleration  of  the  alpha  particle 
when  its  center  is  at  a distance  of  2.0  x 10-14  nr  from  the 
center  of  the  uranium  nucleus?  Assume  that  the  alpha 
particle  and  uranium  nucleus  do  not  overlap,  and  that  the 


940  The  Electric  Force  and  the  Electric  Field 


Fig.  20E-12 


latter  is  so  massive  compared  to  the  former  that  the  ura- 
nium nucleus  may  be  considered  to  be  at  rest  in  the  iner- 
tial frame  used  to  evaluate  the  acceleration  of  the  alpha 
particle. 

20-5.  Nuclear  decay.  A lead-210  nucleus  decays  by 
emitting  an  electron,  forming  a product  nucleus 
bismuth-210.  What  does  the  principle  of  charge  conserva- 
tion enable  you  to  conclude  about  the  product  nucleus? 

20-6.  Repulsive  spheres,  II.  Two  small  identical  metal 
spheres  have  equal  like  charges.  They  are  suspended  by 
insulating  threads  and  repel  each  other  with  a force  of 
0.10  N when  10  cm  apart,  measured  center-to-center. 
They  are  separated,  holding  them  by  the  threads.  One  is 
touched  to  an  identical  uncharged  insulated  metal  sphere. 
The  other  is  treated  in  the  same  manner  using  a different 
identical  uncharged  insulated  metal  sphere.  They  are 
then  brought  together  until  they  are  again  10  cm  apart. 
With  what  force  do  they  repel  each  other? 

20-7.  Electron  in  uniform  field,  I.  What  is  the  magni- 
tude of  the  acceleration  of  an  electron  in  a uniform  elec- 
tric field  of  2.0  X 104  N/C?  What  is  the  direction  of  the 
acceleration? 

20-8.  Electron  in  uniform  field,  II.  How  long  does  it 
take  an  electron  in  a uniform  field  of  1.0  x 104  N/C  to  ac- 
quire 2.0  percent  of  the  speed  of  light,  if  it  starts  from 
rest? 

20-9.  Crossing  forbidden.  Explain  why  it  is  impossible 
for  electric  field  lines  to  cross. 

20-10.  Zap! 

a.  When  the  electric  field  in  ait  reaches  a value  of 
about  3 X 106  N/C,  the  air  becomes  conducting  and 
charge  begins  to  leak  oil  the  charged  object  producing  the 
electric  field.  What  is  the  maximum  charge  that  can  reside 
on  a metal  sphere  of  radius  1 .0  cm  in  air? 

b.  I f this  maximum  charge  were  positive,  how  many 
electrons  were  removed  from  the  sphere  while  charging 
it? 

c.  11  the  1 .0-cm  sphere  were  copper,  how  many 
electrons  would  it  contain?  Each  copper  atom  has  29  elec- 
trons. The  molecular  weight  of  copper  is  63.5,  and  the 
density  of  copper  is  8.9  X 1 03  kg/m3. 

d.  What  fraction  of  the  electrons  were  removed  from 
the  copper  sphere? 

20-11.  Coulomb’s  law  from  Gauss’  law.  By  applying 
Gauss’  law  to  a point  charge,  derive  Coulomb’s  law. 

20-12.  Induced  charge. 

a.  A ball  carrying  charge  + 1</|  is  introduced  into  the 
hollow  of  an  insulated  conductor  through  an  opening  of 
negligible  size.  See  Fig.  20E-12.  The  ball  is  suspended  by 
an  insulated  thread.  Explain  how  Gauss’  law  leads  to  the 
conclusion  that  there  is  a charge  — |</|  on  the  inner  surface 
of  the  conductor. 

b.  The  conductor  was  originally  neutral.  Explain  how 
there  can  be  a charge  - 1<?|  on  one  part  of  it. 


Group  B 

20-13.  Suspended  spheres,  I. 

a.  Two  small  identical  metal  spheres,  each  of  mass  m, 
are  given  identical  charges  q and  suspended  by  insulating 
threads  of  length  /.  Prove  that  in  equilibrium  the  angle  0 
which  each  thread  makes  with  the  vertical  satisfies  the  re- 
lation sin3  6/c os  9 = q2/\6Tre0mgl2. 

b.  If  m = 1.0  x 10-4  kg  and  l = 1.0  m,  what  is 
q if  they  come  to  rest  at  a center-to-center  separation  of 
0.080  m? 

20-14.  Suspended  spheres,  II.  Two  small  identical  metal 
spheres  are  charged,  one  with  +8.0  x 10-8  C,  the  other 
with  —2.0  X 10-8  C.  The  spheres  are  suspended  by  insu- 
lating threads  and  moved  1.0  m apart.  The  force  of  attrac- 
tion between  them  is  measured.  By  means  of  the  threads 
they  are  brought  into  contact  and  once  again  moved  1.0  m 
apart.  What  is  the  ratio  of  the  new  force  between  them  to 
the  original  force? 

20-15.  Large-angle  scattering.  Following  the  procedure 
suggested  immediately  below  Example  20-5,  show  that 
approximately  0.01  percent  of  the  alpha  particles  in  the 
experiment  considered  there  will  be  scattered  through 
angles  greater  than  90°. 

20-16.  Parabolic  electron  trajectory.  Two  oppositely  and 
equally  charged  parallel  metal  plates  furnish  a uniform 
electric  field  of  strength  % at  right  angles  to  their  surfaces. 
An  electron  is  injected  into  the  field  with  its  initial  velocity 
v0  parallel  to  the  plates.  Take  the  direction  of  v0  as  the 
direction  of  the  % axis.  The  y axis  is  in  the  direction  of  the 
field.  The  origin  is  at  the  point  of  entry  into  the  field. 
Show  that  the  trajectory  is  a parabola  whose  equation  is 
y = \(j£e/mev%)x2,  where  — e and  me  are  the  charge  and 
mass  of  the  electron. 

20-17.  Equilibrium  point.  Charges  of  +4.0  x 10  8 C 
and  — 1.0  X 10-8  C are  placed  1.0  m apart. 

a.  At  what  point  is  the  electric  field  zero? 

b.  Considering  only  motion  along  the  line  connecting 
the  source  charges,  show  quantitatively  that  a positive  test 
charge  at  this  point  is  in  unstable  equilibrium. 

c.  Give  a brief  qualitative  discussion  showing  that  the 
point  is  one  of  stable  equilibrium  for  the  test  charge  with 
respect  to  motion  along  a line  perpendicular  to  the  line 
connecting  the  source  charges. 

20-18.  Charged  ring.  The  electric  field  due  to  a uni- 
formly charged  ring  is  given  on  the  axis  of  the  ring  by  Eq. 
(20-28).  Show  that  its  magnitude  is  a maximum  at  z = 

±k/V2. 


Exercises  941 


20-19.  Square  array  of  charges.  Four  equal  charges 
+ 1<7|  are  placed  at  the  corners  of  a square  of  side  a. 

a.  What  is  the  magnitude  and  direction  of  the  electric 
held  at  each  corner? 

b.  What  is  the  electric  held  at  the  center  of  the 
square? 

c.  What  is  the  sign  and  magnitude  of  the  charge 
which  would  produce  zero  electric  held  at  each  of  the 
corners  if  it  were  placed  at  the  center  of  the  square? 

20-20.  Long  charged  wire,  I.  Use  Coulomb’s  law  to  hnd 
the  electric  held  S at  a distance  r from  an  infinitely  long 
metal  wire,  of  very  small  diameter,  which  has  a positive 
charge  X per  unit  length.  Compare  your  results  with  Eq. 
(20-48),  obtained  from  Gauss’  law. 

20-21.  Long  charged  wire,  II.  An  infinitely  long 
straight  wire  has  a uniform  charge  X per  unit  length.  Use 
Gauss’  law  to  show  that  the  electric  held  8 it  produces  at  a 
point  whose  distance  from  the  wire  is  r is  given  by  Eq. 
(20-48).  Explain  why  Eq.  (20-48)  is  more  general  than  the 
result  obtained  in  Exercise  20-20. 

20-22.  Sphere  and  plate.  A small  metal  sphere  of  mass 
m is  hanging  from  an  insulating  thread  attached  to  the  top 
edge  of  a very  large  metal  plate  which  is  hxed  in  a vertical 
plane.  The  plate  is  given  a charge  whose  surface  density 
(counting  the  charge  on  both  sides)  is  cr.  The  sphere,  ini- 
tially touching  the  plate,  acquires  a charge  q from  the 
plate.  Show  that  tan  0 = qcr/2eamg,  where  0 is  the  equilib- 
rium angle  between  the  thread  and  the  plate. 

20-23.  Field  near  charged  conductor.  All  excess  charges 
on  any  conductor  reside  on  its  surface.  These  charges  pro- 
duce an  electric  held  outside  the  surface.  Immediately 
outside  the  surface,  the  direction  of  8 at  every  location  is 
normal  to  the  surface  at  that  location,  since  otherwise 
charge  would  be  moving  along  the  surface  in  response  to 
8.  Use  Gauss’  law  to  prove  that  the  magnitude  % of  the 
electric  held  at  any  location  immediately  outside  the  sur- 
face is  given  by  % = cr/e0,  where  a is  the  charge  per  unit 
area  on  the  surface  at  that  location. 

20-24.  No  charge  inside.  Use  Gauss’  law  to  prove  that 
all  charge  given  to  a metal  shell  ends  up  distributed  over 
the  outer  surface  of  the  shell. 

20-25.  No  field  inside.  Prove  that  the  angles  02  and  0'2 
in  Fig.  20-34  are  equal,  thereby  completing  the 
Coulomb’s-law  proof  that  there  is  no  electric  held  inside  a 
uniformly  charged  spherical  shell. 

Group  C 

20-26.  Balancing  the  force.  Two  equal  charges  +|Q| 
are  placed  at  opposite  corners  of  a square  of  side  a.  Two 
other  equal  charges  — \q\  are  placed  at  the  remaining  two 
corners. 

a.  What  is  the  value  |Q|/|<?|  if  the  force  on  either 
charge  + |Q  | is  to  be  zero? 

b.  What  is  the  magnitude  and  direction  of  the  force 
on  either  charge  - \q\  with  this  value  of  |(2|/|?|? 


20-27.  Triangular  array  of  charges,  I.  Three  equal 
charges  q are  placed  at  the  apexes  of  an  equilateral  trian- 
gle of  side  a. 

a.  What  is  the  electric  held  at  the  center  of  the  trian- 
gle (the  intersection  of  the  medians)? 

b.  What  is  the  magnitude  and  direction  of  the  force 
experienced  by  each  charge  due  to  the  presence  of  the 
other  two? 

20-28.  Three  suspended  spheres.  A charge  q is  given  to 
each  of  three  small  identical  spheres,  each  of  mass  m, 
which  are  suspended  from  a single  point  by  insulating 
threads,  each  of  length  a.  The  spheres  repel  each  other 
until  they  are  at  the  corners  of  an  equilateral  triangle  of 
side  a.  Show  that  q2  = mga2  4Treo/\/0).  Hint:  The  point  of 
suspension  is  the  vertex  of  the  regular  tetrahedron  in  Fig. 
20E-28. 


20-29.  Maximum  deflection  angle,  I.  Evaluate  the  order 
of  magnitude  of  the  maximum  possible  deflection  angle, 
0max , when  an  alpha  particle  of  mass  ma  collides  with  a free 
and  initially  stationary  electron  of  mass  me,  as  follows. 
First  justify  the  statement  that  in  the  collision  of  a very 
massive  body,  moving  initially  at  speed  v,  with  a free  and 
initially  stationary  body  of  small  mass,  the  speed  of  the 
latter  after  the  collision  cannot  be  greater  than  2v.  Do  this 
by  considering  the  collision  in  a reference  frame  moving 
with  the  massive  body,  and  then  transforming  to  the 
frame  in  which  the  other  body  initially  is  stationary.  Then 
show  that  A pa,  the  magnitude  of  the  maximum  change  in 
the  momentum  of  the  alpha  particle  in  the  collision,  is 
2 mev.  Let  the  momentum  change  vector  be  perpendicular 
to  the  initial  alpha  particle  momentum  vector,  whose  mag- 
nitude is  pa  = mav,  and  show  that  0max  — Apjpa.  Then  put 
it  all  together  and  obtain  a numerical  estimate  for  0max . 

20-30.  Maximum  deflection  angle,  II.  Evaluate  the 
order  of  magnitude  of  the  maximum  possible  deflection 
angle,  #max,  when  an  alpha  particle  of  mass  ma  passes 
through  the  positive  charge  + Ze  uniformly  distributed 
over  a sphere  of  atomic  radius  R in  Thomson’s  “raisin- 
cake”  model  of  the  atom,  as  follows.  First  explain  why  the 
order  of  magnitude  of  the  maximum  force  acting  on  the 
alpha  particle  is  F — 2Ze2/4Tre0R2.  Then  explain  why  you 
can  estimate  the  maximum  magnitude  A pa  of  the  mo- 
mentum transferred  to  the  alpha  particle  by  taking  the 
product  of  F and  the  time  At  required  for  it  to  pass 


942  The  Electric  Force  and  the  Electric  Field 


through  the  atom.  Find  A pa  and  also  evaluate  pa,  the  mag- 
nitude of  the  alpha  particle’s  initial  momentum.  Then  use 
these  quantities  as  in  Exercise  20-29  to  obtain  a numerical 
estimate  for  0max. 

20-31.  Triangular  array  of  charges,  II.  Three  equal 
charges  + |r?|  are  placed  at  the  corners  of  the  base  of  a reg- 
ular tetrahedron  with  sides  of  length  a.  See  Fig.  20E-28. 
Prove  that  the  electric  field  at  the  fourth  vertex  has  a mag- 
nitude equal  to  \Z6|<7|/47re0«2.  What  is  the  direction  of  the 
electric  held?  Hint:  The  vertex  lies  vertically  above  the 
center  of  the  base. 

20-32.  Circular  disk.  A circular  disk  is  given  a uniform 
charge  per  unit  area  ot  cr.  Find  the  electric  held  at  a point 
P on  the  axis  of  (he  disk  by  dividing  it  into  circular  strips 
and  applying  Eq.  (20-28).  Specifically: 

a.  Show  that  at  P,  the  magnitude  of  the  electric  held 
is  % = (cr/2e0)(l  — cos  8),  where  8 is  the  angle  subtended 
by  the  radius  of  the  disk  at  P.  See  Fig.  20E-32. 


b.  By  extending  the  radius  of  the  disk  to  inhnity, 
show  that  the  result  gives  the  electric  held  due  to  an 
infinite  charged  hat  plate  quoted  in  Eq.  (20-45). 

20-33.  “ Raisin-cake ” atom.  In  the  Thomson  “raisin- 
cake”  model  of  the  hydrogen  atom,  the  positive  charge  +e 
is  uniformly  distributed  over  a sphere  of  radius  Ii.  Em- 
bedded in  the  center  of  the  sphere  is  a much  smaller  par- 
ticle of  mass  me,  the  electron,  with  charge  — e.  This  is  the 
normal  state  of  the  atom,  since  the  electric  held  is  zero  at 
the  center  of  the  positive  sphere. 

a.  Consider  a spherical  surface  of  radius  r concentric 
with  the  sphere  of  positive  charge.  Show  that  the  magni- 
tude of  the  electric  held  is  % = er/4Tre0R3  at  a distance  r 
from  the  center  of  the  positive  charge.  Explain  why  its 
direction  is  away  from  the  center. 

b.  If  the  electron  at  the  center  is  displaced  to  r,  it  will 
experience  a force  of  magnitude  e2r/ 4ne0R3  tending  to 
restore  it  to  the  center.  Show  that  the  electron  will  oscillate 
about  the  center  with  a frequency 

y = _^_  / 1 
2nR  V 47 T€0Rme 

c.  For  R = 5.3  x 10_n  m,  the  radius  of  a hydrogen 
atom,  calculate  v. 

20-34.  Evaluate  the  charge  distribution.  The  charge  on  a 
nonconducting  sphere  is  distributed  in  a spherically  sym- 
metric fashion  so  that  the  charge  density  p varies  only  with 
r,  the  distance  from  the  center.  If  %,  the  magnitude  of  the 
electric  held,  is  constant  throughout  the  sphere,  show  by 
means  of  Gauss’  law  that  p = 2%  e0/r. 


20-35.  Please  confirm.  Confirm  the  Gauss’-law  results 
of  Example  20-8  by  performing  the  Coulomb’s-law  calcu- 
lation outlined  in  small  print  below  that  example. 

20-36.  Coulomb’s  law  and  the  dimensionality  of  space. 
Equations  (20-18)  and  (20-45),  and  the  results  of  Exercises 
20-20  or  20-21,  can  be  summarized  by  saying  that  the 
dependence  of  the  electric  held  on  the  distance  r from  its 
source  obeys  the  proportionality  % « r~ 2 for  a point 
source,  % « r_1  for  an  infinite  uniform  line  source,  and 
% <x  r°  for  an  inhnite  uniform  plane  source.  It  can  also  be 
said  that  the  electric  held  of  a point  source  is  three- 
dimensional,  the  electric  held  of  an  inhnite  uniform  line 
source  is  two-dimensional,  and  the  electric  held  of  an  inh- 
nite uniform  plane  source  is  one-dimensional.  Write  sev- 
eral paragraphs  justifying  the  second  statement  and  its  re- 
lation to  the  first.  Then  comment  on  the  relation  between 
Coulomb’s  law  and  the  fact  that  space  is  three- 
dimensional. 


Numerical 

20-37.  Alpha-particle  scattering,  I.  Run  the  central- 
force  program  for  alpha-particle  scattering  with  the  same 
initial  values  and  parameters  as  in  Example  20-4,  except 
let  the  impact  parameter  be  zero  to  generate  a “head-on” 
approach,  followed  by  a scattering  through  180°.  Deter- 
mine the  distance  of  closest  approach  of  the  alpha  particle 
to  the  nucleus.  Compare  this  minimum  distance  of  closest 
approach  with  the  value  42  X 10~15  m quoted  in  the  text, 
obtained  from  analytical  calculations  by  Rutherford. 

20-38.  Alpha-particle  scattering,  II.  Run  the  central- 
force  program  for  alpha-particle  scattering  with  the  same 
initial  values  and  parameters  as  in  Example  20-4,  except 
use  the  value  of  the  parameter  a appropriate  to  the  alumi- 
num nucleus,  with  ZM  =13  and  mM  = 4.48  X 10“26  kg. 
Compare  your  trajectory  with  the  one  plotted  in  Fig. 
20-12,  and  explain  their  differences. 

20-39.  Alpha- particle  scattering,  III.  When  alpha  par- 
ticles of  kinetic  energy  6.4  X 10~12  J = 40  MeV  are  scat- 
tered from  uranium  nuclei,  as  the  scattering  angle  be- 
comes larger  than  about  60°  departures  become  apparent 
between  the  observed  behavior  and  the  behavior  pre- 
dicted by  assuming  that  the  only  force  acting  between  the 
two  bodies  is  the  one  given  by  Coulomb’s  law  for  point 
charges.  This  is  interpreted  as  due  to  the  onset  of  the 
strong  nuclear  force.  Use  these  experimental  observations 
and  the  central-force  motion  program  to  estimate  the 
radius  of  the  uranium  nucleus  in  the  following  way.  Eval- 
uate the  parameter  a,  using  the  fact  that  Zv  = 92  and 
mv  = 3.95  X 10-25  kg.  Next  evaluate  ( dx/dt)0 . Then  make 
several  runs,  using  different  values  of  y0  until  you  get  a 
scattering  angle  of  about  60°.  The  nuclear  radius  is 
approximately  equal  to  the  distance  of  closest  approach 
for  the  60°  trajectory,  less  about  2x1 0-15  m for  the  alpha- 
particle  radius  and  about  2 X 10~15  m for  the  distance 
over  which  the  strong  nuclear  force  acts. 


Exercises  943 


ri|  ri| 

The  Electric 
Potential 


21-1  ELECTRIC 
POTENTIAL  ENERGY 
AND  ELECTRIC 
POTENTIAL 


The  techniques  of  Chap.  20  enable  us  to  evaluate  the  electric  force  exerted 
on  a particle  having  a given  charge  by  other  charged  particles.  Knowing 
this  force,  and  given  the  mass  of  the  particle  on  which  it  is  exerted,  we  can 
determine  the  acceleration  of  the  particle  by  using  Newton’s  second  law. 
From  the  acceleration  we  can  find  the  particle’s  motion.  Similar  analyses 
will  lead  to  a determination  of  the  motions  of  the  other  particles  in  a system 
of  particles  interacting  by  means  of  electric  forces.  But  our  experience  with 
mechanics  has  taught  us  that  it  is  often  much  more  efficient  to  analyze  the 
behavior  of  a system  by  applying  energy  relations  than  by  applying  Newton’s 
laws  of  motion  directly.  Basic  to  any  such  analysis  is  the  concept  of  potential 
energy.  In  this  chapter  we  consider  properties  and  applications  of  the  po- 
tential energy  associated  with  the  electric  force,  called  the  electric  potential 
energy.  We  also  consider  the  electric  potential  energy  per  unit  charge  on  a 
test  charge,  called  the  electric  potential. 


The  electric  force  exerted  on  a charged  particle  by  another  charged 
particle  is  given  by  an  expression  with  the  same  mathematical  form  as  the 
one  giving  the  gravitational  force  exerted  on  a particle  having  mass  by  an- 
other particle  having  mass.  Consequently,  an  expression  for  the  electric  po- 
tential energy  of  a system  of  two  particles  can  be  developed  in  a way  com- 
pletely parallel  to  the  way  we  developed  the  gravitational  potential  energy 
in  Sec.  1 1-6.  Figure  21-1,  which  is  analogous  to  Fig.  11-18,  shows  a source 
charge  q and  a test  charge  qt.  The  position  of  the  source  charge  is  fixed  in 
the  reference  frame  used  in  the  figure.  But  the  test  charge  moves  along  the 
path  indicated  from  a position  with  respect  to  the  source  charge  given  by 
the  vector  r,  to  one  given  by  the  vector  rf.  The  figure  is  drawn  under  the  as- 
sumption that  the  source  and  test  charges  are  of  the  same  sign,  so  that  the 


944 


sf 


Fig.  21-1  Diagram  for  evaluating  the  change  in  electric  potential  energy 
of  a system  consisting  of  a source  charge  and  a test  charge  as  the  test 
charge  moves  along  a segment  of  its  path  from  initial  position  st  to  final 
position  sf.  The  coordinates  s{  and  sf  are  measured  along  the  path  from 
some  fixed  origin  lying  on  the  path.  At  any  point  along  the  path,  the  di- 
rection of  the  electric  force  F,  exerted  on  the  test  charge  is  F,  = r,  where  r 
is  the  position  vector  from  the  source  charge  to  the  test  charge  and  where 
it  is  assumed  that  both  charges  are  of  the  same  sign.  As  the  test  charge 
moves  through  the  infinitesimal  displacement  ds,  the  electric  force  does 
work  dW  = F,  • ds.  The  diagram  shows  this  is  equal  to  dW  = F(  • 
dr  = F dr,  as  discussed  in  the  text. 


force  F,  exerted  on  the  test  charge  has  the  same  direction  as  that  of  the  unit 
vector  r directed  from  the  source  charge  to  the  test  charge.  But  the  result  to 
be  obtained  does  not  depend  on  this  assumption. 

The  hrst  step  in  finding  an  expression  for  a potential  energy  associated 
with  the  force  F,  is  to  calculate  the  work  W done  by  this  force  acting  on  the 
test  charge  when  it  moves  from  its  initial  position  to  its  final  position.  Ac- 
cording to  the  definition  of  work,  Eq.  (7-35),  W has  the  value 

W = I *'  Fr  • ds  (21-1) 

JSf 

Elere  ds  is  a displacement  vector  along  the  path  of  the  test  charge,  st  is  a 
coordinate  measured  along  that  path  which  specifies  the  initial  position  of 
the  test  charge,  and  sf  is  a similar  coordinate  specifying  its  final  position. 
Coulomb’s  law  gives  the  force  F,  as 


F, 


_J_  <¥h  . 

47T€0  r2  F 


Using  this  in  Eq.  (21-1),  we  have 


W = 


1 

47760 


m t 


ds 


(21-2) 


It  can  be  seen  from  Fig.  21-1  that  r • ds  = 1 cos  6 ds  = dr.  That  is, 
r • ds  has  the  value  dr  of  the  radial  component  of  ds.  Using  this  fact  and  the 
fact  that  (l/47re0)^f  is  a constant,  we  get 


W - 


1 


477-60 

Evaluating  the  integral,  we  obtain 


(2i-3> 

In  this  expression,  rt  and  rf  are  the  magnitudes  of  the  vectors  r,  and  rf 
describing  the  initial  and  final  positions  of  the  test  charge  with  respect  to 
the  source  charge.  That  the  work  W done  by  the  electric  force  Ff  acting  on 
the  test  charge  depends  only  on  its  initial  and  final  distances  from  the 


21-1  Electric  Potential  Energy  and  Electric  Potential  945 


source  charge — and  not  on  the  path  followed  by  the  test  charge  in  going 
between  its  initial  and  final  positions — tells  us  that  the  electric  force  is  a con- 
servative force.  (See  Sec.  7-5.) 

Since  the  electric  force  is  a conservative  force,  an  electric  potential  en- 
ergy can  be  defined.  (See  Sec.  7-6.)  We  do  this  by  using  the  work-potential 
energy  relation  of  Eq.  (7-46),  which  in  our  present  notation  is 

A U = -W 

This  relation  expresses  the  change  A U in  the  potential  energy  of  the  source 
charge-test  charge  system,  as  the  test  charge  passes  from  its  initial  to  its 
final  position,  in  terms  of  the  negative  of  the  work  W done  by  the  force 
acting  on  the  test  charge.  Using  Eq.  (21-3)  to  evaluate  the  work  W,  we  have 
for  the  change  A U in  the  electric  potential  energy  U associated  with  the 
force 


A U 


1 


47Te0 


mt 


(21-4) 


Equation  (21-4)  states  that  if  the  source  and  test  charges  are  of  the  same 
sign,  so  that  qqt  has  a positive  value,  then  the  electric  potential  energy  U de- 
creases as  the  test  charge  moves  away  from  the  source  charge.  To  see  this, 
note  that  for  such  motion  rf  > r,  and  hence  \/rf  < \/ri,  leading  to  a nega- 
tive value  for  A U . Conversely,  U increases  as  the  test  charge  moves  toward 
the  source  charge.  If  the  source  and  test  charge  are  of  the  opposite  sign, 
then  U increases  as  their  separation  increases  and  decreases  as  the  separa- 
tion decreases. 

We  can  deal  with  specific  values  of  the  electric  potential  energy  U of 
the  source  charge-test  charge  system — instead  of  just  the  changes  A U in 
these  values — if  we  agree  on  a reference  position  at  which  U is  assigned  the 
value  0.  Just  as  in  the  very  analogous  case  of  the  gravitational  potential  en- 
ergy for  a system  of  two  particles  with  mass,  most  often  it  is  convenient  to 
choose  this  reference  position  as  one  where  r = That  is,  we  agree  that 
U = 0 for  r = so  that  the  electric  potential  energy  is  zero  when  the  two  charges 
are  separated  by  an  infinite  distance.  If  we  then  suppose  that  the  test  charge  is 
initially  at  a position  infinitely  distant  from  the  source  charge  (at  q = °°) 
and  moves  to  a final  position  whose  distance  from  the  source  charge  has  the 
arbitrary  value  r (at  rf  — r),  we  have  from  Eq.  (21-4) 


A U = 


1 <Mt 

4776o  r 


But  U = 0 at  the  initial  position  since  it  is  the  reference  position.  Conse- 
quently, the  value  of  U at  the  final  position  is  equal  to  the  change  AU.  Thus 
we  have  U = A U.  Putting  it  all  together,  we  obtain  the  following  expression 
for  the  electric  potential  energy  U of  a system  of  two  point  charges  q and  qt  when  they 
are  separated  by  a distance  r: 

U = — taking  U — 0 for  r = ° ° (21-5) 

47reo  r s 


Although  derived  by  considering  point  charges,  Eq.  (21-5)  can  be  ap- 
plied to  calculate  the  electric  potential  energy  U for  a system  of  two  extended 
charge  distributions,  one  with  total  charge  q and  the  other  with  total  charge 

946  The  Electric  Potential 


Energy 


< it,  providing  the  three  following  conditions  are  satished:  (1)  Each  charge 
distribution  has  spherical  symmetry;  (2)  the  charge  distributions  do  not 
penetrate  each  other;  (3)  their  separation  r is  measured  center  to  center. 
The  considerations  of  Sec.  20-7  show  this  to  be  true  because  when  these 
conditions  are  satished,  the  electric  force  exerted  by  one  extended  charge 
distribution  on  the  other  is  exactly  the  same  as  if  all  the  charge  of  each  were 
concentrated  at  a point  at  its  center.  Since  the  electric  potential  energy  U of 
the  system  of  two  extended  charge  distributions  is  calculated  from  the  work 
done  by  this  force,  it  will  have  the  value  given  by  making  such  use  of  Eq. 
(21-5).  Example  21-1  applies  this  conclusion  to  a very  important  case. 


EXAMPLE  21-1 

Figure  21-2  is  a reproduction  of  Fig.  15-15.  It  shows  the  principal  features  of  the 
experimentally  determined  potential  energy  U for  the  fission  of  a uranium-235  nu- 
cleus versus  the  center-to-center  separation  r of  its  two  constituent  parts.  As  we  ex- 
plained when  the  figure  was  presented  in  Chap.  15,  the  smoothly  descending  part 
of  the  curve  corresponds  to  the  electric  potential  energy  of  the  system  of  two  frag- 
ments into  which  the  nucleus  splits  by  fission.  For  the  purpose  of  obtaining  an  esti- 
mate, assume  that  each  of  these  fission  fragments  carries  a charge  of  +46c  (half  the 
charge  of  the  uranium  nucleus)  and  that  this  charge  is  distributed  through  each 
fragment  with  spherical  symmetry.  Then  use  Eq.  (21-5)  to  calculate  the  electric  po- 
tential energy  of  the  system  of  two  fission  fragments  when  their  center-to-center 
separation  r has  the  smallest  value  found  in  the  smooth  part  of  the  curve,  1.2  x 
10-14  nr.  (For  smaller  values  of  r the  adjacent  surfaces  of  the  fission  fragments  are  so 
close  that  the  strong  nuclear  force  begins  to  act  between  them,  causing  U to  dip 
below  the  values  predicted  by  considering  only  the  electric  force.)  Compare  your 
prediction  with  the  value  of  U read  from  the  figure  at  r = 1.2  X 10~14  m. 

■ Treating  one  fission  fragment  as  the  “source  charge,”  q = +46c,  and  the  other 
as  the  “test  charge,”  qt  = +46e,  and  setting  r = 1.2  x 10-14  m,  Eq.  (21-5)  gives  you 


U = 


i qqt 


47760  r 
= 4.0  X ltr11 


= 9.0  x 109  N-m2/C2 


(46  X 1.6  x 10“19  C)2 


1.2  x 10- 


in 


This  value  of  the  electric  potential  energy  agrees  with  the  value  of  U obtained 
from  the  figure  to  within  about  25  percent.  You  cannot  expect  the  calculated  value 
to  be  in  better  agreement  with  the  experimental  value.  One  reason  is  that  the 
charge  of  the  uranium  nucleus  is  not  evenly  shared  by  the  fission  fragments,  as  as- 
sumed in  the  calculation.  Another  is  that  when  the  fission  fragments  are  verv  close, 
they  are  not  perfectly  spherical,  as  was  assumed.  Nevertheless,  you  have  used  Eq. 
(21-5)  to  obtain  an  approximate  prediction  of  the  value  of  the  quantity  U,  which 


was  explained  in  Sec. 


15-5  to  be  essentially  equal  to  the  energy  released  in  the 
235  nucleus! 


Fig.  21-2  The  potential  energy  U as  a function  of  the  center-to- 
center  separation  r of  the  fission  fragments  in  the  nuclear  reaction 
leading  to  the  fission  of  uranium.  Except  at  the  smallest  values  of  t- 
shown,  the  potential  energy  is  due  entirely  to  the  electric  force 
which  each  fragment  exerts  on  the  other. 


21-1  Electric  Potential  Energy  and  Electric  Potential  947 


In  Sec.  15-5  we  discussed  the  process  in  which  energy  is  released  when  a posi- 
tive sodium  ion  Na+  comes  together  with  a negative  chlorine  ion  Cl-  to  form  the 
molecule  NaCl.  See  Fig.  15-13.  The  value  of  the  electric  potential  energy  change  in 
this  process  can  also  be  predicted  — more  accurately  than  the  value  just 
obtained — by  applying  Eq.  (21-5).  (To  do  this,  you  must  be  careful  in  handling  the 
signs  of  the  charges  and  also  of  the  way  the  value  of  U is  chosen  at  r = °°.)  Thus  Eq. 
(21-5)  also  can  be  used  to  estimate  the  value  of  the  quantity  U,  which  in  Sec.  15-5 
was  shown  to  determine  the  energy  release  in  a typical  chemical  reaction — a quan- 
tity' as  important  as  the  one  estimated  in  this  example. 


When  dealing  with  the  electric  force  F,  exerted  on  a test  charge  qt  by  a 
source  charge  q,  we  have  found  it  very  useful  to  define  the  electric  force  per 
unit  charge  of  the  test  charge,  F t/qt,  which  is  called  the  electric  held  8.  It  is 
just  as  useful  to  do  the  analogous  thing  when  we  deal  with  the  electric  po- 
tential energy  U associated  with  the  electric  force  exerted  on  the  test 
charge.  We  define  the  electric  potential  energy  per  unit  charge  of  the  test 
charge  to  be  the  electric  potential  V.  That  is, 


Qt 


(Take  care  to  avoid  confusing  the  quite  different  quantities  having  almost 
identical  names:  the  electric  potential  energy  U and  the  electric  poten- 
tial V.) 

Equation  (21-5)  gives  an  expression  for  the  electric  potential  energy  U 
for  a system  containing  a point-source  charge  q and  a point-test  charge  qt 
with  separation  r,  taking  U = 0 for  r = To  find  the  corresponding  ex- 
pression for  the  electric  potential  V,  we  use  Eq.  (21-5)  in  the  definition  of 
Eq.  (21-6),  and  we  have 

= ( 1 / 4TTe0)qqt/r 
ht 

Canceling  qt,  we  obtain  the  expression  for  the  electric  potential  V of  a point- 
source  charge  q at  a distance  r from  the  charge: 

V = -7— — - taking  V = 0 for  r — °°  (21-7) 

477eo  r 

The  qualification  stated  in  Eq.  (21-7)  is  necessary  because  just  as  we  must 
agree  that  U = 0 for  r = °°  to  obtain  the  specific  value  of  U in  Eq.  (2 1-5),  so, 
too,  we  must  agree  that  V = 0 at  r = 00  to  obtain  the  value  of  V specified  in 
Eq.  (21-7).  In  other  words,  we  can  write  V as  in  Eq.  (21-7)  if  we  have  agreed 
that  the  electric  potential  is  zero  at  an  infinite  distance  from  a source 
charge. 


Equation  (21-7)  shows  that  the  electric  potential  V is  a property  of  the 
source  charge  q only,  even  though  a test  charge  qt  is  involved  in  defining  V. 
This  is  because  the  value  of  qt  has  been  removed  from  V since  V is  found  by 
dividing  U by  qt.  The  sign  of  the  electric  potential  V in  Eq.  (21-7)  is  the  same 
as  that  of  the  source  charge  q. 

The  sign  of  the  electric  potential  energy  U of  a system  containing  the 
source  charge  q and  the  test  charge  qt  will  be  the  same  as  or  opposite  to  that 
of  V depending  on  whether  the  test  charge  is  positive  or  negative.  To  be 
specific,  if  qt  has  a positive  value,  then  the  values  of  U and  V will  have  the 


948  The  Electric  Potential 


same  sign;  if  qt  has  a negative  value,  then  the  values  of  U and  V will  have  op- 
posite signs. 

The  SI  unit  for  electric  potential  is  joules  per  coulomb  (J/C).  This  is  so 
because  V = U/qt,  because  the  quantity  U is  measured  in  joules,  and  be- 
cause the  quantity  qt  is  measured  in  coulombs.  The  electric  potential  unit  is 
given  the  name  volt  (V).  Thus 

1 V = 1 J/C 

The  volt  is  named  after  the  Italian  physicist  Alessandro  Volta  (1745-1827). 
Volta  invented  the  voltaic  pile  (the  ancestor  of  the  modern  electric  battery). 
In  this  work  he  found  it  necessary  to  apply  an  instrument  (a  precursor  of 
the  modern  voltmeter)  capable  of  measuring  with  reasonable  sensitivity  the 
difference  between  the  electric  potentials  at  two  points  when  a voltaic  pile  is 
connected  between  them. 

Example  21-2  evaluates  V and  U in  a simple  case. 


EXAMPLE  21-2 

The  electron  in  a hydrogen  atom  is  most  probably  at  a distance  r = 5.29  X 10-11  m 
from  the  proton,  which  is  the  nucleus  of  the  atom.  Considering  the  proton  to  be  a 
source  charge,  evaluate  its  electric  potential  V at  a distance  from  it  equal  to  the 
quoted  value  of  r.  Then  evaluate  the  electric  potential  energy  U of  the  atom. 

■ The  charge  q of  a proton  is  q = -he.  Hence,  using  Eq.  (21-7),  you  have 


1 9 

V =- 1 = 8.99  x 

47T60  r 


109  N-m2/C2 


1.60  x 10"19  C 
5.29  x 10“n  m 


= 27.2  V 


Now  you  consider  the  electron  to  be  a test  charge  qt  = — e located  at  the  posi- 
tion where  V has  been  evaluated.  Then  you  find  the  value  of  U by  writing  Eq.  (21-6), 
the  definition  of  V in  terms  of  U and  qt,  as 

U = q,V  = -eV  = -1.60  x 10~19  C x 27.2  V 
= -4.36  x 10“18  J 


Note  that  the  procedure  used  in  Example  21-2  to  calculate  the  electric 
potential  energy  U of  the  system  involves  first  considering  only  the  pres- 
ence of  the  charge  q and  Ending  the  value  of  its  electric  potential  V at  the 
position  of  interest.  Then  the  presence  of  the  charge  qt  at  that  position  is 
taken  into  account,  and  the  value  of  U for  the  system  of  two  charges  is 
found  from  the  equation  U = qtV.  This  two-step  procedure  is  analogous 
to — and  has  the  same  advantages  as — the  procedure  used  to  calculate  the 
electric  force  F exerted  on  a charge  qt  by  a charge  q by  first  finding  the  elec- 
tric field  8 of  charge  q at  the  position  of  charge  qt,  and  then  finding  F 
from  the  equation  F = qt£>. 

Note  also  that  Example  21-2  illustrates  a case  where  the  negative  sign 
of  the  test  charge  (the  electron)  leads  to  an  electric  potential  V and  an  elec- 
tric potential  energy  U of  opposite  signs.  The  source  charge  (the  proton) 
is  positive,  so  V has  a positive  value.  But  U has  a negative  value. 

You  can  see  the  origin  of  the  negative  value  of  U from  the  following 
basic  considerations.  Since  an  attractive  force  is  exerted  on  the  test  charge 
by  the  source  charge,  positive  work  will  be  done  by  this  force  as  the  test 
charge  moves  in  from  an  infinite  distance,  where  the  electric  potential  en- 
ergy has  the  agreed-upon  value  zero,  to  a position  nearer  the  source 
charge.  The  work-potential  energy  relation,  W = — A U,  shows  that  when 

21-1  Electric  Potential  Energy  and  Electric  Potential  949 


the  work  W done  is  positive,  the  change  At/  in  electric  potential  energy  is 
negative.  Thus  the  electric  potential  energy  U of  the  system  is  negative 
when  the  separation  between  the  two  charges  is  finite. 

The  joule  is  an  inconveniently  large  unit  for  expressing  energies  typi- 
cal of  atomic  systems,  such  as  the  electric  potential  energy  U found  in  Ex- 
ample 21-2.  A unit  of  convenient  size  can  be  obtained  by  writing  Eq.  (21-6) 
as 

U = qtV  (21-8) 

Then  this  equation  is  used  to  evaluate  the  electric  potential  energy  of  a 
system  containing  a particle  with  a positive  charge  whose  magnitude  is  that 
of  one  electron  charge  at  a position  where  the  electric  potential  has  a positive 
value  of  magnitude  one  volt.  This  energy  is  taken  as  a unit  of  energy.  It  is 
called  the  electron-volt  and  is  written  eV.  Its  value  is  obtained  by  setting 
qt  = +e  and  V — +1  V in  Eq.  (21-8),  to  give 

1 eV  = e x 1 V = 1.60  x KT9  C x 1 V 


or 

1 eV  = 1.60  x 10“19J  (21-9) 

Expressed  in  terms  of  electron-volts,  the  electric  potential  energy  U found 
in  Example  21-2  for  a hydrogen  atom  has  the  numerical  value  U = 
— 27.2  eV.  This  is  so  because  the  electron  in  the  atom  is  a particle  carrying 
the  charge  qt  = — e and  is  at  a position  where  the  electric  potential  has  the 
value  V = 27.2  V. 


p • 


Fig.  21-3  A charge  distribution  and  a 
test  charge  at  position  P . 


Now  we  extend  our  consideration  of  electric  potential  to  cases  where 
the  source  charge  consists  of  a set  of  n point  charges  qx,  q2,  ...  ,q},  ...  , 
qn,  instead  of  just  a single  point  charge  (or  a spherically  symmetrical  charge 
distribution  that  can  be  treated  as  a single  point  charge).  The  situation  is  il- 
lustrated in  Fig.  21-3.  At  the  position  P of  the  test  charge,  the  electric  po- 
tential of  the  typical  source  charge  q}  has  the  value  Vj.  As  is  justified  in  the 
next  paragraph,  the  total  electric  potential  at  P has  the  value  V given  by 

V = Vj  + V2  + ■ ■ • + Vj  + ■ ■ ■ + V„  (21-10) 


In  words,  this  equation  states  that  electric  potentials  are  additive.  Applying  to 
it  expressions  obtained  from  Eq.  (21-7)  for  the  electric  potential  Vj  of  the 
point-source  charge  qjt  we  have 

y = TJ_  («!  + «!+.  • •+«'  + ■ • ■ + 

4:7T€q  \?i  T2  Y j 

Here  r5  is  the  distance  from  source  charge  qx  to  the  position  P.  In  summation 
notation  this  important  result  assumes  the  form 


Qn 


(21-11) 


V - 


1 A Qj 
47760  rJ 


(21-12) 


The  result  in  Ecp  (21-12)  is  based  on  the  additivity  of  electric  poten- 
tials. An  intuitive  justification  of  this  property  is  found  by  considering  the 
fact  that  potential  energies  are  additive.  Since  electric  potential  is  just  elec- 
tric potential  energy  per  unit  charge  of  the  test  charge,  electric  potentials 
also  are  additive.  A more  formal  justification  can  be  found  by  starting  with 
the  fact  that  the  total  electric  force  F,  acting  on  the  test  charge  is  given  by 


950  The  Electric  Potential 


<73  =+3.00  X 1CT6  C 


Fig.  21-4  Illustration  for  Example  21-3. 


F,  = F,,  + Ff2  + • • • + F,.  + • • • + ~Ftn,  where  F^  is  the  force  exerted  on 
the  test  charge  by  source  charge  q3.  By  substituting  this  into  Eq.  (21-1)  and 
then  repeating  the  calculations  leading  to  Eq.  (21-7),  the  result  obtained  is 
precisely  that  given  in  Eq.  (21-1 1). 

An  application  of  Eq.  (21-11)  is  found  in  Example  21-3. 


EXAMPLE  21-3 

Figure  21-4  shows  the  same  triangular  array  of  three  point  charges  depicted  in  Fig. 
20-17.  Find  the  electric  potential  V of  these  source  charges  at  the  midpoint  P of  the 
base  of  the  triangle. 

■ According  to  Eq.  (21-10), 


V = Vi  + V2  + V3 


where  V1,  V2,  and  V3  are  the  electric  potentials  of  the  individual  source  charges  qlt 
q2,  and  q3.  More  particularly,  Eq.  (21-11)  shows  that 


= 1 ((Il  + (Il 

47760  Wl  r2 


<73 

r3 


where  rx,  r2,  and  r3  are  the  distances  to  point  P from  q1,  q2,  and  q3.  Inserting  the  nu- 
merical values  taken  from  the  figure,  you  obtain 


V = 8.99  x 109  N-m2/C2 


+ 1 .00  x 10-6  C —2.00  x 10-6  C +3.00  x 10-6  C 

+ 


0.200  m 


0.200  m 


0.300  m 


or 


7 


V = 4.50  x 104  V 


You  should  compare  the  calculation  of  the  electric  potential  V at  point 
P in  Example  21-3  with  the  calculation  of  the  electric  field  8 at  point  P in 
Example  20-6o.  The  comparison  will  convince  you  that  it  is  much  easier  to 
find  the  electric  potential  of  a set  of  charges  than  to  find  their  electric  field. 
The  reason  is  that  finding  V involves  the  scalar  addition 

O 


V = 


1 y Qj 
47760  pi  rs 


whereas  finding  8 involves  the  vector  addition 


4776, 


« q5  , 

o J=1  rj 


21-1  Electric  Potential  Energy  and  Electric  Potential  951 


dq 


q 


Fig.  21-5  A continuous  distribution  of  charge. 

Scalar  addition  is  usually  much  less  laborious  than  vector  addition.  We  will 
be  able  to  take  advantage  of  this  fact  after  we  develop  a way  of  calculating 
electric  fields  from  electric  potentials  in  Sec.  21-2. 


But  hrst  we  conclude  this  section  by  considering  the  electric  potential 
of  a continuously  distributed  set  of  source  charges.  Figure  21-5  shows 
source  charge  q distributed  continuously  over  a certain  region  of  space.  We 
would  like  to  find  the  electric  potential  V at  a point  P.  To  clo  this  we  divide 
the  source  charge  into  infinitesimal  elements  dq  and  then  consider  each  ele- 
ment to  be  a point  charge.  This  allows  us  to  use  Eq.  (21-7)  to  write 


dV 


1 dq 

47T£o  r 


(21-13) 


Here  dq  is  the  charge  in  an  element  of  the  source  charge,  r is  the  distance 
from  it  to  P,  and  dV  is  its  contribution  to  the  electric  potential  at  P.  Next  we 
integrate  Eq.  (21-13)  over  the  charge  distribution: 


I 


dV  = 


47 T£0 


charge 

distribution 


charge 

distribution 


The  integral  on  the  left  side  of  this  equality  gives  the  electric  potential  V at 
P because  it  sums  the  contributions  to  V from  all  the  charge  in  the  distribu- 
tion. Thus  we  have 


1 f (k 
477£0  J r 

charge 

distribution 


(21-14) 


Example  21-4  shows  an  application  of  this  equation. 


EXAMPLE  21-4  — — 'tnim—  n 

Figure  21-6  depicts  a circular  disk  of  radius  b having  a uniformly  distributed  charge 
q.  Find  the  electric  potential  V at  a point  P along  the  axis  of  the  disk  at  a distance  z 
from  its  center. 

■ According  to  Eq.  (21-14), 

1 f dq 
E=q—  — 

47re0  J r 

disk 

To  evaluate  the  integral,  you  draw  a ring  of  infinitesimal  width  dR,  as  shown  in  the 
figure.  Every  part  of  this  ring  is  at  the  same  distance  r from  P.  This  being  the  case, 
the  ring  can  be  taken  as  a single  element  containing  charge  dq.  The  value  of  dq  can 
be  found  by  considering  the  fact  that  if  the  radius  of  the  ring  is  R , it  has  an  area 
27 tR  dR,  whereas  the  area  of  the  entire  disk  is  irb2.  Since  the  charge  is  uni- 


952  The  Electric  Potential 


p 


Fig.  21-6  A uniformly  charged,  circu- 
lar disk  of  radius  b and  a point  P located 
on  its  axis  at  distance  z. 


formly  distributed,  the  ring  contains  a fraction  dq/q  of  the  entire  charge  on  the  disk 
equal  to  2 ttR  dR/nb2,  its  share  of  the  entire  area  of  the  disk.  Thus  you  have 

dq  2ttR  dR 
q 7 zb2 


or 

2 qR  dR 
dq  = b' 

The  distance  r from  any  point  on  the  ring  to  P is  given  by  the  pythagorean 
theorem: 


r = (R2  + z2)112 

As  shown  in  the  figure,  the  coordinate  z is  measured  along  the  axis  of  the  disk  from 
an  origin  at  the  disk. 

Now  that  you  have  expressions  for  dq  and  r,  you  can  write  the  expression  for  V 
as 


V 


1 


2 qR  dR 

20/2 


47re0  Jo  b2(R2  + z 
R dR 


20/2 


27 Te0b2  Jo  (R2  + z: 

The  limits  on  the  integral  are  as  written  since  R ranges  from  0 to  b over  the  disk. 
Keeping  in  mind  that  z is  a constant  when  the  integration  is  performed  and  con- 
sulting a table  of  integrals,  you  find 


R dR 


o (R2  + z2)1'2 


So  you  get 


V = 


= [(R2  + z2y2]R=b  - [(A2  + zT2L=0 
= ( b 2 + z2)1'2  - z 

9 


2iT6ob2 


[( b 2 + z2)1/2  - z] 


(21-15) 


To  check  this  result,  consider  the  case  where  the  distance  from  the  disk  to  P is 
quite  large  compared  to  the  radius  of  the  disk  itself;  that  is,  z»  b.  In  order  to 
make  the  behavior  of  V as  a function  of  z more  apparent,  you  take  advantage  of  the 
condition  z » b to  expand  the  term  ( b 2 + z2 ) 1 /2  in  Eq.  (21-15).  This  is  done  by  writ- 
ing it  in  the  form 

/ b2  \ 

(b2  + z2)1'2  = z (l  + -J 

and  then  using  the  binomial  expansion  approximation 

for  z » b 


b 2 \ i/2 

1 + “o' 


1 b2 
1 + ¥ z2 


Hence  to  a high  degree  of  accuracy  you  have 

1 b2 

( b 2 + z2)1/2  = z + — — for  z » b 

2 z 

The  quantity  in  brackets  in  Eq.  (21-15)  thus  can  be  written 

1 b2 


[( b 2 + z2)1/2  - z]  = 
and  the  equation  itself  reduces  to 


2 z 


for  z » 


V = 


47T€0Z 


for  z » b 


(21-16a) 


21-1  Electric  Potential  Energy  and  Electric  Potential  953 


This  is  the  same  as  the  expression  for  the  electric  potential  of  a point  charge  q lo- 
cated at  the  center  of  the  disk.  It  is  what  you  would  expect  because  when  z » b, 
the  distance  from  any  point  on  the  disk  to  the  point  P is  equal  to  z within  a very  small 
fractional  error.  That  is,  from  a great  distance  the  disk  looks  like  a point. 

The  opposite  extreme  case,  z <5c:  b,  is  the  one  in  which  the  distance  from  the 
disk  to  P is  quite  small  compared  to  the  radius  of  the  disk.  When  this  is  true,  the 
quantity  z2  in  the  term  ( b 2 + z2)1'2  of  Eq.  (21-15)  can  be  neglected  in  comparison  to 
the  quantity  b2.  Thus  to  a high  degree  of  accuracy  the  bracket  in  the  equation  re- 
duces to  [(b2)112  — z]  = b — z,  and  the  equation  itself  reduces  to 

V = „ q ,-9  (b  - z)  for  z « 6 (21-166) 

2ne0b~ 

Example  2 1-5  will  show  you  how  to  interpret  this  limiting  behavior  of  V so  as  to  con- 
clude that  it  is  in  agreement  with  a previously  obtained  result. 


21-2  EVALUATION  OF 
ELECTRIC  FIELD 
FROM  ELECTRIC 
POTENTIAL 


As  has  been  noted,  the  task  of  calculating  the  electric  potential  V of  a cer- 
tain set  of  charges  is  much  less  difficult  than  calculating  the  electric  field  8 
of  these  source  charges.  The  reason  is  that  calculating  V involves  a scalar 
summation  (or  integration),  while  calculating  8 involves  a vector  summa- 
tion (or  integration).  But  sometimes  you  really  need  to  know  8,  not  V.  In 
such  circumstances  calculating  8 directly  from  the  charge  distribution  by 
means  of  Coulomb’s  law  often  is  not  the  easiest  method.  Less  effort  is  in- 
volved when  the  following  two-step  method  is  employed: 


1.  The  electric  potential  V is  calculated  from  the  charge  distribution. 

2.  The  procedure  developed  immediately  below  is  used  to  evaluate  8 
from  V. 


The  development  is  brief  since  it  is  just  an  application  of  the  proce- 
dure developed  in  Sec.  7-7  for  evaluating  force  from  potential  energy. 
We  begin  with  Eqs.  (7-59): 


Fx 

Fy 

Fz 


dU(x,  y,  z) 
dx 

dU(x,  y,  z) 
dy 

dU(x,  y,  z) 
dz 


The  quantity  U(x,  y,  z)  is  a potential  energy  that  depends  on  the  three  coor- 
dinates x,  y,  z,  and  the  quantities  Fx,  Fy,  Fz  are  the  components  along  the 
three  coordinate  axes  of  the  force  associated  with  the  potential  energy. 
These  relations  apply  to  any  potential  energy  and  its  associated  force,  in- 
cluding an  electric  potential  energy  and  the  electric  force  giving  rise  to  it. 
This  being  the  case,  we  consider  a test  charge  qt  at  a position  whose  coordi- 
nates are  x,  y,  z.  The  electric  force  exerted  on  it  by  a set  of  source  charges 
is  F,,  whose  components  are  Ftx , Ft  , and  FL  . The  electric  potential  energy 
of  the  system  arising  from  this  force  is  U(x,  y,  z).  The  connection  between 


954  The  Electric  Potential 


the  electric  force  and  the  electric  potential  energy  is 


Ftx  = 
F*.= 

Ft  = 


8U(x,  y,  z) 
dx 

dU{x,  y,  z ) 
dy 

dU(x,  y,  z) 
dz 


Let  us  divide  both  sides  of  each  of  these  three  equations  by  qt,  the 
charge  on  the  test  charge,  and  then  use  the  fact  that  q,  is  a constant  to  make 
it  a part  of  the  quantity  whose  partial  derivative  is  to  be  evaluated.  We  have 


Ftx 

1 

dU(x,  y,  z) 

d[U{x,  y,  z) / qP\ 

Qt 

Qt 

dx 

dx 

Ft . 

1 

dU(x,  y,  z) 

d[U(x,  y,  z)/qt\ 

Qt 

Qt 

dy 

dy 

Ft,  _ 

1 

dU(x,  y,  z) 

d[U(x,  y,  z)/qt ] 

Qt 

Qt 

dz 

dz 

Now  the  electric  held  8 is  defined  to  be 


(2 1-1  la) 
(21-17  b) 
(21-17  c) 


Qt 

so  that  %x  = FtJqt,  %y  = FtJqt,  and  %z  = FtJqt.  And  the  electric  potential  V 
is  defined  to  be 


V(x,  y,  z) 


U(x,  y,  z) 
Qt 


Using  these  definitions  in  Eqs.  (21-17),  we  obtain  the  desired  relations 
between  the  electric  held  and  the  electric  potential: 


Fig.  21-7  A uniform  electric  field 
directed  along  the  x axis,  represented  by 
electric  field  lines  and  by  the  electric 
field  vector  8.  Also  indicated  are  planes 
parallel  to  the  yz  plane.  The  electric  po- 
tential V has  a constant  value  on  each 
plane  because  V depends  on  only  x.  The 
value  of  V decreases  as  x increases.  The 
displacement  vector  d s shown  is  used 
when  this  figure  is  reconsidered  later. 


S’* 


dV(x,  y,  z) 
dx 

dV(x,  y,  z) 
dy 

dV(x,  y,  z) 
dz 


(2 1-1 8a) 
(21-186) 
(2 1-1 8c) 


The  component  along  a coordinate  axis  of  the  electric  field  of  a set  of  source  charges  is 
given  by  the  negative  of  the  partial  derivative  with  respect  to  that  coordinate  of  the 
electric  potential  of  the  source  charges. 


The  simplest  interpretation  of  Eqs.  (21-18)  is  found  in  a case  where  the 
electric  held  is  uniform  (or  can  be  considered  essentially  uniform)  in  some 
region.  If  we  let  the  x axis  extend  along  the  direction  of  8 in  this  region,  as 
in  Fig.  21-7,  then  %y  = 0 and  — 0.  Hence  Eqs.  (21-186)  and  (21-1 8c ) 
read 


dV(x,  y,  z) 
dy 


= 0 


and 


dV(x,  y,  z) 
dz 


= 0 


21-2  Evaluation  of  Electric  Field  from  Electric  Potential  955 


These  two  relations  tell  us  that  the  electric  potential  V has  no  y or  z depen- 
dence in  the  region,  and  so  it  can  be  written  V(x).  Equation  (2 1-1 8a)  thus 
reads 

3V(x) 


But  %x  = %.  Also,  there  is  no  distinction  between  the  partial  derivative  and 
the  ordinary  derivative  in  this  case,  so  (hat  dV(x)/dx  = dV/dx.  Therefore 
we  have 


This  tells  us  that  the  magnitude  of  the  electric  field  equals  the  magnitude  of  the  rate 
of  change  of  electric  potential  with  respect  to  position,  and  the  direction  of  the  electric 
field  is  the  direction  in  which  the  electric  potential  decreases  most  rapidly. 

Although  the  italicized  statement  just  made  was  obtained  by  consider- 
ing a uniform  electric  held,  it  is  valid  even  when  the  electric  held  varies 
from  point  to  point.  The  reason  is  that  the  statement  concerns  a relation 
between  the  electric  held  and  the  electric  potential  in  the  immediate  neigh- 
borhood of  some  point.  So  it  is  unaffected  by  what  these  quantities  do  at 
some  other  point. 


Equations  (21-18)  can  be  used  to  evaluate  the  electric  held  of  a charge 
distribution  from  the  electric  potential  of  the  distribution.  Numerical  re- 
sults of  such  an  evaluation  can  be  stated  by  expressing  the  magnitude  of  the 
electric  held  in  units  of  volts  per  meter  (V/m),  as  suggested  by  an  equation 
such  as  %x  — — dV(x,  y,  z)/dx,  instead  of  newtons  per  coulomb,  as  suggested 
by  an  equation  such  as  %x  = FtJqt.  That  the  two  units  are  equivalent  is 
shown  as  follows: 

V 1 N-m  N 

1 “ = 1 r = 1 T = 1 r 
m C-m  (.  ■ m C 

Example  21-5  employs  Eqs.  (21-18)  in  evaluating  the  electric  held  at 
points  along  the  axis  of  a uniformly  charged  circular  disk  from  the  electric 
potential  at  these  points  found  in  Example  21-4. 


EXAMPLE  21-5 

Evaluate  the  electric  held  at  a point  P along  the  axis  of  the  circular  disk  of  Example 
21-4,  having  radius  b and  uniformly  distributed  charge  q,  as  illustrated  in  Fig.  21-6. 
Do  this  by  using  in  Eqs.  (21-18)  the  expression  given  by  Eq.  (21-15)  for  the  depen- 
dence of  the  electric  potential  on  the  axial  coordinate  z of  point  P. 

■ Writing  the  electric  potential  in  Eq.  (21-15)  as  V(z),  you  have 


V(z)  = 


- [( b 2 + z2)1/2  - z] 


27760  b2 

Applying  Eqs.  (21-18a)  and  (21-186)  to  this  electric  potential  gives  you 


dV(z) 

dx 


= 0 


and 


% 


y 


dV(z) 

dy 


0 


These  results  tell  you  that  the  electric  held  8 at  points  on  the  axis  of  the  disk  has  no 
components  in  directions  perpendicular  to  the  axis.  This  is  in  agreement  with  con- 


956  The  Electric  Potential 


elusions  that  can  be  drawn  from  a simple  symmetry  argument.  What  is  the  argu- 
ment? 

Applying  Eq.  (2 1-1  Be),  you  obtain 


= - 


dV(z ) = 
dz 

Q 

2 7 re062 


dz  I2'77'e061 


[( b 2 + z2)1'2  - z] 


| (62  + z2)-1/2(2z)  - 1 


or 


sr 


2 


q 

2ne0b2 


[1  - z(62  + z2)-1/2] 


(21-19) 


This  is  the  required  expression  for  the  electric  held. 

Just  as  you  did  for  Eq.  (21-15)  in  Example  21-4,  you  can  check  Eq.  (21-19)  by 
considering  its  limiting  behavior.  In  the  limit  z » b (that  is,  when  the  distance 
from  the  disk  to  P is  large  compared  to  the  radius  of  the  disk),  you  make  use  of  the 
binomial  expansion  approximation  by  writing 


1 bus 


(i b 2 + z2)-1'2  = z”1 


2 z 2 


1 b 2 

z(b2  + z2)-112  = 1 -2^ 


and  the  quantity  in  brackets  in  Eq.  (21-19)  becomes 


1 - z(62  + z2) 


bl 

9 

Z“ 


So  to  a very  good  approximation  you  can  write  Eq.  (21-19)  as 


«?2  = - — - — r for  z » b (21-20a) 

47re0z 

You  also  can  obtain  the  same  result  by  applying  %z  = - dV(z)/ dz  directly  to  Eq. 
(21-16a).  The  result  certainly  makes  sense  because  it  says  that  very  far  away  from 
the  disk  its  electric  held  is  the  same  as  that  of  a charge  q located  at  its  center. 

In  the  limit  z « b (in  other  words,  when  the  distance  from  the  disk  to  P is 
small  compared  to  its  radius),  the  term  z(62  + z2)112  in  the  brackets  of  Eq.  (21-19)  will 
be  small  compared  to  the  hrst  term,  1.  Hence  the  value  of  the  quantity  in  brackets 
will  be  nearly  equal  to  1,  and  you  can  write  with  good  accuracy 


= ——r,  for  z«b  (21  -20  b) 

2 Tre0b 

Another  way  you  can  obtain  the  same  result  is  to  use  Eq.  (21-166)  in  the  relation 
%z  = — dV{z)/ dz.  The  result  is  in  agreement  with  the  one  obtained  in  Example  20-8, 
where  Gauss’  law  was  applied  to  evaluate  the  electric  held  by  a plane  of  infinite  ex- 
tent, carrying  the  uniform  charge  per  unit  area  cr.  According  to  Eq.  (20-45),  its 
value  is  % = cr/2e0.  For  z <5C  b the  edges  of  the  uniformly  charged  disk  are  so  far 
away  from  the  point  P,  compared  to  the  distance  from  the  disk  to  P,  that  they  might 
as  well  be  inhnitely  far  away.  That  is,  the  fact  that  the  disk  is  of  finite  extent  should 
make  no  difference  to  the  value  of  the  electric  held.  The  charge  per  unit  area  on  the 
disk  carrying  charge  q is  cr  = q/vb2,  since  rrb 2 is  its  area.  Thus  if  we  drop  the  sub- 
script z in  Eq.  (21-206)  to  conform  to  the  notation  used  in  Eq.  (20-45),  the  former  can 
be  written  % = o-/2e0,  in  agreement  with  the  latter. 

Equations  (21-20a)  and  (21-206)  show  that  very  near  the  charged  disk  the  elec- 
tric held  along  its  axis  has  a magnitude  independent  of  the  distance  from  the  disk, 
whereas  far  from  the  disk  this  held  decreases  in  inverse  proportion  to  the  square  of 


21-2  Evaluation  of  Electric  Field  from  Electric  Potential  957 


the  distance.  At  intermediate  distances  the  magnitude  of  the  electric  held  has  the 
intermediate  dependence  given  by  Eq.  (21-19)  for  z neither  small  nor  large  com- 
pared to  b.  These  equations  provide  you  with  quantitative  justification  of  the  quali- 
tative arguments  given  in  the  material  following  Example  20-8. 


The  relation  between  electric  field  and  electric  potential  can  be  put 
into  a form  different  from,  but  closely  connected  with,  the  form  given  in 
Eqs.  (21-18).  To  develop  this  useful  reexpression  of  the  relation,  we  con- 
sider a test  charge  qt  in  electric  field  8.  The  electric  force  exerted  on  the  test 
charge  is  F,  = qtE>.  When  the  test  charge  experiences  a displacement  ds,  the 
work  dW  done  by  this  force  is  dW  = F,  • ds  = qt 8 ■ ds.  The  work-potential 
energy  relation  says  that  the  change  dU  in  the  electric  potential  energy  as- 
sociated with  the  electric  force  is  dU  = —dW  — — qtE>  • ds.  We  write  this  as 
dU/qt  = — 8 • ds.  And  since  qt  is  a constant,  we  can  also  write  it  as 
d(U/qt)  = — 8 * ds.  But  U/qt  — V,  the  electric  potential.  Thus  we  have  the 
desired  relation 

dV  = - 8-  ds  (21-21) 

The  infinitesimal  change  in  the  electric  potential  occurring  in  an  infinitesimal  dis- 
placement is  the  negative  of  the  dot  product  of  the  electric  field  and  the  displacement. 

You  can  see  the  connection  between  Eqs.  (21-18)  and  Eq.  (21-21)  by 
considering  again  the  situation  illustrated  in  Fig.  21-7.  In  a certain  region 
the  electric  field  8 is  uniform,  and  tfie  x axis  is  directed  along  8.  As  before, 
in  this  situation  Eqs.  (21-18)  reduce  to 

dx 

The  relation  can  be  written 

d V = — % dx 

Taking  the  displacement  ds  in  Eq.  (21-21)  to  be  in  the  direction  of  8 . and 
therefore  in  the  x direction,  gives  8 • ds  = % ds  = % dx.  Thus  Eq.  (21-21) 
reduces  to  the  relation 

dV  = — % dx 

in  complete  agreement  with  the  one  obtained  from  Eqs.  (21-18).  Frequent 
use  will  be  made  of  Eq.  (21-21). 


21-3  EQUIPOTENTIAL 
SURFACES  AND 
ELECTRIC  FIELD 
LINES 


At  any  point  P in  an  electric  field  8 it  is  possible  to  construct  an  infinites- 
imal surface  centered  on  the  point  and  oriented  normal  to  the  direction  of 
the  electric  field  at  the  point.  The  construction  is  indicated  in  Fig.  21-8a. 
Also  shown  in  the  figure  is  an  infinitesimal  displacement  vector  ds  from  P 
to  any  other  point  P'  lying  in  the  surface.  According  to  Eq.  (21-21),  the 
change  dV  in  the  electric  potential  in  going  from  P to  P’  is  dV  = -8  • ds. 
But  since  ds  is  perpendicular  to  8,  we  have  8 • ds  = 0 and  so  dV  = 0.  From 
this  it  follows  that  the  electric  potential  V is  constant  over  the  infinitesimal 
surface  normal  to  the  electric  field  8 . 

At  a boundary  of  the  infinitesimal  surface,  another  similar  surface  can 
be  constructed  normal  to  the  electric  field  at  its  center.  The  two  adjoining 
surfaces  are  shown  in  Fig.  21-86.  The  surfaces  have  a common  value  of 
electric  potential  V at  their  common  boundary.  So  V has  the  same  value 


958  The  Electric  Potential 


Fig.  21-8  (a)  An  infinitesimal  surface 

centered  on  point  P and  oriented  such 
that  the  electric  field  £ at  P is  normal  to 
the  surface.  Point  P'  also  lies  in  the  sur- 
face, so  the  displacement  vector  d s from 
P to  P'  is  perpendicular  to  the  vector  £. 
( b ) Adjacent  infinitesimal  surfaces  with  a 
common  boundary.  Each  is  oriented  so 
that  £ is  normal  to  it  at  its  center. 


over  both  surfaces.  This  process  can  be  repeated  at  the  other  boundaries 
and  then  continued  to  the  boundaries  of  the  adjoining  surfaces — with  each 
infinitesimal  surface  being  constructed  normal  to  the  local  direction  of  the 
electric  field  — until  a complete  surface  results.  If  the  source  of  the  electric 
field  is  a single  point  charge,  the  resulting  surface  will  be  a sphere  centered 
on  the  source,  as  shown  in  Fig.  21-9.  Also  shown  are  electric  field  lines 
emanating  from  the  point  charge.  They  all  are  normal  to  the  surface  where 
they  cross  it  because  the  field  lines  everywhere  are  in  the  direction  of  8 
while  the  surface  everywhere  is  normal  to  that  direction.  (Remember  that 
the  surface  of  a sphere  is  everywhere  normal  to  its  radii.)  If  the  source  of 
the  electric  field  is  more  complex  than  a single  point  charge,  then  both  the 
surface  and  the  field  lines  will  be  correspondingly  more  complex.  But  still 
the  field  lines  will  always  be  normal  to  the  surface  where  they  cross  it. 

The  electric  potential  V has  a constant  value  over  the  entire  surface. 
The  reason  is  that  any  finite  displacement  in  the  surface  is  a sum  of  infini- 
tesimal displacements  ds,  all  of  which  lead  to  dV  = 0 since  for  all  of  them 
8 • ds  = 0 because  ds  is  perpendicular  to  8.  We  therefore  can  characterize 
the  surface  by  writing 

V = constant 

The  surface,  on  which  the  electric  potential  V everywhere  has  an  equal  value,  is 

called  an  equipotential  surface,  or  an  equipotential  for  short.  Wherever  an 
electric  field  line  crosses  an  equipotential  surface,  it  is  normal  to  the 
surface. 


Any  source  charge,  or  set  of  source  charges,  is  surrounded  by  a nest  of 
equipotential  surfaces.  The  electric  potential  has  a constant  value  on  each 
surface,  and  the  value  differs  from  one  surface  to  another.  Figure  21-10  in- 


Fig.  21-9  For  a single  point-source 
charge,  a spherical  surface  centered  on 
the  charge  is  one  on  which  the  electric 
potential  has  a constant  value.  The  elec- 
tric field  lines  radiate  uniformly  from 
the  charge. 


Fig.  21-10  A schematic  representation 
in  two  dimensions  of  the  actual  three-di- 
mensional equipotential  surfaces  and 
electric  field  lines  for  a positive  point 
charge.  The  field  lines  have  arrow- 
heads; the  curves  representing  equipo- 
tentials  do  not.  Although  the  value  of 
the  electric  potential  V decreases  in 
going  from  one  equipotential  to  the 
next  in  the  direction  away  from  the 
charge,  particular  values  of  V for  each 
equipotential  are  not  specified.  Thus 
the  information  provided  by  the  equi- 
potentials  is  qualitative  in  that  they  give 
the  shapes  of  the  surfaces  of  constant  V 
but  not  specific  values  of  V on  each  sur- 
face. The  same  is  true  of  the  electric 
field  lines,  which  give  the  direction  of 
the  electric  field  £ but  not  its  magni- 
tude. The  equipotentials  can  be  made 
quantitative  by  using  techniques  devel- 
oped earlier  in  this  chapter  to  evaluate  V 
at  some  point  on  each  of  them.  But 
there  is  no  way  to  make  the  electric  field 
lines  completely  quantitative  by  using 
the  techniques  developed  in  Chap.  20. 
This  is  because  the  magnitude  of  £ de- 
termines the  number  of  field  lines  per 
unit  of  normal  area,  and  there  is  no  way 
to  represent  this  essentially  three-di- 
mensional feature  quantitatively  in  a 
two-dimensional  figure. 


21-3  Equipotential  Surfaces  and  Electric  Field  Lines  959 


dicates  the  equipotential  surfaces,  and  also  the  electric  held  lines,  for  the 
simplest  case — a single  positive  point  charge.  Because  both  the  equipoten- 
tial surfaces  and  the  pattern  of  held  lines  are  symmetric  about  any  axis 
passing  through  the  charge,  the  hgure  can  give  an  adequate  representation 
of  both  by  showing  their  cross  sections  in  the  plane  of  the  page.  As  has 
already  been  concluded,  each  equipotential  surface  is  a sphere  centered  on 
the  source  charge. 

A different  way  to  reach  the  same  conclusion  is  through  Eq.  (21-7), 


This  evaluates  the  electric  potential  V of  a source  charge  q at  a position 
whose  distance  from  q is  r.  All  positions  at  which  V has  the  same  value  are 
on  one  equipotential  surface.  Such  a surface  is  a sphere  centered  on  the 
source  charge,  because  all  positions  on  it  correspond  to  the  same  value  of  r 
and  therefore  to  the  same  value  of  V.  Equation  (21-7)  makes  it  apparent 
that  the  closer  an  equipotential  surface  is  to  the  positive  source  charge  (that 
is,  the  smaller  the  value  of  r),  the  larger  is  the  positive  value  of  V on  the  sur- 
face. The  equipotential  surfaces  shown  in  the  hgure  are  not  labeled  with 
numerical  values  of  V.  But  this  could  be  done  if  the  value  of  q were  speci- 
fied, the  value  of  r for  each  surface  measured,  and  then  Eq.  (21-7)  used  to 
evaluate  V. 

Figure  21-11  represents  equipotential  surfaces  and  electric  held  lines 
in  the  same  way  as  Fig.  21-10.  But  in  this  hgure  their  source  consists  of  two 
separated  point  charges,  both  charges  being  positive  and  having  the  same 
magnitude.  What  changes  would  be  required  to  make  Figs.  21-10  and 
21-1  1 represent  equipotential  surfaces  and  electric  held  lines  if  in  both  the 
source  charges  were  negative? 


Fig.  21-11  A schematic,  two-di- 
mensional representation  of  the 
equipotential  surfaces  and  elec- 
tric held  lines  for  a set  of  two  sep- 
arated point  charges  of  the  same 
sign  and  magnitude.  The  held 
lines  have  arrowheads  denoting 
their  direction.  As  in  Fig.  21-10, 
the  information  provided  by  the 
hgure  is  qualitative  because  the 
values  of  V and  % are  not  speci- 
fied. 


Although  they  exist  in  three  rather  than  in  two  dimensions,  a nest  of 
equipotential  surfaces  is  analogous  to  a set  of  contour  curves  on  a map.  If 
you  follow  a contour  curve  along  a hillside,  you  remain  at  the  same  height 
regardless  of  the  complexities  of  the  shape  of  the  hill.  If  you  follow  an  equi- 
potential surface,  you  remain  at  the  same  electric  potential  no  matter  how 
complicated  the  distribution  of  its  source  charges.  Thus  in  Fig.  21-1 1 the 
locations  of  the  nearby  positive  source  charges  of  equal  magnitude  are 
analogous  to  the  locations  of  nearby  “peaks”  of  equal  height  in  a mountain 
range.  And  the  location  halfway  between  the  charges  is  analogous  to  the  lo- 
cation of  the  “saddle”  halfway  between  the  peaks. 

The  electric  held  lines  are  analogous  to  the  paths  of  steepest  descent 
from  one  contour  curve  of  a map  to  its  neighbor.  Since  electric  held  lines 
are  everywhere  normal  to  equipotential  surfaces,  a held  line  is  the  shortest 
path  from  a given  point  on  one  equipotential  to  any  point  on  a neighboring 
equipotential.  All  other  paths  will  be  “gentler”;  that  is,  they  will  take  you 
through  a greater  distance  in  passing  through  the  same  difference  in  elec- 
tric potential. 

The  construction  of  Fig.  21-10  is  a matter  of  very  simple  geometry. 
But  the  construction  of  Fig.  21-11  is  not  simple.  Such  a hgure  can  be  ob- 
tained from  numerical  calculations  which  trace  the  electric  held  lines,  and 
also  the  equipotentials,  produced  by  a set  of  two  equal  point  charges.  The 
calculations  employ  the  held  lines  and  equipotentials  program  given  in  the 
Numerical  Calculation  Supplement. 

The  operations  carried  out  by  the  programmable  calculator,  or  small 
computer,  used  in  the  calculations  are  as  follows.  To  produce  the  held 
lines,  the  value  of  each  charge  and  the  x and  z coordinates  of  its  location  are 
stored  in  the  programmed  computing  device.  For  simplicity,  it  is  assumed 
in  the  program  that  charge  qx  has  a value  qx  = + 1 and  location  x = 0,  z = 0. 
It  also  is  assumed  that  charge  q2  has  the  location  x = 0,  z = 5.  Hence  the 
program  defines  charge  qx  to  be  the  unit  of  charge  and  its  distance  from  q2 
to  be  5 units  of  distance.  Also  stored  in  the  device  is  a value  of  a distance  A.s 
(which  typically  is  chosen  to  be  ts  to  too  of  the  distance  between  the 
charges),  and  the  value  of  an  integer  n (typically  5).  Then  the  x and  z coor- 
dinates of  some  initial  point  in  the  held  are  stored,  and  execution  of  the 
program  is  started.  By  using  the  equation  £ = (1/47 Te0)(q/r2)  r,  in  effect 
the  device  evaluates  %x  and  %z,  the  components  of  the  electric  held  vectors 
produced  at  the  initial  held  point  by  each  charge.  It  then  adds  the  corre- 
sponding components  and  uses  the  rule  6 — tan _1(^/^x)  to  determine  the 
direction  of  the  total  electric  held  vector  at  the  initial  held  point.  Next,  it 
evaluates  the  x and  z components  of  a displacement  As  whose  magnitude  is 
A 5 and  whose  direction  6 is  that  of  the  total  electric  held  just  determined.  It 
then  adds  these  components  to  the  x and  z coordinates  of  the  held  point.  If 
the  change  in  these  coordinates  is  small,  this  produces  a good  approxi- 
mation to  the  coordinates  of  the  point  at  a distance  As  from  the  initial  point 
and  on  the  same  held  line  as  the  initial  point.  This  concludes  the  hrst  cycle 
of  calculation.  The  device  then  repeats  the  cycle,  in  effect  taking  another 
step  of  length  As  whose  direction  is  approximately  along  the  electric  held 
line.  Every  n steps  it  stops,  displaying  the  current  values  of  the  x and  z coor- 
dinates of  the  point  on  the  held  line  so  that  they  can  be  plotted  on  graph 
paper.  (It  plots  them  for  you  if  the  device  is  a computer  with  a graphic  dis- 
play.) To  plot  another  held  line,  the  entire  procedure  is  repeated,  but 
starting  with  a different  initial  point. 

21-3  Equipotential  Surfaces  and  Electric  Field  Lines  961 


The  calculations  which  trace  equipotentials  are  the  same  as  those  just 
described  for  the  held  lines,  except  that  the  program  is  modified  so  that 
after  the  direction  of  the  total  electric  held  vector  at  a point  is  determined, 
the  calculating  device  finds  a direction  perpendicular  to  it  before  continuing 
with  the  calculation.  Equipotentials  are  everywhere  oriented  so  that  a direc- 
tion along  them  is  perpendicular  to  the  local  direction  of  the  electric  held 
line.  Hence  this  procedure  produces,  in  good  approximation,  the  coordi- 
nates of  the  points  on  a curve  representing  an  equipotential  passing 
through  the  initial  held  point.  (The  calculations  do  not  determine  the  nu- 
merical values  of  the  electric  potential  for  each  of  the  equipotential  curves. 
If  the  values  are  needed,  they  must  be  obtained  from  separate  calculations. 
Such  calculations  are  readily  performed  for  each  equipotential  curve  by 
choosing  the  point  at  which  it  intersects  the  axis  passing  through  the  two 
charges.) 

This  numerical  procedure  was  used  to  produce  the  held  lines  and 
equipotentials,  displayed  in  Fig.  21-11,  for  two  separated  charges  of  same 
magnitude  and  sign.  It  was  also  used  to  produce  the  held  lines  for  two 
charges  of  same  magnitude  but  different  sign,  shown  in  Fig.  20-22.  Ex- 
ample 21-6  uses  the  procedure  to  yield  plots  of  the  held  lines  and  equipo- 
tentials for  two  charges  of  different  magnitude  and  sign. 


EXAMPLE  21-6  

Use  the  held  lines  and  equipotentials  program  to  trace  a representative  set  of  elec- 
tric held  lines  and  equipotential  curves  for  two  charges  whose  values  and  locations 
are: 

q1  = + 1 (in  C)  at  x = 0 and  z = 0;  q2  = —4  (in  C)  at  x = 0 and  z = 5 (in  cm). 

■ Generally,  you  can  obtain  good  accuracy  by  using  a step  length  of  As  = 0.1  cm, 
and  you  can  obtain  enough  information  by  taking  n = 10  so  that  only  every  tenth 
calculated  point  is  plotted.  But  very  near  the  charges  the  equipotentials  have  quite 
high  curvature,  so  that  for  accuracy  you  should  use  As  = 0.025  cm  with  n = 10.  Far 
from  the  charges  the  held  lines  have  quite  low  curvature,  and  it  suffices  to  take  As  = 
0.2  cm  and  n = 5.  Since  both  the  electric  held  lines  and  the  equipotential  curves  will 
be  symmetrical  about  the  z axis  passing  through  the  two  charges,  you  need  perform 
the  calculations  for  only  the  part  of  the  xz  plane  lying  on  one  side  of  that  axis.  The 
held  lines  and  equipotentials  can  be  extended  to  cover  the  entire  plane  if  you  take 
advantage  of  their  symmetry.  Typical  results  are  displayed  in  Fig.  21-12.  The 
plotted  points  are  shown,  as  well  as  smooth  curves  you  draw  through  them  and 
then  extend  by  symmetry. 

You  should  label  each  equipotential  curve  in  Fig.  21-12  with  the  numerical 
value  of  V on  that  curve.  What  must  you  do  to  obtain  the  information  you  need  to 
do  this? 


The  equipotential  surfaces  and  electric  held  lines  of  sources  com- 
prising three  or  more  point  charges  can  be  found  by  extending  the  nu- 
merical calculations  just  considered  to  such  cases.  In  principle,  the  exten- 
sion is  just  a matter  of  finding  and  adding  components  of  the  electric  held 
more  than  two  at  a time.  But  in  practice  the  necessary  increase  in 
number-handling  capacity  soon  requires  a full-sized  computer  as  the 
number  of  point  charges  increases.  In  some  such  cases,  analytical  methods 
can  be  used  in  place  of  numerical  ones.  But  these  analytical  methods  also 
become  complicated  when  the  number  of  charges  is  not  small,  unless  the 
charge  distribution  is  highly  symmetrical.  In  Sec.  21-5  we  develop  a proce- 


962  The  Electric  Potential 


Fig.  21-12  Electric  field  lines  and  equipotendal  curves  in  any  plane  passing  through  the  axis 
joining  the  charges,  obtained  in  Example  21-6  for  a system  of  two  separated  point  charges  of 
differing  sign  and  magnitude.  Note  that  since  the  system  has  a net  charge  which  is  negative, 
field  lines  come  into  tire  system  from  all  directions  to  end  on  the  negative  charge.  The  field 
lines  very  near  the  negative  charge  are  not  uniformly  distributed  as,  strictly  speaking,  they 
should  be.  The  reason  is  that  extra  lines  have  been  included  to  illustrate  the  transition  between 
lines  beginning  on  the  positive  charge  and  lines  coming  in  from  outside  the  system.  A related 
departure  of  the  diagram  from  strict  accuracy  is  that  it  does  not  show  4 times  as  many  lines 
ending  on  the  charge  q = — 4 C as  there  are  beginning  on  the  charge  q = + 1 C. 


21-3  Equipotential  Surfaces  and  Electric  Field  Lines  963 


dure  that  is  very  useful  for  finding  equipotential  surfaces  of  a source  con- 
taining many  charges,  providing  the  charges  are  contained  on  one  or  more 
bodies  made  of  some  conducting  material.  Electric  field  lines  can  always  be 
constructed,  once  the  equipotential  surfaces  have  been  found,  by  requiring 
each  line  to  be  normal  to  every  surface  through  which  it  passes. 


21-4  ELECTRIC  When  a molecule  is  in  its  normal  state,  there  are  as  many  negatively 
DIPOLES  charged  electrons  surrounding  the  nuclei  of  its  atoms  as  there  are  posi- 
tively charged  protons  in  these  nuclei.  Since  the  charges  on  an  electron  and 
a proton  are  of  the  same  magnitude,  there  is  no  net  charge  on  the  mole- 
cule. But  although  electrically  neutral,  most  molecules  have  charge  distri- 
butions which  are  not  spherically  symmetrical  (noble-gas  molecules 
are  exceptions).  In  the  simplest  cases,  the  most  important  attribute 
of  these  charge  distributions  is  that  the  average  position  of  the  negative 
charge  in  the  molecule  does  not  coincide  with  the  average  position  of  its 
positive  charge. 

A very  important  example  is  the  water  molecule,  shown  schematically 
in  Fig.  21-13 a.  Also  indicated  in  that  figure  are  the  average  positions  of  the 
negative  and  positive  charges  in  the  molecule.  On  the  average,  the  negative 
charge  is  quite  near  the  oxygen  nucleus  because  the  electrons  tend  to  spend 
more  time  near  it  than  near  the  hydrogen  nuclei.  The  average  position 
of  the  positive  charge  is  a little  farther  from  the  oxygen  nucleus. 

For  many  purposes  the  actual  charge  distribution  in  a water  molecule 
can  be  replaced  by  the  charge  distribution  shown  in  Fig.  21-136.  All  the 
negative  charge  is  replaced  by  an  equal  amount  of  charge  located  at  the 
average  position  of  the  actual  charge,  and  the  same  is  done  for  the  positive 


1 X 1CT10  m 


Hydrogen 

nucleus 


Hydrogen 

nucleus 


Oxygen  nucleus 


Average  position  ofposffrve  charge 
Average  position  of  negative  charge 


(a) 


Fig.  21-13  (a)  A water  molecule.  It  is  formed  when 

two  hydrogen  atoms  bond  to  an  oxygen  atom.  Each  of 
these  three  atoms  is  indicated  by  a sphere  centered  on 
its  nucleus.  The  sphere  represents  the  electron  distri- 
bution surrounding  the  nucleus.  But  the  representa- 
tion is  very  schematic  since  the  spherical  symmetry  that 
the  electron  distributions  have  when  the  atoms  are  free 
and  independent  is  modified  when  they  join  to  form 
the  molecule.  That  is,  the  overall  electron  distribution 
in  the  molecule  is  not  accurately  described  by  the  super- 
position of  three  spherical  distributions.  Shown  to  scale 
are  the  positions  of  each  of  the  nuclei  in  the  molecule, 
the  average  position  of  the  positive  charge  in  the  mole- 
cule (the  charges  on  the  nuclei),  and  the  average  posi- 
tion of  the  negative  charge  in  the  molecule  (the  charges 
on  its  electrons).  The  two  average  positions  do  not  coin- 
cide because  the  electron  distribution  is  not  actually  a 
superposition  of  three  spherical  distributions.  ( b ) A 
water  molecule  represented  by  a point  charge  equal  to 
its  total  positive  charge  at  the  average  location  of  this 
charge,  and  another  point  charge  equal  to  its  total  neg- 
ative charge  at  the  location  of  this  charge.  The  two 
point  charges  constitute  an  electric  dipole. 


964  The  Electric  Potential 


charge.  Thus  there  are  two  separated  point  charges  of  equal  magnitude 
but  opposite  sign.  These  two  charges  form  what  is  called  an  electric  dipole. 

Numerous  substances  other  than  water  contain  such  permanent  electric 
dipoles.  And  electric  dipoles  are  induced  when  an  external  electric  held  is 
applied  to  any  substance  which  is  not  a conductor.  What  happens  is  that 
the  applied  electric  held  “stretches”  the  molecules  of  the  substance,  dis- 
placing the  average  position  of  its  negative  charge  from  that  of  its  positive 
charge.  Although  a molecule  may  not  normally  constitute  an  electric  dipole 
because  the  average  positions  of  its  negative  and  positive  charge  coincide, 
when  an  electric  field  is  applied  to  the  molecule,  the  negative  charge  moves 
in  the  direction  opposite  to  that  of  the  held  and  the  positive  charge  moves 
in  the  same  direction  as  that  of  the  held.  The  result  is  that  an  electric  dipole 
is  induced  in  the  molecule. 

In  this  section  we  concentrate  our  attention  on  permanent  electric  di- 
poles, leaving  a treatment  of  induced  electric  dipoles  for  Sec.  21-8.  First  we 
investigate  the  electric  potential  and  electric  held  of  a permanent  electric 
dipole.  Later  we  investigate  the  behavior  of  a permanent  electric  dipole 
when  an  external  electric  held  is  applied  to  it. 


EXAMPLE  21-7 


An  electric  dipole  consists  of  charges  q+  = +|^|  and  — — 1^|,  separated  by  dis- 
tance 2d.  Find  the  electric  potential  V of  the  dipole  at  any  point  P whose  distance  R 
from  the  center  of  the  dipole  is  large  compared  with  2d. 

■ You  begin  by  making  a sketch  like  Fig.  21-14  of  the  geometric  situation.  Speci- 
fically, you  choose  the  coordinate  axes  so  that  the  two  charges  of  the  dipole  are  situ- 
ated on  the  z axis  indicated  in  the  figure,  whose  origin  is  at  the  center  of  the  dipole 


Z 


Fig.  21-14  A sketch  used  in  Example  21-7 
to  evaluate  the  electric  potential  due  to  an 
electric  dipole. 


P 


21-4  Electric  Dipoles  965 


and  whose  positive  direction  is  from  the  negative  charge  to  the  positive  one.  You 
represent  the  distances  to  the  point  P from  q+  and  q-,  respectively,  as  R+  and  R_. 
Then  you  can  use  Eq.  (21-11)  to  write  the  electric  potential  V at  P as 


V = 


1 

+ ^ \ 

w / 1 1 \ 

47760 

\R+ 

+ rJ 

47760  ^R+  P-  ) 

or 


1 

47Te0 


R - - R+ 


R+R _ 


(21-22) 


Since  the  point  P is  very  distant  from  the  dipole — that  is,  R » 2d — 
simplifying  approximations  become  possible.  Call  the  angles  between  the  positive  z 
axis  and  the  lines  to  P from  q+ , </_,  and  the  center  of  the  dipole,  respectively,  8+,  0-, 
and  6 , as  in  Fig.  21-14.  These  angles  are  almost  the  same  because  the  three  lines  are 
very  nearly  parallel  when  P is  very  distant.  But  you  cannot  ignore  the  difference  in 
the  lengths  of  these  lines  since  Eq.  (21-22)  shows  that  V depends  on  this  difference. 
To  evaluate  R-  — R+,  you  can  use  the  following  approximation:  Drop  a perpendic- 
ular from  q+  to  the  line  joining  q-  to  P,  as  shown  by  the  dashed  line  in  Fig.  21-14. 
The  difference  Z?_  — R+  is  then  approximately  the  distance  from  the  perpendicular 
to  q-.  Also  shown  in  Fig.  21-14  is  that  the  cosine  of  the  angle  8 _ is  given  by  the  ratio 
of  R-  — R+  to  2d,  the  distance  between  the  two  charges.  Using  the  fact  that  8-  — 8, 
you  have 

R-  - R+  — 2d  cos  8-  — 2d  cos  8 (21-23a) 

Furthermore,  since  the  difference  in  the  two  lengths  is  small  compared  to  the 
lengths  themselves,  you  can  make  the  approximation 

(21-236) 

Inserting  the  approximations  of  Eqs.  (21  -23«)  and  (21-236)  in  Eq.  (21-22),  you  ob- 
tain the  result 


V = 


47760 


\q\2d- 


cos  8 


R2 


for  R » 2d 


(21-24) 


a— 


Equation  (21-24)  shows  that  at  large  distances  R from  an  electric  di- 
pole, its  electric  potential  falls  off  as  R~2.  This  contrasts  with  the  fact  that 
the  electric  potential  of  a single  point  charge  falls  off  as  Z?-1,  as  you  can  see 
by  writing  Eq.  (21-7)  as  V = (l/47re0 )R~1.  The  more  rapid  decrease  in  V 
with  increasing  R for  an  electric  dipole  is  a result  of  the  tendency  of  its  two 
oppositely  signed  charges  to  cancel  each  other  in  their  effects.  With 
increasing  R this  cancellation  becomes  more  effective  since  the  distances 
from  the  two  charges  to  P become  more  similar. 

The  factor  cos  6 in  Eq.  (21-24)  is  also  plausible.  Every  point  on  the  x 
axis  show'll  in  Fig.  21-14  (6  = 90°  and  cos  6 = 0)  is  equidistant  from  two 
equal  and  opposite  charges.  Thus  everywhere  along  the  x axis  the  sum  of 
the  tw'o  individual  electric  potentials  of  the  individual  charges  is  zero  — 
even  though  the  sum  of  the  individual  electric  fields  of  these  charges  is  not 
zero.  To  look  at  it  another  way,  first  consider  that  although  the  net  electric 
field  at  any  point  along  the  x axis  is  not  zero,  it  is  always  directed  perpendic- 
ular to  that  axis.  This  is  so  because  there  is  a cancellation  of  the  x compo- 
nents of  the  electric  fields  of  the  equidistant,  oppositely  signed  charges. 
Hence  if  a test  charge  comes  in  from  infinity  along  the  x axis,  its  displace- 
ment ds  is  everywhere  perpendicular  to  the  electric  field  £ . Thus  dV  = 


966  The  Electric  Potential 


— 8 • ds  = 0 always,  and  so  at  all  positions  along  the  x axis  V has  the  same 
value  V = 0 that  it  has  at  infinity. 

On  the  other  hand,  at  points  along  the  z axis  (9  = 0°  or  180°  and 
cos  9 = 1 or  — 1)  the  difference  in  the  distances  to  point  P from  the  two 
charges  is  as  great  as  possible,  and  the  cancellation  effect  is  minimized. 
Thus  for  a fixed  distance  R from  the  center  of  the  dipole,  the  electric  po- 
tential has  its  greatest  magnitude  at  9 = 0°  or  180°. 

Another  feature  of  the  cos  9 dependence  is  that  V > 0 in  the  range 
0°  =£  9 < 90°  and  V < 0 in  the  range  90°  < 9 180°.  This  simply  reflects 

the  fact  that  in  the  first  range  P is  closer  to  the  positive  charge  while  in  the 
second  it  is  closer  to  the  negative  charge. 

Equation  (21-24)  shows  that  for  a given  large  value  of  R,  and  for  a 
given  value  of  9 , the  value  of  the  electric  potential  V of  a dipole  depends 
not  on  the  particular  value  of  \q\,  the  magnitude  of  either  of  its  charges,  nor 
on  the  particular  value  of  2d,  their  separation,  but  only  on  the  charge 
magnitude-separation  product  \q\2d.  This  product,  which  characterizes  an 
electric  dipole,  is  called  its  electric  dipole  moment  magnitude  p.  That  is,  by 
definition 

P = \q\2d  (21-25) 

Expressed  in  terms  of  the  magnitude  p of  the  electric  dipole  moment,  Eq. 
(21-24)  becomes 

V = - r — p -—ppr~  for  R » 2d  (21-26) 

47T€o 


EXAMPLE  21-8  ' ■■■n—  iih— 

Using  Eq.  (21-26),  find  %x  and  %z,  the  x and  z components  of  the  electric  field  at  a 
distant  point  P of  an  electric  dipole  whose  electric  dipole  moment  has  magnitude  p. 
The  positive  direction  of  the  z axis  is  from  the  negative  to  the  positive  charge  of  the 
dipole,  as  specified  in  deriving  Eq.  (21-26).  The  positive  x axis  is  chosen  in  an  arbi- 
trary direction  perpendicular  to  that  of  the  z axis.  The  coordinate  origin  is  at  the 
center  of  the  dipole.  Using  the  values  of  %x  and  thus  determined,  find  the  magni- 
tude and  direction  of  8. 

■ You  calculate  the  electric  field  components  from  the  electric  potential  by  using 
the  relations  %x  = —dV/dx  and  ^ = — dV/dz  of  Eqs.  (21-18a)  and  (2 1-  18c).  But  be- 
fore doing  this,  you  must  express  V as  a function  of  x and  z.  Making  a sketch  as  in 
Fig.  21-15,  you  see  that 


and  that 


cos  6 = 


z 

(x2  + z2)1'2 


R2  = x2  + z* 


where  x and  z are  the  coordinates  of  P.  Substituting  these  values  into  Eq.  (21-26), 
you  obtain 


V(x,  z) 


1 z 

UrTo  P (x2  + z2)3'2 


for  R » 2d 


(21-27) 


You  can  now  find  the  components  of  the  electric  field  by  partial  differentiation 
of  Eq.  (21-27).  First  you  compute 


= - 


dV(x,  z) 
dx 


1 


3xz 


47760  ' (X2  + /) 


5/2 


for  R » 2d 


(21-28  a) 


21-4  Electric  Dipoles  967 


Fig.  21-15  A sketch  used  in  Exam- 
ple 21-8  to  evaluate  the  electric  field 
due  to  an  electric  dipole. 


Then  you  compute 


dV{x,  z) 
dz 


1 


1 


3z2 


4-7760  U*2  + Z2)3/2  (x2  + Z2)5/2_ 


1 


47760 


c2  + z2  — 3z2 


so  that 


o 2 2 

77“  — XT 


. P . 9 


4-7760  1 (X2  + Z2)5/2 


(X2  + Z2)5'2  J 


for  ft  » 2d 


(21-286) 


You  can  use  these  results  to  find  the  magnitude  of  the  electric  held  8 from  the 
relation 


(?*  + *|)1« 


~9z2x2  + 4z2  + x2  — 4z2x2~ 

1/2  J 

[ (4z2  + x2 ) (z2  + x2)" 

(x2  + z2)5 

4-7760  ^ 

(x2  + z2)5 

1/2 


1 (4z2  + x2) 


2\l/2 


47760  (X2  + Z2)2 


1 ( R 2 + 3z2)1/2 

p- 


47760  ^ -R4 

For  the  direction  of  6,  you  have 

/?_\ 

<p  = tan  1 (—  1 = tan  1 


for  R » 2d 


3xz 


^2z2  - x2 

where  <f>  is  the  angle  between  the  positive  z axis  and  8 


for  R » 2d 


(21  -29a) 


(21-296) 


968  The  Electric  Potential 


Inspecting  Eq.  (21-29a),  you  can  see  that  far  from  an  electric  dipole 
the  magnitude  of  its  electric  held  decreases  with  increasing  distance  R as 
R~3.  The  contrast  is  with  the  R~2  dependence  for  the  electric  held  magni- 
tude of  a single  point  charge.  How  is  this  related  to  the  contrast  between 
the  R dependences  of  the  electric  potentials  of  an  electric  dipole  and  a 
point  charge  discussed  below  Example  21-7?  What  is  the  direction  of  the 
electric  held  at  a point  far  from  the  dipole  which  is  on  the  positive  z axis,  on 
the  negative  z axis,  on  the  plane  perpendicular  to  the  z axis  and  bisecting 
the  dipole? 

The  electric  held  of  an  electric  dipole  can  also  be  calculated  directly  by 
taking  the  vector  sum  of  the  electric  helds  of  its  two  point  charges.  To  do 
so,  however,  requires  much  more  effort  than  the  method  used  in  Examples 
21-7  and  21-8.  And  even  the  less  laborious  method  becomes  very  laborious 
when  used  to  calculate  the  electric  held  near  the  dipole,  where  the  approxi- 
mations made  possible  by  the  restriction  R » 2d  cannot  be  applied. 

But  in  any  region  the  numerical  method  used  in  Example  21-6  can  be 
employed  in  a straightforward  way  to  trace  the  electric  held  lines  and  the 
equipotential  curves  of  an  electric  dipole.  Results  obtained  by  doing  this  are 
shown  in  Fig.  21-16.  The  specihc  values  of  the  electric  potential  V used  to 
label  the  equipotentials  were  computed  by  taking  |#|  = 1 x 10-10  C and 
2d  = 5 x 10-2  m.  The  computations  were  made  at  the  points  where  the 
equipotentials  intersect  the  line  passing  between  the  charges,  by  using  Eq. 
(21-22). 


Fig.  21-16  A cross  section  in  the  plane  of  the  page  of  the  set  of 
electric  field  lines  and  equipotential  surfaces  of  an  electric  dipole. 
Taking  |<?|  = 1 X 10-10  C and  2d  = 5 x 10-2  m,  values  of  the  elec- 
tric potential  V have  been  computed  and  then  used  to  label  the 
equipotentials.  Note  that  the  electric  field  has  its  maximum  magni- 
tude in  the  region  midway  between  the  two  charges.  This  is  visual- 
ized by  means  of  the  density  of  electric  field  lines,  which  is  largest  in 
that  region,  so  that  the  number  of  field  lines  per  unit  of  perpendic- 
ular area  is  largest  there.  It  is  also  visualized  by  means  of  the  spac- 
ing between  the  equipotentials,  which  is  smallest  in  that  region. 
Thus  the  rate  of  change  of  electric  potential  with  respect  to  posi- 
tion is  largest  there. 


21-4  Electric  Dipoles  969 


q+  = + 1 <7 1 


Fig.  21-17  An  electric  dipole  of  dipole  moment  p = \q\  2ch 
in  a uniform  applied  electric  field  8,  whose  field  lines  are 
directed  parallel  to  the  x axis. 


F 


X 


Having  considered  the  properties  of  the  electric  held  of  an  electric  di- 
pole (as  well  as  those  of  the  electric  potential  associated  with  this  held),  we 
now  turn  to  a quite  separate  consideration.  What  is  the  effect  on  an  electric 
dipole  of  an  external  electric  held  in  which  it  is  situated?  Figure  21-17  illus- 
trates an  electric  dipole  placed  in  an  external  electric  held.  This  electric 
held  8 is  that  of  a set  of  source  charges  not  shown.  It  is  constant  in  magni- 
tude and  direction  throughout  the  region  where  the  dipole  is  located.  The 
figure  shows  that  there  are  electric  forces  exerted  on  the  positive  and  neg- 
ative charges  of  the  dipole  whose  values  are,  respectively,  F+  = ^+8  = |<y |S 
and  F_  = q-  = — |f/|8.  These  two  forces  are  of  equal  magnitude  but  op- 
posite direction.  Thus  no  net  force  is  exerted  on  an  electric  dipole  by  a uniform,  ap- 
plied electric  field. 

However,  the  external  electric  held  does  cause  a torque  to  be  exerted 
on  the  dipole.  The  torque  tends  to  rotate  the  dipole  about  its  center  so  as  to 
align  the  direction  from  its  negative  to  positive  charge  with  the  direction  of 
the  electric  held.  We  obtain  an  expression  for  this  torque  by  choosing  an 
origin  0 at  the  point  midway  between  the  two  charges  of  the  dipole  and 
then  specifying  the  positions  of  its  positive  and  negative  charges  relative  to 
0 by  the  vectors  d+  and  d_,  shown  in  the  figure.  The  torques  T+  and  T_ 
about  O that  act  on  the  two  charges  are,  by  the  definition  of  torque, 


d_  = — d+  and  I — — F+ 


So 


d_  x F^  = ( — d+)  x ( — F+)  = d+  x F+ 


Iherefore 


T = 2d+  x F+  = 2d+  x \q\& 


or 


T = |9|2d+  x 8 


(21-30) 


970  The  Electric  Potential 


The  quantity  |^|2d+  appearing  in  this  expression  is  a vector  version  of 
the  scalar  quantity  p = \q\2d  dehnecl  in  Eq.  (21-25)  as  the  electric  dipole  mo- 
ment magnitude.  We  now  define  the  vector  quantity  to  be  the  electric  di- 
pole moment  p of  the  dipole.  That  is, 

P — k|2d+  (21-31) 

Here  d+  is  the  vector  extending  from  the  midpoint  of  the  dipole  to  its  pos- 
itive charge,  and  |#|  is  the  magnitude  of  either  charge.  In  terms  of  p,  Eq. 
(21-30)  for  the  total  torque  on  the  electric  dipole  assumes  the  form 

T = p x 8 (21-32) 

The  torque  exerted  on  an  electric  dipole  by  a uniform,  applied  electric  field  equals  the 
cross  product  oj  its  electric  dipole  moment  vector  and  the  applied  electric  field  vector. 
Note  that  the  statement  does  not  specify  about  which  origin  the  torque  is 
measured.  The  reason  is  that  it  has  the  same  value  regardless  of  which  ori- 
gin is  used.  You  can  verify  this  by  repeating  the  calculation  with  a different 
choice  of  origin — try  it  with  the  origin  at  one  charge. 

If  an  electric  dipole  in  an  applied  electric  field  changes  its  orientation 
in  the  sense  specified  by  the  torque  acting  on  it,  then  this  torque  does  posi- 
tive work.  The  energy  involved  is  supplied  from  energy  stored  in  the 
system  comprising  the  applied  electric  field  and  the  electric  dipole.  In  other 
words,  there  is  a potential  energy  of  orientation  in  this  system,  which 
changes  by  an  amount  equal  to  the  negative  of  the  work  done  by  the  asso- 
ciated torque.  The  situation  is  very  much  like  the  one  involving  a force  that 
acts  on  a test  charge  in  an  applied  electric  field.  When  this  single  charge 
changes  its  position  so  that  the  force  does  positive  work,  the  potential  en- 
ergy stored  in  the  applied  electric  field-test  charge  system  changes  by  an 
amount  equal  to  the  negative  of  the  work  done. 

We  can  evaluate  a dipole’s  orientational  potential  energy  U in  a uni- 
form, applied  electric  field  by  summing  the  electric  potential  energies  U+ 
and  t/_  of  its  positive  and  negative  charges.  These  are 

U+  = q+V+  and 

where  V+  and  T_  are  the  values  of  the  electric  potential  at  the  positions  of 
the  two  charges.  Thus 

U = U+  + U-  = q+V+  + 9_y_  = \q\V+  ~ \q\V- 


or 


U = \q\(V+  - V.) 


(21-33) 


To  evaluate  V+  — V_,  we  take  an  x axis  extending  in  the  direction  of 
the  applied  electric  field  8.  as  indicated  in  Fig.  21-17.  Then  we  write 


y+  - y_  = 


(21-34) 


Here  x+  and  x_  are  the  x coordinates  of  the  positive  and  negative  charges,  as 
is  also  indicated  in  the  figure.  Next  we  write  Eq.  (21-21),  dV  = — 8 • ds,  for 
the  case  where  ds  is  in  the  direction  of  8 and  has  magnitude  dx.  We  obtain 

dV  = dx 


21-4  Electric  Dipoles  971 


Using  this  in  Eq.  (21-34),  we  have 


V+  - V-  = - % dx  = - 


! 


x- 


L 


X- 


dx 


The  integral  immediately  gives  us 

y+  - u_  = -%(x+  - x-) 
With  this  substituted  in  Eq.  (21-33),  we  find 

U = — \q\S{x+  — X-) 
Now  the  hgure  shows  that 

x+  — X-  = 2d  cos  6 


where  6 is  the  angle  between  the  direction  of  the  electric  dipole  moment 
vector  p and  the  electric  held  vector  8.  Therefore  we  have 


U — — \q\2d  S cos  6 = — pS  cos  6 
In  terms  of  the  vectors,  this  result  can  be  expressed  as 

U = - p • 8 


(21-35) 


The  orientational  potential  energy  of  an  electric  dipole  in  a uniform,  applied  electric 
field  eq  uals  the  nega  tive  of  the  dot  product  of  its  electric  dipole  moment  vector  and  the 
applied  electric  field  vector.  Note  that  the  reference  value  U = 0 corresponds 
to  the  reference  orientation  in  which  the  electric  dipole  moment  p is  per- 
pendicular to  the  applied  electric  held  8 (6  = 90°).  When  the  dipole  is 
oriented  with  p parallel  to  8 (6  = 0°),  the  orientational  potential  energy  has 
the  minimum  value  Umin  = —pS.  When  it  is  oriented  with  p anti  parallel  to 
8 (6  = 180°),  this  energy  has  the  maximum  value  Umax  = P'S. 

Although  they  have  been  derived  by  assuming  that  the  applied  electric 
held  8 is  uniform,  the  equations  T = p x 6 and  U = — p • 8 can  be  used 
in  most  circumstances  when  8 varies  from  one  location  to  the  next.  The 
variation  in  8 over  the  region  occupied  by  a small  electric  dipole  (such  as  a 
molecule)  usually  will  be  small,  and  so  the  equations  can  be  employed  if  8 
is  interpreted  to  represent  the  value  at  the  center  of  the  dipole.  In  con- 
trast, the  conclusion  that  the  net  force  on  an  electric  dipole  has  the  value 
F = 0 is  not  true  if  8 varies.  Can  you  explain  why? 

Example  21-9  makes  use  of  Eq.  (21-35). 


EXAMPLE  21-9 


An  inventor  comes  to  you  for  advice  on  a design  for  an  electric  household  refriger- 
ator. He  claims  great  reliability  and  economy  of  manufacture,  on  the  ground  that 
the  only  moving  part  is  a small,  cheap,  low-pressure  water  pump.  The  idea,  which  is 
quite  analogous  to  that  of  the  magnetic  refrigerator  described  in  Sec.  19-6,  is 
sketched  schematically  in  Fig.  21-18.  Pure  water  in  the  left  branch  of  the  loop  is  sub- 
jected to  the  strongest  electric  field  which  will  not  lead  to  an  electric  discharge 
through  the  water — the  magnitude  is  about  % = 1 X 105  N/C.  The  water  then 
flows  into  the  cooling  coils  inside  the  refrigerator  box,  represented  by  the  lower 
branch  of  the  loop,  where  there  is  no  electric  held.  The  water  molecules,  whose  per- 
manent electric  dipole  moments  have  been  aligned  by  the  applied  electric  held,  now 
randomize  their  orientations.  This  disordering  means  there  is  an  increase  AS  in  the 
entropy  of  the  water.  Thus  since  the  temperature  T is  positive,  the  quantity  T AS  is 
positive.  But  Eq.  (18-61)  states  that  T AS  = AH.  The  positive  value  of  the  heat  AH 
means  that  heat  must  how  into  the  water  from  the  contents  of  the  refrigerator  box 


972  The  Electric  Potential 


4-  + + + + 


Refrigerator 

box 


Fig.  21-18  A schematic  diagram  of  a 
proposed  household  refrigerator.  Its 
operation  involves  the  interaction  of 
electric  dipoles  with  an  applied  electric 
field,  as  explained  in  Example  21-9. 


through  the  walls  of  the  cooling  coils  comprising  the  lower  branch.  This  reduces  the 
temperature  inside  the  box.  The  water,  containing  heat  energy  transferred  to  it 
from  the  contents  of  the  refrigerator,  now  passes  out  of  the  box  and  through  a 
small  pump  in  the  right  branch  of  the  loop.  The  pump  serves  solely  to  maintain  the 
flow  of  water  around  the  loop.  Next  the  water  reenters  the  electric  field  in  the  top 
branch  of  the  loop.  As  a result,  the  electric  dipole  moments  of  the  water  molecules 
are  realigned.  This  ordering  leads  to  a decrease  in  the  entropy  of  the  water.  Hence 
the  quantity  T AS  = AH  is  negative.  This  means  that  heat  flows  out  of  the  water 
through  the  radiator  coils  of  the  upper  branch  and  into  the  surrounding  atmos- 
phere. The  water  then  continues  around  the  loop.  How  should  you  advise  the  in- 
ventor about  the  feasibility  of  his  design? 

■ Tell  him  that  his  invention  will  work,  in  principle.  Then  raise  the  practical 
question:  At  what  rate  must  water  flow  through  the  loop  in  order  to  remove  a sig- 
nificant amount  of  heat  from  the  refrigerator  box?  You  can  get  a useful  answer  to 
this  question  by  making  the  most  optimistic  predictions  possible.  For  this  purpose 
you  assume  that  there  is  no  interaction  among  individual  water  molecules.  That  is, 
you  assume  that  each  molecule  will  “see”  only  the  external  electric  field  applied  to 
the  water,  and  not  the  electric  fields  produced  by  the  electric  dipoles  in  neighboring 
water  molecules  (which  tend  to  oppose  the  external  field  as  the  molecules  align  in 
that  field).  You  refer  to  a physics  handbook  and  find  that  the  electric  dipole  mo- 
ment of  water  has  the  magnitude  p = 6 x 10-30  C-m.  This  is  one  of  the  largest  val- 
ues for  any  substance.  So  the  choice  of  working  fluid  seems  at  first  glance  to  be  a 
good  one  because  the  external  electric  field  will  have  a relatively  large  effect  on  the 
alignment  of  the  water  molecules. 

When  the  molecular  electric  dipole  moments  p are  lined  up  in  the  external 
electric  field  8,  the  angle  6 between  p and  8 has  the  value  8 = 0°.  Thus  according  to 
Eq.  (21-35),  the  orientational  potential  energy  per  molecule  is  f7initial  = —/><£.  After 
they  have  passed  through  the  cooling  coils  inside  the  refrigerator  box,  their  orienta- 
tional potential  energy  has  the  value  f/flnal  = 0 because  they  are  in  a region  where 
% = 0.  Thus  the  energy  change  per  molecule  has  the  value 


AU  = f/final  - Cinitial  = 0 = p%  = 6 x l(r30  C-m  x 1 x 105  N/C 


or 


AU  = 6 x 10  25  J /molecule 

This  is  equal  to  the  heat  energy  AH  per  molecule  absorbed  by  the  water  from  the 
contents  of  the  refrigerator  box. 

Now  you  look  at  the  specifications  of  a typical  household  refrigerator,  and  you 
find  that  it  is  expected  to  be  able  to  remove  heat  energy  from  the  contents  of  the 
box  at  a rate  of  about  100  W = 100  J/s.  So  the  water  flow  through  the  loop  must  be 
at  least 

100  1/s 

— — 7 - — = 2 x 1026  molecules/s 

6 x 10  25J/molecule 

Since  Avogadro’s  number  is  6 x 1026  molecules  per  kilomole,  this  is  about  i kmol/s. 
And  since  the  molecular  weight  of  water  is  18,  it  amounts  to  about 

18  kg/kmol  x J kmol/s  = 6 kg/s 

— that  is,  6 kg  of  water  passing  through  the  loop  each  second,  or  approximately  90 
gallons  per  minute.  At  the  very  least,  you  can  tell  the  inventor  that  his  pump  is  not 
going  to  be  small!  In  an  ordinary,  compressor-type  refrigerator  the  heat  energy  per 
molecule  carried  by  the  working  fluid  is  about  five  orders  of  magnitude  larger  than 
the  value  of  AU  calculated  above. 


21-4  Electric  Dipoles  973 


21-5  LAPLACE’S  If  you  know  how  all  the  source  charges  qj  are  distributed  in  a certain 
EQUATION  region,  you  can  determine  the  total  electric  field  8 of  these  charges  by 
applying  Coulomb’s  law  to  each  charge.  This  requires  evaluating  the  vector 
sum  of  their  individual  fields,  as  in  Sec.  20-4.  When  the  distribution  of 
source  charges  is  sufficiently  symmetrical,  much  less  effort  is  needed  to  use 
Gauss'  law  to  determine  the  electric  flux  and  then  to  find  8 from  the  flux,  as 
in  Sec.  20-5.  For  a distribution  that  does  not  have  enough  symmetry  to 
permit  this,  it  is  advantageous  to  deal  with  scalar  electric  potentials  rather 
than  vector  electric  fields.  You  find  the  electric  potential  of  each  source 
charge,  evaluate  the  sum  of  these  potentials  to  get  the  total  electric  poten- 
tial V,  and  then  obtain  8 by  differentiating  V with  respect  to  position,  as  in 
Sec.  21-4. 

But  in  many  cases  you  would  like  to  know  the  electric  potential  V and 
the  corresponding  electric  field  8 in  a certain  region  when  you  do  not  know 
the  location  of  the  source  charges.  Instead,  you  know  V along  the  bounda- 
ries of  the  region.  For  an  example  consider  Fig.  21-19.  Four  long  metal 
plates  are  fastened  together  by  insulating  material  and  enclose  a region  of 
square  cross  section.  The  terminals  of  two  electric  batteries  are  connected 
to  the  plates  by  metal  wires,  as  shown  in  the  figure.  The  operation  of  batter- 
ies is  described  in  some  detail  in  Chap.  22.  Here  it  suffices  to  say  that  when 
the  connections  are  made,  the  batteries  cause  charge  to  flow  through  the 
wires  and  onto  the  plates.  In  a very  short  time  the  motion  of  charge  ceases. 
When  this  occurs,  the  difference  between  the  electric  potential  at  any  point 
connected  by  conducting  material  with  one  terminal  of  a battery  differs 
from  the  electric  potential  at  any  point  similarly  connected  to  the  other 


+2  V Fig.  21-19  A system  of  four  long  electrodes  on  which  the 
values  of  the  electric  potential  are  V = 0,  V = + 1 V, 
V = + 1 V,  and  V = + 2 V.  What  are  the  values  of  V at  points 
within  the  region  surrounded  by  the  electrodes,  and  not  near 

O J 

their  ends? 


1-V  battery 


1-V  battery 


974  The  Electric  Potential 


terminal  by  an  amount  whose  magnitude  depends  on  the  specifications  of 
(he  battery.  In  the  case  illustrated,  this  amount  is  1 V for  both  batteries. 
The  sign  of  the  difference  in  electric  potential  is  determined  by  the  signs 
used  to  label  the  two  battery  terminals.  Specifically,  the  points  connected  to 
a terminal  labeled  as  positive  are  at  an  electric  potential  V which  is  more 
positive  than  those  connected  to  a terminal  labeled  as  negative.  If  you 
inspect  the  figure,  you  will  see  (hat  (he  result  of  connecting  the  two  batter- 
ies to  the  plates  as  shown  is  that  relative  to  the  value  of  V on  the  near  plate, 
(he  value  on  the  left  and  right  plates  is  made  more  positive  by  1 V and  the 
value  on  (he  far  plate  more  positive  by  2 V.  For  the  situation  illustrated  in 
(he  figure,  it  is  most  convenient  to  take  the  reference  position  that  must  be 
specified  to  define  specific  values  of  V as  a position  on  the  near  plate  (and 
not  a position  at  infinity,  as  usually  is  clone  with  a system  containing  one  or 
more  known  source  charges  and  no  batteries).  With  this  definition  of  refer- 
ence position,  the  values  of  the  electric  potential  on  the  near,  left,  right, 
and  far  plates  are,  respectively,  V = 0,  V = + 1 V,  V = + 1 V,  and  V = 
+ 2 V. 

Implicit  in  the  explanation  just  given  is  the  fact  that  the  electric  potential 
V has  the  same  value  everywhere  in  a conducting  body  when  there  is  no  charge 
moving  in  the  body.  This  is  true  since  in  such  circumstances  there  must  be 
zero  electric  field  everywhere  in  the  interior  of  the  body — otherwise 
charge  would  be  moving  through  the  body  because  it  is  conducting.  Hence 
dV  = — £ • ds  = 0 everywhere  within  the  body,  and  so  V has  the  same  value 
at  every  interior  point.  At  the  very  surface  of  the  body  there  can  be  a non- 
zero electric  field.  But  its  direction  must  be  normal  to  the  surface  so  that 
S • ds  = 0,  where  ds  is  any  direction  parallel  to  the  surface.  There  will  be 
such  an  electric  field  if  the  surface  is  charged,  £ being  in  the  outward 
normal  direction  if  the  charge  is  positive  and  in  the  inward  normal  direc- 
tion if  the  charge  is  negative.  Since  the  charge  cannot  move  out  of  the  sur- 
face of  the  conducting  body  into  the  insulating  material  that  surrounds  it, 
this  electric  field  will  produce  no  motion  of  charge.  (If  £ had  the  opposite 
direction,  charge  would  move  into  the  interior  of  the  conducting  body.) 


Metal  plates  across  which  differences  of  electric  potential  are  main- 
tained by  batteries,  or  by  other  means,  are  called  electrodes.  The  batteries 
in  Fig.  21-19  cause  charges  to  flow  onto  the  electrodes  when  they  are  con- 
nected. Thus  there  are  charges  on  the  boundaries  of  the  region  sur- 
rounded by  the  electrodes.  On  account  of  the  presence  of  these  source 
charges  there  is  an  electric  field  within  the  region,  which  you  would  like  to 
determine.  But  the  only  thing  you  know  about  the  locations  of  these  source 
charges  is  that  they  have  a complicated,  nonuniform  distribution  on  the 
electrodes.  In  principle  it  is  possible  to  make  a laboratory  measurement  of 
the  source  charge  distribution  on  the  boundaries  of  the  region  and  then  to 
apply  Coulomb’s  law.  But  this  would  be  very  difficult  in  practice.  A much 
better  way  is  to  make  use  of  Laplace’s  equation.  This  equation  allows  you  to 
determine  the  electric  potential  V at  any  point  within  a charge-free  region, 
provided  that  you  know  V along  the  boundaries  of  the  region.  The  source 
charges  do  not  enter  into  the  calculation.  Once  you  know  V within  the 
region,  you  can  use  (he  relation  between  V and  8 to  determine  8 by  dif- 
ferentiating. 

Laplace’s  equation  is  not  an  independent  law.  It  is  derived  from  Gauss' 
law.  But  Gauss’  law  can  be  derived  from  Coulomb’s  law,  to  which  it  is  logi- 


21-5  Laplace's  Equation  975 


cally  equivalent,  as  we  saw  in  Sec.  20-5.  So  Laplace’s  equation  is  really  a re- 
formulation of  Coulomb’s  law,  just  as  Gauss’  law  is.  Much  like  Gauss’  law, 
Laplace’s  equation  not  only  facilitates  particular  calculations  but  also  pro- 
vides important  physical  insights  into  general  properties  of  electric  poten- 
tials and  fields.  In  this  section  we  derive  Laplace’s  equation  from  Gauss’ 
law.  Then  we  make  a qualitative  investigation  of  the  behavior  of  its  solu- 
tions. The  behavior  is  reasonably  simple,  and  it  will  suggest  an  equally 
simple  mechanical  system  that  has  a quite  analogous  behavior.  This  anal- 
ogy is  exploited  to  develop  the  physical  insights  just  mentioned.  The  sec- 
tion closes  with  a calculation  using  Lapace’s  equation  to  determine  the  elec- 
tric potential  V in  the  interior  region  of  the  system  illustrated  in  Fig.  21-19. 

Fig.  21-20  A gaussian  surface  enclos-  I he  calculation  employs  a very  straightforward  numerical  method, 
ing  an  infinitesimal  charge-free  region. 

Figure  21-20  shows  a charge-free  region  in  the  shape  of  an  infinitesimal 
cube  whose  edge  lengths  measured  along  the  x,  y,  and  z axes  are  dx,  dy,  and 
dz,  respectively.  In  this  region  there  is  a nonuniform  electric  field  8 = 
%xk  + c?,,y  + %zi,  where  x,  y,  and  z are  unit  vectors  in  the  positive  x,  y,  and 
z directions.  We  will  apply  Gauss’  law  to  this  charge-free  region  to  derive  a 
relation  that  must  be  satisfied  by  8 in  the  region.  Laplace’s  equation  for  the 
corresponding  electric  potential  V will  then  follow  immediately.  The  deri- 
vation will  be  simplified,  without  the  generality  of  its  results  being 
restricted,  if  we  assume  that  the  components  1ox,  %y,  and  %z  of  the  electric 
field  all  have  positive  values  in  the  region  under  consideration. 

Since  there  is  no  charge  within  the  region,  Gauss’  law,  Eq.  (20-37),  re- 
quires that 

| 8 • da  = 0 (21-36) 

closed 

surface 

The  integral  is  taken  over  the  six  faces  of  the  cube  that  are  the  boundaries 
of  the  region.  Consider  first  the  “right-hand”  face  of  the  cube.  This  is  the 
face  of  area  dy  dz  on  which  the  x coordinate  has  the  value  x + dx.  For 
this  face  the  outward  surface-element  vector  da  points  in  the  positive  x 
direction,  and  therefore  it  can  be  written  da  = da  x.  Thus  we  have  8 • da  = 
(^x  + fvy  + g2z)  • da  x = %x  da.  Hence  the  integral  over  the  surface  ol 
the  face  is 

face  dy  dz  face  dy  dz 
at  x + dx  at  x + dx 

The  integral  on  the  right  side  of  this  equality  is  exactly  equal  to  the  average 
value  of  %x  over  the  face  at  x + dx  times  the  area  dy  dz  of  the  face.  Since  the 
face  is  infinitesimal,  this  average  value  of  %x  must  be  extremely  close  to  the 
value  of  %x  at  the  center  of  the  face.  Writing  the  value  of  %x  at  the  center  of 
the  face  at  x + dx  as  (cf x)x+dx , we  therefore  have 


8 • da  = r&r  da 


Consequently, 


1 )x+djc  dy  dz 


face  dy  dz 
at  x + dx 


I 8 • da  (i%x)jc+dx  dy  dz 

face  dy  dz 
at  x + dx 


Next  consider  the  “left-hand”  face  of  area  dy  dz  and  on  which  the  x 


976 


coordinate  has  the  value  x.  For  this  face  the  outward  surface-element 
vector  cl  a points  in  the  negative  x direction,  and  so  it  can  be  written  da  = 

— da  x.  Thus  here  we  have  8 • da  = («?xx  + %vy  + %zz)  • ( — da  x)  = 

— % x da.  The  integral  is 

j 8 • da  = — J %x  da 

face  dy  dz  face  dy  dz 

at  x at  x 

Thus  we  have 

| 8 • da  = — {<&x)x  dy  dz 

face  dy  dz 
at  x 

where  (%x)x  is  the  value  of  %x  at  the  center  of  the  face  at  coordinate  x. 
The  sum  of  the  integrals  of  8 • da  over  the  pair  of  faces  of  area  dy  dz  is 


g • da  + 


8 • da  = ( %x)x+dx  dy  dz  - {%x)x  dy  clz 


face  dy  dz  face  dy  dz 
at  x + dx  at  x 


{J$x)x+dx  - (%x)x]  dy  dz 


The  quantity  in  brackets  is  the  difference  between  the  values  of  %x  at  the 
centers  of  the  pair  of  faces  of  the  infinitesimal  cube.  Since  the  two  faces  are 
only  an  infinitesimal  distance  dx  part,  the  quantity  can  be  expressed  as  the 
rate  of  change  of  %x  with  respect  to  x at  the  center  of  the  cube  times  the  dis- 
tance dx  between  the  faces.  That  is, 

(%x)x+dx  - (%x)x  = dx 

We  employ  partial  derivative  notation  because  the  x component  of  the  elec- 
tric field,  %x,  may  depend  on  any  of  or  all  the  independent  variables  x,  y,  z, 
but  we  want  the  rate  of  change  with  respect  to  x only.  Using  this  relation, 
we  obtain 

f f d % 

I 8 • da  + I 8 • da  — dx  dy  dz  (21-37a) 

face  dy  dz  face  dy  dz 
at  x + dx  at  x 

In  completely  similar  ways,  we  can  obtain 


and 


| 8 • da  + 

J 8 • rfa  = 

d^y 

„ dy  dx  dz 
dy 

(21-376) 

face  dx  dz 

face  dx  dz 

at  y + dy 

at  y 

j 6 • da  + 

Cr> 

a. 

p 

II 

d%z 

_ dz  dx  dy 
dz  7 

(21-37c) 

face  dx  dy 

face  dx  dy 

at  z + dz 

at  z 

If  we  now  add  Eqs.  (21-37a),  (21-376),  and  (21-37c),  the  left  side  of  the 
equality  that  is  produced  is  just  the  integral  of  8 • da  over  the  closed  sur- 
face formed  by  all  six  faces  of  the  cube.  So  the  addition  yields 


8 • da  = 


^ dx  dy  dz  + ^ dy  dxdz+^ 
dx  J dy  J dz 


dz  dx  dy 


closed 

surface 


977 


or 


I 8 • da 

closed 

surface 


d%*  + d%y  + d%z\ 
dx  dy  dz  J 


dx  dy  dz 


But  Gauss'  law,  Eq.  (21-36),  tells  us  that  the  integral  on  the  left  side  of  this 
equation  equals  zero,  since  no  charge  is  within  the  closed  surface.  There- 
fore the  quantity  on  the  right  side  must  also  be  equal  to  zero.  This  can  be 
true  only  if 

d^x  8%>y 

£ + + = 0 

dx  dy  dz 


To  introduce  the  electric  potential  V,  we  use  Eqs.  (21-18): 


dV 

dx 


dV 

dy 


These  allow  us  to  write  our  result  as 


S*  = 


dV 

dz 


_d_/_  ar  \ d ( \ a ( dV  \ - 

dx  \ dx  / + dy  \ dy  ) + dz  \ dz  ) ^ 

Finally,  we  multiply  through  by  — 1 and  use  second  partial  derivative  nota- 
tion to  obtain  Laplace’s  equation: 


d2V  d2V  d2V 
dx2  + dy2  + dz2 


(21-38) 


I bis  equation  plays  an  important  role  in  many  fields  of  science  and  engi- 
neering, since  it  governs  the  behavior  ot  not  only  electric  potentials  but  also 
such  things  as  gravitational  potentials,  velocity  “potentials"  in  fluid  flow, 
and  temperatures  in  heat  flow. 

Laplace’s  equation  describes  the  possible  behavior  of  the  electric  po- 
tential V in  a charge-free  region  of  space.  The  description  has  a geometri- 
cal interpretation  based  on  the  fact  that  the  second  derivative  of  any  func- 
tion with  respect  to  one  of  its  independent  variables  determines  the  curva- 
ture of  a plot  of  the  function  versus  that  variable.  For  instance,  take  a case 
where  V is  a function  of  x alone,  that  is,  V = T(x).  It  is  a familiar  fact  that  in 
such  a case  the  second  derivative  d2V(x)/dx2  at  any  point  x measures  the  cur- 
vature of  a plot  of  V(x)  versus  x at  that  point.  The  magnitude  of  the  curva- 
ture increases  with  the  magnitude  of  d2V(x)/dx2,  and  the  plot  is  concave  up- 
ward or  concave  downward  according  to  whether  d2V(x)/dx2  is  positive  or 
negative. 

When  V is  a function  of  three  variables,  such  as  V = V(x,  y,  z),  there  are 
three  basic  curvatures  at  any  point.  Holding  any  two  of  the  variables  fixed 
and  permitting  the  third  one  to  change  allows  an  investigation  of  the  curva- 
ture of  a plot  of  V(x,  y,  z)  along  the  direction  of  the  changing  variable.  The 
curvature  is  determined  by  the  second  partial  derivative  with  respect  to 
that  variable  in  a way  completely  similar  to  that  just  described.  For  ex- 
ample, the  magnitude  of  d2V(x,  y,  z)/ dy2  specifies  the  magnitude  of  the  cur- 
vature at  x,  y,  z of  a plot  of  V(x,  y,  z)  with  x and  z held  constant  and  y allowed 
to  vary.  The  plot  is  concave  upward  if  d2V(x,  y,  z )/dy2  > 0 and  concave 
downward  if  d2V(x,  y,  z)/dy 2 < 0. 

With  these  geometrical  properties  in  mind,  Laplace’s  equation. 


978  The  Electric  Potential 


V(x.y) 


Fig.  21-21  At  t lie  central  point  the  sur- 
face V(x,  y)  has  no  curvature  in  either 
thex  or  they  direction. 


V(x.y) 


(a) 

V(x,y) 


(■ b ) 


d2V/dx2  + d2V / dy2  + d2V / dz2  = 0,  can  be  interpreted  as  saying  that  any 
function  V ( x , y,  z),  which  describes  the  way  an  electric  potential  varies  from  point  to 
point  in  a charge-free  region,  must  everywhere  be  such  that  its  curvatures  along  the 
mutually  perpendicular  x,  y,  and  z directions  add  to  zero. 

It  is  easy  to  illustrate  this  interpretation  in  a case  where  the  potential  is 
a function  of  only  two  variables,  say  V = V(x,  y).  Except  near  the  ends  of  the 
electrodes,  the  electric  potential  in  the  region  inside  the  system  of  long 
metal  plates  of  Fig.  21-19  has  this  property  if  we  orient  the  coordinate  axes 
as  in  the  figure  so  that  the  z axis  is  vertical.  In  the  inside  region  we  have 
d2V / dz2  = 0,  and  Laplace’s  equation  reduces  to 

d2V(x,  y)  d2V(x,  y) 

- + ^-  = 0 <21-39) 

Consider  any  point  x,  y in  the  region  where  Eq.  (21-39)  is  valid.  There  are 
only  two  ways  for  V{x,  y)  to  satisfy  the  equation  at  that  point.  The  first  way  is 
for  both  derivatives  to  be  equal  to  zero.  The  second  is  for  one  of  them  to 
have  a positive  value  and  the  other  to  have  a negative  value  of  exactly  the 
same  magnitude. 

Figure  21-21  illustrates  the  first  type  of  solution  in  a region  near  the 
point.  It  is  a surface  obtained  by  plotting  V(x,  y)  along  an  axis  perpendicular 
to  an  xy  plane.  In  a small  locality  surrounding  a point  where  both 
d2V(x,  y)/dx2  = 0 and  d2V(x,  y)/dy2  — 0,  the  surface  V(x,  y)  is  a plane.  It  can 
have  any  inclination,  but  it  must  be  locally  planar  since  there  is  no  curva- 
ture in  either  the  x or  they  direction.  Figure  21-22a  and  b pictures  the  local 
behavior  of  the  surface  specified  by  V(x,  y)  near  a point  where 
d2V(x,  y)/dx2  = — d2V(x,  y)/dy2  ^ 0.  For  this  second  type  of  solution  to  La- 
place’s equation  at  the  point,  the  V(x,  y)  surface  is  locally  saddlelike.  It  can 
he  concave  upward  along  the  x direction  and  concave  downward  along  the 
y direction,  or  vice  versa.  The  tangent  to  the  surface  at  the  point  can  have 
any  inclination. 

The  complete  surface  specified  by  the  behavior  of  V(x,  y)  at  all  points 
in  the  charge-free  region  is  a “patch  work”  of  adjoining  surface  elements, 
each  of  which  looks  like  one  of  those  shown  in  Fig.  21-21  or  21-22.  At  no 
point  within  the  region  can  the  surface  have  a maximum,  or  a minimum, 
since  the  former  requires  that  both  d2V(x,  y)/dx2  < 0 and  d2V(x,  y)/dy2  < 0, 
while  the  latter  requires  that  both  be  greater  than  zero.  Can  you  see  that  a 
maximum  or  a minimum  would  imply  the  presence  of  charge? 

If  a rubber  sheet  is  stretched  between  supports  of  different  heights  at 
the  boundaries  of  a region,  then  within  the  region  it  will  form  a surface 
which  has  properties  analogous  to  those  just  described  for  the  complete 
V(x,  y)  surface.  At  all  points  within  the  region,  the  stretched  sheet  will  be  lo- 


Fig.  21-22  (a)  At  the  central  point  the  surface  V(x,  y) 

has  a positive  curvature  in  the  x direction  and  an 
equal  negative  curvature  in  they  direction.  ( b ) At  the 
central  point  the  surface  V(x,  y)  has  a negative  curva- 
ture in  thex  direction  and  an  equal  positive  curvature 
in  the  y direction. 


21-5  Laplace's  Equation  979 


!!(x,y) 


Fig.  21-23  An  infinitesimal  piece  of’ a uniformly  stretched  rubber  sheet  and  the  tension 
f orces  of  equal  magnitude  exerted  on  ii  by  the  adjacent  parts  of  the  sheet.  Each  pair  of  forces 
ac  ling  on  opposite  edges  is  like  I he  single  pair  of  forces  acting  on  opposite  ends  of  the  piece  of 
stretched  siring  in  Fig.  12-8.  The  pair  acting  on  the  left  and  right  edges  gives  a net  upward 
force  because  the  piece  is  concave  upward  along  the  x direction.  The  force  is  proportional  to 
i PH  (x,  y)/i)x 2,  where  // (x,  y)  specifies  I he  height  of  the  sheet  as  a function  of  i he  coordinates  x,  y, 
providing  the  angle  between  the  H(x,  y)  surface  and  the xy  plane  is  small.  This  is  comparable  to 
K(|.  ( 1 2-24)  for  I he  net  upward  force  acting  on  I he  piece  of  string.  The  total  upward  force  on 
the  piece  of  sheet  due  to  both  pairs  of  forces  acting  on  its  edges  is  proportional  to 
ii'1  ft (x,  y)/i)xl  + y)/()y‘t\  the  second  term  comes  from  ihe  curvature  in  the  y di- 

rection. Since  ihe  piece  is  stationary,  this  force  must  be  zero,  and  therefore 
!)'lH{x,  y)/dx 1 + ft2// (x,  y)/ fly2  = 0.  This  equation  is  satisfied  by  the  piece  illustrated  in  the  fig- 
iu  e since  ii  has  as  much  downward  curvature  along  the  y direct  ion  as  ii  has  upward  curvature 
along  the  x direction,  so  !)2H(x,  y)/dy2  = —< l2/-/(x,  y)/<)xi. 


t ally  planar  or  locally  saddlelike.  It  cannot  have  a maximum  or  a minimum 
at  any  point  inside  the  boundaries  where  it  is  supported.  The  analogy  is 
more  than  qualitative.  At  all  points  where  the  sheet  is  not  touching  a sup- 
port,  the  function  H(x,  y)  specifying  the  height  above  an  xy  plane  of  a uni- 
formly stretched  sheet  satisfies  an  equation  mathematically  identical  to  La- 
place’s equation  for  ihe  electric  potential  — providing  the  slope  angle 
between  the  ll(x,  y)  surface  and  the  xy  plane  is  everywhere  small  enough 
that  its  sine  or  tangent  is  essentially  equal  to  the  angle  itsell.  You  can  see 
that  i his  is  so  by  studying  Fig.  21-23  and  its  caption. 

Because  the  height  function  H(x,  y)  for  a stretched  rubber  sheet  in  an 
unsupported  region  satisfies  the  same  equation  as  the  potential  function 
V(x,  y)  in  an  uncharged  region,  a rubber  sheet  provides  a mechanical 
system  that  accurately  models  (that  is,  behaves  in  a manner  analogous  to)  the 
electrical  system.  To  use  the  model,  first  you  make  a scale  drawing  on  an  xy 
plane  of  the  intersections  with  that  plane  of  each  metal  electrode  at  the 
boundaries  of  the  electrical  system.  Then  you  construct  supports  along 
these  outlines  whose  heights  above  (or  below)  the  plane  are  proportional  to 
the  positive  (or  negative)  electric  potentials  on  (he  electrodes  and  stretch 
the  sheet  between  ihe  supports.  It  automatically  forms  a surface  whose 
height  everywhere  in  the  region  within  the  boundaries  is  proportional  to 
the  electric  potential  for  the  system  in  that  region.  The  model  acts  like 
an  “analogue  computer”  that  produces  the  solution  of  Laplace’s  equation 
for  the  specified  boundary  values  of  the  electric  potential  V(x,  y).  Prior  to 
the  advent  ol  digital  computers,  this  procedure  was  used  frequently 
by  scientists  and  engineers  to  solve  Laplace’s  equation  for  two- 
dimensional  boundary  value  problems  that  were  difficult  or  impossible  to 
treat  by  analytical  means. 

Figures  2 1-24  through  21-27  illustrate  the  stretched  rubber-sheet  anal- 
ogy applied  to  some  simple  electrode  configurations.  An  outer  frame  must 
be  used  to  keep  the  sheet  stretched;  this  frame  acts  like  an  outer  boundary. 
The  unsupported  regions,  or  uncharged  regions,  lie  between  it  and  the 
boundaries  determined  by  the  supports  at  various  heights,  or  electrodes  at 
various  electric  pot  ent  ials.  If  the  distance  from  any  part  of  the  frame  to  any 
part  of  a support  representing  an  electrode  is  large  compared  to  the  widths 
of  i he  supports,  and  if  t he  height  of  the  frame  defines  the  height  of  the  xy 
plane,  then  to  a good  approximation  the  effect  of  the  frame  is  the  same  as 
the  effect  of  choosing  the  electric  potential  to  have  the  value  zero  infinitely 
far  from  the  electrodes. 


980  The  Electric  Potential 


Fig.  21-24  Stretched  rubber-sheet  analogue  of  the 
electric  potential  V(x,y)  surrounding  a long,  straight 
metal  wire  of  finite  diameter.  The  wire  has  been 
given  a certain  amount  of  posit ive  charge,  so  that  at 
its  surface  V has  a certain  positive  value.  Near  the 
frame  holding  the  rubber  sheet  in  tension  the  an- 
alogy is  not  accurate  since  V(x,  y)  really  reaches  zero 
only  at  infinity.  And  accuracy  is  limited  near  the 
post  representing  the  wire  since  its  height  has  been 
exaggerated,  for  the  sake  of  clarity,  causing  the 
angle  between  the  sheet  and  the  plane  of  the  frame 
to  become  large.  What  is  the  electric  field  like?  What 
happens  to  the  electric  potential,  and  the  corre- 
sponding electric  field,  if  t fie  value  of  V at  t lie  wire 
is  held  constant  while  the  diameter  of  the  wire  is 
reduced. 


Figure  21-25  The  electric  potential  V (x,  y)  in  the 
vicinity  of  two  long,  straight  parallel  wires  on  which 
the  values  of  V are  made  equal  but  opposite,  flow 
does  the  electric  field  compare  to  the  one  in  the 
vicinity  of  an  electric  dipole  made  from  two  small 
charged  spheres? 


Fig.  21-26  A long,  straight  wire  on  which  the  elec- 
tric potential  has  a certain  positive  value  V is  placed 
near  a parallel,  metal  plane  on  which  V = 0.  The 
electric  potential  V(x,  y)  to  the  left  of  the  plane  is 
just  like  that  in  the  left  half  of  Fig.  21-25.  Why? 


21-5  Laplace’s  Equation  981 


Fig.  21-27  The  electric  potential  V(x,  y)  inside  and 
outside  a long,  straight  metal  tube  of  noncircular 
cross  section,  on  the  surface  of  which  it  has  a posi- 
tive value  V.  Since  the  electric  potential  has  the  same 
value  V everywhere  inside  the  tube,  there  can  be  no 
electric  field  in  this  region.  The  decrease  in  the 
values  of  V(x,  y)  in  the  region  outside  the  tube  is 
most  rapid  near  the  part  of  the  tube  where  its  sur- 
face bends  most  abruptly — that  is,  near  the  pointed 
end  of  the  tube.  Why  is  this  so?  The  significance  of 
this  behavior  of  V(x,  y)  is  discussed  in  the  text. 


Particularly  interesting  is  the  surface  formed  by  the  sheet  inside  the 
cylinder  in  Fig.  21-27.  The  cylinder  of  noncircular  cross  section  is  of  con- 
stant height  since  the  electric  potential  is  constant  at  all  points  on  the  sur- 
face of  the  cylindrical  metal  electrode  of  noncircular  cross  section  which  it 
represents.  The  sheet  will  maintain  that  height  everywhere  within  the  cyl- 
inder. Thus  the  electric  potential  will  be  V(x,  y)  = constant  within  the  metal 
electrode.  As  a consequence,  the  electric  held  components  will  be  %x  = 

— dV(x,  y)/dx  — 0 and  — — dV{x,  y)/dy  = 0 in  this  region.  There  will  be 
no  electric  held  in  the  region  within  the  electrode,  even  though  the  region 
is  surrounded  by  the  charges  that  must  be  on  the  electrode  because  its  elec- 
tric potential  is  not  zero. 

This  analogy  gives  a very  clear  insight  into  the  reason  why  there  is  no  electric 
field  within  a charged  metal  cylinder,  no  matter  what  the  shape  of  its  cross  section 
is.  How  does  it  carry  over  to  the  three-dimensional  case  of  a charged  metal  shell  of 
arbitrary  shape? 

Note  that  the  height  of  the  stretched  sheet  in  the  region  just  outside 
the  cylinder  decreases  most  rapidly  where  its  surface  bends  most  rapidly. 
Thus  for  the  analogous  system  the  electric  potential  decreases  most  rapidly 
just  outside  the  “most  pointed”  part  of  the  electrode.  This  means  that  the 
magnitude  % of  the  electric  held  will  be  greatest  just  outside  this  part  of  the 
electrode.  You  can  see  so  by  writing  Eq.  (21-21),  dV  = — 8*  <is,  with  the  dis- 
placement ds  taken  parallel  to  the  direction  of  8 so  that  dV  = — % ds  or  f = 

— dV/ds.  The  last  equation  makes  it  clear  that  % will  be  largest  where  dV /ds 
has  the  most  negative  value. 

A related  consequence  is  that  the  charge  on  the  surface  of  the  elec- 
trode will  be  most  concentrated  where  the  electrode  is  most  pointed.  This  is 
because  the  value  of  c?  just  outside  its  surface  is  proportional  to  the  value  of 
the  charge  per  unit  area,  cr,  at  the  surface.  You  can  find  qualitative  justifica- 
tion for  a proportionality  between  % and  cr  in  the  fact  that  the  electric  held 
lines  emanate  from  the  charges  on  the  surface  and  are  normal  to  the  sur- 
face just  outside  it.  Thus  the  number  of  electric  held  lines  per  unit  of  sur- 
face area  is  proportional  to  the  charge  per  unit  of  surface  area.  Since  the 
number  of  lines  per  unit  area  is  proportional  to  it  follows  that  % is  pro- 
portional to  cr.  (Alternatively,  you  can  follow  an  exercise  in  Chap.  20  and 
apply  Gauss’  law  to  prove  the  quantitative  relation  % = cr/e0.) 


982  The  Electric  Potential 


A lightning  rod  is  a three-dimensional  example  taking  advantage  of  the  prop- 
erties discussed  in  the  two  preceding  paragraphs.  Frictional  effects  in  a turbulent 
atmosphere  often  lead  to  a large  accumulation  of  charge  on  a cloud.  In  “favorable” 
circumstances  the  resulting  electric  field  in  the  region  between  the  cloud  and  the 
earth  can  produce  the  electric  discharge  that  is  called  lightning.  A lightning  rod  is 
a metal  rod  with  a sharply  pointed  end  connected  by  a thick  conducting  wire  to 
the  earth.  When  a charged  cloud  passes  overhead,  charge  of  the  opposite  sign  is  at- 
tracted from  the  earth  through  the  wire  to  the  electrode,  where  it  is  concentrated  at 
the  pointed  end.  Just  outside  this  region  the  electric  field  has  a relatively  large 
magnitude.  The  consequence  is  that  if  it  is  at  all  possible  for  lightning  to  strike,  it 
will  strike  the  lightning  rod  and  then  pass  harmlessly  through  the  conducting 
wire  to  the  earth,  instead  of  striking  a building  in  the  vicinity. 

Many  other  insights  into  qualitative  properties  of  electric  potentials 
and  electric  fields  can  he  found  in  the  stretched  rubber-sheet  analogy.  And 
usually  you  do  not  need  to  employ  an  actual  rubber  sheet.  Your  intuition 
will  tell  you  well  enough  what  surface  such  a sheet  would  form.  Try  it  on 
the  electrode  configuration  shown  in  Fig.  21-19. 

Obtaining  a quantitative  solution  to  Laplace’s  equation  for  a certain  set 
of  boundary  values  by  using  a stretched  rubber  sheet  as  an  analogue  com- 
puter is  a cumbersome  procedure  with  limited  accuracy.  Furthermore,  the 
analogy  is  restricted  to  potential  functions  involving  only  two  coordinates, 
while  Laplace’s  equation  applies  just  as  well  to  those  involving  three  coordi- 
nates. So  Laplace’s  equation  usually  is  solved  by  mathematical  methods. 
But  analytical  solutions  to  this  second-order  partial  differential  equation 
can  be  found  for  systems  of  electrodes  only  if  the  boundaries  of  the  elec- 
trodes lie  along  constant  coordinate  surfaces  of  certain  sets  of  coordinates. 
And  the  analytical  solutions  are  cpiite  complicated. 

For  many  systems  of  great  practical  interest  there  is  no  analytical  solu- 
tion to  Laplace’s  equation  because  the  electrodes  do  not  have  the  required 
symmetry.  However,  there  is  an  extremely  simple  method  for  obtaining  a 
numerical  solution  to  the  equation.  As  is  characteristic  of  numerical  proce- 
dures, the  method  can  be  applied  successfully  to  all  cases  involving  La- 
place’s equation.  And  it  provides  a typical  example  of  how  partial  differen- 
tial equations  are  solved  numerically.  We  explain  the  method  and  then 
demonstrate  its  application. 

You  can  see  the  basis  of  the  numerical  method  by  inspecting  Figs. 
21-21  and  21-22.  The  plane  shown  in  Fig.  21-21  is  a part  of  a V(x,  y)  surface 
for  which  the  two  coordinates  lie  within  small  ranges  of  the  same  size.  Be- 
cause it  is  a plane,  it  is  evident  that  the  value  of  V(x,  y)  at  the  point  at  its 
center  equals  the  average  of  the  values  of  V{x,  y)  at  the  four  points  labeled  a, 
b,  c,  d.  This  is  true  no  matter  how  the  plane  is  inclined.  Furthermore,  the 
same  property  applies  to  either  of  the  saddlelike  V(x,  y)  surfaces  shown  in 
Fig.  21-22.  The  central  value  of  T(x,  y)  for  the  surface  in  Fig.  21-22o  is 
larger  than  the  average  of  its  values  at  points  a and  c because  the  surface  is 
concave  downward  between  these  points.  But  it  is  also  smaller,  by  exactly 
the  same  amount,  than  the  average  of  the  values  of  V(x,  y)  at  points  b and  d. 
This  is  so  because  according  to  Laplace’s  equation  the  surface  is  concave 
upward  with  a curvature  between  those  points  that  is  precisely  equal  in 
magnitude,  though  opposite  in  sign,  to  the  curvature  of  the  surface 
between  points  a and  c.  As  a consequence,  the  value  of  V(x,  y)  at  the  center 
of  this  surface  is  also  equal  to  the  average  of  its  values  at  points  a,  b,  c,  d. 
The  same  is  true  for  the  surface  in  Fig.  21-22 b.  And  this  statement  remains 


21-5  Laplace’s  Equation  983 


true  independent  of  the  inclination  of  the  tangent  to  either  surface  at  its 
central  point.  Thus  Laplace’s  equation  says,  in  effect,  thatt/ic  value  ofV(x,  y) 
at  any  point  in  a charge-free  region  must  be  the  average  of  its  values  at  four  symmet- 
rically disposed,  nearby  surrounding  points.  Example  21-10  makes  use  of  this 
property  of  the  electric  potential. 


This  basic  property  can  also  be  obtained  directly  from  Laplace’s  equation,  Eq. 
(21-39), 


d2V(x,  y)  d2V(x,  y) 
dx2  + dy2 


by  applying  to  it  the  definition  of  a partial  derivative.  The  first  partial  derivative 
with  respect  tox  ofV(x,  y)  at  the  point  x,  y can  be  defined  in  terms  of  the  approxi- 
mation 


dV(x,  y)  V(x  + Ax/2,  y)  — V(x  - Ax/2,  y) 
dx  Ax 


(21-40) 


with  the  understanding  that  the  approximation  is  better  as  the  separation  Ax 
between  the  two  symmetrically  disposed  points  at  which  V(x,  y)  is  evaluated  be- 
comes smaller.  In  the  limit  Ax  — » 0,  this  is  completely  equivalent  to  a definition 
involving  the  more  familiar  expression  [V(x  + Ax,y)  -V(x,y)]/Ax.  But  Eq. 
(21-40)  has  the  advantage  of  providing  a more  accurate  approximation  to 
dV (x,  y)/dx  for  any  value  of  the  finite  quantity  Ax.  The  second  partial  derivative 
with  respect  to  x can  be  expressed  similarly  by  using  Eq.  (21-40)  to  compute  d/dx 
of  dV (x,  y)/dx.  That  is, 


d TdV(x,y)~ 
dx  L dx 


[V((x  + Ax/2)  + Ax/2,  y)  - V ((x  - Ax/2)  + Ax/2,  y)] 

Ax  Ax 


[V((x  + Ax/2)  — Ax/2,  y)  — V((x  - Ax/2)  - Ax/2,  y)] 

Ax  Ax 


or 


d2V(x,  y)  __  V (x  + Ax , y ) — 2V (x , y ) + V (x  - Ax , y) 
dx2  “ (Ax)2 

Similarly,  we  have  for  the  second  partial  derivative  with  respect  to  y 


(21-41) 


d2V(x,  y)  V(x,  y + Ay)  - 2V(x,  y)  + V(x,  y - Ay)  ^ ^ 

dy2  ~ (Ay)2  1 ' J 

Setting  Ax  = Ay  in  Eqs.  (21-40)  and  (21-41)  and  then  substituting  into  Eq.  (21-39), 
we  find 


V(x  + Ax,  y)  + V(x 


Ax,  y)  + V(x,  y + Ay)  + V(x,  y - Ay)  - 4V(x,  y) 
(Ax)2 


for  Ax  = Ay 


Then  multiplying  through  by  (Ax)2  and  transposing,  we  obtain 


V(x,  y) 


V(x  + Ax,  y)  + V(x  - Ax,  y)  + V(x,  y + Ay)  + V(x,  y - Ay) 

4 

for  Ax  = Ay  (21-43) 


This  approximate  equality  becomes  exact  when  Ax  and  Ay  go  to  zero.  The  right 
side  of  this  equation  is  precisely  the  average  of  the  values  of  the  function  V at  the 
four  symmetrically  disposed  points  surrounding  the  point  x,  y.  The  left  side  of  the 
equation  is  the  value  of  V atx,  y.  So  the  equation  makes  the  same  statement  as  the 
italicized  sentence  immediately  preceding  this  small-print  material,  which  was 
based  on  the  geometry  of  Figs.  21-21  and  21-22.  If  V is  a function  of  the  three  coor- 


984  The  Electric  Potential 


dinatesx,  y,  z,  then  the  three  terms  of  Eq.  (21-38)  are  all  present.  Consequently,  the 
three-dimensional  equivalent  of  Eq.  (21-43)  is  an  equation  involving  one-sixth  of 
the  sum  of  six  terms.  The  first  four  are  similar  to  those  in  the  numerator  of  Eq. 
(21-43);  the  last  two  are  V(x,  y,  z + Az)  and  V (x,  y,  z — Az). 

The  result  expressed  in  Eq.  (21-43)  is  Laplace’s  equation  for  the  two  coordi- 
nates x and  y,  written  in  the  form  of  a difference  equation.  It  is  equivalent  to  the 
differential  equation  form  of  Eq.  (21-39).  Since  numerical  solution  of  equations 
always  involves  finite  differences  like  Ax,  rather  than  infinitesimal  differences 
like  dx,  numerical  solutions  to  Lapace’s  equation  are  found  by  applying  the  dif- 
ference equation.  The  method  for  doing  this  involves  performing  a set  of  simple 
calculations  repeatedly  in  a manner  explained  in  Example  21-10.  But  although  the 
calculations  involve  only  the  simplest  arithmetic,  in  a typical  three-coordinate 
problem  the  numerical  method  requires  a much  larger  set  of  memory  registers 
than  is  available  in  any  but  the  most  sophisticated  programmable  calculators.  So 
the  work  usually  is  done  on  a computer.  However,  it  is  possible  to  obtain  an 
approximate  numerical  solution  to  Laplace’s  equation  for  the  particularly  simple 
two-coordinate  problem  depicted  in  Fig.  21-19  by  using  only  a pencil  and  paper 
for  a memory  and  a manual  calculator  to  carry  out  the  arithemtic.  Following  Ex- 
ample 21-10  through  will  give  you  a good  impression  of  what  goes  on  in  a com- 
puter solution  of  a more  complicated  problem. 


EXAMPLE  21-10 


Fig.  21-28  A uniform  grid  of  points 
inside  the  system  of  electrodes  shown  in 
Fig.  21-19.  The  electric  potential  at 
these  points  is  evaluated  in  Example  21- 
10. 


Find  approximate  values  of  the  electric  potential  V at  16  points  arranged  in  a uni- 
form 4x4  grid  on  a cross-sectional  plane  through  die  region  within  the  system  of 
long  electrodes  in  Fig.  21-19,  and  not  near  their  ends. 

■ To  do  this,  you  construct  a uniform  6x6  grid  of  points  arranged  as  shown  in 
Fig.  21-28.  The  edges  of  this  grid  are  at  the  intersections  of  the  electrodes  with  the 
cross-sectional  plane.  So  the  electric  potential  V at  almost  all  the  edge  points  has  the 
known  values  given  in  the  figure  in  units  of  volts.  The  values  of  V at  the  four  corner 
points  are  ambiguous,  but  they  are  not  needed  in  the  calculation. 

To  start  the  calculation,  you  assign  values  of  V to  all  the  interior  grid  points. 
Since  solving  differential  equations  often  involves  some  form  of  guessing,  it  is  not 
surprising  that  you  must  guess  these  values.  The  final  values  to  be  obtained  do  not 
depend  on  the  values  guessed  at  the  start.  But  the  more  reasonable  the  assumed 
starting  values,  the  more  rapid  will  be  the  process  of  obtaining  the  final  values.  The 
set  of  values  labeled  "start”  in  Table  21-1  shows  a crude,  but  not  unreasonable,  ini- 
tial assignment  that  you  can  make  for  V at  the  interior  grid  points. 

The  first  step  in  the  calculation  is  to  compute  new  values  of  V at  each  interior 
grid  point  from  the  starting  values  in  the  interior  of  the  grid  and  the  fixed  values  on 
the  boundary  of  the  grid.  Each  new  value  is  obtained  by  taking  the  average  of  the 
starting  values  at  the  four  surrounding  points.  This  procedure  agrees  with  the  itali- 
cized statement  preceding  the  last  material  in  small  print  and  with  Eq.  (21-43) 
derived  in  that  material.  For  instance,  the  new  value  at  the  upper  left  interior  point 
is  (1  + 1.00  + 1.00  + 2)/4  = 1.25.  You  enter  this  in  the  upper  left  interior  location 
of  the  set  labeled  "first  iteration"  in  the  table.  Then  you  compute  and  enter  the  rest 
of  the  new  interior  values  in  the  same  way,  using  always  the  starting  values  in  the 
interior  of  the  grid  and  the  fixed  values  on  its  boundary. 

Next  you  repeat  the  process,  using  the  values  obtained  in  the  first  iteration  to 
produce  those  labeled  “second  iteration.”  The  “iterative”  process  is  continued,  with 
all  values  used  in  computing  averages  at  each  step  of  the  process  being  values  obtained  in  the 
preceding  step.  You  stop  the  process  when  the  change  in  values  obtained  from  one 
iteration  to  the  next  becomes  negligible.  In  this  example  the  values  are  seen  to  be 
adequately  close  to  convergence  at  the  end  of  the  sixth  iteration.  There  is  no  change 
in  most  of  them  from  the  fifth  iteration.  And  the  changes  that  do  occur  are  no 
greater  than  one  digit  in  the  last  decimal  place  retained.  Since  the  numbers  are 
rounded  off  to  that  digit,  none  of  their  accuracies  is  better  than  that,  and  so  there  is 
no  reason  to  continue.  Also,  there  is  no  reason  to  keep  more  decimal  places  in  the 
calculations  unless  a finer  arid  is  used. 


985 


Table  21-1 


An  Iterative  Solution  of  Laplace’s  Equation 


2 2 2 2 

1 1.00  1.00  1.00  1.00  1 

1 1.00  1.00  1.00  1.00  1 
1 1.00  1.00  1.00  1.00  1 

1 1.00  1.00  1.00  1.00  1 

0 0 0 0 

Start 


2 

2 

2 

2 

2 

2 

2 

2 

2 

2 

2 

2 

1 

1.25 

1.25 

1.25 

1.25 

1 

1 

1.31 

1.38 

1.38 

1.31 

1 

1 

1.36 

1.44 

1.44 

1.36 

1 

1 

1.00 

1.00 

1.00 

1.00 

1 

1 

1.06 

1.06 

1.06 

1.06 

1 

1 

1.08 

1.11 

1.11 

1.08 

1 

1 

1.00 

1.00 

1.00 

1.00 

1 

1 

0.94 

0.94 

0.94 

0.94 

1 

1 

0.92 

0.89 

0.89 

0.92 

1 

1 

0.75 

0.75 

0.75 

0.75 

1 

1 

0.69 

0.63 

0.63 

0.69 

1 

1 

0.64 

0.57 

0.57 

0.64 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

First  iteration 

Second 

iteration 

Third  iteration 

2 

2 

2 

2 

2 

2 

2 

2 

2 

2 

2 

2 

1 

1.38 

1.48 

1.48 

1.38 

1 

1 

1.40 

1.50 

1.50 

1.40 

1 

1 

1.40 

1.51 

1.51 

1.40 

1 

1 

1.10 

1.13 

1.13 

1.10 

1 

1 

1.10 

1.15 

1.15 

1.10 

1 

1 

1.11 

1.15 

1.15 

1.11 

1 

1 

0.90 

0.87 

0.87 

0.90 

1 

1 

0.90 

0.86 

0.86 

0.90 

1 

1 

0.89 

0.86 

0.86 

0.89 

1 

1 

0.62 

0.53 

0.53 

0.62 

1 

1 

0.61 

0.51 

0.51 

0.61 

1 

1 

0.60 

0.50 

0.50 

0.60 

1 

0 0 0 0 

Fourth  iteration 


0 0 0 0 

Fifth  iteration 


0 0 0 0 

Sixth  iteration 


The  results  obtained  are  plotted  in  Fig.  21-29.  Does  the  V(x,  y)  surface  they  de- 
fine agree  with  the  qualitative  one  you  were  asked  earlier  to  obtain  for  the  electrode 
system  of  Fig.  21-19  by  using  the  stretched  rubber-sheet  analogy  and  your  intui- 
tion? 

Are  the  final  results  really  independent  of  those  guessed  at  the  start?  You  can 
answer  this  question  by  repeating  the  calculation,  using  a different  starting  guess. 
You  will  find  that  the  answer  is  yes.  The  reason  why  the  process  converges,  and  con- 
verges to  values  that  do  not  depend  on  the  starting  values  for  the  interior  grid 
points,  is  that  at  each  step  the  same  values  are  always  used  for  the  boundary  points. 
Because  of  the  way  that  the  averages  must  be  computed  in  accordance  with  La- 
place's equation,  the  boundary  values  will  always  ultimately  “impose  their  will”  on 
the  interior  values. 

What  would  you  do  to  obtain  the  electric  held  8 from  the  electric  potential  V? 
How  could  the  calculation  be  extended  to  determine  the  values  of  V,  and  8.  at  a 
cross-sectional  plane  passing  through  the  ends  of  the  electrodes? 


986  The  Electric  Potential 


V(x.y) 


0 V 


Fig.  21-29  A plot  of  the  electric  po- 
tential inside  the  system  of  electrodes 
shown  in  Fig.  21-19. 


21-6  CAPACITORS  AND  In  its  simplest  form,  a capacitor  consists  of  two  closely  spaced  metal  plates, 

CAPACITANCE  with  charge  +|c/|  on  one  and  charge  — |g|  on  the  other.  Because  of  the  pres- 

ence of  the  charges,  the  electric  potential  has  different  values  at  the  two 
plates.  If  we  take  the  value  of  the  electric  potential  at  the  negatively 
charged  plate  to  be  zero,  its  value  V at  the  positively  charged  plate  is  the 
electric  potential  difference  between  the  plates.  The  values  of  |c/|  and  V 
prove  to  be  proportional  for  a particular  capacitor.  Hence  they  are  related 
by  the  equation  |^|  = CV.  The  proportionality  constant  C,  called  the  capaci- 
tance, depends  on  only  the  geometry  of  the  capacitor  and  the  nature  of  the 
insulating  material  between  the  plates.  The  name  is  appropriate  because 
the  equation  |<y|  = CV  shows  that  the  greater  the  value  of  C,  the  larger  the 
capacity  to  hold  charge  |t/|  for  a given  electric  potential  difference  V.  In  fact, 
one  of  the  principal  uses  of  capacitors  is  for  the  (temporary)  storage  of  elec- 
tric charge. 

In  this  section  and  the  next  we  investigate  the  basic  properties  of 
capacitors  with  simple  geometries  and  vacuum  (or  its  practical  equivalent, 
air)  between  the  plates.  In  Sec.  21-8  we  learn  how,  given  a capacitor  of  a 
particular  size  and  shape,  to  increase  its  capacitance  considerably  by  filling 
the  region  between  the  plates  with  certain  types  of  insulating  materials. 
Many  applications  of  capacitors  are  considered  in  subsequent  chapters. 

We  now  proceed  to  a development  of  the  equation  |g|  = CV.  Consider 
an  electrically  neutral  piece  of  metal  of  any  shape,  which  we  call  electrode 
1.  We  give  it  a charge  — 1<?|  by  adding  electrons.  These  electrons  must  come 
from  somewhere.  Assume  that  they  came  from  a single  other  initially  neu- 
tral piece  of  metal,  electrode  2,  there  being  nothing  else  in  the  vicinity. 
Then  electrode  2 must  have  been  given  a charge  +|^|  at  the  same  time  that 
electrode  1 was  given  the  charge  — 1</|.  The  situation  is  illustrated  in  Fig. 
21-30.  When  the  electrodes  are  charged,  there  is  an  electric  held  8 in  their 
vicinity,  which  has  similarities  to  the  electric  held  of  an  electric  dipole.  Be- 
cause of  this  electric  held  the  electric  potential  has  different  values  at  the 
two  electrodes. 

We  are  interested  in  the  difference  between  the  values  of  the  electric 
potential  at  the  two  electrodes.  As  in  Sec.  21-5,  we  can  avoid  the  need  of  in- 
troducing a symbol  such  as  AF  to  represent  this  quantity  by  choosing  the 
reference  location  required  to  specify  values  of  the  electric  potential  to  be 
at  the  negatively  charged  electrode  1.  Then  the  electric  potential  has  the 

21-6  Capacitors  and  Capacitance  987 


Fig.  21-30  Two  electrodes  of  arbitrary  shape  and  carrying  charge 
of  the  same  magnitude  but  opposite  sign.  The  curve  between  the 
two  is  an  arbitrary  path  of  integration  used  to  compute  the  differ- 
ence in  the  electric  potential  at  the  two  electrodes. 


— \q  \ distributed 
over  surface 


value  zero  there  and  some  other  value  V at  the  positively  charged  electrode 
2.  That  value  V is  also  the  electric  potential  difference  between  the  two  elec- 
trodes. To  be  specific,  the  electric  potential  difference  V between  elec- 
trodes 1 and  2 is 


V = 


Using  Eq.  (21-21),  we  can  write  this  as 


V — — 


ds 


(21-44a) 


(21-446) 


The  integral  is  over  the  elements  ds  of  any  path  from  any  point  on  the  sur- 
face of  electrode  1 to  any  point  on  the  surface  of  electrode  2.  As  is  illus- 
trated in  Fig.  21-30,  the  general  direction  of  8 is  from  the  positively 
charged  electrode  2 to  the  negatively  charged  electrode  1,  whereas  ds  is 
generally  in  the  opposite  direction.  Thus  8 • ds  most  often  has  a negative 
value,  and  so  the  integral  in  Eq.  (21-446)  has  a negative  value.  Because  of 
the  minus  sign  in  the  equation,  the  value  of  V is  positive. 

The  value  of  the  electric  potential  difference  V does  not  depend  on  the 
path  used  in  Eq.  (21-446)  to  integrate  8 • ds  between  the  two  electrodes. 
But  it  does  depend  on  the  electric  held  8.  In  turn,  8 depends  on  two 
factors.  First,  8 depends  on  the  geometry  of  the  system,  that  is,  on  the 
shapes  of  the  electrodes  and  on  their  separation.  Except  for  a few  highly 
symmetrical  cases,  the  dependence  is  complicated.  Second.  8 depends  on 
the  magnitude  |<?|  of  the  charges  on  the  electrodes.  But  this  dependence  is 
very  simple,  no  matter  what  the  geometry  of  the  system.  If  we  imagine 
doubling  the  value  of  |<?|,  this  will  lead  to  a distribution  of  individual  charges 
over  the  two  electrodes  that  has  the  same  pattern  as  before,  but  with  twice 
the  original  charge  density  everywhere.  The  new  charge  distribution  will 
lead  to  an  electric  held  with  held  lines  of  the  same  shape  as  the  original  one, 
but  with  twice  the  field-line  density  everywhere  because  there  is  twice  as 
much  charge  on  which  lines  begin  and  end.  In  other  words,  increasing  \q\ 
by  some  factor  will  cause  an  increase  in  the  magnitude  of  8 by  the  same 
factor  at  all  locations.  Equation  (21-446)  shows  that  this,  in  turn,  will  cause 
V to  increase  by  that  factor.  So  V and  |#|  are  proportional  to  each  other. 
Note  that  this  is  true  for  two  electrodes  of  any  geometry,  and  not  just  for 
electrodes  in  the  form  of  a pair  of  closely  spaced  plates. 

The  most  useful  way  to  express  this  universal  proportionality  is  by 
writing  the  equality 

\q\  = CV  (21-45  a) 

Here  C is  a proportionality  constant,  whose  value  depends  on  only  the 
geometry  of  the  system  (assuming,  as  we  have,  that  the  two  electrodes  are 


988  The  Electric  Potential 


Area  a 


+ ~ 


in  vacuum).  For  the  reasons  indicated  earlier,  C is  called  the  capacitance  of 
the  two-electrode  system,  and  the  system  itself  is  known  as  a capacitor. 
Thus  Eq.  (2 1 -45a)  says  that  the  charge  on  either  electrode  of  a capacitor  is  given 
by  the  product  of  its  capacitance  and  the  electric  potential  difference  between  the  elec- 
trodes. 


+ - 


£ 


e, 

£ = 0 


_Jh 

£ = 0 


Plate  2 


Plate  1 


+ 1 <7 1 distributed 
over  surface 


+ 


— I q I distributed 
over  surface 


■* — Separation  d 


An  explicit  expression  for  the  quantity  C in  Eq.  (21-45a)  serves  as  a 
definition  of  capacitance.  Solving  that  equation  for  C,  we  have 


(21-456) 


The  capacitance  of  a capacitor  is  defined  as  the  charge  on  either  electrode  divided  by 
the  electric  potential  difference  between  the  electrodes. 

The  SI  unit  of  capacitance  is  called  the  farad  (F),  in  honor  of  Michael 
Faraday.  If  a capacitor  has  a capacitance  of  1 F,  then  the  positive  and  nega- 
tive charges  on  its  two  electrodes  have  magnitudes  of  1 C when  their  poten- 
tial difference  is  1 V.  The  values  of  C ordinarily  encountered  are  much 
smaller,  so  submultiples  frequently  are  used.  One  of  these  is  the  micro- 
farad (p.F).  Another  is  the  picofarad  (pF),  which  is  called  the  micro-micro- 
farad (p,/u,F)  in  older  literature.  The  relations  are 


Fig.  21-31  A plane-parallel  capacitor. 
Relative  to  practical  capacitors,  the  sepa- 
ration between  the  plates  has  been  exag- 
gerated for  clarity,  and  the  dimensions 
of  the  plates  have  been  reduced  for  con- 
venience. 


1 F = 1 C/V  (21-46  a) 

1 (jlY  = 10-6  F (21-466) 

1 pF  = 10-12  F (21-46c) 

In  all  cases  the  capacitance  C of  a capacitor  can  be  determined  experi- 

mentally by  charging  a capacitor  so  that  the  difference  between  the  electric 
potential  of  its  two  electrodes  has  a value  measured  by  an  appropriate 
meter  to  be  V,  by  completely  discharging  it  through  a meter  capable  of 
measuring  the  charge  |<?|  flowing  from  one  electrode  to  the  other  as  it  dis- 
charges, and  then  by  evaluating  C = \q\/V.  In  certain  cases  the  geometry  is 
symmetrical  enough  for  C to  be  obtained  by  calculation. 


By  far  the  most  frequently  encountered  of  these  symmetrical  cases  is 
the  plane-parallel  capacitor  indicated  in  Fig.  21-31.  The  vertical  lines  rep- 
resent the  intersections  with  the  page  of  two  metal  plates  in  parallel  planes 
that  are  separated  by  a distance  d,  with  the  dimensions  of  the  plates  being 
very  large  compared  to  d.  Each  of  these  plates  has  an  area  a.  One  carries 
charge  +|<?|  and  the  other  charge  — |<j|.  The  horizontal  lines  connected  to 
the  plates  represent  wires  used  to  conduct  the  charges  to  them. 

We  now  determine  the  total  electric  field  due  to  the  charges  of  the 
system  by  evaluating  the  separate  fields  due  to  the  charges  on  each  plate 
and  then  adding  these  two  fields.  On  a particular  plate,  each  charge  is  re- 
pelled by  all  the  other  charges  of  like  sign.  So  the  charges  tend  to  distribute 
themselves  uniformly  over  the  plate.  In  fact,  the  charge  distribution  is  uni- 
form on  each  plate,  except  near  its  edges.  But  since  the  plate  is  very  large, 
all  but  a very  small  part  of  the  charge  is  located  well  inside  the  edges. 
Therefore,  to  a good  approximation  we  can  ignore  the  edges  and  treat 
each  plate  as  if  it  had  the  same  uniform  charge  distribution  as  the  “infinite” 
sheet  of  charge  depicted  in  Fig.  20-29.  The  results  obtained  by  considering 
that  figure  tell  us  that  the  positively  charged  plate  produces  a uniform  elec- 
tric field  S+,  which  is  everywhere  directed  perpendicularly  away  from  the 
plate.  According  to  Eq.  (20-45),  its  magnitude  is 

21-6  Capacitors  and  Capacitance  989 


£+  = 


I ai 

2e« 


where  |cr|  is  the  magnitude  of  the  charge  per  unit  area  on  either  plate.  That 
is, 

I , w 

a 


I he  negatively  charged  plate  gives  rise  to  exactly  the  same  type  of  elec- 
tric held,  except  that  it  is  everywhere  directed  perpendicularly  toward  that 
plate.  So  its  magnitude  is 


The  total  electric  held  8 produced  by  the  charges  on  the  two  plates  is 

8 = 8+  + 8_ 


The  electric  helds  8 + , 8_,  and  8 are  indicated  in  Fig.  21-31.  In  the  regions 
outside  the  capacitor  plates  8+  is  always  directed  oppositely  to  8_,  and  so  8 
is  everywhere  zero.  In  the  region  between  the  plates  8+  has  the  same  direc- 
tion as  8_.  Thus  the  total  held  8 in  this  inner  region  has  the  magnitude 

% = £+  + = 2%+ 

Evaluating  in  terms  of  <x  and  e0,  we  have 

g = — (21-47) 

The  direction  of  8 is  from  the  positively  charged  plate  to  the  negatively 
charged  one  since  it  is  the  direction  of  the  force  on  a positive  test  charge 
placed  between  the  plates.  (Note  that  with  this  electric  held  the  charges  re- 
side on  the  inner  surface  of  each  capacitor  plate.) 


The  simplicity  of  the  electric  held  8 between  the  plates  of  a plane- 
parallel  capacitor  makes  it  easy  to  evaluate  the  electric  potential  difference 
V between  the  plates.  In  Ecj.  (21-446), 

V = — j " 8 • ds 

we  choose  a straight  integration  path  from  some  point  on  the  negatively 
charged  plate  1 to  a point  which  is  directly  opposite  it  on  the  positively 
charged  plate  2.  Figure  21-31  shows  that  8 and  ds  are  everywhere  anti- 
parallel on  this  path,  and  we  have  8 • ds  = — <?  ds.  Furthermore  % has  the 
constant  value  given  by  Eq.  (21-47).  So  we  have 

V = J2  % ds  = % J2  ds 

1 he  last  integral  is  just  equal  to  d,  the  length  of  the  path.  Thus  we 
obtain 

V = %d  (21 -48a) 


990  The  Electric  Potential 


Fig.  21-32  A stretched  rubber-sheet  analogue  showing  the  electric 
potential  in  the  region  near  the  edges  of  the  plates  of  a plane-parallel 
capacitor,  and  also  in  the  region  well  inside  the  edges.  All  parts  of  the 
frame  holding  the  rubber  sheet  in  tension  are  very  far  away. 


Using  Eq.  (21-47)  with  cr  = \q\/a  in  Eq.  (21 -48a),  we  have 


(21-486) 


where  d is  the  separation  between  the  plates. 

Now  we  can  compute  the  capacitance  C of  the  plane-parallel  capacitor 
from  its  defining  relation,  Eq.  (21-456), 


Evaluating  V from  Eq.  (21-486),  we  find 


C = tt ^ 


I q\d/e0a 


or 


(21-49) 


If  we  neglect  the  effects  of  its  edges,  the  capacitance  of  a plane- parallel  capaci- 
tor is  proportional  to  the  area  a of  its  plates  and  inversely  proportional  to  their  sepa- 
ration d.  The  permittivity  constant  e0  is  the  proportionality  constant.  Note 
that  the  capacitance  depends  essentially  on  the  geometrical  properties  of 
the  system  of  two  electrodes. 

Figure  21-32  illustrates  a stretched  rubber-sheet  analogue  to  a 
plane-parallel  capacitor.  It  shows  a uniformly  varying  electric  potential  V 
everywhere  in  the  region  between  the  plates  at  a distance  from  the  edges 
somewhat  larger  than  the  separation  between  the  plates.  This  corresponds 
to  a uniform  electric  field  8 in  the  region.  Near  the  edges  the  behavior  of  8 
is  more  complicated.  The  pattern  of  electric  field  lines  produced  by  a 
plane-parallel  capacitor  is  illustrated  in  Fig.  21 -33a. 

The  results  that  we  have  obtained  concerning  capacitors  are  applied  to 
a specific  case  in  Example  21-11. 


EXAMPLE  21-11 


A plane- parallel  capacitor  has  circular  plates  of  radius  r = 10.0  cm,  separated  by  a 
distance  d = 1.00  mm.  How  much  charge  is  stored  on  each  plate  when  their  electric 
potential  difference  has  the  value  V = 100  V? 


21-6  Capacitors  and  Capacitance  991 


Fig.  21-33  (a)  The  electric  field  lines  of  a standard  plane-parallel  ca- 

pacitor. Well  inside  the  edges  of  the  plates,  the  field  lines  are  very  uni- 
form. But  near  the  edges  they  bow  out  to  form  the  “fringing  field”  out- 
side the  capacitor  plates.  ( b ) In  a guarded  capacitor,  ring-shaped 
electrodes  surround  the  capacitor  plates.  Each  guard  ring  is  insulated 
from  the  adjacent  plate,  but  is  kept  at  the  same  electric  potential  by  an 
independent  circuit  not  shown  in  the  figure.  The  electric  field  lines  be- 
tween the  guard  rings  are  nonuniform.  The  electric  field  lines  between 
the  capacitor  plates  are  very  uniform — even  up  to  the  edges. 


■ To  calculate  C,  you  first  evaluate  the  plate  area 

a = nr2  = tt(  1 .00  X 10-1  m)2  = 3.14  X 10-2  m2 
From  Ecp  (21-49),  you  have  for  the  capacitance 

e0a  8.85  x 10-12  C2/(N-m2)  X 3.14  X 10-2  m2 
C ~ ~d~  ~ 1.00  x 10-3  m 

= 2.8  x 10-10  F = 2.8  x 102  pF 

It  is  not  appropriate  to  quote  the  capacitance  to  more  than  two  significant  figures 
because  you  evaluated  it  from  Eq.  (21-49),  which  ignores  the  effects  of  the  edges  of 
the  capacitor  plates.  Figures  21-32  and  21 -33a  show  that  these  effects  occur  in  a 
region  whose  radial  extent  A r is  comparable  to  the  separation  d of  the  plates.  Hence 
the  ratio  A r/r  — d/r  = 10-3  m/10-1  m = 1 percent  gives  a measure  of  the  accuracy 
to  be  expected  from  the  equation. 

To  determine  the  magnitude  |(?|  of  the  charge  stored  on  either  plate  of  the 
capacitor,  you  evaluate  from  Eq.  (21 -45a) 

|?|  = CV  = 2.8  x 10-1°  F x 1.00  x 102  V 
= 2.8  x 10-8  C 


By  using  the  arrangement  shown  in  Fig.  21-33 b it  is  possible  to  elimi- 
nate the  effects  of  the  edges  of  a plane-parallel  capacitor.  The  guard  rings 
surrounding  the  capacitor  are  separately  maintained  at  the  same  electric 
potentials  as  the  capacitor  plates.  The  “edge  effects”  are  thus  removed 
from  the  capacitor  itself  to  the  guard  rings.  Consequently,  the  electric  field 
between  the  capacitor  plates  is  uniform,  up  to  their  very  edges,  and  the 
field  in  the  capacitor  accurately  satisfies  the  assumption  made  in  deriving 
Ecp  (21-49),  C = e0a/d.  So  this  equation  can  be  applied  with  accuracy  to 
evaluate  C.  Guarded  capacitors  are  used  to  provide  standard  values  of  ca- 
pacitance for  laboratory  calibration  purposes.  They  can  also  be  used  to  de- 


The  Electric  Potential 


termine  ihe  value  of  the  permittivity  constant  e0.  Accurate  measurements 
are  made  of  the  area  a and  separation  d of  the  plates  of  the  capacitor  within 
the  guard  rings.  Next  its  capacitance  C is  determined  from  Eq.  (21-45£), 
C = \q\/V,  by  measuring  the  magnitude  \q\  of  the  charge  on  the  plates  when 
there  is  a measured  electric  potential  difference  V between  them.  Then  Eq. 
(21-49),  in  the  form  e0  = Cd/a,  is  used  to  determine  e0. 

In  1909  R.  A.  Millikan  exploited  the  fact  that  the  electric  held  well  in- 
side a plane-parallel  capacitor  has  a uniform  and  easily  determined  magni- 
tude in  the  first  experiment  which  demonstrated  that  all  electrons  have  the 
same  charge,  and  which  measured  its  magnitude  e.  Taken  together  with  J.J. 
Thomson’s  I 897  experiment  (described  in  Chap.  23)  showing  that  all  elec- 
trons have  the  same  charge-to-mass  ratio  e/m,  Millikan’s  work  established 
the  existence  of  the  electron  as  a particle  of  specific  charge  and  mass.  We 
give  a simplified  account  of  Millikan’s  experimental  procedure. 

I he  procedure  is  based  on  the  observation  that  when  a liquid  is 
sprayed  into  fine  chops  by  an  atomizer  (like  a perfume  sprayer),  the  fric- 
tion results  in  the  presence  of  a very  small  amount  of  charge  on  most  of  the 
drops.  Drops  of  oil,  charged  in  this  way,  are  sprayed  into  the  central  region 
of  a plane-parallel  capacitor  with  horizontal  plates.  While  a microscope  is 
tised  to  watch  a particular  drop,  the  electric  potential  difference  between 
the  capacitor  plates  is  adjusted  in  sign  and  magnitude  so  that  an  upward 
electric  force  of  magnilude  \q\%  acting  on  the  drop  just  supports  it  against 
the  downward  gravitational  force  of  magnitude  Mg.  In  this  equilibrium 
condition 

I q\%  = Mg 

where  |#|  is  the  magnitude  of  the  charge  on  the  oil  drop,  M is  its  mass,  g is 
the  gravitational  acceleration,  and  % is  the  magnitude  of  the  electric  field. 

In  the  central  region  of  a plane-parallel  capacitor  the  electric  field  is 
uniform  and  has  a magnitude  given  by  Eq.  (21 -48a)  to  be 


Here  V is  the  electric  potential  difference  between  the  capacitor  plates,  and 
d is  their  separation.  Thus  the  equilibrium  condition  is 

, V 

Id  = Ms 

or 

Mgd 
V 

To  determine  the  oil  drop  mass  M,  the  capacitor  is  discharged  by  con- 
necting a conducting  wire  between  its  plates,  so  that  no  electric  force  acts 
on  the  drop.  It  then  falls  under  the  influence  of  gravity,  rapidly  reaching  a 
constant  terminal  speed  v.  The  value  of  v is  measured  with  the  aid  of  a 
graduated  distance  scale  in  the  microscope  and  an  accurate  clock.  A termi- 
nal speed  is  reached  because  of  the  effect  of  fluid  friction  between  the  drop 
and  the  air  through  which  it  falls.  The  fluid  friction  obeys  Stokes'  law  be- 
cause the  drops  are  very  small  and  they  are  moving  very  slowly.  Hence  Eq. 
(4-26)  applies  and  requires  that  at  terminal  speed 

Mg  = dinqrv 

21-6  Capacitors  and  Capacitance  993 


Here  iq  is  the  coefficient  of  viscosity  of  air,  the  value  of  which  is  determined 
in  a separate  measurement,  and  r is  the  radius  of  the  drop.  The  radius  also 
is  related  to  the  mass  of  the  drop  and  the  known  density  p of  the  oil  by  the 
equation 

M = §7 rr3p 

By  eliminating  r between  the  last  two  equations,  an  expression  is  obtained 
which  makes  it  possible  to  determine  M.  Then  by  using  its  value  in  the  ex- 
pression for  \q\,  the  magnitude  of  the  charge  on  the  oil  drop,  this  charge  can 
be  evaluated. 

After  studying  many  different  oil  drops,  Millikan  found  that  within 
the  accuracy  of  his  measurements  all  the  values  obtained  for  \q\  could  be 
fitted  by  the  formula 

kl  = ne 

where  n is  a small  integer  and  the  charge  e is  a constant.  He  correctly  inter- 
preted this  result  to  mean  that  the  charge  on  any  oil  ch  op  consisted  of  an 
integral  number  of  electron  charges,  e being  the  magnitude  of  that  charge. 
In  work  done  in  1913  he  found  the  value  of  e to  be  1.592  x 10-19  C,  not  too 
far  from  the  best  modern  value  1.602  x 10~19  C. 

In  electric  circuits,  capacitors  are  often  connected  in  various  ways.  We 
now  consider  the  two  simplest  and  most  important  of  these  ways.  A parallel 
connection  of  two  capacitors  is  shown  in  Fig.  21 -34a,  and  a series  connec- 
tion is  shown  in  Fig.  21-35a.  These  figures  use  the  standard  electrical 
symbols  for  capacitors  and  the  conducting  wires  leading  from  their  plates. 
The  symbol  is  reminiscent  of  an  actual  drawing  of  a plane-parallel  capaci- 
tor with  two  parallel  lines  representing  its  plates  and  two  more  lines  repre- 
senting the  wires  through  which  charge  flows  onto  the  plates.  But  the 
symbol  is  used  to  represent  a capacitor  of  any  geometry. 

Can  the  two  capacitors  C:  and  C2  which  are  connected  in  parallel  be  re- 
placed with  a single  capacitor  of  capacitance  C whose  electrical  properties 
are  identical  to  those  of  the  pair?  If  so,  what  must  the  value  of  C be?  And 
can  the  capacitors  and  C2  which  are  connected  in  series  be  replaced  with 
a single  capacitor  of  capacitance  C whose  electrical  properties  are  identical 
to  those  of  the  pair?  If  so,  what  must  the  value  of  C be  in  this  case? 

In  both  the  parallel  and  series  cases,  such  a replacement  is  indeed  pos- 
sible. In  the  parallel  case,  the  replacement  capacitor  must  have  the  capaci- 
tance 

C = Ci  + C2  for  parallel  connection  (21-50) 


Fig.  21-34  (a)  Two  capacitors  of  capacitance  Cx  and  C2  connected  Fig.  21-35  (a)  Two  capacitors  of  capacitance  Cj  and  C2  connected 

in  parallel.  ( b ) A single  equivalent  capacitor  ol  capacitance  C.  in  series.  ( b ) A single  equivalent  capacitor  of  capacitance  C. 


C, 

c,  c2 

C 

a 1 1 1 1 6 a 

b 

C II 

1 

(a)  ( b ) 


994 


The  Electric  Potential 


In  the  series  case,  the  replacement  capacitor  must  have  a value  of  C which 
satisfies  the  equation 

— = — + — for  series  connection  (21-51) 

C Ci  C2 

fhe  proof  of  these  statements  follows. 


To  prove  Eq.  (21-50),  note  that  the  electric  potential  difference  V is  the 
same  across  the  plates  of  both  of  the  parallel-connected  capacitors  in  Fig. 
21 -34a.  This  is  because  the  wire  connecting  the  plates  on  the  left  makes 
them  into  a single,  continuous  conducting  surface  and  thus  constrains 
them  to  be  at  the  same  electric  potential.  This  is  also  true  of  the  plates  on 
the  right.  Since  the  electric  potential  difference  between  the  plates  of  both 
capacitors  has  the  same  value  V,  the  magnitudes  of  the  charges  on  their 
plates  are  Mi  = CXV  and  \q\2  — C2V.  The  magnitude  of  the  total  charge  on 
the  plates  connected  to  a,  or  to  b,  is 

M = Mi  + Ms  = c,v  + c2t 


or 

M = (Ci  + c2)v 

Now  consider  the  single  capacitor  shown  in  Fig.  21-346,  and  let  the 
electric  potential  difference  V between  its  plates  be  equal  to  the  value  of  V 
for  the  parallel-connected  capacitors  in  Fig.  21 -34a.  Then  the  charge  M on 
its  plates  is  given  by  the  relation 

M = cv 

If  the  single  capacitor  is  to  have  electrical  properties  identical  to  those  of 
the  parallel-connected  capacitors,  then  when  the  values  of  V are  equal,  the 
values  of  M must  also  be  equal.  Comparison  of  the  two  equations  displayed 
above  shows  that  this  will  be  true,  providing  that  C = Cx  + C2 , as  stated  in 
Eq.  (21-50). 

There  is  a very  straightforward  interpretation  of  Eq.  (21-50)  for  the 
case  of  two  adjacent  plane-parallel  capacitors  having  the  same  plate  separa- 
tion d and  connected  in  parallel.  Can  you  explain  what  it  is? 


To  prove  Eq.  (21-51),  imagine  that  the  series-connected  capacitors  are 
charged  by  connecting  point  a to  the  positive  terminal  of  a battery  and 
point  b to  its  negative  terminal.  Electrons  will  flow  out  of  one  battery  termi- 
nal and  into  the  other,  so  that  a charge  +M  is  placed  on  the  left  plate  of  Cx 
and  a charge  — M is  placed  on  the  right  plate  of  C2.  These  charges  are 
responsible  for  an  electric  field  along  tfie  wire  connecting  tire  other  two 
plates.  The  electric  field  makes  electrons  in  the  wire  flow  until  the  right 
plate  of  Cx  has  charge  — M and  the  left  plate  of  C2  has  charge  +M.  When 
this  equilibrium  situation  is  achieved,  there  is  no  more  electric  field  along 
tfie  wire  to  make  electrons  How.  You  can  see  this  by  noting  that  when  the 
right  plate  of  C1  has  charge  — 1</|,  then  along  the  wire  the  electric  field  of  tfie 
charge  on  that  plate  will  just  cancel  the  electric  field  of  the  charge  + M on 
the  left  plate  of  Cx.  And  in  the  equilibrium  situation  the  same  kind  of  can- 
cellation occurs  for  C2.  Thus  the  charges  on  both  plates  of  both  capacitors 
end  up  with  the  same  magnitude  M-  Then  the  electric  potential  difference 
across  one  capacitor  is  Vx  = \q\/Cx,  and  that  across  the  other  is  V2  — \q\/C2. 

21-6  Capacitors  and  Capacitance  995 


So  the  total  electric  potential  difference  from  a to  b is 


V = Vj  + V2 


M W 
Cx  c2 


or 


For  the  single  capacitor  shown  in  Fig.  21-356,  when  the  electric  poten- 
tial difference  V between  its  plates  is  equal  to  the  value  of  V for  the 
series-connected  capacitors  in  Fig.  21-35a,  the  charge  \q\  on  its  plates  sa- 
tisfies the  relation 

v=h* 1 

The  single  capacitor  will  be  identical  in  its  electrical  properties  to  the 
series-connected  capacitors  if  this  value  of  M equals  the  value  given  by  the 
equation  displayed  at  the  end  of  the  last  paragraph.  Comparison  shows 
that  this  will  be  true  if  1/C  = 1/Ci  + 1/C2,  as  stated  in  Eq.  (21-51). 

Can  you  come  up  with  a physical  interpretation  of  Eq.  (21-51)  for  the 
case  of  two  series-connected,  plane-parallel  capacitors  with  plates  having 
identical  dimensions?  Imagine  that  the  wire  connecting  the  capacitors 
shrinks  to  zero  length,  so  that  the  adjacent  plates  merge  into  a single  metal 
plate.  Then  ask  yourself  what  role,  if  any,  this  plate  plays  in  the  system. 

Example  21-12  involves  the  rules  for  calculating  the  capacitance  of  a 
set  of  capacitors  connected  in  series  or  in  parallel. 


If  you  need  a capacitor  with  C = 0.25  /xF,  but  the  only  ones  in  the  storeroom  have 
C = 1.00  fiF,  must  you  delay  finishing  your  experiment? 

■ No.  You  can  connect  four  of  the  available- capacitors  in  series.  The  first  two, 
taken  as  a system,  have  a capacitance  given  by 

111  2 
C ~~  1.00  pY  + 1.00  /xF  ~ 1.00  /xF 


or 


C = 0.50  /xF 

The  system  composed  of  the  second  two  has  the  same  capacitance.  So  the  capaci- 
tance of  the  series-connected  combination  of  all  four  is  C',  where 

111  2 
C7  “ 0.50  fiF  + 0.50  /txF  " 0.50  /xF 


or 

C'  = 0.25  jixF 

You  should  prove  that  any  number  of  capacitors  with  C1;  C2,  C3,  . . . con- 
nected in  series  is  equivalent  to  a single  one  whose  capacitance  is  given  by 

1111 

“=—+—+—+•  • • for  series  connection  (21-52) 

L C>i  C/2  C/3 

Also  prove  that  when  they  are  connected  in  parallel,  the  equivalent  capacitance  is 
C = Ci  + C2  + C3  + ■ ■ ■ for  parallel  connection  (21-53) 


996  The  Electric  Potential 


21-7  ENERGY  IN 
CAPACITORS  AND 
ELECTRIC  FIELDS 


d\q\ 


+\q\  — Iql 

Fig.  21-36  An  infinitesimal  amount  of 
positive  charge,  d\q\,  transferred  from 
the  negatively  charged  plate  of  a capaci- 
tor to  the  positively  charged  plate. 


Then  describe  two  different  connections  of  the  1.00-/U.F  capacitors  that  could  be 
used  to  produce  a capacitance  of  0.67  /jlF . 


A capacitor  in  an  electrical  system  has  properties  very  much  like  those  of  a 
spring  in  a mechanical  system.  Suppose  that  you  have  partly  charged  a 
capacitor  and  are  continuing  the  process.  You  must  add  electrons  to  the 
electrode  that  already  has  a surplus  of  electrons.  In  so  doing,  you  must 
overcome  the  repulsion  of  like  charges.  Also,  you  must  remove  electrons 
from  the  electrode  that  already  has  a deficiency  of  electrons  and  thus  has  a 
net  positive  charge.  This  operation  is  opposed  by  the  attraction  of  unlike 
charges.  So  an  ever-increasing  force  must  be  applied  to  the  charges  you 
move  successively  when  you  continue  charging  the  electrodes.  By  Cou- 
lomb’s law,  the  strength  of  the  force  is  proportional  to  the  amount  of 
charge  (he  electrodes  already  hold.  Compare  this  process  to  stretching  a 
spring.  If  it  is  already  extended,  it  takes  force  to  extend  it  further.  And  by 
Hooke’s  law,  the  force  is  proportional  to  the  amount  it  is  already  extended. 

Just  as  work  is  done  by  the  force  required  to  change  the  length  of  a 
spring  from  its  relaxed  value,  so  is  work  clone  by  the  force  required  to 
change  the  charge  on  capacitor  electrodes  from  their  uncharged  state.  In 
both  cases  the  work  appears  as  potential  energy  stored  in  the  system.  We 
will  evaluate  this  potential  energy. 

Assume  the  electrodes  in  Fig.  21-36  already  have  charges  of  the  oppo- 
site sign  and  the  same  magnitude  |^|.  Then  the  electric  potential  at  the  posi- 
tively charged  electrode  will  differ  from  that  at  the  negatively  charged  elec- 
trode by  the  positive  quantity  V.  Let  us  apply  the  definition  of  electric 
potential  in  terms  of  electric  potential  energy  and  amount  of  test  charge, 
to  evaluate  the  increase  in  the  electric  potential  energy  of  the  system  when 
additional  positive  charge  is  transferred  from  the  negative  electrode  of  the 
capacitor  to  its  positive  electrode.  If  a test  charge  q,  is  moved  in  this  manner 
through  the  electric  potential  difference  V , there  will  be  a difference  U 
between  the  value  of  the  electric  potential  energy  of  the  system  after  it  is 
moved  and  the  value  of  this  quantity  before  it  is  moved.  According  to  Eq. 
(21-6),  the  relation  among  these  quantities  is  U = Vqt.  But  we  can  apply  this 
relation  only  if  there  is  no  appreciable  increase  in  V during  the  charge 
transfer  as  a result  of  the  transfer  itself  increasing  the  charge  on  the  capaci- 
tor by  an  appreciable  amount.  To  ensure  the  applicability  of  the  relation, 
we  let  qt  be  infinitesimal.  We  write  it  as  d\q\,  because  the  transfer  leads  to  a 
change  in  the  charge  |t/|  on  the  electrodes.  Since  the  transferred  charge  is 
infinitesimal,  the  electric  potential  energy  difference  arising  from  the 
transfer  also  will  be  infinitesimal.  Its  value  is 


dU  = V d\q\ 


Now  |<jf|  = CV,  and  C is  a constant.  Thus  we  have  d\q\  = C dV,  and  we  can 
express  the  infinitesimal  potential  energy  change  as 


dU  = VC  dV 


The  total  change  in  this  potential  energy  when  the  capacitor  is  brought 
to  a final  charged  state  with  V = V f from  an  initial  uncharged  state  with 
V = 0 is  obtained  by  integrating.  During  the  process  the  potential  energy 
charges  to  U — Uf  from  the  initial  value  U = 0,  so  we  have 


21-7  ENERGY  IN 
CAPACITORS  AND 
ELECTRIC  FIELDS 


d\q  I 


+\q\  —Iql 

Fig.  21-36  An  infinitesimal  amount  of 
positive  charge,  d\q\,  transferred  from 
the  negatively  charged  plate  of  a capaci- 
tor to  the  positively  charged  plate. 


997 


Since  C is  a constant,  we  can  write  this  as 


dU  = C \V'  V dV 
Jo  Jo 

Evaluating  the  integrals,  we  find 

CV} 


Now  the  unneeded  subscript  / can  be  chopped,  to  give  the  result 

CV2 

u = - Y (21-54) 

The  quantity  U is  the  potential  energy  stored  in  a capacitor  of  capacitance 
C when  it  is  charged  so  that  the  electric  potential  difference  between  its 
electrodes  is  V. 

Using  |#|  = CV,  we  can  express  the  potential  energy  in  the  capacitor  in 
the  alternative  form 


(21-55) 


In  this  form  we  can  see  again  the  analogy  between  the  distortion  x of  a 
spring  and  the  charge  q on  capacitor  plates.  For  a spring  the  potential  en- 
ergy is  given  by  the  familiar  expression 


where  k is  the  Hooke’s-law  constant  that  specifies  the  stiffness  of  the  spring. 
Comparison  shows  that  q is  analogous  to  x and  that  1/C  is  analogous  to  k. 
Can  you  explain  why  the  reciprocal  of  its  capacitance  is  a measure  of  the 
“stiffness”  of  a capacitor? 


We  associate  the  energy  stored  in  a charged  capacitor  with  the  electric 
field  of  the  capacitor.  After  all,  the  work  expended  in  charging  it  is  done  by 
the  force  exerted  to  overcome  the  effect  of  the  electric  field  on  the  charge 
moved  from  one  electrode  to  the  other.  Since  the  electric  field  is  distributed 
throughout  the  region  near  the  capacitor  electrodes,  it  is  reasonable  to  say 
that  the  energy  in  the  field  is  similarly  distributed. 

The  relation  between  the  strength  of  the  electric  field  in  a certain  ele- 
ment of  volume  and  the  energy  contained  in  the  volume  element  is  particu- 
larly easy  to  obtain  by  considering  a guard-ring  plane-parallel  capacitor.  In 
such  a capacitor,  the  electric  field  of  the  charge  on  the  plates  is  accurately 
the  same  everywhere  in  the  region  between  the  plates  and  is  zero  else- 
where. Its  magnitude  c?  in  the  region  between  the  plates  can  be  evaluated  in 
terms  of  the  electric  potential  difference  V across  the  plates  and  their 
spacing  d by  using  Eq.  (21-48«): 


Eater  it  will  prove  very  convenient  to  express  the  energy  stored  in  the  elec- 
tric field  in  terms  of  its  energy  per  unit  volume.  This  quantity  is  called  the 
electric  field  energy  density  p,,.  Since  it  is  intimately  associated  with  the 
electric  field,  it  must  be  constant  everywhere  in  the  region  between  the 


capacitor  plates  (where  % is  constant)  and  zero  everywhere  outside  that 
region  (where  % is  zero).  The  volume  of  the  region  is  ad,  where  a is  the  area 
of  either  plate.  So  we  have 

U_  _ CV2 
P e ad  2 ad 

where  we  have  used  Eq.  (2  1-54)  to  evaluate  the  total  energy  U in  the  electric 
field.  Since  V = %d,  this  can  be  written 

C&d 

Pe  2 a 

Then  we  evaluate  C for  the  guard-ring  plane-parallel  capacitor  from  Eq. 
(21-49), 


c = 

L d 

and  obtain  immediately 


We  have  shown  that  the  electric  held  energy  density  is  one-half  the 
permittivity  constant  times  the  square  of  the  Held  strength.  Although  we 
have  obtained  this  important  result  by  treating  a special  case  where  % is 
constant  everywhere  that  it  is  nonzero,  the  result  is  valid  no  matter  how  % 
varies.  The  reason  is  that  Eq.  (21-56)  relates  the  value  of  pe  in  the  immedi- 
ate vicinity  of  a point  in  the  Held  to  the  value  of  % at  that  point,  and  this  re- 
lation is  unaffected  by  what  these  quantities  do  at  some  other  point.  The 
full  significance  Eq.  (21-56)  will  not  become  apparent  until  we  make  an 
in-depth  study  of  the  properties  of  electric  and  magnetic  fields  in  the 
chapter  on  electromagnetic  waves.  At  an  earlier  stage  we  will  see  the  signifi- 
cance of  Eqs.  (21-54)  and  (21-55).  One  or  the  other  of  these  is  usually  the 
most  convenient  equation  to  employ  in  dealing  with  the  energy  content  U 
of  a specific  capacitor  because  they  relate  it  by  direct  proportion  to  the 
readily  measured  quantities  V2,  the  square  of  the  electric  potential  dif- 
ference between  its  plates,  or  to  q 2,  the  square  of  the  charge  on  them. 

Examples  21-13  and  21-14  employ  relations  developed  in  this  section. 


EXAMPLE  21-13 

Calculate  the  electric  field,  the  electric  field  energy  density,  and  the  energy 
stored  in  the  plane-parallel  capacitor  of  Example  21-1  1.  It  has  circular  plates  of 
radius  10.0  cm  separated  by  1.00  mm,  and  the  electric  potential  difference  between 
them  is  100  V. 

■ You  find  the  electric  field  by  using  Eq.  (21 -48a)  to  evaluate 


100  V 

1.00  X 10~3  m 


= 1.00  x 


105  V/m 


This  is  a fairly  large  electric  field,  as  judged  by  the  values  commonly  encountered. 
Then  you  can  obtain  the  electric  field  energy  density  by  evaluating  Eq.  (21-56): 


Pe  = 


eo^2 


2 


8.85  x 10~12  C7(N-m2)  x (1.00  X 105  V/m)2 
2 


= 4.4  x 10-2  J/m3 


21-7  Energy  in  Capacitors  and  Electric  Fields  999 


Because  of  the  small  value  of  e0,  the  energy  density  is  not  large  in  comparison  to  the 
energy  densities  of  other  systems  we  deal  with  in  the  everyday  world,  such  as  the 
density  of  energy  in  a charged  storage  battery. 

To  obtain  the  energy  stored  in  the  capacitor,  knowing  the  energy  per  unit  vol- 
ume pe  in  its  electric  held,  you  can  multiply  pe  by  the  volume  irr2d  of  the  held: 

U = pe7rr2d  = 4.4  x 10~2  J/m3  X 77-  X (ICG1  m)2  x 10~3  m 
= 1.4  x 10“6J 

But  a more  direct  way  to  get  this  result  is  to  use  the  capacitance  C = 2.8  X 10_I°  F 
obtained  in  Example  21-11  and  to  evaluate  Eq.  (21-54): 

CV2  2.8  x 10“10  F x (1.00  x 102  V)2  , „ 

U=—= g = L4  x 10  J 

The  hrst  calculation  of  U does  provide  a worthwhile  insight.  It  shows  that  the  total 
energy  stored  in  the  electric  held  is  very  small  because  there  is  a small  energy  density 
extending  over  a small  volume. 

Many  practical  capacitors  have  capacitances  much  larger  than  the  one  dealt 
with  in  this  example.  If  the  electric  potential  difference  across  their  plates  is  large, 
such  capacitors  can  store  very  significant  amounts  of  energy.  A capacitor  small 
enough  to  pick  up  easily  can  contain  enough  energy  to  be  lethal  if  you  discharge  it 
through  your  body  by  touching  both  of  the  wires  connected  to  the  plates.  Be 
warned! 


a.  Obtain  expressions  for  the  electric  held  energy  density  pe  and  energy  con- 
tent U for  the  spherical  capacitor  shown  in  Fig.  21-37,  when  the  inner  and  outer 
spheres  hold  charges  +|<?|  and  — 1#|,  respectively. 

■ Gauss’  law  tells  you  that  there  is  no  electric  held  inside  the  spherical  elec- 
trode of  radius  rx  and  none  outside  the  spherical  electrode  of  radius  r2.  And  it  tells 
you  that  in  between  the  held  is  associated  with  the  charge  on  the  inner  electrode 
only,  and  has  magnitude 


% = 


\q\ 

4ve0r2 


Fig.  21-37  A spherical  capacitor  consist- 
ing of  spherical  metal  shells  of  radii  rx  and 


1000  The  Electric  Potential 


So  pe  in  (his  region  is 


e0<£2  _ e0q2 
~~2~~  ~ 2(477)2e02r4 


or 


Pe  = 


9 

y 

32t r2e0r4 


for  S r?  r2 


(21-57) 


You  can  find  U by  integrating  pe  over  volume  elements  dv  that  are  spherical 
shells,  concentric  with  the  electrodes,  of  radius  r and  thickness  dr.  Since  dv  = 
477 r2  dr,  you  have 


U = 


pe  dv  = 477 


per2  dr 


Arrq2  rr*r2dr  q2  Cr*dr 
32772€0  Jr 1 r4  87760  Jr,  r2 


Evaluating  the  integral  gives 


or 


87760  Vrj  r2  / 


(21-58) 


Note  that  by  letting  r2  go  to  infinity  in  Eqs.  (21-57)  and  (21-58)  you  immediately 
obtain  expressions  for  the  energy  density  and  total  energy  in  the  electric  field  of  a 
single  charged  spherical  electrode  of  radius  rlt  with  no  other  electrode  close  enough 
to  it  to  have  a significant  influence  on  its  field.  Even  though  the  volume  sur- 
rounding the  single  charged  sphere  is  infinite,  the  total  energy  U stored  in  its  elec- 
tric field  is  finite.  Can  you  explain  why?  ■ 

b.  Use  the  expression  for  U to  evaluate  the  capacitance  C of  the  spherical 
capacitor. 

■ All  you  have  to  do  is  solve 


for  C,  obtaining 


Then  substitute  the  expression  for  U.  You  get 


C = 


<r 


(2^2/87T60)(l/r1 


1 /r,) 


or 


4 7760 

1/ri  - 1/r, 


(21-59) 


You  should  check  this  result  by  modifying  the  direct  calculation  which  led  to  the  ca- 
pacitance evaluated  in  Eq.  (21-49)  so  that  it  can  be  used  to  calculate  the  capacitance 
of  the  spherical  capacitor.  Note  that  Eq.  (21-59),  like  Eq.  (21-49),  expresses  C in 
terms  of  e0  times  a geometric  factor. 

By  letting  r2  go  to  infinity,  Eq.  (21-59)  becomes  an  expression  for  what  can  be 
called  the  capacitance  of  an  isolated  spherical  electrode  of  radius  r1.  Of  course, 


21-7  Energy  in  Capacitors  and  Electric  Fields  1001 


there  really  must  be  a second  electrode  for  the  term  “capacitance”  to  be  meaningful. 
There  is,  but  this  electrode  is  infinitely  far  away.  This  being  the  case,  its  shape  is  of 
no  consequence. 


21-8  DIELECTRICS  An  insulating  substance  is  called  a dielectric.  Nearly  all  practical  capaci- 
tors are  constructed  with  dielectrics  between  their  electrodes  rather  than 
vacuum  (or  its  near  equivalent,  air),  as  we  have  assumed  up  to  this  point.  A 
principal  reason  is  that  the  presence  of  the  dielectric  increases  the  capaci- 
tance. To  understand  why,  we  must  understand  what  happens  when  an  ex- 
ternal electric  held  is  applied  to  a dielectric. 

When  a dielectric  substance  is  subjected  to  an  external  electric  held  80, 
electric  forces  are  exerted  on  the  positively  and  negatively  charged  par- 
ticles which  comprise  the  substance.  The  particles  of  opposite  charge  tend 
to  move  in  opposite  directions  because  the  forces  exerted  on  them  are  op- 
positely directed.  But  a dielectric,  being  an  insulator,  is  a substance  in 
which  charged  particles  are  not  free  to  move  indefinitely.  In  describing  the 
motion  that  does  occur,  two  important  cases  are  to  be  distinguished. 

In  the  hrst  case  there  are  no  permanent  electric  dipoles  in  the  dielec- 
tric substance.  That  is,  there  are  no  dipoles  in  the  absence  of  the  external 
electric  held.  This  means  that  for  each  molecule  of  the  dielectric  the 
average  location  of  the  negative  charge  (on  the  electrons)  coincides  with  the 
average  location  of  the  positive  charge  (on  the  nuclei)  when  there  is  no  ap- 
plied electric  held.  When  an  electric  held  80  is  applied  to  the  dielectric  from 
the  outside,  it  induces  electric  dipoles — called  induced  electric  dipoles  — 
inside  the  dielectric.  The  applied  electric  held  does  this  by  “stretching” 
each  of  the  molecules  so  that  the  average  location  of  its  negative  charge  is 
displaced  from  the  average  location  of  its  positive  charge.  Figure  21-38  il- 


© 

© 

© 

© 

© 

© 

© 

£i 

© 

© 

© 

© 

(a) 

©iff© 

©"a© 

©*© 

©^© 

©"© 

©T© 

wym © 

©T© 

©"W© 

©'TO© 

©'strQ 

©TO© 

Fig.  21-38  (a)  Schematic  representation  of  a dielectric  material 

whose  molecules  do  not  have  permanent  electric  dipole  moments. 
In  the  absence  of  an  applied  electric  held  £0,  the  average  positions 
of  each  molecule's  positive  and  negative  charge  coincide.  ( b ) When 
= o an  electric  held  £0  is  applied  to  the  material,  the  positive  charge  in 

each  molecule  is  displaced  in  the  direction  of  £0  and  the  negative 
charge  is  displaced  in  the  opposite  direction.  Thus  each  molecule 
becomes  an  electric  dipole.  The  springs  represent  the  fact  that 
these  are  induced  dipoles.  In  other  words,  if  the  applied  electric 
held  is  removed,  the  two  charges  in  each  dipole  will  “snap  back” 
into  coincidence,  and  it  will  no  longer  be  an  electric  dipole. 


£0  =^0 

► 


( b ) 

1002  The  Electric  Potential 


lustrates  the  process,  picturing  each  molecule  as  a positive  and  negative 
charge  of  equal  magnitude  whose  centers  are  joined  by  a spring  that  repre- 
sents the  attractive  forces  they  exert  on  each  other.  When  80  = 0,  each 
spring  has  zero  length  and  the  charges  overlap  completely.  The  larger  the 
magnit  ude  of  80,  the  greater  is  the  extension  of  each  spring.  Since  each  in- 
duced electric  dipole  has  an  electric  dipole  moment  p of  magnitude  pro- 
portional to  the  separation  of  its  two  charges,  the  electric  dipole  moment 
magnitude  increases  with  increasing  magnitude  of  the  applied  electric 
field.  In  fact,  experiment  shows  that  the  magnitude  p increases  in  direct 
proportion  to  the  magnitude  — provided  the  applied  electric  field  is  not 
too  large.  As  for  direction,  each  induced  electric  dipole  moment  vector  p, 
being  directed  from  the  negative  to  the  positive  charge,  is  in  the  direction 
of  the  applied  electric  field  vector  80. 

In  the  second  case,  there  are  permanent  electric  dipoles  in  the  dielec- 
tric substance.  For  most  such  substances,  in  the  absence  of  an  applied  elec- 
tric field  So  these  electric  dipoles  are  randomly  oriented  because  of  thermal 
agitation.  When  the  electric  field  is  applied,  the  equal  and  opposite  forces  it 
exerts  on  the  two  charges  of  each  dipole  produce  a torque  on  the  dipole.  As 
is  indicated  schematically  in  Fig.  21-39,  these  torques  cause  the  dipoles  to 
rotate  so  that  their  electric  dipole  moments  p come  into  partial  alignment 
with  the  applied  electric  field  80.  (See  also  Fig.  21-17.)  Experiment  demon- 
strates that  if  the  applied  electric  field  80  is  not  too  large,  the  average  value 
of  the  components  of  p along  the  direction  of  80 — a measure  of  the  degree 
of  alignment — is  proportional  to  the  magnitude  of  80.  (In  addition  to 
aligning  the  directions  of  the  permanent  molecular  electric  dipole  mo- 
ments, the  applied  electric  field  may  increase  their  magnitudes  by  the 
stretching  effect  described  in  the  preceding  paragraph.) 


**  • : 
° . *„  » • .* 

: .* 

(a) 


Fig.  21-39  (a)  Schematic  representation  of  a dielectric  material 

whose  molecules  have  permanent  electric  dipole  moments.  The 
permanence  of  each  dipole  is  indicated  by  connecting  its  two 
charges  with  a rigid  rod,  instead  of  with  a spring.  In  the  absence  of 
an  applied  electric  held  £0,  thermal  agitation  randomizes  the  orien- 
tation of  the  dipoles.  ( b ) When  an  electric  held  £0  is  applied  to  the 
material,  the  electric  dipole  in  each  molecule  rotates  in  such  a way 
that  the  positive  charge  is  displaced  in  the  direction  of  £0  and  the 
negative  charge  is  displaced  in  the  opposite  direction. 


^© 


0—0  O — © © — 0 s0*o 


0 — ^ © — o © — 


C b ) 


21-8  Dielectrics  1003 


p p p p 

p > p a ^ p_^  p_^ 

^0 

P P_^  _P_^ 

Fig.  21-40  The  depolarization  field  8d,  the  applied  electric  field 
80,  and  their  vector  sum,  the  internal  electric  field  8lnt,  in  a slab 
of  dielectric  material.  The  depolarization  field  is  the  electric  field 
produced  by  the  electric  dipoles  that  the  molecules  of  the  ma- 
terial constitute.  Each  electric  dipole  is  indicated  by  its  electric  di- 
pole moment  p.  The  electric  dipole  moments,  being  directed 
from  the  negative  to  the  positive  charge  of  the  dipoles,  are 
aligned  in  the  direction  of  the  applied  electric  field  80.  This  is 
demonstrated  in  Figs.  21-38  and  21-39.  The  depolarization  field 
8d  is  in  the  direction  of  the  strongest  part  of  the  electric  field  pro- 
duced by  the  individual  dipoles.  It  is  shown  in  Fig.  21-16  that  this 
direction  is  opposite  to  the  direction  of  the  electric  dipole  moment 
p.  The  direction  of  p being  also  the  direction  of  £0,  it  follows  that 
8d  is  in  the  direction  opposite  to  that  of  80,  as  depicted  here. 


Figure  21-40  shows  schematically  the  macroscopic  effect  of  these 
microscopic  processes.  For  simplicity,  a rectangular  slab  of  dielectric  is 
pictured,  whose  faces  are  perpendicular  to  the  direction  of  a uniform  elec- 
tric held  S0  applied  from  outside  the  dielectric.  Within  the  body  of  the  di- 
electric, electric  dipole  moments  develop — and/or  rotate  if  they  are  already 
present — in  such  a way  that  on  the  average  the  electric  dipole  moment 
vectors  are  parallel  to  the  applied  electric  held  vector.  Now  Fig.  21-16 
shows  that  the  electric  held  8^  of  an  electric  dipole  is  strongest  in  the  region 
between  the  two  charges,  and  in  this  region  £>d  is  antiparallel  to  the  electric 
dipole  moment  p.  But  p is  parallel  to  the  electric  held  80  acting  on  the  elec- 
tric dipole.  Hence  in  the  region  where  8d  is  most  important  because  it  has 
the  largest  magnitude,  on  the  average  it  has  a direction  opposite  to  that  of 
80.  This  means  that  as  the  magnitudes  of  the  electric  dipole  moments 
increase — and/or  their  directions  increase  the  degree  of  alignment — 
under  the  influence  of  an  applied  electric  held  80  of  increasing  magnitude, 
there  comes  into  existence  a macroscopic,  oppositely  directed  electric 
held  8d  of  increasing  magnitude  resulting  from  the  increasing  alignment  of 
the  microscopic  dipoles.  The  electric  held  8rf  is  called  the  depolarization 
field.  Since  the  macroscopic  electric  dipole  moment  along  the  direction  of 
the  applied  electric  field  has  an  average  value  proportional  to  the  magni- 
tude of  the  applied  field,  the  magnitude  of  the  depolarization  field  is  also 
proportional  to  that  of  the  applied  field. 

The  actual  electric  field  in  the  interior  of  the  dielectric,  8int,  is  the 
vector  sum  of  the  applied  and  depolarization  fields: 

Sint  = So+Sd  (21-60) 

Since  80  and  8d  are  oppositely  directed,  the  terms  in  this  sum  tend  to 
cancel.  Thus  8int  always  has  a smaller  magnitude  than  8o-  Also,  its  magni- 
tude <9lnt  is  proportional  to  the  magnitude  g0  of  the  applied  electric  field 
since  both  terms  on  the  right  side  of  Eq.  (21-60)  have  magnitudes  propor- 
tional to  go- 

This  proportional  relation  between  the  magnitude  of  the  internal  elec- 
tric field  in  a dielectric  material  and  the  magnitude  of  the  external  electric 
field  applied  to  it  is  most  conveniently  stated  in  terms  of  the  equation 

g0 

Ke=~  (21-61) 

®int 


1004  The  Electric  Potential 


Table  21-2 


Room-Temperature  Dielectric  Constants  and  Dielectric  Strengths 
for  Various  Materials 


Material 

Dielectric  constant  Ke 

Dielectric  strength 
(in  106  V/m) 

Vacuum 

1 

Air  (dry) 

1.0006 

Approx.  1 

Water 

80 

Glycerine 

56 

Glass 

5 to  10 

30  to  150 

Polyethylene 

2.25 

50 

Mica 

3 to  7 

30  to  220 

The  quantity  Ke  defined  in  this  equation  is  a constant  called  the  dielectric 
constant.  The  value  of  Ke  is  always  greater  than  1 for  any  dielectric  sub- 
stance, since  %nt  is  always  less  than  %0.  For  vacuum,  its  value  is  exactly  1. 
For  a conducting  substance  the  value  is  Ke  = because  the  equilibrium 
value  of  i’int  is  always  zero  in  a conductor.  Table  2 1-2  gives  the  values  of  the 
dielectric  constant  Ke  for  several  different  materials. 


Except  for  very  special  applications,  capacitors  are  always  made  with  a 
dielectric  between  the  electrodes.  There  are  three  reasons  for  this.  The 
hrst  reason  is  structural.  When  two  electrodes  are  closely  spaced  to  maximize 
the  capacitance,  there  is  no  better  way  of  keeping  them  together  while 
keeping  them  from  making  direct  contact  than  by  sandwiching  them  with  a 
thin  sheet  of  dielectric.  The  second  reason  is  that  the  dielectric 
strength — the  maximum  electric  held  a material  can  sustain  without  suf- 
fering destructive  electrical  breakdown  as  it  becomes  a conductor — is 
greater  and  more  stable  for  many  insulating  materials  than  it  is  for  air.  This 
can  be  seen  from  the  last  column  of  Table  21-2.  The  third  reason  is  that  the 
dielectric  increases  the  capacitance  of  the  capacitor,  as  we  show  in  the  next 
paragraph. 

Consider  a plane-parallel  capacitor  with  vacuum  (or  for  most  practical 
purposes,  air)  between  its  plates.  The  plates  carry  charge  per  unit  area  of 
magnitude  |cr|  and  are  separated  by  distance  d.  Thus,  according  to  Eq. 
(21-47),  the  electric  held  between  the  plates  has  the  magnitude 


The  difference  in  electric  potential  of  the  two  plates  is  given  by  Eq. 
(21-48o)  as 


V =%0  d = 


ar\d 

Co 


Now  isolate  the  plates  so  that  no  charge  can  How  on  or  off  them.  Then  hll 
the  space  between  the  plates  with  a dielectric  of  dielectric  constant  Ke.  This 
reduces  the  magnitude  of  the  electric  held  between  the  plates  to  the  value 
«?int,  which  Eq.  (21-61)  shows  to  be 

«,  %o ..  H 

int  Ke  Kee0 


21-8  Dielectrics  1005 


The  new  value  of  the  electric  potential  difference  of  the  plates  is  calculated 
from  Eq.  (21-48o)  to  be 


V = glnt  d = 


\a\d 

Kee0 


This  is  reduced  from  the  original  value  by  the  factor  1 /Ke.  But  there  has 
been  no  change  in  the  charge  on  the  capacitor  plates  |g|.  So  the  definition  of 
capacitance  C in  Eq.  (21-456), 


shows  that  since  |g|  is  unchanged,  the  decrease  in  V by  the  factor  1/Ke  is  ac- 
companied by  an  increase  in  C by  the  factor  Ke.  The  same  conclusion  is  ob- 
tained, no  matter  what  the  geometry  of  the  capacitor:  The  capacitance  is  in- 
creased by  the  factor  Ke  when  an  insulating  material  with  that  dielectric  constant  is 
inserted  between  the  plates  of  a capacitor. 


EXERCISES 

Group  A 

21-1.  Making  salt.  Use  the  procedure  suggested  at  the 
end  of  Example  21-1  to  estimate  the  electric  potential  en- 
ergy change  during  the  formation  of  the  molecule  NaCl. 

21-2.  Near  a sphere.  A solid  metal  sphere  whose  radius 
is  1.0  cm  has  a charge  of  + 1.0  X 10-8  C. 

a.  What  is  the  electric  potential  just  outside  the  sur- 
face of  the  sphere?  Take  it  to  be  zero  infinitely  far  from 
the  sphere. 

b.  At  what  distance  from  the  surface  does  the  electric 
potential  fall  to  one-half  its  surface  value? 

21-3.  Speedy  approach ? An  immobile  conducting 
sphere  of  radius  R and  charge  |£)|  attracts  a body  of  mass 
m whose  charge  is  — |g|.  What  is  the  speed  of  the  body 
when  it  is  a distance  r from  the  center  of  the  sphere  if  it 
starts  from  rest  at  a great  distance? 

21-4.  Ti  iangular  array. 

a.  Calculate  the  electric  potential  at  a point  A midway 
between  charges  qx  and  q3  in  the  triangular  array  of 
charges  in  Fig.  21-4,  taking  it  to  be  zero  at  infinity. 

b.  Do  the  same  for  the  point  B midway  between 
charges  q2  and  q3 . 

c.  Evaluate  the  work  that  must  be  done  to  move  an 
electron  from  A to  B . 

21-5.  Assembling  charges.  Calculate  the  work  done  to 
bring  three  widely  separated  equal  point  charges  q to  the 
apexes  of  an  equilateral  triangle  of  side  a. 

21-6.  At  the  midway  point. 

a.  Midway  between  two  point  charges,  qt  = + |(?|  and 
<72  = |</|,  whose  separation  is  d,  what  is  the  value  of  the 

electric  potential  if  the  value  is  zero  infinitely  far  from  the 
charges? 


b.  What  is  the  rate  of  change  of  the  electric  potential 
with  respect  to  position  along  the  line  between  the  charges 
at  the  midway  point? 

c.  What  is  the  value  and  direction  of  the  electric  field 
at  the  midway  point? 

21-7.  Breakdown  potential.  Charge  can  be  added  to  a 
conductor  in  air  until  % at  the  surface  reaches  the  value  of 
about  3 x 106  V/m.  Higher  values  of  % cause  the  sur- 
rounding air  to  become  conducting  so  that  the  excess 
charge  is  carried  away. 

a.  Estimate  the  maximum  charge  Q that  a sphere  of 
radius  1 m can  acquire. 

b.  What  is  the  corresponding  electric  potential  V, 
taking  V = 0 for  r = °°? 

21-8.  Label  the  equipotentials.  Calculate  the  numerical 
values  of  V for  each  equipotential  in  Fig.  21-12,  taking 

V = 0 at  infinity.  Use  these  values  to  label  the  equipoten- 
tials in  the  figure. 

21-9.  Electric  potential  due  to  nearest  neighbors  in  steam. 
The  average  distance  between  water  molecules  in  a steam 
chamber  at  100°C  and  1.0  atm  of  pressure  is  approxi- 
mately 4.0  x 10-9  m.  Given  that  the  electric  dipole  mo- 
ment magnitude  of  a water  molecule  is  6.2  X 10_3°  C-m, 
use  Eq.  (21-24)  with  6 = 0 to  determine  a typical  value  for 
the  electric  potential  experienced  by  a water  molecule  due 
to  a single  neighboring  molecule  4.0  x 10-9  m away. 

21-10.  Electric  potential  of  a dipole.  An  object  with  elec- 
tric dipole  moment  p = pi  is  placed  at  the  origin  of  the  x, 
y,  z coordinate  axes.  What  is  the  electric  potential  V at  the 
point  x = L,  y = L,  z = L?  Employ  the  convention  that 

V = 0 infinitely  far  from  the  origin. 


1006  The  Electric  Potential 


21-11.  Equivalent  capacitance , /.  In  Fig.  21E-1  1,  each 
capacitor  has  a capacitance  of  1 .00  /xF.  What  is  the  equiva- 
lent capacitance  of  the  arrangement  in  part  a?  What  is  it  in 
part  b ? 


Fig.  21E-11 


HHHH 


C b ) 


21-12.  Equivalent  capacitance,  II.  Given  three  capaci- 
tors of  1.00  /xF  each,  what  are  four  ways  of  connecting 
them  and  what  is  t lie  equivalent  capacitance  in  each  case? 

21-13.  Equivalent  capacitance , III.  Find  two  different 
ways  of  connecting  a set  of  1.00-/xF  capacitors  which  will 
give  an  equivalent  capacitance  of  0.67  pT. 

21-14.  Plane-parallel  capacitor,  I. 

a.  Show  that  the  magnitude  of  the  force  with  which 
one  plate  of  a plane-parallel  capacitor  (in  vacuum  or  air) 
attracts  the  other  is  equal  to  \qf/2e0a,  where  a is  the  area  of 
either  plate  and  |c/|  is  the  magnitude  of  the  charge  on 
either  plate. 

b.  What  is  the  work  done  by  a force  applied  to  slowly 
separate  the  plates  when  the  spacing  between  them  is  in- 
creased from  di  to  <4? 

c.  Show  that  the  increase  in  the  potential  energy 
stored  in  the  capacitor  is  equal  to  the  work  done. 

21-15.  Plane-parallel  capacitor,  II.  The  dielectric  in  a 
plane-parallel  capacitor  has  a dielectric  strength  of  1.0  X 
107  V/m  and  a dielectric  constant  of  4.0.  Its  thickness  is 
0.10  mm.  The  area  of  the  plates  is  500  cm2. 

a.  What  is  the  maximum  electric  potential  difference 
between  the  plates? 

b.  What  is  the  capacitance? 

c.  What  is  the  maximum  energy  that  can  be  stored? 


ference  between  its  electrodes  is  400  V.  The  battery  is 
then  disconnected,  leaving  the  capacitor  charged.  Kero- 
sene of  dielectric  constant  2.5  is  poured  into  the  space 
between  the  spherical  electrodes,  completely  filling  the 
space. 

a.  What  is  the  electric  potential  difference  when  the 
capacitor  contains  kerosene? 

b.  How  is  the  charge  on  the  capacitor  affected? 

c.  What  is  the  ratio  of  the  final  capacitance  to  the  ini- 
tial capacitance? 

Group  B 

21-18.  Potential  differences  of  charged  planes.  The  three 
large  parallel  insulating  planes  in  Fig.  2 IE- 18  are  1.0  cm 
apart.  They  are  uniformly  charged,  with  cr^  = +2.0  X 
10-7  C/m2,  crB  = +4.0  X 10-7  C/m2,  and  crc  = +6.0  x 
10-7  C/m2,  where  A,  B.  and  C refer  to  the  left,  middle,  and 
right  planes.  Calculate  the  following  differences  in  their 
electric  potentials: 


Fig.  21E-18 


a.  VB  - V A 

b.  Vc  - VB 

c.  Vc  ~ V A 

21-19.  Charged  nonconducting  sphere.  A noncon- 
ducting sphere  of  radius  R has  a total  charge  Q uniformly 
distributed  throughout  its  volume.  The  electric  held  at  a 
point  inside  the  charged  sphere  is  radial  and  has  magni- 
tude equal  to  Qr/ATre0R 3,  where  r is  the  distance  from  the 
center.  Show  that  the  electric  potential  at  this  point  is 
given  by 

Q f(3fi2  - r2) 

87T€o  _ R3 


21-16.  Plane-parallel  capacitor,  III.  A plane-parallel 
capacitor  has  plates  of  area  a and  separation  d.  Over  one 
half  of  this  area  the  region  between  the  plates  is  evac- 
uated; over  the  other  half  it  is  filled  with  material  of  die- 
lectric constant  Ke.  What  is  the  capacitance? 

21-17.  Spherical  capacitor,  I.  A spherical  capacitor  is 
charged  with  a battery  so  that  the  electric  potential  dif- 


taking  its  value  to  be  zero  infinitely  far  from  the  sphere. 

21-20.  Sphere  of  zero  potential.  A charge  + \q\  is  at  a dis- 
tance 3/? /2  from  a charge  -|<?|/2.  See  Fig.  21E-20.  The 
figure  shows  an  imaginary  sphere  of  radius  R whose 
center  is  on  the  line  joining  the  two  charges  and  is  at  dis- 
tance R/2  from  the  negative  charge,  in  the  direction  away 


Exercises  1007 


Fig.  21E-20 


from  both  charges.  Prove  that  the  electric  potential  over 
this  sphere  is  zero,  if  it  is  defined  to  be  zero  infinitely  far 
from  the  charges. 


21-21.  Shell  game.  A conducting  sphere  of  radius 
1 cm  has  a charge  of  1 x 10-8  C.  It  is  at  the  center  of  a 
conducting  shell  which  has  a net  charge  of  2 X 10-8  C. 
See  Fig.  21E-21.  The  inner  and  outer  radii  of  the  shell 
are  2 cm  and  3 cm.  Take  all  these  numbers  to  be  exact. 


Fig.  21E-21 


a.  What  is  the  charge  on  (i)  the  inner  surface  of  the 
shell?  (ii)  the  outer  surface  of  the  shell? 

b.  What  is  the  electric  potential  (letting  it  be  zero  at 
infinity)  (i)  just  outside  the  shell?  (ii)  inside  the  metal  of  the 
shell?  (iii)  at  the  surface  of  the  sphere? 

c.  What  is  the  difference  in  electric  potential  between 
the  sphere  and  the  shell? 

21-22.  Far  field  of  a charged  disk.  Derive  Eq.  (21-20a) 
for  the  far  electric  field  of  a charged  disk  by  applying  %z  = 
— dV/dz  directly  to  Eq.  (21-16a). 

21-23.  Near  field  of  a charged  disk.  Derive  Eq.  (21-206) 
for  the  near  electric  field  of  a charged  disk  by  applying 
z = —dV/dz  directly  to  Eq.  (21-166). 

21-24.  Torque  on  a dipole.  Prove  that  Eq.  (21-32)  for 
the  torque  exerted  on  an  electric  dipole  is  valid  when  the 
origin  about  which  the  torque  is  measured  is  chosen  at  one 
of  the  charges  of  the  dipole,  instead  of  at  its  center.  What 
property  of  the  two  forces  producing  the  torque  is  respon- 
sible for  the  fact  that  the  torque  does  not  depend  on  the 
choice  of  origin? 

21-25.  Energy  of  a dipole.  Employ  Eq.  (21-32),  T = 
p x 8,  in  an  integration  of  Eq.  (9-58),  T = — dU/dO , to 
derive  Eq.  (21-35),  U = — p • 8.  Take  particular  care  in 
handling  signs. 


21-26.  Electric  field  in  polar  coordinates.  If  V is  ex- 
pressed in  polar  coordinates,  Eqs.  (21-18)  become  %R  = 
—dV/dR,  where  %R  is  the  radial  component  of  8,  and 
<^e  = —dV/RdQ  where  is  the  transverse  component. 

a.  Evaluate  and  for  the  dipole  electric  potential 
given  by  Eq.  (21-26). 

b.  Show  that  c?2  = e?|  + agrees  with  the  expres- 
sion obtained  in  cartesian  coordinates  in  Eq.  (21 -29a). 

21-27.  Spherical  capacitor,  II.  Modify  the  calculation 
leading  to  Eq.  (21-49)  so  that  it  can  be  used  to  calculate  the 
capacitance  of  the  spherical  capacitor  in  Fig.  21-37.  Com- 
pare your  results  with  Eq.  (21-59). 

21-28.  Switching  capacitors,  I. 

a.  In  Fig.  21E-28,  the  four  capacitors  are  identical. 
Switch  B is  kept  open.  Switch^  is  closed  and  then  opened. 
Switch  B is  now  closed.  What  is  the  electric  potential  dif- 
ference across  each  capacitor?  The  symbol  on  the  right 
represents  a battery. 


b.  Starting  with  uncharged  capacitors,  switch  B is 
closed.  Then  switch  A is  closed.  What  is  the  electric  poten- 
tial difference  across  each  capacitor? 

21-29.  Equivalent  capacitance,  IV. 

a.  Derive  Eq.  (21-52)  for  the  equivalent  capacitance 
of  an  arbitrary  number  of  capacitors  in  series. 

b.  Derive  Eq.  (21-53)  for  the  equivalent  capacitance 
of  an  arbitrary  number  of  capacitors  in  parallel. 

21-30.  Switching  capacitors,  II.  In  Fig.  2 IE-30,  switch  A 
is  closed,  then  opened.  After  this,  switch  C is  closed,  then 
opened.  The  symbols  at  the  ends  represent  batteries. 


100  V 


St 

SI 

+ £_ 

E 

+ 

l.oo  mF 

1.00  aiF 

200  V 


Fig.  21E-30 


a.  If  switch  B is  now  closed,  what  is  the  electric  poten- 
tial difference  across  either  capacitor? 

b.  Calculate  the  energy  loss  when  switch  B is  closed. 
Account  for  the  energy  loss. 

21-31.  Electric  field  energy  of  a spherical  conductor.  Prove 
that  half  the  energy  of  the  electric  field  of  a charged  iso- 
lated spherical  conductor  of  radius  R is  in  the  region 


1008  The  Electric  Potential 


between  t lie  sphere  and  an  imaginary  concentric  sphere 
of  radius  2 R. 

21-32.  Coalescing  drops.  Two  very  widely  separated 
identical  spherical  drops  of  water  of  radius  R carry  equal 
charges q. 

a.  What  is  the  total  electric  potential  energy  of  the 
system? 

b.  The  two  drops  coalesce  to  form  a large  one  with 
charge  2 q.  What  is  the  ratio  of  the  electric  potential  energy 
of  the  large  drop  to  the  sum  of  the  electric  potential  en- 
ergies of  the  two  small  ones? 

c.  From  elementary  considerations,  show  that  the 
ratio  should  be  greater  than  1. 

21-33.  Dielectric  slab.  A slab  of  dielectric  is  inserted 
between  plates  of  a plane-parallel  air  capacitor.  The 
thickness  of  the  dielectric  is  exactly  one-half  the  distance 
between  the  plates.  If  the  dielectric  constant  of  the  slab  is 
exactly  2,  what  is  the  ratio  of  the  capacitance  C with  the 
slab  to  the  capacitance  C without  it? 

21-34.  Crabby  capacitor.  In  Fig.  21E-34  a slab  of  die- 
lectric of  constant  Ke  is  inserted  a small  distance  into  the 
space  between  the  plates  of  a charged  plane-parallel  air 
capacitor.  When  the  slab  is  released,  it  is  drawn  in  all  the 
way. 

/ ► Fig.  2 1 E-34 


— / 


a.  Why  does  this  happen? 

b.  The  potential  energy  of  a charged  capacitor  is 
given  by  \qf/2C.  In  part  a,  |g|  does  not  change  but  C in- 
creases by  a factor  of  Ke.  The  potential  energy,  with  the 
dielectric  tilling  the  gap,  is  therefore  less  than  the  original 
potential  energy.  Account  for  the  missing  energy. 

c.  By  equating  the  work  done  on  the  slab  to  the  de- 
crease in  potential  energy,  calculate  the  average  force  with 
which  the  dielectric  is  pulled  in  terms  of  the  original  po- 
tential energy  U. 

Group  C 

21-35.  Electron  model.  A nonconducting  sphere  of 
radius  R has  a charge  which  is  uniformly  distributed 
over  its  volume. 

a.  Show  that  the  electric  potential  energy  of  the 
charged  sphere  is  iQ2/4iTe0R,  where  Q is  the  total  charge. 
Hint:  Build  up  the  charged  sphere  by  putting  together 
uniformly  charged  spherical  shells. 

b.  Assuming  that  an  electron  is  such  a sphere,  calcu- 
late its  radius  by  equating  the  electric  potential  energy  to 
its  rest  mass  energy  m0c2.  What  is  the  numerical  value  of 
the  radius? 

21-36.  An  inverse  fifth-power  attraction.  A point  charge 
q attracts  an  uncharged  object  by  inducing  charges  in  the 
latter.  If  the  uncharged  object  is  a small  conducting 


sphere,  the  force  of  attraction  varies  inversely  with  the 
fifth  power  of  the  distance  r between  the  charge  and  the 
center  of  the  sphere.  This  can  be  proved  in  steps  as 
follows. 

a.  Find  the  energy  density  in  the  electric  field  of  the 
point  charge  in  terms  of  q and  r. 

b.  When  the  conducting  sphere  is  placed  at  any  loca- 
tion in  the  field,  the  interior  of  the  sphere  is  field-free, 
Thus  the  energy  of  the  electric  field  is  decreased  from  the 
value  it  had  in  the  absence  of  the  sphere.  Calculate  the  de- 
crease for  a sphere  of  radius  a. 

c.  The  attraction  ol  the  point  charge  will  pull  the 
sphere  into  a region  where  the  field  and  therefore  the  en- 
ergy density  is  greater.  Calculate  the  rate  at  which  the 
electric  field  energy  decreases  with  the  distance. 

d.  Let  dU  be  the  magnitude  of  the  change  in  the  elec- 
tric field  energy  due  to  the  change  dr  in  the  distance  of  the 
conducting  sphere  from  the  point  charge.  Then  energy 
conservation  requires  that  dU  = F dr,  where  F is  the  mag- 
nitude of  the  force  on  the  sphere.  Calculate  F. 

No  allowance  has  been  made  in  this  calculation  for 
the  rearrangement  of  the  electric  field  lines  so  as  to  be- 
come everywhere  normal  to  the  surface  of  the  conducting 
sphere.  Doing  so  would  increase  the  value  of  F by  a factor 
of  3,  but  would  have  no  effect  on  the  inverse  fifth-power 
law  that  is  obtained. 

21-37.  The  method  of  images.  A charge  + |t/|  is  at  a dis- 
tance d from  an  infinite  conducting  plane  at  zero  electric 
potential.  This  charge  will  induce  a negative  charge  on  the 
plane.  What  is  the  force  with  which  the  plane  attracts  the 
charge  + |<7|?  What  is  the  induced  charge  per  unit  area  on 
the  plane?  The  following  steps,  sometimes  called  the 
method  of  images,  will  lead  you  to  the  answers  to  these 
questions. 

Place  a charge  — 1<?|  on  the  other  side  of  the  plane 
along  the  normal  from  + 1(/|  to  the  plane  and  also  at  a dis- 
tance d from  the  plane.  Then  remove  the  conducting 
plane  as  in  Fig.  21E-37. 


Fig.  21E-37 


oo 


Exercises 


1009 


Fig.  21E-40 


a.  What  is  the  electric  potential  due  to  these  two 
charges  at  an  imaginary  plane  coinciding  with  the  one  re- 
moved? 

b.  What  is  the  direction  of  the  electric  held  due  to  the 
two  charges  at  this  imaginary  plane? 

c.  The  results  obtained  fora  and  b show  that  the  held 
at  the  imaginary  plane  due  to  the  two  charges  is  identical 
to  the  held  when  the  conducting  plane  takes  the  place  of 
the  charge  — 1</|.  Hence  the  attraction  of  the  conducting 
plane  on  + |g|  is  the  same  as  the  attraction  of  - |<?|  on  + |g|. 
Evaluate  this  attraction. 

d.  Gauss’  law  applied  to  a conductor  gives  cr  = e0c?, 
where  a is  the  charge  per  unit  area  on  the  surface  of  the 
conductor  and  % is  the  magnitude  of  the  electric  held 
immediately  outside  the  conductor.  The  value  of  % can  be 
found  by  using  the  electric  held  due  to  the  point  charges. 
Show  that  % = 2|(/|cf/47re0r3,  where  r is  the  distance  from 
either  point  charge  to  the  point  where  % is  being  evalu- 
ated. From  this  obtain  the  expression  for  cr. 

e.  By  integrating  the  charge  on  circular  rings  cen- 
tered about  the  line  connecting  the  two  charges,  evaluate 
the  total  induced  charge  on  the  plane  and  show  that  it  is 
equal  to  — |c/|. 

21-38.  Find  the  field  from  superposed  potentials.  A sta- 
tionary positive  point  charge  qx  is  located  at  the  point 
(xj , 3>! . 0),  and  a negative  point  charge  q2  = — 2 qx  is  located 
at  the  point  (xj , — 3^ , 0). 

a.  Find  the  electric  potential  V.  Employ  the  custom- 
ary reference  V = 0 for  r — > °°. 

b.  Find  the  locus  of  points  in  the  xy  plane  for  which 
V = 0.  Express  your  result  in  the  form  y = f(x). 

c.  Find  the  point(s)  on  the  x axis  for  which  V = 0. 

d.  Determine  all  three  components  of  the  electric 
held  for  a general  point  (x,  y,  z). 

e.  Find  all  points  along  the  x axis  where  (i)  %x  = 0;  (ii) 
= 0. 

f.  Find  the  point  along  the  x axis  for  which  the  mag- 
nitude of  the  electric  held  is  a maximum,  and  express  this 
maximum  value  in  terms  of  qx  and  yx.  Hint:  Careful  con- 
sideration of  the  electric  held  superposition  equation  8 = 
8j  + 82  can  help  you  avoid  much  algebraic  manipulation. 

21-39.  A dipole  and  a point  charge.  An  electric  dipole 
whose  moment  is  of  magnitude  p is  aligned  along  an  elec- 
tric held  line  due  to  a point  charge  q.  The  dipole’s  distance 
r from  q is  much  greater  than  2d,  the  distance  between  the 
dipole  charges.  Show  that  the  magnitude  of  the  force 
which  the  electric  held  of  q exerts  on  the  dipole  is  equal  to 
(q/4iTe0)(2p/r3).  Is  the  force  directed  toward  q,  or  away 
from  q ? 


+q 

-<7 


d_ 

d 


r 


-qid 


a.  Show  that  the  magnitude  of  the  force  which  either 
dipole  exerts  on  the  other  is  given  by  (1/4  7re0 )(6/>2/r4) 

Hint:  Use  Eq.  (21-286)  with  x = 0. 

b.  Is  the  force  attractive  or  repulsive? 

21-41.  Electric  dipole  in  a uniform  electric  field.  An  object 
whose  electric  dipole  moment  is  p = px.  is  held  hxed  in  a 
uniform  external  electric  held  8 = %y. 

a.  Find  the  vector  torque  T on  the  object. 

b.  Suppose  the  object  is  a homogeneous  solid  sphere 
of  mass  m and  radius  a.  The  sphere  is  released  from  a state 
of  translational  and  rotational  rest  at  t = 0,  and  its  dipole 
moment  is  hxed  with  respect  to  the  sphere,  so  that  the  di- 
pole moment  at  time  t is  given  by  p(()  = p sin  9(t)x  — 
cos  9(t)y.  Use  the  equation  T = dL/dt  to  obtain  an  equation 
for  the  angular  acceleration  d26/dt2. 

c.  The  initial  conditions  for  the  equation  obtained  in 
part  b are  6 = rr/2  and  dO/dt  = 0.  Show  that  the  following 
equation  is  satisfied  by  the  dipole:  j(bna2)(d9/dt)2  + 
p%  cos  9 = 0. 

d.  How  would  you  modify  the  equation  presented  in 
part  c to  allow  for  arbitrary  initial  values  of  9 and  d9/dt 
(but  with  the  initial  values  of  p and  dp/dt  still  confined  to 
the  xy  plane)? 

21-42.  Poisson’s  equation.  In  Sec.  21-5,  Gauss’  law  and 
the  relationship  between  the  electric  held  8 and  the  elec- 
tric potential  V were  used  to  show  that  in  a charge-free 
region  of  space  the  potential  satishes  Laplace’s  equation: 

d2V  d2V  d2V  n 
— 7 + — 7+TT=0 
dxr  dy~  dz* 

The  same  procedure  can  easily  be  generalized  to  show 
that  the  electric  potential  always  satishes  Poisson’s  equa- 
tion: 


21-40.  Two  dipoles.  An  electric  dipole  whose  moment 
is  of  magnitude  p is  aligned  along  the  axis  of  a similar  di- 
pole. See  Fig.  21E-40.  The  distance  r between  the  centers 
is  much  greater  than  2d,  the  distance  between  the  opposite 
charges  of  either  dipole. 


d2V  d2V  d2V  p 

dx2  dy2  dz2  e0 

where  p is  the  local  charge  density. 

a.  Make  the  generalization,  thereby  deriving  Pois 
son’s  equation. 


1010  The  Electric  Potential 


b.  Determine  the  electric  field  8.  and  charge  density 
p,  associated  with  the  electric  potential 

V(x,  y,  z)  = ae~bz 2 
where  a and  b are  constants. 

21-43.  The  consequences  of  symmetry.  A cubical  box  has 
aluminum  faces  which  are  mechanically  joined  by  narrow 
insulating  spacers  which  serve  as  the  edges  of  the  cube. 
The  interior  of  the  box  is  charge-free.  The  six  faces  of  the 
cube  are  connected  to  six  different  batteries  which  main- 
tain them  at  the  following  electric  potentials:  5,  15,  — 10, 
30,  45,  and  - 25  V. 

a.  Without  using  any  complicated  numerical  proce- 
dure, determine  the  electric  potential  at  the  center  of  the 
cube.  Justify  your  method. 

b.  For  which  object(s)  among  the  following  could  a 
similarly  simple  means  be  used  to  determine  the  electrical 
potential  at  the  geometrical  center?  (In  each  case,  there 
are  no  interior  charges,  and  various  parts  of  the  surface 
are  held  at  various  given  electric  potentials.) 

(i)  a hollow,  regular  tetrahedron 

(ii)  a hollow,  noncubical  rectangular  parallelepiped 

(iii)  a hollow  spherical  shell 

21-44.  A charged  sheet,  I.  An  infinite  sheet  containing  a 
position-dependent  charge  per  unit  area  lies  in  the  xy 
plane;  its  electric  potential  is  given  by  V(x,y,  0)  = V0cos(kx). 

a.  Find  the  electric  potential  V(x,  y,  z)  in  the  charge- 
free  region  on  both  sides  of  the  infinite  sheet.  Hint:  As- 
sume that  V{x,  y,  z)  = /(z)  cos(foc)  and  that  V — » 0 for  z — » 
00 . Why  should  there  be  no  y-dependence  in  V(x,  y,  z)  for 
z f 0? 

b.  Evaluate  the  electric  field  8 (x,  y,  z)  for  z f 0. 

c.  Show  that  the  magnitude  of  the  electric  field  de- 
pends only  upon  z. 

21-45.  A charged  sheet,  II.  Apply  Gauss’  law  to  find  the 
charge  per  unit  area  cr(x,  y)  of  the  sheet  described  in  Exer- 
cise 21-44. 

Numerical 

21-46.  Field  lines  and  equipotentials,  I.  Run  the  field 
lines  and  equipotentials  program  to  trace  a representative 
set  of  electric  field  lines  and  equipotential  curves  for  two 
charges  whose  value  and  locations  are:  q^  = + 1 (in  C)  at 
x = 0 and  z = 0;  q2  = + 2 (in  C)  at  x = 0 and  z = 5 (in 
cm).  1 his  work  is  quite  time-consuming  if  carried  out  on  a 
programmable  calculator.  (But  there  is  no  tedium  if  you 
use  a computer  with  a graphic  display.)  So  use  values  of  As 
and  n which  are,  respectively,  twice  as  large  and  twice  as 
small  as  those  used  in  Example  21-6.  in  order  to  reduce 
the  time  required  (unless  you  use  a graphic  display  com- 
puter). Also,  evaluate  V for  each  equipotential.  Compare 
your  results  with  those  displayed  in  Fig.  21-11,  and  com- 
ment on  the  differences. 


21-47.  Field  lines  and  equipotentials,  II.  Run  the  field 
lines  and  equipotentials  program  to  trace  a representative 
set  of  electric  field  lines  and  equipotential  curves  for  two 
charges  whose  values  and  locations  are:  q1  = + ] (in  C)  at 
x = 0 and  z = 0;  q2  = — 2 (in  C)  at  x = 0 and  z = 5 (in 
cm).  Also  evaluate  V for  each  equipotential.  Compare 
your  results  with  those  displayed  in  Fig.  21-12,  and  com- 
ment on  the  differences.  The  remarks  made  in  Exercise 
21-46  about  the  time  required  for  the  calculations  apply 
here. 

21-48.  Laplace’s  equation,  I. 

a.  This  exercise  requires  the  use  of  a small  computer. 
Write  a program  for  carrying  out  an  iterative  solution  of 
Laplace’s  equation.  Follow  the  lines  established  by  Ex- 
ample 21-10,  but  allow  for  a more  flexible  choice  of  the 
number  of  grid  points  and  for  the  specification  of  the  val- 
ues of  V at  the  boundaries.  Test  the  program  by  repeating 
the  calculation  in  the  example,  recording  the  time  re- 
quired to  obtain  convergence  to  two  decimal  places. 

b.  Choose  a different  set  of  initial  values  of  V at  the 
interior  grid  points.  Specifically,  choose  V = 0 at  each  of 
these  points.  Repeat  the  calculation,  and  show  that  the 
same  results  are  obtained.  Also  record  the  time  required 
to  obtain  the  results.  Compare  the  time  required  for  the 
two  calculations,  and  explain  what  the  comparison  shows. 

21-49.  Laplace’s  equation,  II.  Run  the  Laplace’s  equa- 
tion computer  program  as  in  Exercise  21-48  until  you  ob- 
tain convergence  to  three  decimal  places.  Record  your  re- 
sults. Then  repeat  the  calculation  with  a grid  in  which  the 
spacing  of  the  grid  points  is  halved  along  both  the  x and  y 
axes.  Record  your  results,  but  only  at  the  positions  of  the 
initial  set  of  grid  points.  Continue  this  process,  repeatedly 
halving  the  grid  point  spacing,  until  the  calculation  con- 
verges with  respect  to  grid  point  spacing  to  three- 
decimal-place  accuracy.  When  the  convergence  is 
achieved,  the  results  of  the  numerical  calculations  will  be 
identical  to  this  accuracy  with  the  results  of  analytical  cal- 
culations. 

21-50.  Laplace’s  equation,  III. 

a.  Apply  the  Laplace’s  equation  computer  program 
written  in  Exercise  21-48  to  find  values  of  V at  points  in- 
side a set  of  three  adjacent  long  electrodes  extending  in 
the  z direction,  whose  intersections  with  the  xy  plane  form 
a triangle  with  two  perpendicular  sides  of  equal  length. 
The  values  of  V on  the  two  perpendicular  electrodes  are 

V = 0 and  V — + 1 V.  The  value  on  the  third  electrode  is 

V = + 2 V. 

b.  Explain  briefly  the  difficulty  that  arises  if  the  trian- 
gle does  not  have  two  perpendicular  sides  of  equal  length, 
and  what  you  would  do  to  handle  the  difficulty.  Laplace’s 
equation  does  not  have  analytical  solutions  for  the  elec- 
trode systems  considered  in  either  part  of  this  exercise. 


Exercises  1011 


22 

Steady  Electric 
Currents 


22-1  ELECTROMOTIVE  Until  1800,  the  only  way  to  produce  the  charge  transfer  necessary  to  ob- 

FORCE  AND  serve  electrical  phenomena  involved  the  use  of  friction.  But  the  electric 

ITS  SOURCES  force  is  quite  strong.  Thus  the  amount  of  charge  which  can  be  put  on  an 

electrode  is  limited  by  the  rapid  buildup  of  an  electric  potential  (or  poten- 
tial for  short)  on  the  electrode  which  prevents  further  accumulation  of  like 
charge. 

Consider  a specific  example.  Because  of  the  strength  of  the  electric 
force,  the  capacitance  of  practical  capacitors  is  rather  small.  That  is,  the 
plates  of  a typical  capacitor  having  a large  electric  potential  difference  (or 
potential  difference  for  short)  carry  rather  small  amounts  of  charge. 
Therefore,  when  a capacitor  is  discharged  by  connecting  its  electrodes,  the 
How  of  charge  is  either  brief  or  weak,  or  both. 

The  invention  in  1800  of  the  voltaic  cell  changed  all  this.  The  familiar 
dry  cell  and  mercury  cell  are  forms  of  a voltaic  cell.  A battery  is  a series  of 
connected  voltaic  cells.  Such  devices  make  it  possible  to  deal  with  very  large 
quantities  of  charge  flowing  steadily  over  quite  small  potential  differences. 
Today  there  are  many  other  ways  of  ch  iving  such  a steady  flow,  but  the  vol- 
taic cell  still  has  great  practical  importance.  In  addition  to  its  widespread 
usefulness,  moreover,  the  voltaic  cell  provides  one  of  the  essential  links 
between  the  sciences  of  physics  and  chemistry,  a point  to  which  we  return 
briefly  later. 

The  establishment  of  net  electric  charge  in  any  region  of  space  re- 
quires that  the  charge  be  separated  from  charges  of  opposite  sign.  (This 
happens,  for  example,  in  the  charge-transfer  processes  discussed  in  Chaps. 
20  and  21.)  A device  separating  electric  charges  must  do  work  on  those 
charges  in  some  way,  in  order  to  overcome  the  electric  forces  which  op- 


1012 


Fig.  22-1  Schematic  drawing  of  the  van  de  Graaff  generator,  used  to 
produce  very  high  potential  differences.  A hollow  spherical  electrode 
is  supported  and  insulated  from  the  ground  by  ceramic  insulators. 
Near  the  ground,  the  endless  belt  passes  between  a set  of  wires  on  one 
side  and  a metal  plate  on  the  other  side.  An  electronic  device  separates 
charge  and  this  produces  a potential  difference  of  several  thousand 
volts  between  the  wires  and  the  plate.  As  a result,  a continuous  electric 
discharge  (rather  like  a steady  but  very  weak  lightning  discharge)  takes 
place,  which  passes  through  the  belt.  Electric  charge  is  thus  “sprayed” 
onto  the  nonconducting  belt,  which  carries  it  upward.  The  motion  of 
the  belt  then  transfers  this  charge  to  the  hollow  electrode.  Until  the 
charge  is  inside  the  electrode,  it  experiences  a strong  repulsive  force 
due  to  the  charge  already  on  the  electrode.  Work  must  therefore  be 
done  to  move  the  charge.  As  is  discussed  in  the  text,  the  energy  re- 
quired to  perform  this  work  is  supplied  by  the  mechanical  device 
(usually  an  electric  motor)  which  turns  the  belt.  Once  the  charge  on 
the  belt  passes  inside  the  electrode,  the  electrode  exerts  no  further 
force  on  it,  for  reasons  explained  at  the  end  of  Chap.  20.  As  the 
charges  on  the  belt  come  into  contact  with  the  metal  wires  brushing 
over  the  belt  inside  the  hollow  electrode,  the  repulsion  among  them 
causes  some  of  them  to  leave  the  belt  and  flow  through  the  wires  to  the 
outer  surface  of  the  electrode.  The  longer  the  process  continues,  the 
more  charge  is  accumulated  on  the  electrode.  The  potential  difference 
between  the  hollow  electrode  and  the  earth  thus  increases  until  a light- 
ninglike discharge  takes  place  through  the  air  between  them,  or  along 
the  surface  of  one  or  more  of  the  insulating  supports.  When  the  mag- 
nitude of  the  charge  transferred  from  the  earth  to  the  electrode  is  |<?|, 
the  potential  difference  V between  the  two  is  given  by  V = \q\/C,  where 
C is  the  capacitance  of  the  system.  If  the  hollow  electrode  is  well  insu- 
lated from  the  earth,  \q\  can  be  made  large  enough  (before  discharge 
occurs)  to  result  in  a value  of  V equal  to  several  million  volts. 


pose  the  separation.  That  is,  the  electric  potential  energy  of  a system  com- 
prising equal  amounts  of  positive  and  negative  charge  can  be  increased 
only  by  doing  work  on  the  charges.  How  this  work  is  done  on  the  charges 
depends  on  the  particular  device  employed.  The  general  features  of  the 
process  are  especially  apparent  in  the  case  of  the  van  de  Graaff  generator, 
depicted  in  Fig.  22-1.  As  is  explained  in  the  caption,  the  separated  electric 
charge  is  transferred  from  one  electrode  (the  earth)  to  another  (the  large, 
hollow  metal  sphere)  on  a moving  belt  made  of  a nonconducting  material. 
This  process  produces  a large  potential  difference  between  the  electrodes 
and  thereby  gives  the  system  an  appreciable  amount  of  electric  potential 
energy.  Because  the  charges  on  the  belt  move  against  a repulsive  electric 
force,  work  must  be  done  to  make  the  belt  move.  The  source  of  the  me- 
chanical energy  expended  in  doing  this  work  is  the  macroscopic  device 
driving  the  belt.  In  principle,  it  could  be  a steam  engine,  in  which  case 
the  electric  potential  energy  of  the  van  de  Graaff  generator  is  created  at  the 
expense  of  thermal  energy.  In  practice,  the  source  of  the  mechanical  en- 
ergy is  an  electric  motor,  so  that  the  mechanical  energy  required  to  drive 
the  belt,  and  thus  create  electric  potential  energy,  is  itself  produced  at  the 
expense  of  electromagnetic  energy. 

In  the  case  of  the  voltaic  cell,  the  source  of  the  energy  required  to  sepa- 
rate positive  and  negative  charges,  and  to  place  a net  positive  charge  on  one 
electrode  and  a net  negative  charge  on  the  other,  is  less  evident  on  casual 
inspection  because  it  is  microscopic.  The  energy  arises  from  the  breaking 
and  making  of  chemical  bonds  in  the  course  of  the  chemical  reactions  that 
take  place  between  the  electrodes  and  the  fluid  (called  the  electrolyte ) in 
which  they  are  immersed.  In  other  systems,  the  energy  converted  into  elec- 
tric potential  energy  initially  may  be  in  still  other  forms. 


22-1  Electromotive  Force  and  Its  Sources  1013 


V 


Fig.  22-2  General  representation  of  a 
source  of  electromotive  force.  By  means 
of  some  unspecified  mechanism,  the  de- 
vice represented  by  the  shaded  rect- 
angle separates  positive  and  negative 
charges.  As  a result,  there  is  a potential 
difference  V between  terminals  A and  C. 
Terminal  A,  called  the  anode,  or  positive 
terminal,  is  at  the  higher  potential,  and 
terminal  C,  called  the  cathode,  or  nega- 
tive terminal,  is  at  the  lower  potential. 


Figure  22-2  illustrates  in  a very  general  way  a device  which  converts 
some  other  form  of  energy  into  electric  potential  energy  by  driving  apart 
positive  and  negative  charges.  The  device,  whose  details  do  not  concern  us 
here,  is  represented  as  a rectangle.  It  has  two  terminals,  shown  at  opposite 
sides  of  the  rectangle.  There  is  a potential  difference  V between  the  terminals 
resulting  from  the  (unspecified)  process  taking  place  inside  the  device.  The 
terminal  having  the  higher  potential  of  the  two,  labeled  A,  is  called  the 
anode  and  is  usually  denoted  by  the  symbol  +.  The  terminal  having  the 
lower  potential,  labeled  C,  is  called  the  cathode  and  is  usually  denoted  by 
the  symbol  — . 

If  a small  positive  test  charge  q is  transported  through  the  device  from 
the  cathode  to  the  anode,  the  electric  potential  energy  of  the  system  must 
be  increased  by  the  amount  qV.  The  work  per  unit  charge  required  to  do  this 
must  be  performed  by  the  device  and  is  called  the  electromotive  force. 
The  value  of  the  electromotive  force  is  thus  given  by  the  equation 


electromotive  force 


_qV- 


V 


(22-1) 


The  device  itself  is  called  a source  of  electromotive  force.  It  maintains  an 
electric  potential  difference  between  its  terminals  in  a manner  analogous  to 
that  in  which  a water  pump  maintains  a pressure  difference  between  its 
inlet  and  outlet  pipes. 

There  are  two  important  points  to  be  remembered  about  the  electro- 
motive force.  The  first  has  to  do  with  the  name  itself,  whereas  the  second 
has  to  do  with  the  connection  between  the  electromotive  force  and  the  elec- 
tric potential  difference  associated  with  it: 


1.  The  word  “force”  is  not  used  in  the  precise  sense  to  which  it  is 
restricted  in  modern  scientific  terminology.  Rather,  it  is  used  in  the  every- 
day sense  meaning  “driving  influence.”  When  the  term  “electromotive 
force”  was  hrst  coined,  about  150  years  ago,  physical  terms  were  not  always 
used  as  precisely  as  today.  Both  because  of  the  inaccuracy  of  the  terminol- 
ogy and  because  “electromotive  force”  is  a long  and  awkward  term,  the 
abbreviation  emf,  derived  from  the  hrst  letters  of  the  main  components  of 
the  term,  is  used  almost  universally  in  the  English-speaking  world.  (The 
letters  are  pronounced  separately,  as  e-m-f.)  The  signihcance  of  the  word 
“electromotive,”  or  “producing  motion  of  electricity,”  arises  from  the  fact 
that  sources  of  emf  are  used  most  often  to  drive  electric  currents. 

2.  While  the  emf  is  numerically  equal  to  an  electric  potential  dif- 
ference and  is  measured  in  units  of  volts,  just  as  a potential  difference,  the 
emf  is  not  itself  a potential  difference.  The  emf  produces  a potential  dif- 
ference, but  arises  from  physical  phenomena  which  are  not  necessarily 
electrical  in  nature.  A more  important  distinction  between  an  emf  and  a 
potential  difference  arises  from  the  fact  that  emf  represents  not  potential 
difference,  but  work  done  per  unit  charge.  This  work  need  not  be  done  by 
a conservative  force,  whereas  a potential  difference  can  be  defined  only  for 
a conservative  force. 


The  voltaic  cell  is  an  important  example  of  a source  of  emf.  A detailed 
phenomenological  description  of  the  operation  of  the  voltaic  cell  will  be 
found  in  any  elementary  chemistry  text.  By  means  of  the  process  sketched 
in  Sec.  15-5,  chemical  reactions  provide  the  “chemical  energy”  required  to 


1014  Steady  Electric  Currents 


V (in  V) 


Fig.  22-3  (a)  A voltaic  cell.  Two  electrodes  made  of  different  electrically  conducting  sub- 

stances (here  shown  to  be  copper  and  zinc)  are  immersed  in  a conducting  solution  called  the 
electrolyte  (here  shown  to  be  a mixture  of  copper  sulfate  and  zinc  sulfate  in  water).  Some  of 
the  metal  atoms  from  each  electrode  ionize,  going  into  solution  as  positively  charged  ions, 
each  of  which  leaves  one  or  more  electrons  behind  on  the  electrode.  As  a result,  each  electrode 
acquires  a negative  potential  relative  to  the  electrolyte.  For  each  electrode,  the  magnitude  of 
the  potential  is  determined  by  the  energy  available  per  atom  for  the  specific  ionization  reac- 
tion involved.  Since  this  energy  is  different  for  each  of  the  two  electrodes,  a potential  dif- 
ference exists  between  the  electrodes,  as  suggested  by  the  diagram  of  part  (b).  ff  the  electrodes 
are  linked  by  an  external  conductor,  electrons  will  flow  from  low  potential  to  high  potential.  In 
the  cell  shown,  the  electrons  will  combine  with  copper  ions  at  the  anode-electrolyte  interface, 
and  copper  metal  will  plate  out  on  the  anode.  At  the  same  time,  removal  of  the  electrons  from 
the  zinc  cathode  enables  an  “equal  amount”  of  zinc  metal  to  ionize  and  go  into  solution  in  the 
electrolyte.  (Can  you  give  a precise  meaning  to  the  term  “equal  amount”?)  The  process  will 
continue  until  either  all  the  zinc  is  dissolved  or  all  the  copper  ions  in  the  electrolyte  have  been 
plated  out. 


separate  electrons  from  atoms.  This  process,  called  ionization,  produces  an 
electric  potential  difference  V between  the  terminals.  A schematic  diagram 
of  a voltaic  cell  is  shown  in  Fig.  22-3,  and  a brief  explanation  of  its  opera- 
tion is  given  in  the  caption.  For  the  purposes  of  this  discussion,  it  suffices  to 
note  the  following  basic  principles,  which  are  founded  on  chemical  obser- 
vations: 

1.  Every  chemical  reaction  involving  1 kmol  of  a substance  requires 
the  transfer  of  |v|  kmol  of  electrons  between  that  substance  and  some  other 
substance.  The  quantity  v is  a small  positive  or  negative  integer  called  the 
valence  of  the  substance,  and  most  often  it  lies  between  —4  and  +4.  The 
overall  chemical  reaction  which  drives  the  voltaic  cell,  shown  in  Fig.  22-3«, 
for  instance,  can  be  carried  out  simply  by  dropping  powdered  zinc  into  a 
solution  of  copper  sulfate.  Zinc-metal  (Zn)  atoms  lose  two  electrons  (|jz|  = 2) 
and  become  zinc  ions  (Zn2+)  in  solution.  At  the  same  time,  copper  ions 
(Cu2+)  in  solution  acquire  two  electrons  (|r>|  = 2)  and  come  out  of  solution 
as  copper-metal  (Cu)  atoms.  This  exchange  of  electrons  can  be  written  as 

Zn  + Cu2+ > Zn2+  + Cu 

2.  The  structure  of  the  voltaic  cell  is  such  that  the  necessary  transfer  of 
electrons  takes  place  through  an  external  connection.  The  electrons  are 
propelled  by  the  electric  force  arising  from  the  potential  difference  V 
between  the  terminals  of  the  cell.  The  magnitude  of  the  charge  transferred 
from  cathode  to  anode  with  1 kmol  of  electrons  is  Ae,  where  A — 6.022  X 


22-1  Electromotive  Force  and  Its  Sources  1015 


1026  is  Avogadro’s  number,  defined  in  Sec.  17-4,  and  e is  tfie  magnitude  of 
the  electron  charge.  The  quantity  Ae  is  called  Faraday’s  constant  That 
is,  the  quantity 

& = Ae  (22-2) 

is  the  magnitude  of  the  total  electric  charge  on  1 kmol  of  electrons.  To  four 
significant  figures,  $F  has  the  value 

3F  = 6.022  x 1026  electrons /kmol  x 1.602  x 10-19  C/electron 


or 

& = 9.649  x 107  C/krnol  (22-3) 

Faraday’s  constant  can  be  measured  by  carrying  out  an  electrochemical 
reaction  involving  n kmol  of  a substance  of  known  valence  v.  If  the  magni- 
tude of  the  total  charge  transferred  through  the  external  connection  is 
measured  to  be  |#|,  we  have 

|#|  = n\v\2F  (22-4a) 


or 


= J_ 

n\v\ 


(22-46) 


If  any  two  of  the  quantities  2F,  A , and  e are  known,  the  third  is  deter- 
mined by  Eq.  (22-2),  8F  — Ae.  As  mentioned  in  Sec.  20-2,  the  value  of  e was 
first  estimated  in  1874  by  G.  Johnstone  Stoney  from  the  values  of  Faraday’s 
constant  and  Avogadro’s  number.  His  value,  e — 1 x 10-20  C,  was  flawed 
by  the  very  poor  estimates  of  Avogadro’s  number  then  available.  The  first 
accurate  evaluation  of  A was  made  possible  by  Millikan’s  determination  of 
e.  Today  there  are  much  more  precise  methods  of  measuring  the  electron 
charge  —e,  but  Eq.  (22-2)  remains  an  important  means  of  determining 
Avogadro’s  number. 

3.  A system  in  which  a chemical  reaction  takes  place  often  may  be 
regarded  as  an  isolated  system.  In  the  course  of  the  chemical  reaction,  the 
potential  energy  of  the  system  always  changes,  for  the  reasons  discussed  in 
Sec.  15-5.  If  the  potential  energy  decreases,  the  reaction  is  called  exothermic 
(that  is,  heat-releasing)  because  the  energy  is  most  commonly  transferred  to 
the  outside  world  in  the  form  of  heat  energy.  The  chemical  reactions  used 
in  voltaic  cells  are  exothermic.  But  in  a voltaic  cell,  most  of  the  energy  is  not 
released  directly  in  the  form  of  heat.  Rather,  it  appears  largely  in  the  form 
of  the  electric  potential  energy  possessed  by  the  system  because  it  contains 
separated  positive  and  negative  charges.  This  energy  is  converted  to  still 
another  form  of  energy  as  the  charge  is  propelled  through  the  external 
connection  from  one  terminal  to  the  other  by  the  electric  force  arising  from 
the  potential  difference  between  the  terminals.  If  the  potential  difference 
has  magnitude  |V|  and  a charge  of  magnitude  |#|  is  transferred  via  the  ex- 
ternal connection,  the  transfer  involves  a decrease  of  magnitude  \qV\  in  the 
electric  potential  energy  of  the  system.  It  is  this  energy,  sometimes  called 
electric  energy,  which  is  used  in  the  enormous  number  of  different  ways 
which  give  electricity  so  much  of  its  practical  importance. 


I he  connection  among  the  amount  of  matter  undergoing  chemical 
reaction,  the  amount  of  electric  charge  transferred,  and  the  amount  of  en- 
ergy released  by  the  system  to  the  outside  world  is  considered  in  Example 
22-1. 


1016  Steady  Electric  Currents 


EXAMPLE  22-1 


RM 


You  have  a voltaic  cell  like  that  shown  in  Fig.  22-2 a,  made  with  electrodes  of  zinc 
and  copper.  You  remove  the  zinc  electrode  from  the  cell,  weigh  it,  and  replace  it  in 
the  cell.  You  measure  the  potential  difference  between  the  terminals  attached  to  the 
electrodes  and  find  its  magnitude  to  be  |Vj  = 1.1  V.  You  place  the  entire  system  in  a 
calorimeter.  Next,  you  “short-circuit”  the  cell  by  connecting  a thick  copper  wire 
between  the  terminals.  Using  the  calorimeter,  you  measure  the  heat  evolved  by  the 
system.  (In  particular,  you  note  that  the  wire  becomes  quite  hot,  but  the  cell  be- 
comes warm  as  well.)  After  some  time,  you  remove  the  cell  from  the  calorimeter, 
disconnect  the  wire,  and  weigh  the  zinc  electrode  again.  You  find  that  its  mass  has 
decreased  by  1 .3  g.  This  zinc  goes  into  solution  in  the  electrolyte  in  the  course  of  the 
chemical  reaction.  Assume  that  the  emf  of  the  cell  did  not  change  appreciably 
during  the  process  and  that  the  valence  of  the  zinc  in  the  chemical  reaction  taking 
place  in  the  cell  is  v = 2. 

a.  How  much  charge  q has  been  transferred  from  one  electrode  to  the  other? 
The  molecular  weight  of  zinc  is  65. 

■ In  order  to  use  Eq.  (22-4a)  in  finding  q,  you  must  first  find  the  number  of  ki- 
lomoles  n of  zinc  which  have  gone  into  solution.  By  definition,  the  molecular  weight 
of  zinc  is  the  mass,  in  units  of  kilograms,  of  1 kmol  of  zinc.  Thus  1 kmol  of  zinc  has  a 
mass  of  65  kg,  and  you  have 


n 


1.3  x IQ-*  kg 
65  kg/kmol 


2.0  x 1CT5  kmol 


Since  the  valence  of  zinc  in  the  reaction  is  v = 2,  Eq.  (22-4«)  gives  the  amount  of 
electric  charge  transferred  from  one  electrode  to  the  other  as 

|<?|  = n\vffi  = 2.0  x 10-5  kmol  X 2 x 9.6  x 107  C/kmol 


or 

\q\  = 3.8  x 103  C 

This  is  millions  of  times  greater  than  the  amount  of  charge  transferred  from  one 
electrode  to  the  other  in  the  discharge  of  a typical  capacitor.  ■ 

b.  Elsing  the  electrochemical  information  you  have  just  acquired,  predict  the 
heat  energy  output  — AH  you  expect  to  find  when  you  make  the  calorimetric  mea- 
surements. 

■ The  charge  q is  propelled  through  the  wire  across  a potential  difference  V = 
1.1  V.  In  this  process,  which  is  exothermic,  the  electric  potential  energy  lost  has 
magnitude 

\qV\  = 3.8  X lo3  C X 1.1  V = 4.2  X 103  J 

According  to  the  principle  of  energy  conservation,  \qV\  must  be  equal  to  the  heat 
output  — AH,  since  there  is  no  other  form  of  energy  into  which  the  electric  potential 
energy  can  have  been  converted.  So  you  have 

-AH  = 4.2  X 103  J 

This  energy  is  equivalent  to  approximately  1 kcal.  Thus  the  chemical  energy  re- 
leased in  the  reaction  of  a little  more  than  1 g of  zinc  can  heat  about  1 kg  of  water 
through  1°C.  This  is  the  case  regardless  of  whether  the  chemical  energy  is  con- 
verted directly  to  heat  energy  in  the  same  chemical  reaction  carried  out  directly  or 
by  carrying  out  the  reaction  in  a voltaic  cell  where  you  first  convert  the  chemical  en- 
ergy to  electric  energy  and  then  to  heat  energy.  How  long  would  a 1-W  flashlight 
bulb  run  on  the  electric  energy  produced? 


If  a source  of  emf  did  nothing  more  than  move  charge  from  one  ter- 
minal to  the  other,  the  two  terminals  would  have  net  charges  of  opposite 


22-1  Electromotive  Force  and  Its  Sources  1017 


22-2  FLOW  OF 
ELECTRIC  CHARGE 
AND  ELECTRIC 
CURRENT 


v 


Fig.  22-4  An  electric  circuit,  consisting 
of  a source  of  emf  whose  terminals  are 
connected  externally  by  a conductor. 
The  direction  8 of  the  local  electric 
field  is  shown  at  several  locations  inside 
and  outside  the  source  of  emf.  The 
arrowheads  describing  the  electric  po- 
tential difference  V between  the  termi- 
nals of  the  source  of  emf  denote  the 
directions  of  pathways  along  which  the 
electric  potential  increases. 


sign  and  equal  magnitude.  This  is  the  reason  for  the  standard  plus  and 
minus  symbols  used  for  the  anode  and  the  cathode,  respectively.  It  is  not 
necessarily  true,  however,  that  the  terminals  have  net  charges  of  opposite 
sign.  (In  the  voltaic  cell,  for  instance,  the  terminals  themselves  are  electri- 
cally neutral,  or  nearly  so.  And  both  terminals  are  electrically  connected  to 
electrodes  which  have  net  negative  surface  charges  where  they  are  in  con- 
tact with  the  electrolyte.)  The  significant  point  is  that  the  charge  distribution 
through  the  source  of  emf  is  nonuniform  in  such  a way  that  there  is  a potential  dif- 
ference between  the  terminals. 


In  the  presence  of  an  externally  imposed  electric  field,  the  charges  within  a 
conductor  experience  electric  forces.  And  since  some  of  the  charges  within 
a conductor  are  mobile,  there  is  a flow  of  charge.  This  flow  will  persist  until 
the  buildup  of  excess  charge  on  some  parts  of  the  surface  of  the  conductor, 
and  the  corresponding  deficiency  of  charge  on  other  parts,  leads  to  the 
buildup  of  an  internal  depolarizing  field  within  the  conductor,  due  to  the 
separation  of  charge,  which  exactly  cancels  the  externally  imposed  field.  If 
an  isolated  conductor  is  placed  in  an  externally  imposed  electric  field,  the 
charge  redistribution  is  very  rapid,  and  thus  the  charge  flow  persists  only 
briefly. 

But  an  electric  field  can  be  maintained  across  a conductor  indefinitely 
by  connecting  two  points  on  it  (say,  at  its  ends)  to  the  terminals  of  a source 
of  emf  such  as  a battery,  as  shown  in  Fig.  22-4.  When  charge  flows  through 
the  conductor  toward  one  of  the  terminals  to  which  it  is  connected,  it  does 
not  build  up  on  the  surface  of  the  conductor  so  as-  to  result  in  a depo- 
larizing field  which  brings  an  end  to  the  charge  flow.  Rather,  the  source  of 
emf  “pumps”  the  charge  through  itself,  performing  work  on  the  charge  as 
it  does  so.  In  the  absence  of  the  “pumping,”  the  potential  difference  V 
between  its  terminals  would  not  remain  constant,  but  would  diminish  as  the 
flowing  charge  built  up  and  thus  imposed  a depolarizing  field  across  the 
source  of  emf  as  well  as  across  the  conductor  to  whose  ends  it  is  connected. 

To  put  it  another  way,  the  electric  potential  difference  across  a source 
of  emf  implies  the  existence  of  an  electric  field  within  the  source.  Since  the 
ends  of  the  conductor  are  in  contact  with  the  terminals  of  the  source  of 
emf,  the  same  electric  potential  difference  V must  exist  across  both  con- 
ductor and  source.  (Remember  that  the  potential  difference  between  two 
points  must  be  independent  of  the  path  taken  between  them.)  The  electric 
potential  difference  across  the  conductor  is  therefore  not  zero,  and  there 
must  be  an  electric  field  within  the  conductor.  As  long  as  this  electric  field 
persists,  charge  flow  through  the  conductor  will  continue.  The  net  result  is 
a continual  flow  of  electric  charge  around  a closed  pathway.  Such  a system, 
consisting  of  a source  of  emf  and  an  external  conducting  path  between  its 
terminals,  is  called  an  electric  circuit,  or  simply  a circuit. 


In  Sec.  16-7  we  described  the  flow  of  ordinary  fluids  in  terms  of  the 
mass  flux  A very  similar  mathematical  description  can  be  used  for  the 
flow  of  electric  charge.  However,  it  is  developed  independently  here.  Figure 
22-5  is  almost  the  same  as  Fig.  16-18.  It  depicts  a tube  of  flow — the  bundle 
of  paths,  or  streamlines,  along  which  fluid  passes.  The  cross-sectional  area 
of  a tube  of  flow  may  vary.  This  may  happen  in  the  case  of  water  flow  be- 


1018  Steady  Electric  Currents 


Lower 


Higher  '<! 
potent  ■] 


Fig.  22-5  A tube  of  flow  of  electric  charge.  The  streamlines,  called 
current  lines,  are  denoted  by  dashed  curves.  Their  sense  is  conven- 
tionally from  higher  to  lower  electric  potential,  and  they  are  every- 
where tangent  to  the  local  electric  held  vector  £ . However,  the 
charge  motion  has  the  same  sense  as  the  current  only  if  the  charge  is 
positive.  Negative  charge  moves  in  the  opposite  sense,  from  lower  to 
higher  electric  potential.  By  definition  of  a tube  of  flow,  no  charge 
passes  through  the  walls  of  the  tube.  Since  the  tube  contains  no 
sources  or  sinks  of  charge,  the  current  passing  through  the  surface 
M must  be  equal  in  the  steady  state  to  the  current  passing  through 
the  surface  N. 


cause  of  a variation  in  the  size  of  the  pipe.  In  the  case  of  flow  of  electric 
charge,  the  same  effect  may  result  from  a variation  in  the  size  of  the  wire 
carrying  the  flowing  charge. 

For  water  flow,  we  can  draw  an  imaginary  surface  across  the  tube  of 
flow  at  any  location  and  use  Eq.  (16-35)  to  define  the  mass  flux 


dm 

dt 


(22-5) 


The  mass  flux  is  thus  the  mass  m of  fluid  which  passes  across  the  surface 
per  unit  time  t.  We  do  a completely  analogous  thing  for  the  flow  of  electric 
charge  in  a conductor.  I he  electric  charge  flux  is  almost  always  called  elec- 
tric current,  defined  to  be  the  amount  of  electric  charge  q passing  per  unit 
time  through  an  imaginary  surface  (such  as  M in  Fig.  22-5).  The  symbol  i is 
universally  used  for  electric  current.  Thus  we  have,  by  definition, 


i 


dq 

dt 


(22-6) 


The  unit  of  electric  current  must  be  coulombs  per  second.  This  very  impor- 
tant unit  is  given  the  name  ampere  (A)  after  the  French  physicist  Andre 
Marie  Ampere  (1775-1836),  who  was  one  of  the  founders  of  the  theory  of 
electromagnetism.  The  ampere  is  thus  related  to  the  coulomb  by  the  ex- 
pression 

1 A = 1 C/s  (22-7) 

The  current  i is  a signed  scalar.  We  often  have  used  the  term  “signed 
scalar”  to  denote  a vector  in  one  dimension.  But  the  current  is  not  a vector, 
even  when  it  describes  the  flow  of  charge  in  a long,  thin  wire.  This  is  be- 
cause of  an  essential  aspect  of  any  flow  of  charge:  it  must  always  flow  in  an 
electric  circuit,  or  closed  pathway.  Such  a closed  pathway  cannot  exist  in 
one  dimension;  at  best,  the  long,  thin  wire  is  only  a part  of  a complete  cir- 
cuit. Nevertheless,  a current  must  always  have  a sense,  and  it  is  this  sense 
that  is  denoted  by  the  sign  of  i.  Even  in  the  simplest  loop,  consisting  of  a 

22-2  Flow  of  Electric  Charge  and  Electric  Current  1019 


single  source  of  emf  whose  terminals  are  joined  by  an  external  wire,  charge 
must  flow  one  way  or  the  other.  For  currents  whose  sense  does  not  change  with 
time,  called  direct  currents,  it  is  conventional  to  take  the  positive  sense  of  current  as 
that  of  a pathway  around  the  circuit  which  would  be  followed  by  a positive  test  charge 
that  is  free  to  move  through  the  circuit.  At  any  point  in  the  circuit  not  located  in- 
side a source  of  emf,  such  a test  charge  would  move  in  the  direction  speci- 
fied by  the  unit  vector  8 having  the  direction  of  the  local  electric  field 
vector  8.  Within  a source  of  emf,  the  physical  mechanism  which  produces 
the  emf  drives  the  positive  test  charge  in  the  sense  opposite  to  that  of  the 
local  electric  field.  If  you  refer  to  Fig.  22-4,  you  will  see  that  this  leads  to  a 
sense  for  the  current  which  is  consistent  throughout  the  circuit.  (For  cur- 
rents whose  sense  does  change  with  time,  called  alternating  currents,  it  is  con- 
venient to  use  a different  convention  in  defining  the  positive  sense  of  the 
current.  This  convention  is  introduced  in  Chap.  26,  where  alternating  cur- 
rents are  treated  for  the  first  time.) 

The  tube  of  flow  in  Fig.  22-5  may  be  regarded  as  a segment  of  an  elec- 
tric circuit.  Some  of  the  many  possible  pathways  which  a positive  test 
charge  might  take  through  the  tube  of  flow,  in  its  journey  around  the  cir- 
cuit, are  shown  as  dashed  lines.  These  pathways  are  entirely  analogous  to  the 
streamlines  used  in  describing  fluid  flow,  and  they  all  have  the  same  sense. 
When  streamlines  are  used  to  describe  electric  currents,  they  are  called 
current  lines.  In  the  absence  of  places  where  additional  electric  charge  can 
enter  or  leave  the  region  MN,  all  the  current  lines  which  pass  through  M in 
the  figure  must  also  pass  through  N in  the  steady  state.  The  current  is  the 
same  at  M as  at  N,  and  by  our  convention  the  sense  of  the  current  is  the 
same  at  M and  N as  well.  Thus  for  steady  currents  we  can  equate  the  cur- 
rent iM  passing  through  M with  the  current  iN  passing  through  N.  This  gives 
us  the  equation 

I'M  = In  (22-8) 

which  is  the  electrical  form  of  the  continuity  equation  discussed  in  Chap. 
16. 


The  analogous  equation  for  fluid  flow,  Eq.  (16-37),  is  —$>M  = <J>W.  The  dif- 
ference in  sign  between  the  two  equations  is  due  to  the  fact  that  different  conven- 
tions are  used  to  define  the  sense  of  flow  in  the  two  cases. 


In  analogy  to  the  case  of  ordinary  fluid  flow,  we  can  reexpress  Eq. 
(22-6)  in  terms  of  the  mobile  charge  density  pq,  which  is  the  total  mobile 
electric  charge  q per  unit  volume.  We  define 


Pq  - 


1 

volume  occupied  by  charge  q 


(22-9) 


Both  the  magnitude  and  the  sign  of  the  mobile  charge  density  depend  on 
the  particular  conducting  material  under  consideration.  In  the  interest  of 
simplicity,  we  begin  by  restricting  our  attention  to  a material  containing 
only  one  type  of  mobile  charge,  whose  sign  is  positive.  (Many  of  the  so- 
called  p- type  semiconductors  are  of  this  sort.)  Later  in  this  section,  we  con- 
sider materials  (such  as  most  common  metals)  in  which  the  mobile  charges 
are  electrons,  whose  sign  of  charge  is  negative.  In  Chap.  23,  we  gener- 
alize to  the  situation  where  both  positive  and  negative  mobile  charges  are 


1020  Steady  Electric  Currents 


present  (as  is  the  case  in  many  solutions,  such  as  salt  in  water).  For  the  mo- 
ment, then,  we  require  that  pQ  be  positive.  Nevertheless,  the  material  dis- 
plays zero  net  charge  because  of  the  presence  of  an  equal  density  of  immo- 
bile charges  of  the  opposite  sign. 

We  also  assume  that  the  density  of  the  mobile  charges  is  so  great  that 
they  may  be  considered  a continuous  fluid  from  the  macroscopic  point  of 
view.  This  is  analogous  to  the  way  in  which  we  have  considered  a large 
number  of  molecules  as  comprising  the  special  kind  of  fluid  called  a gas.  In 
this  case,  however,  the  mobile  electric  fluid  is  nearly  incompressible,  in  con- 
tradistinction to  gases  which  are  highly  compressible.  The  reason  for  this 
lies  in  a slight  extension  of  the  argument  made  in  Sec.  20-2  concerning  the 
way  in  which  mobile  charges  distribute  themselves  within  an  array  of  im- 
mobile charges  of  opposite  sign.  Precisely  because  the  immobile  charges  are 
immobile,  their  charge  density  is  fixed.  If  a conductor  is  in  an  uncharged 
state,  the  density  p9  of  the  mobile  charges  must  everywhere  be  equal  in 
magnitude  and  opposite  in  sign  to  that  of  the  immobile  charges.  Any  local 
fluctuation  from  this  condition  will  bring  strong  local  electric  fields  into 
play.  As  a consequence,  mobile  charge  will  flow  into  or  out  of  the  region  of 
imbalance  in  such  a way  as  to  restore  overall  local  electrical  neutrality. 

This  state  of  affairs  can  be  disturbed  by  the  imposition  of  an  external 
electric  held.  In  all  ordinary  conductors,  however,  the  externally  imposed 
electric  held  required  to  drive  a substantial  current  is  so  small  as  to  be  neg- 
ligible compared  to  the  internal  electric  fields  produced  by  even  very  slight 
local  deviations  of  the  mobile  charge  density  from  its  equilibrium  value  p„. 
Thus,  in  general,  a conductor  is  everywhere  electrically  neutral  even  when  it 
carries  a substantial  electric  current.  This  is  equivalent  to  the  statement  that  the 
electric  fluid  is  incompressible.  (Note  that  the  argument  just  completed  is 
valid  regardless  of  the  sign  of  the  mobile  charges.)  Excess  charge  can  and 
does  build  up  in  conductors  under  certain  circumstances — on  the  plate  of 
a capacitor,  for  example.  But  we  have  already  seen  at  the  beginning  of  Sec. 
22-1  that  the  process  is  severely  limited.  Typical  capacitances  are  small;  it 
takes  very  little  excess  charge  on  a capacitor  plate  before  the  opposing  po- 
tential brings  to  a stop  the  steady  current  flowing  through  a wire  attached 
to  the  capacitor. 


If  the  ends  of  a sample  of  a material  having  positive  mobile  charge  are 
connected  to  the  terminals  of  a source  of  emf,  there  will  be  a potential  dif- 
ference across  the  sample.  Consequently,  there  will  be  at  every  location  in- 
side the  sample  an  electric  field  8.  We  show  soon  that  under  the  combined 
influence  of  this  electric  field,  which  may  vary  from  place  to  place,  and  of 
effects  which  are  frictional  in  nature,  the  mobile  electric  fluid  will  move 
with  a velocity  v.  This  terminal  velocity  is  called  the  drift  velocity.  It  is  pro- 
portional to  the  electric  held,  and  the  acceleration  of  the  electric  fluid  is 
zero.  (In  studying  the  electric  current,  as  in  studying  the  flow  of  an  ordi- 
nary fluid,  the  ordered  velocity  of  the  fluid  taken  as  a whole,  and  not  the 
random  velocities  of  the  individual  particles  of  which  it  is  comprised,  is  of 
interest.)  Since  the  mobile  charge  is  positive,  the  drift  velocity  is  parallel  to 
the  electric  held  at  every  point  in  the  wire.  (In  this  book  we  do  not  consider 
the  case  of  anisotropic  materials,  in  which  the  directional  relation  between 
v and  8 is  more  complicated.) 

Now  consider  the  special  case  in  which  the  conducting  sample  is  a wire 
of  uniform  cross-sectional  area  a.  In  this  case,  symmetry  suggests  that  the 

22-2  Flow  of  Electric  Charge  and  Electric  Current  1021 


electric  field  8 is  everywhere  the  same  in  magnitude  and  is  directed  along 
the  wire.  We  show  below  that  the  electric  current  (or  charge  flux)  can  be 
written  in  the  form 

i = pqa\  • 8 (22-10) 

where  8 is  the  unit  vector  in  the  direction  of  the  electric  field  at  any  point  in 
the  wire  and  v is  the  drift  velocity  at  the  same  point.  This  equation  is  very 
much  like  Eq.  (16-39),  which  expresses  the  mass  flux  <f>w  in  terms  of  the 
mass  density  pm  and  the  fluid  how  speed  v in  the  form  <t>m  = pmav.  The 
slightly  more  complicated  form  of  Eq.  (22-10)  allows  for  the  later  treatment 
of  charge  how  in  materials  where  the  mobile  charge  is  negative,  so  that  the 
current  and  the  motion  of  charge  are  in  opposite  senses.  For  positive  mobile 
charge,  however,  v • 8 = v because  v is  proportional  to  8 and  the  propor- 
tionality constant  has  a positive  value,  so  that  v has  the  same  direction  as  8. 
Thus  Eq.  (22-10)  becomes 

i = pqav  for  pq>  0 (22-11) 

The  derivation  of  Eq.  (22-10)  for  positive  mobile  charge  is  as  follows. 
Consider  an  imaginary  surface  moving  with  the  electric  fluid.  At  a certain 
moment,  the  moving  surface  passes  through  the  stationary  surface  M in 
Fig.  22-6.  A very  short  time  dt  later,  it  has  moved  through  a displacement 
d s,  whose  magnitude  is  d s • 8.  If  a is  the  area  of  the  surface  M,  the  volume 
of  fluid  crossing  M during  the  time  interval  dt  is  a ds  • 8.  According  to  Eq. 
(22-9),  its  total  charge  dq  is  the  product  of  the  charge  density  pq  and  this  vol- 
ume, and  thus 

dq  = pqa  ds  • 8 (22-12) 

Using  the  definition  of  electric  current  i given  by  Eq.  (22-6),  we  obtain 


dt 


pqa 


ds 

dt 


(22-13) 


Now  note  that  the  quantity  ds/dt  is  the  drift  velocity  v in  the  immediate  vi- 
cinity of  M,  so  that  v = ds/dt.  The  current  i can  thus  be  written 

i — pqa\  • 8 

in  agreement  with  Eq.  (22-10). 

We  now  show  that  even  though  Eq.  (22-10)  was  derived  for  the  special 
case  of  mobile  charge  having  positive  sign,  it  is  valid  also  for  a material  in 


Fig.  22-6  Mobile  electric  charge  flows  steadily  through  a conductor  of 
arbitrary  shape.  The  direction  of  the  electric  field  is  shown  by  the  unit 
vectorg  The  cross-sectional  area  of  the  conductor  is  a.  In  an  infinites- 
imal time  dt,  an  imaginary  surface  moving  with  the  charge,  and  originally 
coinciding  with  the  fixed  marker  surface  at  M or  N,  moves  through  a dis- 
placement ds  to  the  position  shown  by  the  solid  line.  The  drift  velocity  of 
the  mobile  charge  is  v = ds/dt.  Both  the  electric  current  i and  the  density 
of  mobile  charge  pq  are  the  same  at  M and  N.  Since  i = pQa\  ■ £.  the 
magnitude  of  v must  be  larger  at  N where  the  value  of  a is  smaller. 


1022  Steady  Electric  Currents 


which  the  mobile  charges  have  negative  sign  and  the  positive  charges  are 
immobile.  Consider  a material  which  differs  from  the  material  we  have  just 
studied  only  in  that  the  signs  of  all  charges,  both  mobile  and  immobile,  are 
reversed.  For  this  new  material,  the  mobile  charge  density  p'q  is  given  by  the 
equation  pj  = — pq.  Since  nothing  is  changed  but  the  sign  of  the  charges,  the 
same  electric  held  imposed  on  a sample  of  the  new  material  will  result  in 
motion  of  the  mobile  electric  fluid  with  the  same  drift  speed  as  before,  but 
in  the  opposite  direction.  That  is,  the  drift  velocity  v'  is  given  by  the  equa- 
tion v'  = —v.  Thus  for  the  new  material,  Eq.  (22-10)  can  be  written  as 

i = p'„a\'  • 8 
or 

i = -pqa(~\  * 8)  = pqa\  • 8 (22-14) 

The  current  in  this  material  having  negative  mobile  charge  is  the  same  — 
both  in  magnitude  and  in  sense — as  the  current  in  the  similar  material 
having  positive  mobile  charge.  Reversing  the  sign  of  the  mobile  charge  re- 
sults in  a reversal  of  the  sense  of  motion  of  the  charge,  but  does  not  affect 
the  current.  I he  usefulness  of  this  fact  becomes  increasingly  evident  in  this 
and  the  following  four  chapters. 

It  is  to  be  understood  that  in  using  Eq.  (22-10)  the  quantities  a,  v,  and  8 
must  be  evaluated  at  the  same  location.  In  the  important  particular  case 
where  the  conductor  is  homogeneous  and  has  uniform  cross-sectional  area 
(as  is  often  the  case  for  common  electrical  wiring),  a and  v have  constant 
magnitudes  throughout  the  conductor.  Example  22-2  demonstrates  the  ap- 
plication of  Eq.  (22-10)  to  an  electric  current  flowing  through  a copper 
wire.  Fhe  mobile  charge  is  negative  in  copper,  as  is  demonstrated  in  Sec. 
22-4. 


EXAMPLE  22-2 

A no.  14  copper  wire  (a  size  in  common  use  in  households)  is  specified  to  have 
cross-sectional  area  a = 2.082  X 10-H  m2.  Find  the  magnitude  and  direction  of  the 
drift  velocity  v when  the  wire  is  carrying  its  maximum  rated  current  i = 15  A.  The 
mobile-charge  density  in  copper  is  approximately  pq  = —1.3  x 1010  C/m3.  (You  will 
see  in  Example  22-6  how  it  is  possible  to  estimate  this  charge  density  in  metals  and 
in  Sec.  23-3  how  a quite  accurate  value  can  be  obtained.) 

■ Since  pq  has  a negative  value,  the  mobile  charges  are  negative  and  move  in  a 
direction  opposite  to  that  of  the  electric  field.  Thus  the  scalar  product  v • 8 in  Eq. 
(22-10)  gives  you 

v • 8 = —v 


where  the  drift  speed  v is  the  magnitude  of  the  drift  velocity  v.  Hence  you  can  write 
Eq.  (22-10)  in  the  simplified  form  i = —pQav.  Solving  for  v and  inserting  the  nu- 
merical values  given,  yon  have 


i 

P<ia 


15  A 

-1.3  x 1010  C/m3  x 9.1  x Hr6  m2 


5.5  x 10  4 m/s 


or  a little  more  than  0.5  mm/s.  The  direction  of  motion  is  opposite  to  that  of  the  elec- 
tric field.  According  to  the  definition  of  electric  potential  difference,  V = 


22-2  Flow  of  Electric  Charge  and  Electric  Current  1023 


8 • ds,  the  overall  sense  of  motion  of  negative  charge  through  the  con- 


path 

ductor  is  from  lower  to  higher  potential  or,  as  it  is  often  put  loosely,  “from 
minus  to  plus.”  The  sense  of  the  current  i,  however,  is  from  higher  to  lower 
potential,  or  “from  plus  to  minus.” 

The  very  small  drift  velocity  calculated  for  a commonplace  case  in  Ex- 
ample 22-2  is  characteristic  of  the  drift  velocities  associated  with  electric 
currents  in  metals.  The  mobile-charge  density  in  metals  has  so  great  a mag- 
nitude that  the  electric  fluid  need  not  move  very  fast  in  order  to  transport 
electric  charge  at  a significant  rate.  In  other  words,  the  electric  fluid  need 
not  move  very  fast  in  order  for  there  to  be  a significant  electric  current. 


22-3  OHM’S  LAW 


0 Source  of 

adjustable  emf  N® 


Fig.  22-7  Schematic  drawing  of  an 
apparatus  for  determining  the  relation 
between  the  (adjustable)  potential  dif- 
ference V applied  across  a sample  wire 
by  means  of  an  adjustable  source  of 
emf,  and  the  current  i flowing  through 
the  wire.  The  current  is  measured  by  a 
device  called  an  ammeter,  denoted  by  the 
circle  labeled  A.  An  ammeter  measures 
the  current  flowing  through  it.  But  since 
there  is  no  place  in  the  circuit  where 
charge  can  accumulate,  this  current 
must  be  equal  to  the  current  i flowing  at 
any  point  in  the  circuit. 


Even  before  voltaic  cells  became  available  in  1800,  it  was  known  that  some 
materials  appeared  to  be  better  conductors  of  electric  current  than  others; 
that  is,  some  materials  would  carry  more  current  than  others  w hen  samples 
of  the  same  shape  and  size  w'ere  connected  across  the  same  potential  dif- 
ference. Cavendish,  a brave  man,  had  even  made  rough — and  painful — 
comparisons.  He  attached  sample  wires  to  charged  capacitors  and  dis- 
charged the  capacitors  through  the  wires  and  his  body  by  touching  the  free 
ends  of  the  wires.  He  then  compared  the  intensities  of  the  shocks  that  he 
felt!  (Nowadays  we  have  better  methods.) 

There  is  no  general  rule  for  the  experimentally  observed  dependence 
of  electric  current  flow  through  a sample  (say,  a wire)  on  the  potential  dif- 
ference imposed  by  a source  of  emf  connected  across  the  sample.  A vast 
variety  of  possibilities  exist,  since  current  can  pass  through  homogeneous 
substances  or  mixtures;  through  solids,  liquids,  or  gases;  or  along  or 
through  surfaces  or  interfaces  between  substances.  The  current  can  de- 
pend on  the  magnitude  and  t He  sense  of  the  imposed  potential  difference, 
as  wrell  as  on  such  other  factors  as  temperature.  Indeed,  the  exploitation  of 
the  possibilities  is  one  of  the  foundations  of  the  field  of  electronics. 

Figure  22-7  showrs  schematically  an  apparatus  for  measuring  the  rela- 
tion between  the  potential  difference  V imposed  across  a sample  wire  by  a 
source  of  emf  and  the  current  i driven  through  the  sample.  For  samples 
made  of  a very  large  class  of  substances  under  a wide  range  of  conditions, 
the  observed  relation  is  a simple  one.  For  a given  sample  of  this  sort,  the  re- 
lation is  one  of  direct  proportionality  and  can  be  expressed  in  the  form 

to cT  (22- 15a) 

We  introduce  a proportionality  constant  5 and  rewrite  this  relation  as  an 
equation: 

i = SV  (22-15  b) 

The  quantity  5 is  given  the  name  conductance.  The  larger  the  conduc- 
tance, the  more  current  flows  for  a given  potential  difference. 

From  Eq.  (22-156),  it  follows  that  the  conductance  S is  expressed  in 
units  of  amperes  per  volt.  This  unit  is  given  the  name  siemens  (S).  Thus  wre 
have  by  definition 


1024  Steady  Electric  Currents 


1 S = 1 A/V 


(22-16) 


The  siemens  is  named  after  the  distinguished  German-British  inventor,  elec- 
trical engineer,  and  entrepreneur  Sir  William  (Karl  Wilhelm  von)  Siemens 
(1823-1883).  With  his  elder  brother,  Ernst  Werner  von  Siemens  (1816-1892),  he 
pioneered  the  electric  telegraph  systems  of  Germany,  Russia,  and  Brazil.  In  1874 
he  directed  the  laying  of  the  first  transatlantic  telegraph  cable,  inventing  much  of 
the  equipment  required  for  the  task  himself.  Just  before  his  death,  he  completed  in 
Ireland  one  of  the  first  electric  street  railways  in  the  world. 


In  dealing  with  practical  electric  circuits,  it  is  common  to  call  the  po- 
tential dif  ference  across  any  part  of  the  circuit  the  voltage  across  it,  denoted 
by  the  same  symbol  V.  It  is  also  convenient  to  regard  the  voltage  as  a func- 
tion of  the  current  i,  rather  than  vice  versa.  With  this  in  mind,  we  rewrite 
Eq.  (22-156)  in  the  form  V — (1  /S)i.  We  then  define  the  electric  resistance 
R of  a conductor  to  be  the  reciprocal  of  its  conductance,  so  that 

(22-17) 

In  terms  of  the  resistance,  Eq.  (22-156)  can  be  rewritten  in  the  form 

V = iR  (22-18) 

While  conductance  is  a measure  of  the  ease  with  which  current  flows 
through  a conductor,  resistance  is  a measure  of  the  degree  to  which  the 
conductor  resists  the  flow.  The  unit  of  electric  resistance  is  siemens-1,  or 
volts  per  ampere,  which  is  called  the  ohm  (ft — the  capital  Greek  omega  is 
always  spoken  “ohm”  in  this  context).  Thus  we  have 

1 ft  = 1 S-1  = 1 V/A  (22-19) 

Equation  (22-156),  i = SK,  and  its  equivalent  Eq.  (22-18),  V = iR,  are  both 
called  Ohm’s  law.  In  the  latter  form,  Ohm's  law  states  that  the  potential  dif- 
ference V across  a conductor  is  equal  to  the  electric  current  i flowing  through  it  multi- 
plied by  its  resistance  R.  Any  system  in  which  Ohm’s  law  is  a satisfactory 
description  of  the  observed  dependence  of  current  on  the  potential  dif- 
ference across  the  system  is  called  an  ohmic  system. 


The  German  physicist  Georg  Simon  Ohm  (1787-1854)  was  the  first  to  make  a 
systematic  investigation  of  the  relation  between  the  voltage  across  a conductor 
and  the  current  flowing  through  it.  The  voltaic  cells  of  the  time  (1826)  were  not 
stable  enough  to  be  suitable  for  such  an  investigation.  Their  instability  was  an  im- 
portant reason  for  the  long  delay  between  the  beginning  of  serious  study  of  cur- 
rent electricity  in  1800  and  the  satisfactory  experimental  measurement  of  the 
“simple”  relation  V = iR.  Ohm  used  not  an  electric  battery  but  a bismuth-copper 
thermocouple  as  the  source  of  emf.  A thermocouple  is  a loop  consisting  of  wires  of 
two  different  conducting  substances,  as  shown  in  Fig.  22-8 a.  When  the  two  junc- 
tions between  the  two  substances  are  kept  at  different  temperatures,  a potential 
difference  appears  between  them.  Its  magnitude  is  roughly  proportional  to  the 
temperature  difference.  Thus  Ohm  could  adjust  the  potential  difference  between 
points  b and  c in  the  figure  by  keeping  one  junction  in  ice  water  and  heating  the 
other  in  a water  bath  of  variable  temperature. 

Into  the  gap  b to  c Ohm  inserted  wires  of  different  lengths,  thicknesses,  and 
materials.  He  measured  the  current  (in  a relative  way)  by  using  a torsion  balance  to 
measure  the  torque  exerted  on  a magnet  held  a fixed  distance  from  the  wire,  as 
described  in  the  caption  to  Fig.  22-8b.  (We  see  why  this  works  in  Chap.  23.)  With 
his  relative  measurements  of  voltage  and  current,  Ohm  could  thus  test  the  rela- 
tionship which  we  have  written  as  the  proportionality  i « V. 


22-3  Ohm's  Law  1025 


b 


c 


<? 


Thermometer 


Thermometer 


Fig.  22-8  (a)  The  use  of  a bismuth-copper  thermocouple  as  a source  of  emf. 

The  potential  difference  V is  roughly  proportional  to  the  temperature  dif- 
ference between  the  two  water  baths,  (b)  Drawing  of  Ohm's  apparatus  from  his 
original  paper.  In  use,  the  two  thermocouple  junctions  ab  and  a'b'  are  im- 
mersed in  water  baths  as  in  part  (a).  The  wire  whose  resistance  is  to  be  mea- 
sured is  connected  between  points  b andc  inside  the  case  of  the  torsion  balance. 
For  reasons  to  be  discussed  in  Chap.  23,  the  magnet  tt  experiences  a torque 
proportional  to  the  current  flowing  through  the  wire.  As  in  Coulomb’s  appa- 
ratus (shown  in  Fig.  20-5),  this  torque  is  measured  by  determining  the  angle 
through  which  the  knob  on  top  of  the  apparatus  must  be  twisted  to  restore  the 
magnet  to  its  undisturbed  position. 


Ohm  never  described  his  experimental  results  in  the  compact  and  simple 
form  i sc  V,  or  V = iR.  This  was  not  done  until  1849,  when  Gustav  Kirchhoff 
(1824-1887)  “saw  through"  the  experimental  complications  and  understood  the 
macroscopic  phenomenon  of  electric  conduction  in  essentially  modern  terms. 
Nevertheless,  any  equation  which  relates  current  to  voltage  in  a linear  fashion  is 
called  Ohm’s  law. 


The  proportionality  constants  S and  R,  which  appear  in  the  Ohm’s-law 
equations  i — SV  and  V = iR,  depend  on  the  size  and  shape  of  the  con- 
ductor. For  this  reason,  it  is  usef  ul  to  define  a quantity  called  the  conductivity 
cr,  which  depends  only  on  the  material  of  the  conductor.  Experiment  shows 
that  the  conductance  5 of  a series  of  wires  of  a given  material  is  directly 
proportional  to  their  cross-sectional  areas  a,  and  inversely  proportional  to 
their  lengths  /.  We  can  thus  write 

S oca  (22-20 a) 

and 

S°c|  (22-20 b) 

We  can  combine  these  two  proportionalities  as  5 °c  a/ 1 and  express  the  re- 
sult as  an  equation  by  defining  the  proportionality  constant  as  the  electrical 
conductivity  cr.  We  then  have 


S^cr-  (22-21) 

Substituting  this  expression  for  S into  Ohm’s  law  in  the  form  i = SV  given 
by  Eq.  (22-15 b),  we  obtain  another  useful  expression  of  Ohm's  law: 

i = (r~lV  (22-22) 


1026 


Steady  Electric  Currents 


Fig.  22-9  A current-carrying  con- 
ductor of  uniform  cross  section  is  di- 
vided in  imagination  into  a bundle  of 
many  identical  smaller  conductors.  It  is 
argued  in  the  text  that  the  small  con- 
ductors carry  equal  currents. 


The  fact,  represented  in  this  equation,  that  the  current  is  directly  pro- 
portional to  the  cross-sectional  area  of  a conductor  of  uniform  cross  section 
suggests  strongly  that  the  charge  passes  through  the  wire  in  an  evenly  dis- 
tributed way.  That  is,  if  you  imagine  the  wire  to  be  made  up  of  a bundle  of 
many  identical  smaller  wires,  as  in  Fig.  22-9,  each  of  the  smaller  wires  will 
carry  the  same  current  as  every  other  one.  With  this  idea  in  mind,  we  de- 
fine the  current  density  j to  be  the  electric  current  per  unit  of  cross- 
sectional  area  of  the  conductor: 


i 


(22-23) 


The  quantity  j must  be  expressed  in  units  of  amperes  per  square  meter. 
(Note  that  current  density  j is  current  per  unit  of  cross-sectional  area,  in 
contrast  to  the  charge  density  pQ,  which  is  charge  per  unit  volume.)  Com- 
bining this  definition  with  Eq.  (22-22),  we  obtain 

V 


for  a wire  of  uniform  cross-sectional  area. 

In  the  equation  immediately  above,  the  quantity  V/I  depends  on  the 
length  l of  the  particular  wire  chosen.  In  order  to  express  the  current  den- 
sity^ in  a way  which  does  not  depend  on  so  specific  a quantity,  we  make  use 
of  Eq.  (21-48rt).  In  the  present  notation,  this  equation  is 


This  is  the  magnitude  of  the  electric  held  anywhere  in  the  uniform  wire. 
Substituting  % for  V/l  in  the  equation  j = c rV/l  yields  the  current  density  in 
the  form 


j = (j%  (22-24) 

Again  we  have  Ohm’s  law.  But  it  is  expressed  in  a way  which  is  independent 
of  the  particular  size  and  shape  of  the  conductor  and  depends  on  only  the 
properties  of  the  material  of  which  it  is  made. 


Moreover,  the  current  density),  the  conductivity  cr,  and  the  electric  field  mag- 
nitude % are  local  quantities  whose  values  have  meaning  at  any  particular  point  in 
the  conductor,  in  contrast  to  the  current  i,  the  resistance  R.  and  the  voltage  V 
across  the  conductor,  which  have  meaning  only  with  respect  to  the  conductor  as  a 
whole.  Thus,  even  though  Eq.  (22-24)  was  derived  for  the  case  of  a uniform  wire 
having  the  same  conductivity  throughout,  it  is  valid  even  when  a conductor  is 
nonuniform  and  has  a variable  conductivity  provided  the  conductor  is  ohmic,  a 
point  which  must  be  verified  experimentally.  We  take  advantage  of  the  generality 
of  Eq.  (22-24)  in  Example  22-5. 


From  Eq.  (22-24),  it  follows  that  the  conductivity  cr  is  expressed  in 
units  of  (A/m2)/(V/m),  or  (A/V)/m.  This  is  siemens  per  meter  (S/m). 
The  reciprocal  of  the  conductivity  is  called  the  resistivity  p: 


(22-25 a) 


(The  resistivity  p should  not  be  confused  with  the  charge  density  pq.)  It 
follows  from  Eq.  (22-25o)  that  the  unit  of  resistivity  is  the  reciprocal  of  the 


22-3  Ohm’s  Law  1027 


1028 


unit  of  conductivity,  or  (S/m)-1.  If  we  use  the  definition  of  the  ohm  given 
by  Eq.  (22-19),  the  unit  of  resistivity  is  the  ohm-meter  (fYm).  The  resistivity 
p.  like  the  conductivity  cr,  is  a specific  property  of  materials  and  is  tabulated 
in  many  reference  manuals. 


The  electric  resistance  R of  a conductor  of  specified  length  and  uni- 
form cross-sectional  area  can  be  predicted  if  the  resistivity  p of  the  material 
of  which  it  is  made  is  known.  Such  predictions  are  frequently  of  practical 
importance.  To  derive  the  relation  between  R and  p,  we  begin  with  Eq. 
(22-22),  i = craV/l.  Substituting  Eq.  (22-25a)  into  this  equation,  we  have  i = 
aV /pi.  Solving  for  V gives  us 


a 


We  now  compare  this  expression  with  Eq.  (22-18),  V = iR.  It  is  immedi- 
ately evident  that  the  resistance  R is  equal  to  the  combination  of  constants 
pi/ a.  That  is,  we  have 


(22-25  b) 


The  resistance  of  a conductor  is  directly  proportional  to  the  resistivity  of  the  material 
of  which  it  is  made,  and  to  its  length  l,  and  is  inversely  proportional  to  its  cross- 
sectional  area  a. 


You  have  now  seen  Ohm’s  law  expressed  in  several  different  ways.  Still 
other  ways  are  possible,  by  using  various  combinations  of  the  quantities  po- 
tential difference  V,  electric  field  magnitude  g,  current  i,  current  density  j, 
conductivity  cr,  and  resistivity  p.  Which  form  is  most  convenient  to  use  de- 
pends on  the  application  at  hand.  But  all  forms  of  Ohm’s  law  make  one  of 
the  following  two  pairs  of  related  statements,  conformity  to  which  is  the 
hallmark  of  the  class  of  ohmic  conductors: 

la.  The  current  i passing  through  an  ohmic  conductor  is  directly  pro- 
portional to  the  potential  difference  V across  it,  the  proportionality  con- 
stant being  the  conductance  5;  that  is,  i = ST. 

lb.  The  potential  difference  across  an  ohmic  conductor  is  directly 
proportional  to  the  current  passing  through  it,  the  proportionality  con- 
stant being  the  resistance  R\  that  is,  V = iR. 

2a.  The  current  density  j at  any  location  within  an  ohmic  region  in  a 
conductor  is  directly  proportional  to  the  magnitude  g of  the  electric  field  at 
that  location,  the  proportionality  constant  being  the  conductivity  cr;  that  is, 
j = erg. 

2b.  The  magnitude  of  the  electric  field  at  any  location  within  an  ohmic 
region  in  a conductor  is  directly  proportional  to  the  current  density  at  that 
location,  the  proportionality  constant  being  the  resistivity  p;  that  is,  g = ip. 

In  practical  laboratory  work,  the  most  commonly  applied  form  of  Ohm’s 
law  is  Eq.  (22-18),  T = iR. 


Steady  Electric  Currents 


The  resistivity  and  conductivity  of  metals  are  not  fixed  constants.  In 
the  temperature  range  around  room  temperature,  the  resistivity  increases 


Table  22-1 


Electrical  Conductivity,  Resistivity,  and  Temperature  Coefficient  for 
Various  Metals  and  Alloys 

(Reference  temperature  t0  = 20°C) 


Metal 

o-0  (in  10H  S/m) 

p0(in  10  8 ft-m) 

a (in  °C-1) 

Silver 

62.9 

1.59 

0.0058 

Copper  (hard  drawn) 

56.47 

1.771 

0.0038 

Gold 

41.0 

2.44 

0.0034 

Aluminum 

35.41 

2.824 

0.0039 

Tungsten 

18 

5.6 

0.0045 

Iron 

10 

10 

0.005 

Lead 

4.5 

22 

0.0039 

Bismuth 

0.83 

120 

0.004 

Mercury 

1.0440 

95.783 

0.00089 

Brass 

14 

7 

0.002 

Manganin 

2.3 

44 

0.00001 

Constantan 

2.0 

49 

0.00001 

Nichrome 

1.0 

100 

0.0004 

slowly  with  increasing  temperature,  and  the  conductivity  decreases  corre- 
spondingly. It  is  conventional  to  express  the  resistivity  as  a polynomial 
series  in  the  Celsius  temperature  t.  However,  it  is  usually  not  necessary  to 
consider  terms  beyond  the  term  linear  in  temperature.  Given  the  resistivity 
pn  at  some  reference  temperature  t0,  it  is  usually  sufficiently  accurate  to 
express  the  resistivity  p at  some  other  temperature  t in  the  form 

p = Po[l  + a(t  - <0)]  (22-26) 

The  empirical  constant  a is  called  the  temperature  coefficient  of  resistiv- 
ity. The  reference  temperature  t0,  at  which  the  resistivity  has  the  value  p0,  is 
usually  (but  not  always)  taken  to  be  t0  = 20°C.  Table  22-1  gives  values  of  the 
conductivity  cr0  and  the  resistivity  p0  at  t = t0  and  the  temperature  coeffi- 
cient a for  various  metals  and  alloys. 

In  general,  “good"  metals  have  high  conductivities.  By  “good"  metals  is 
meant  those  which  have  most  of  or  all  the  classical  metallic  properties,  such  as 
ductility  and  luster,  and  clearly  metallic  chemical  properties,  such  as  exclusively 
positive  valences.  Roughly  speaking,  metals  which  possess  these  properties  in 
lesser  degree  are  not  such  good  conductors  as  those  which  possess  them  in  greater 
degree.  Consider  the  extreme  cases  in  Table  22-1.  Silver  is  shiny  and  highly  duc- 
tile and  always  has  a chemical  valence  of  + 1 or  +2.  It  has  the  highest  conducti- 
vity of  all  metals  at  room  temperature.  Bismuth  is  dull  in  appearance  and  rather 
brittle  and  combines  chemically  in  a variety  of  complicated  ways;  it  is  a poor  con- 
ductor of  electricity,  as  metals  go. 

At  temperatures  far  below  room  temperature,  the  behavior  of  metals  is  both 
more  complicated  and  more  diverse.  Most  dramatic  among  the  metals  are  those 
called  superconductors.  While  superconducting  metals  usually  have  relatively 
low  room-temperature  conductivities,  there  is  a critical  temperature  Tc,  character- 
istic of  each,  below  which  its  conductivity  abruptly  becomes  infinite.  That  is, 
once  started  in  a loop  of  a superconducting  metal,  a current  will  continue 
flowing  indefinitely,  provided  the  temperature  is  kept  below  Tc.  The  highest  criti- 
cal temperature  for  a pure  metal  is  that  for  niobium,  for  which  Tc  = 8.9  K.  Super- 
conductors have  found  important  application  in  the  manufacture  of  high-field 


22-3  Ohm’s  Law  1029 


electromagnets,  and  they  may  be  used  in  the  near  future  in  long-distance  electric 
power  transmission  and  in  computer  memories. 

Alloys  have  much  lower  conductivities  than  pure  metals,  and  their  tempera- 
ture coefficients  of  conductivity  are  generally  much  smaller  than  those  of  pure 
metals.  We  discuss  some  reasons  for  this  in  Sec.  22-5. 

In  Examples  22-3  and  22-4,  Ecj.  (22-27)  and  Ohm's  law  are  applied  to 
situations  of  practical  significance. 


EXAMPLE  22-3  ■■■■■■■  m— — 

William  Siemens  proposed  in  1860  that  the  standard  of  resistance  be  a column  of 
pure  mercury  exactly  1 m long  and  1 mm2  in  cross-sectional  area,  held  at  a tempera- 
ture of  exactly  0°C.  What  is  the  resistance  of  this  proposed  standard  in  ohms?  How 
long  should  the  mercury  column  be  if  its  resistance  is  to  be  1.0000  fl? 

■ Inserting  the  values  of  the  resistivity  p0  and  the  temperature  coefficient  a for 
mercury  from  Table  22-1  into  Eq.  (22-26),  you  have 

p(0°C)  = 95.783  x 10“8  fl-m  x [1  + 0.00089  “C"1  x (0°C  - 20°C)] 

= 94.078  x 10-8  fl-m 


And  using  Eq.  (22-256),  R = pl/a , you  obtain 


R = 


94.078  x 10“8  fl-m  x 1 m 
1 X 10-6  nr 


0.94078  f! 


for  a column  1 m long.  II  the  resistance  is  to  be  1.0000  fl,  the  length  of  the  column 
should  be 


, RA  1.0000  a x 1 x 10-6  m2  , nnan 

l = = — „■  ■ = 1.0630  m 

p 94.0/8  x 10  8 a-m 


For  many  years  a mercury  column  1.06300  m long,  containing  a mass  of  mer- 
cury equal  to  14.4521  g and  a constant  cross-sectional  area  (which,  given  this 
mass,  turns  out  to  be  very  close  to  1 mm2),  was  the  primary  standard  of  resistance, 
the  international  ohm.  At  a time  when  means  of  measuring  mass  and  length  were 
more  accurate  than  those  for  voltage  and  current,  this  definition  was  useful  for  its 
high  precision  and  reproducibility.  However,  it  lost  its  usefulness  when  this  situa- 
tion ceased  to  exist.  Today  there  is  no  primary  standard  ohm.  Rather,  the  ohm  is 
defined  to  be  1 V/A,  as  we  have  done  in  Eq.  (22-19).  However,  excellent  secondary 
standards  are  available,  which  are  stable  and  convenient  to  use. 


The  copper  wire  of  Example  22-2  carries  its  maximum  rated  current  i = 15.0  A. 
Find  the  magnitude  of  the  internal  electric  field  which  drives  the  current.  The  tem- 
perature of  the  wire  is  50°C. 

■ Using  Eq.  (22-24),  j = crcf,  you  can  express  the  electric-field  magnitude  in  the 
form  % — j/cr.  You  can  evaluate  the  current  density  j by  using  Eq.  (22-23),  j = i/a. 
Using  the  value  of  a from  Example  22-2,  you  have 


15.0  A 

2.08  x 10“6  in2 


= 7.21  x 106  A/m2 


To  find  the  conductivity  cr,  you  can  use  Eq.  (22-26), 

p = po[l  + a(t  - to)] 

to  obtain  the  value  of  the  resistivity  at  50°C  and  then  write  Eq.  (22-25)  in  the  form 
a = 1/p  to  evaluate  the  conductivity  at  the  same  temperature.  Or  else  you  can  use 


1030  Steady  Electric  Currents 


Eq.  (22-26)  to  derive  a general  expression  for  the  temperature  dependence  of  o\ 
You  have 

cr  = p'1  = po![l  + a(t  - t0)]_1  = cr0[l  + a(t  - t0)]_1 

Since  a rough  calculation  using  the  value  of  a from  Table  22-1  shows  you  that 
a(t  — t0)  « 1,  you  can  use  the  mathematical  approximation  (1  + x)n  — 1 + nx, 
which  is  valid  for  any  exponent  n if  x <5C  1.  So  you  obtain  the  general  result 


c r = <r0[l  - a (t  - t0)] 


(22-27) 


Inserting  the  numerical  values,  you  find  the  result 
a = 56.47  x 106  S/m  x [1  - 0.0038  “C"1  x (50°C  - 20°C)]  = 50.0  x 106  S/m 
Finally,  you  use  the  values  of  j and  cr  to  calculate 


j _ 7.21  x 106  A/m2 
cr  50.0  X 106  S/m 


= 0.144  V/m 


The  result  of  Example  22-4,  with  a potential  difference  of  only  0. 144  V 
between  two  points  1 m apart  is  typical  of  metal  wires  carrying  ordi- 
nary currents. 

Up  to  this  point  we  have  discussed  only  uniform  current  densities  in 
cylindrical  wires.  In  this  important  special  case,  the  current  i,  the  current 
density  j,  and  the  cross-sectional  area  a may  all  be  dealt  with  in  terms  of 
magnitudes  only.  The  direction  of  the  current  lines  is  everywhere  along  the 
wire,  and  a plane  of  fixed  cross-sectional  area  can  always  be  drawn  normal 
to  the  current  lines.  Since  the  current  density  is  uniform,  it  can  be  defined 
simply  as  j = i/a.  What  is  more,  the  electric  field  has  the  same  magnitude 
c?  everywhere  along  the  wire. 

But  this  is  not  always  the  case.  Consider,  for  example,  the  disk-shaped 
conductor  shown  in  Fig.  22-10.  A wire  of  a material  having  a much  higher 
conductivity  than  that  of  the  disk  supplies  current  to  the  small  high- 
conductivity  core  at  its  center.  Around  the  edge  of  the  disk  is  wrapped  a 
band  of  the  same  high-conductivity  material.  Thus  the  core  and  the  edge 
may  be  considered  as  equipotential  regions;  the  potential  difference 
between  them  is  distributed  in  some  way  through  the  disk.  How  can  we  des- 
cribe the  current  through  the  disk?  Since  charge  cannot  accumulate  any- 
where in  the  disk,  the  current  flowing  out  through  band  B must  be  equal  to 
that  Howing  in  through  core  A.  And  the  same  current  must  pass  through 
the  disk.  Specifically,  all  the  current  must  pass  through  any  closed  “hoop” 


Fig.  22-10  A disk-shaped  conductor.  Current  flows  through  the 
disk  from  A to  B.  In  doing  so,  it  must  pass  through  the  cylindrical 
closed  surfaces  D and£,  and  also  through  the  irregular  closed  sur- 
face C.  Some  current  lines  are  denoted  by  arrows.  By  symmetry, 
they  must  be  straight,  radial  lines. 


22-3  Ohm’s  Law  1031 


which  cuts  through  the  disk  and  which  contains  the  center  core,  but  not  the 
outside  edge.  Such  hoops  are  the  irregular  hoop  C,  as  well  as  the  circular 
hoop  D of  radius  rD  and  the  circular  hoop  E of  radius  rE. 

The  approach  we  develop  to  deal  with  current  (which  is  a flux  of  elec- 
tric charge)  is  completely  analogous  to  the  approaches  used  to  deal  with 
fluxes  in  Chaps.  12,  16,  and  20.  As  before,  we  choose  an  infinitesimal  area 
element  da  on  any  one  of  the  hoops  and  associate  with  it  a vector  da,  whose 
magnitude  is  numerically  equal  to  the  area  and  whose  direction  is  outward, 
normal  to  the  area  element.  Since  the  current  is  not  uniform  throughout 
the  disk  (note  how  the  current  lines  shown  in  Fig.  22-10  spread  out  with 
increasing  distance  from  the  center),  the  current  density  cannot  be  uni- 
form throughout  the  disk  in  either  magnitude  or  direction.  In  order  to 
specify  the  current  density  completely,  we  must  specify  a direction  as  well 
as  a magnitude  at  every  location.  Thus  the  current  density  is  a vector  j.  The 
current  di  passing  through  the  area  element  da  is  given  by  the  expression 


di  = j • da 

(22-28) 

Compare  this  expression  with  Eq.  (20-29), 

d<Fe  = 8 • da 

(22-29) 

to  which  it  is  analogous.  Equation  (22-29)  gives  the  magnitude  of  an  element  of 
electric  flux  dd><,  in  terms  of  the  local  electric  field  magnitude  (which  is  defined  to 
be  the  electric  flux  density)  and  its  direction  relative  to  the  orientation  of  the  area 
vector  da.  Equation  (22-28),  which  is  mathematically  identical  but  physically 
much  less  abstract,  gives  the  magnitude  of  an  element  of  electric  current  in  terms 
of  the  local  electric  current  density  and  its  direction  relative  to  the  orientation  of 
the  area  vector  da. 

l he  total  current  flowing  through  any  “hoop”  is  found  by  integrating 
the  current  density  over  the  hoop  which  encloses  the  current  source  A, 
using  Eq.  (22-28).  This  gives 

i = J di  = J j • da  (22-30) 

closed  closed 
surface  surface 

While  we  have  derived  it  for  the  special  case  of  a disk,  this  equation  is  valid 
for  a current-carrying  body  of  any  shape,  provided  that  a closed  surface  is 
drawn  around  the  current  source. 

The  equation  is,  in  fact,  closely  analogous  to  Gauss’  law,  Eqs.  (20-36)  and 
(20-37),  which  relate  the  electric  flux  passing  through  a closed  surface  to  the 
source  charge  it  contains: 


Ty  = — = I 8 • d a 

e0  J 

closed 

surface 

By  using  the  general  vectorial  definition  of  the  current  density  j, 
Ohm's  law  can  be  written  in  the  vectorial  form 

j = 0-8  (22-31) 

Provided  that  the  material  carrying  the  current  is  ohmic,  Eq.  (22-31)  can  be 
applied,  no  matter  how  complicated  the  geometry  of  the  current  flow.  In 


1032  Steady  Electric  Currents 


fact,  Eq.  (22-31)  is  valid  even  when  the  conductivity  <j  varies  from  place  to 
place. 

Current  flow  through  a disk  is  analyzed  in  Example  22-5. 


EXAMPLE  22-5 

A disk  like  that  shown  in  Fig.  22-10  is  made  of  nichrome.  Its  radius  is  1.0  m,  and  its 
thickness  5 is  0.20  mm.  The  central  core  is  a copper  plug  having  a radius  r1  = 2.0 
cm.  The  disk  is  surrounded  by  a copper  hoop  having  an  inner  radius  r2  = 1.0  m, 
which  fits  it  tightly. 

a.  Find  the  current  density  at  a point  35  cm  from  the  center  of  the  disk  when 
the  current  flowing  through  the  disk  is  i — 100  A. 

b.  Find  the  resistance  of  the  disk  at  room  temperature. 

■ a.  Note  that  the  calculation  of  the  current  density  is  analogous  to  the  use  of 
Gauss’  law  to  hnd  the  cylindrically  symmetrical  electric  field  associated  with  a static 
charge  on  a long  wire.  Once  that  is  done,  you  can  calculate  the  resistance  of  the  disk 
in  a manner  analogous  to  that  used  in  finding  the  capacitance  of  a cylindrical  capac- 
itor. 

You  have  from  Eq.  (22-30) 

i = I j’  da 

hoop  of 
radius  35  cm 


By  symmetry,  the  current  lines  are  everywhere  perpendicular  to  the  circular  hoop 
of  radius  35  cm.  And  by  symmetry,  j is  the  same  at  all  points  for  which  r = 35  cm. 
Thus  you  have 

i = J j da  = j(2nr8) 

hoop 

and  the  current  density  is 


J = 


100  A 


27rr5  2t t x 0.35  m x 2.0  x IQ-4  m 


= 2.3  x 105  A/m2 


b.  Since  you  know  i,  you  can  find  the  resistance  R of  the  disk  if  you  know  the 
voltage  V across  it.  You  can  express  V in  terms  of  the  electric  field  8 by  using 

Eq.  (21-445),  V = — J 8 • ds.  In  this  case  it  takes  the  form 


V = - 


8 • dr 


You  again  invoke  the  symmetry  of  the  situation  and  argue  that  the  electric  field  8 is 
everywhere  parallel  to  the  radius  vector  r.  Thus  you  have  8 • dr  = ‘S  dr,  and 
therefore 


V = - 


% dr 


Interchanging  the  limits  of  the  integral  and  reversing  its  sign  give  you 


V = 


dr 


From  Eq.  (22-31),  j = cr8,  you  substitute  the  magnitudes  % = j/a  — i/2vr8a,  so 
that  you  have 


V = 


z , i f'  dr 

dr  = 


In  - 


2vr8(T  2-n8a  J,.2  r 2tt8<j  \r2 
And  since  R = V/i,  you  obtain 


R = 


1 


In  (- 


2tt8(t  \r2 


22-3  Ohm's  Law  1033 


22-4  THE  ELECTRON 

GAS 


Using  the  value  of  cr  given  for  Nichrome  in  Table  22-1,  you  find 


R = 


1 


In 


100  cm 


277  x 2.0  x 10  4 m x 1.0  x 106  S/m  \ 2.0  cm 


3.i  x kt3  n 


Fig.  22-11  Wire-wound  rotating  cylin- 
der used  in  the  Tolman-Stewart  ex- 
periment in  order  to  determine  the 
charge/mass  ratio  of  the  charge  carriers 
in  metals.  The  cylinder  is  rotated  at  a 
large  initial  angular  speed  o>;,  and  is 
then  braked  rapidly  to  a stop. 


Ohm's  law  is  only  one  of  many  important  macroscopic  descriptions  of  elec- 
trical properties  of  metals  which  cry  out  for  an  understanding  based  on 
microscopic  considerations.  Matter  is  made  up  of  atoms,  which  in  turn  are 
made  up  of  negatively  charged  electrons  and  positively  charged  nuclei. 
Electrochemical  experiments  suggest  that  some  of  the  electrons  are  bound 
relatively  loosely  to  their  atoms,  since  the  measured  energy  typicallv  re- 
quired to  liberate  an  electron  from  an  atom  is  a few  electron  volts.  Is  the 
mobile  electric  charge  which  carries  current  in  metals  made  up  of  more  or 
less  free  electrons,  while  the  compensating  immobile  positive  charge 
is  made  up  of  the  ionized  atoms  they  leave  behind,  as  was  asserted  with- 
out proof  in  the  qualitative  description  of  conductors  in  Sec.  20-1? 

There  is  direct  evidence  in  support  of  this  view  in  the  experiment  of 
Tolman  and  Stewart.  This  experiment,  performed  in  1917,  is  depicted 
schematically  in  Fig.  22-1 1.  A cylinder  is  mounted  on  a shaft  and  can  be  ro- 
tated very  rapidly.  The  surface  of  the  cylinder,  whose  radius  is  r,  is  wound 
with  many  turns  of  wire  in  a single  layer. 

After  the  cylinder  has  been  set  spinning  at  a large  angular  velocity  of 
magnitude  o>,,  it  is  braked  to  a stop  as  quickly  as  possible.  While  coming  to  a 
stop,  it  experiences  a large  angular  acceleration  of  magnitude  a.  Thus 
every  part  of  the  wire  experiences  an  acceleration  of  magnitude  ar.  The  in- 
ertia of  any  free  particles  within  the  wire  will  make  them  tend  to  crowd 
toward  the  “front”  end  A of  the  wire  as  it  comes  to  a stop,  like  standing  pas- 
sengers in  a crowded  bus.  But  now  suppose  the  particles  have  charge  q. 
The  charge  will  pile  up  toward  the  front  end  of  the  wire,  as  it  comes  to  a 
stop,  only  until  the  electric  field  of  magnitude  % in  the  wire,  produced  by 
the  crowding,  just  suffices  to  oppose  the  acceleration  of  the  free  charged 
particles  relative  to  the  wire  with  an  electric  force  of  magnitude  F = \q\£.  In 
this  situation,  we  can  write  Newton’s  second  law,  for  a charged  particle,  in 
the  form  acceleration  = force/mass,  or 

w> 

ar  — 

m 


where  m is  the  mass  of  a particle.  As  a result  of  the  crowding  of  the  free 
particles  toward  end  A,  there  is  a potential  difference  of  magnitude  |Vj 
between  ends  A and  B of  the  wire.  Since  the  free  charged  particles,  called 
charge  carriers,  will  pile  up  toward  end  A,  that  end  will  have  the  higher  po- 
tential if  the  charge  carriers  are  positive.  But  if  the  charge  carriers  which 
pile  up  toward  end  A are  negative,  end  B will  have  the  higher  potential.  If 
the  electric  field  has  uniform  magnitude  % over  the  length  / of  the  wire,  we 
have  % — \V\ //,  as  shown  in  Eq.  (21 -48a).  The  magnitude  of  the  ac- 
celeration can  therefore  be  written 


\q\  M 


(22-32) 


1034  Steady  Electric  Currents 


In  principle,  then,  we  could  find  \q\/m,  the  magnitude  of  the  charge-to- 
mass  ratio  of  the  free  charged  particles  which  carry  electric  current  in  the 
wire,  if  we  could  measure  the  angular  acceleration  of  die  cylinder  and  the 
potential  difference  between  the  ends  of  the  wire.  In  addition,  the  sense  of 
the  potential  difference,  as  expressed  in  the  polarity  of  the  ends,  would  tell 
us  whether  the  free  particle  charge  has  a positive  or  a negative  value. 


In  practice,  however,  the  acceleration  is  not  constant  and  is  very  diffi- 
cult to  measure.  What  is  done  instead  is  the  following.  The  ends  of  the  wire 
are  connected  to  the  outside  world  through  sliding  contacts.  These,  in  turn, 
are  connected  to  the  terminals  of  a device  called  a ballistic  galvanometer, 
which  measures  the  total  charge  Q flowing  through  it  during  the  time  in- 
terval from  ti  to  tf  during  which  the  cylinder  is  being  braked  to  a stop.  That 
is,  it  measures  the  quantity 

<2=1''  dQ 
Jtt 

But  the  electric  current  is  defined  in  Eq.  (22-6)  to  be  i = dQ/dt.  Thus  we 
have  dQ  = i dt,  and  we  can  write  this  integral  in  the  form 

Q = I''  idt  (22-33) 

Jt< 

What  is  the  current  i?  Because  of  its  deceleration,  the  wire  becomes  a 
source  of  emf,  and  there  is  a potential  difference  V between  its  ends,  A and 
B.  This  potential  difference  drives  electric  charge  through  the  ballistic  gal- 
vanometer. Thus  there  is  a current  i through  the  ballistic  galvanometer, 
and  the  sense  of  i is  from  whichever  end  of  the  wire,  A or  B,  is  at  the  higher 
potential  to  the  end  at  the  lower  potential.  The  wire  itself  completes  the  cir- 
cuit. Since  the  acceleration  of  the  cylinder  is  not  constant,  the  potential  dif- 
ference V is  not  constant,  and  i is  not  constant  either.  But  at  any  moment  i is 
related  to  V by  Ohm’s  law,  i = V/R , where  R is  the  resistance  of  the  entire 
circuit,  including  the  wire  wound  on  the  cylinder  and  the  external  leads 
and  ballistic  galvanometer.  Thus  we  can  write  Eq.  (22-33)  in  the  form 


In  order  to  evaluate  this  integral,  we  solve  Eq.  (22-32)  for  the  magni- 
tude of  the  potential  difference  and  obtain 


M = n 


ET 


We  can  use  this  for  V if  we  make  sure  the  integral  yields  a positive  Q.  Thus 


m rl  fr 

m rl  . 

where  a)f  and  o»,-  are  the  final  and  initial  angular  speeds  of  the  cylinder. 
And  since  the  cylinder  is  brought  to  rest,  we  have  a)f  = 0,  so  that 


m rl 


22-4  The  Electron  Gas  1035 


Table  22-2 


Tolman-Stewart  Results  for  the 
Charge-to-Mass  Ratio  q/m  of 
the  Free  Charges  in  Selected 
Metals  and  — e/me  for  Free 
Electrons 
Charge 

carriers  in  q/m  (in  C/kg) 


Copper 

Silver 

Aluminum 


-1.60  x 1011 
-1.49  X 1011 
-1.54  x 1011 


Free  electrons  —1.76  x 10n 

in  vacuum 


This  can  be  solved  immediately  to  yield  the  magnitude  of  the  charge-to- 
mass  ratio  of  the  charge  carriers  in  the  metal  wire.  We  have 


M _ on 
m 7 QR 


(22-34) 


The  ballistic  galvanometer  gives  the  sense  of  the  charge  flow  through 
it — that  is,  the  sense  of  the  current — as  well  as  the  total  charge  Q.  As 
already  noted,  this  sense  is  from  A to  B through  the  galavanometer  if  the 
charge  carriers  are  positive,  and  from  B to  A if  they  are  negative.  The 
quantity  q/m  is  thus  measured  in  terms  of  the  magnitude  and  sense  of  flow 
of  the  total  charge  Q flowing  through  the  ballistic  galvanometer,  the  initial 
angular  speed  w,-  of  the  cylinder,  and  the  other  directly  measurable  quan- 
tities r,  l,  and  the  circuit  resistance  R.  The  results  of  Tolman  and  Stewart 
are  given  in  Table  22-2,  together  with  the  charge-to-mass  ratio  — e/m?  mea- 
sured for  free  electrons  in  a vacuum.  The  results  strongly  suggest  that  the  charge 
carriers  in  these  metals  behave  something  like  free  electrons. 

It  is  tempting  to  go  one  step  farther  and  argue  that  the  electrons  in  a 
metal  comprise  a gas — that  is,  a collection  of  independent  particles  con- 
fined by  only  the  “walls”  which  are  the  surfaces  of  the  metal.  If  that  were  so, 
we  could  apply  our  know  ledge  of  the  properties  of  an  ideal  gas  wholesale. 
In  Sec.  22-5,  we  pursue  this  line  of  argument  in  order  to  develop  a micro- 
scopic basis  for  Ohm’s  law. 


22-5  THE 
MICROSCOPIC  BASIS 
OF  ELECTRIC 
RESISTANCE 


What  is  it  about  the  way  that  electric  charge  flows  through  metals  that  leads 
to  the  validity  of  Ohm’s  law,  V = iR  or  j = cr8?  Electric  resistance  falls  into 
the  category  of  phenomena  we  call  frictional.  In  the  absence  of  any  other 
forces  acting  on  it,  a free  electron  having  charge  — e and  mass  me,  subjected 
to  a constant  electric  field  8,  will  accelerate  under  the  action  of  an  electric 
force  F = — r8.  According  to  Newton’s  second  law,  we  have  a constant 
acceleration  whose  value  is  given  by 


F = -eZ 

me  me 


(22-35) 


1036  Steady  Electric  Currents 


But,  in  fact,  the  current  in  a metal  wire,  driven  by  an  electric  field  8 because 
of  a potential  difference  applied  across  the  wire,  can  be  expressed  in  terms 
of  the  charge  density  pq,  the  cross-sectional  area  a of  the  wire,  and  the  drift 
velocity  v of  the  mobile  electric  fluid.  For  negative  charge  carriers,  v is  anti- 
parallel to  8.  The  relation  given  by  Ecp  (22-10)  can  therefore  be  written 

i — pqa\  • 8 = —pQav  (22-36) 

The  current  is  thus  directly  proportional  to  the  drift  speed  v of  the 
current-carrying  electric  fluid,  which  we  are  assuming  to  be  a gas  of  free 
electrons.  If  the  current  is  constant  when  the  voltage  is  constant,  the  drift 
speed — not  the  acceleration — must  be  constant.  Thus,  the  electric  force 
F = — eZ  in  Eq.  (22-35)  cannot  be  the  only  force  on  the  electrons. 

It  is  possible,  however,  to  account  for  the  constant  drift  speed  in  terms 
of  a frictional  force.  Typically,  a macroscopic  fluid  frictional  force  increases 
in  magnitude  with  increasing  speed  of  the  macroscopic  body  on  which  it 
acts,  until  it  is  equal  in  magnitude  to  the  force  driving  the  body  through  the 
fluid.  Since  the  direction  of  the  frictional  force  is  opposite  to  that  of  the 
driving  force,  the  acceleration  of  the  body  is  then  zero,  and  it  moves  at  con- 
stant velocity.  In  Sec.  5-6,  for  example,  this  approach  was  developed  to  deal 
with  the  problem  of  the  falling  skydiver. 

We  cannot  simply  say  that  there  is  a frictional  force  acting  on  the  indi- 
vidual electrons  which  collectively  carry  the  current  in  a metal.  An  electron 
is  a microscopic  object.  But  a frictional  force  is,  by  its  nature,  macroscopic; 
it  represents  an  average  effect  of  many  microscopic  events,  each  of  which 
individually  is  conservative  rather  than  dissipative.  Thus  we  must  look  into 
the  averaging  process  in  order  to  understand  the  phenomenon  of  electric 
resistance. 


We  begin  by  making  a distinction  between  the  collective  drift  velocity  of 
all  the  electrons  carrying  current  and  the  individual  random  velocity  of  any- 
one of  them.  If  there  is  an  electric  current  in  a wire,  the  electron  gas  as  a 
whole  must  be  flowing  down  the  wire.  Since  we  are  supposing  that  elec- 
trons act  more  or  less  like  the  molecules  of  a gas,  the  collective  flow — that 
is,  the  flow  of  the  electron  gas  as  a whole  — must  be  superimposed  on  the 
random  individual  motions  of  the  electrons.  The  very  fact  that  the  individ- 
ual motions  are  random  means  that  motion  is  equally  likely  in  all  directions, 
and  no  net  charge  is  conveyed  from  one  point  to  another  as  a result  of 
them.  The  drift  velocity  v,  however,  is  superimposed  on  all  the  random  ve- 
locities just  as  the  velocity  of  the  wind  is  superimposed  on  the  much 
greater  velocities  of  all  the  molecules  in  the  air. 

As  a general  rule,  the  drift  velocity  is  very  much  smaller  in  magnitude  than 
the  random  velocities  of  the  electrons.  You  have  seen  in  Example  22-2  that  a typi- 
cal drift  velocity  magnitude  is  of  order  1 mm/s.  In  contrast,  if  you  assume  that  an 
electron  in  a metal  behaves  like  an  ideal-gas  molecule  of  mass  m(,  equal  to  that  of  a 
free  electron,  its  random  thermal  velocity  has  the  root-mean-square  magnitude 
vrms  given  by  Eq.  (18-53), 

/3  kT  \ 1,2 

t^rms  ( ) 

V mP  / 

At  room  temperature  (T  = 300  K),  this  yields  vrms  = 1.2  x io5  m/s,  a magnitude 
108  times  greater  than  that  of  the  drift  velocity. 


22-5  The  Microscopic  Basis  of  Electric  Resistance  1037 


We  are  looking  for  some  kind  of  “frictional  force”  that  tends  to  reduce 
the  drift  velocity  by  opposing  the  electric  force  which  the  externally  ap- 
plied electric  held  imposes  on  the  electrons,  but  which  does  not  affect  the 
random  velocities  of  the  individual  electrons  in  a significant  way,  on  the 
average. 

At  first  glance,  this  may  seem  to  be  a self-contradictory  task.  When  an 
electron  is  moving,  how  does  it  “know”  what  part  of  its  motion  is  random 
thermal  motion  and  what  part  is  collective  drift  motion?  How  can  a fric- 
tional force  act  on  only  the  former  and  not  the  latter?  The  key  to  the  matter 
lies  precisely  in  the  distinction  between  the  orderliness  of  the  drift  and  the 
randomness  of  the  thermal  motion.  Random  motion  results  in  the  net  trans- 
port of  no  charge.  So  a mechanism  that  acts  to  convert  drift  motion  — 
which  does  result  in  the  transport  of  charge  — into  random  motion  will 
serve  the  purpose  of  imposing  “friction”  on  the  system,  by  tending  to  make 
the  drift  motion  “coast  to  a stop.” 

In  order  to  introduce  such  a randomization  mechanism,  we  argue  that 
the  electrons  are  not  completely  free  to  move  about  in  the  wire.  The  metal 
of  which  the  wire  is  made  contains  many  other  things  besides  electrons,  and 
these  act  as  obstructions  with  which  the  electrons  make  collisions. 


It  seems  reasonable  to  assume  that  the  metal  ions  themselves  act  as  the  ob- 
structions. This  is  the  basis  on  which  the  theory  we  are  now  developing  was  origi- 
nally conceived,  and  we  follow  this  assumption  for  the  time  being.  It  will  furnish 
a check  on  the  theory,  since  the  electrical  conductivity  turns  out  to  be  related  to  the 
distance  between  collisions.  According  to  our  assumption,  this  should  be  approxi- 
mately equal  to  the  known  distance  between  ions  in  a metal. 

A more  precise  theory  based  on  the  laws  of  quantum  mechanics  predicts  that 
electrons  do  not  collide  with  the  regularly  arrayed  ions  in  the  crystal  lattice  which 
typifies  the  structure  of  pure  metals,  but  only  with  irregularities  of  various  kinds 
in  the  crystal  lattice.  We  return  to  this  point  toward  the  end  of  this  section. 

We  argue  further  that  the  collisions  experienced  by  the  electrons  are 
completely  randomizing.  That  is,  we  assert  that  there  is  no  way  of  predicting 
the  direction  in  which  an  electron  will  bounce  off  an  obstruction,  on  the 
basis  of  the  direction  in  which  it  was  going  before  it  struck  the  obstruction. 
If  we  think  of  the  electrons  as  billiard  balls  and  the  ions  (or  other  obstruc- 
tions) in  the  metal  as  bowling  balls,  the  collisions  will  range  from  head-on  to 
barely  glancing.  As  a result,  the  light  “billiard  ball”  electrons  will  bounce  off 
the  heavy  “bowling  ball"  ions  in  all  possible  directions.  Thus,  electrons  have 
no  “memory”  of  their  previous  motion.  To  achieve  this  randomization,  we 
must  assume  that  the  electrons  are  scattered  with  equal  probability  in  all 
directions.  The  scattering  process  is  called  isotropic  scattering. 

Between  collisions,  the  electrons  are  accelerated  in  an  orderly  fashion 
by  the  externally  applied  electric  field,  thus  slightly  increasing  their  speeds. 
Each  collision  converts  the  resulting  orderly  motion  into  random  motion. 
Consequently,  the  electric  field  slightly  increases  the  kinetic  energy  of 
random  motion  of  the  electrons — their  thermal  energy.  Through  further 
collisions,  this  leads  to  a corresponding  gradual  increase  in  the  energy  of 
thermal  vibration  of  the  fixed  ions  as  well.  The  result  is  a slight  warming  of 
the  conductor  as  a whole.  It  is  a matter  of  general  experience  that  this,  in 
fact,  happens  when  an  electric  current  passes  through  a conductor — that  is 
why  electric  heaters  and  light  bulbs  work.  The  process  is  called  Joule 
heating;  we  discuss  it  from  a macroscopic  point  of  view  in  Sec.  22-6. 


1038  Steady  Electric  Currents 


Let  us  now  consider  the  part  of  the  velocity  acquired  by  an  electron 
between  collisions,  as  a result  of  its  acceleration  by  an  externally  applied 
electric  held.  According  to  Eq.  (22-35),  the  acceleration  is  ( — e/me)E>.  If  a 
time  A t has  elapsed  since  the  last  collision,  the  velocity  \At  resulting  from 
this  acceleration  is 


— 8 M 

me 


(22-37) 


In  order  to  concentrate  our  attention  on  the  collective  drift  velocity,  we 
imagine  all  the  electrons  in  the  electron  gas  to  be  replaced  by  an  equal 
number  of  “average”  electrons.  These  average  electrons  have  the  conven- 
ient property  that  they  have  no  random  motion  at  all!  (This  is  because  the 
random  velocity,  averaged  over  all  electrons,  is  zero.)  Thus  we  can  consider 
their  drift  motion  only.  An  average  electron  begins  moving  from  rest  just 
after  a collision,  when  its  drift  velocity  has  just  been  completely  converted 
into  random  velocity  by  the  collision.  It  then  accelerates  in  a uniform  way, 
reaching  some  maximum  velocity  vA(  = vmax  just  before  the  next  collision, 
and  then  comes  to  rest  and  repeats  the  process. 

If  we  knew  the  time  between  collisions,  we  could  find  the  average  drift 
velocity  vmax/2.  But  the  actual  collision  process  is  random.  Sometimes  an 
electron  goes  only  a very  short  distance  between  collisions;  at  other  times  it 
may  go  quite  a long  way  without  hitting  anything.  While  the  time  between 
collisions  varies  from  case  to  case,  there  must  be  an  average  time  between 
collisions.  We  call  this  the  mean  scattering  time  r (lowercase  Greek  tau). 
We  may  imagine  our  average  electrons  as  making  collisions  at  instants 
evenly  separated  by  time  intervals  r.  At  the  end  of  such  an  interval,  an 
average  electron  has  velocity  vT  = vmax.  The  average  electrons  therefore 
have  an  average  drift  velocity 


<v> 


1-fle 

2 me 


We  drop  the  factor  \ in  recognition  of  the  approximate  nature  of  this  calcu- 
lation and  write 


( v)  — 8 (22-38) 

me 

Equation  (22-38)  gives  a drift  velocity  which  is  directly  proportional  to 
the  externally  applied  electric  field  8.  It  is  therefore  also  directly  propor- 
tional to  the  electric  force  which  drives  the  electric  current.  That  is,  the  pic- 
ture we  have  developed  is  a typical  “frictional”  one,  even  though  we  have 
not  introduced  an  explicit  frictional  force.  Rather,  the  randomization 
mechanism  takes  the  place  of  a continuous  frictional  force  whose  magni- 
tude is  proportional  to  that  of  the  drift  velocity  and  whose  direction  is  op- 
posite to  that  of  the  applied  force.  The  microscopic  model  which  we  have 
developed  of  charge  flow  through  a conductor  under  the  influence  of  an 
externally  applied  electric  field  thus  predicts  a drift  velocity  which  is  constant 
when  the  externally  applied  electric  field  is  constant.  Next  we  show  that  this  result 
is  essential  if  Ohm’s  law  is  to  be  satisfied  by  the  conductor  on  the  macro- 
scopic scale. 

Since  the  average  velocity  (v)  is  the  drift  velocity  we  need  in  order  to 
relate  the  microscopic  motion  of  the  electrons  to  the  macroscopic  electric 


22-5  The  Microscopic  Basis  of  Electric  Resistance  1039 


current,  we  can  now  develop  the  equation  which  relates  them.  Written  in 
terms  of  the  average  speed  (v),  Eq.  (22-36)  becomes 

i = ~ pQa(v ) 

The  mobile  charge  density  pq  must  also  be  expressed  in  terms  of  micro- 
scopic quantities.  This  is  done  by  noting  that  pQ  is  the  product  of  the  charge 
— e on  each  electron  and  the  number  N of  electrons  per  unit  volume.  Thus 
we  have 

pq  = N(  — e)  = — Ne  (22-39) 

We  can  substitute  this  value  of  pq  into  the  equation  displayed  immediately 
above  it  to  obtain 


i = Nea(v)  (22-40) 

In  relating  the  microscopic  properties  of  a metallic  conductor  to  the 
macroscopic  current  passing  through  it,  we  do  not  wish  to  be  concerned 
with  the  shape  or  size  of  the  sample.  Such  macroscopic  details  are  not  rele- 
vant to  the  microscopic-macroscopic  relation  in  a fundamental  way.  There- 
fore we  recast  Eq.  (22-40)  in  terms  of  the  current  density  j.  If  we  assume 
that  j is  uniform,  we  can  use  the  definition  of  its  magnitude  given  by  Eq. 
(22-23),  j = i/a,  and  rewrite  Eq.  (22-40)  in  the  form  j = Ne(v).  Now  j is 
always  parallel  to  8.  But  8 is  antiparallel  to  (v)  for  negative  charge  carriers. 
Thus  j is  antiparallel  to  (v),  and  we  have 

j = -Ne(\)  (22-41) 

We  now  substitute  into  this  equation  the  value  of  (v)  given  by  Eq. 
(22-38),  to  obtain 


j = ~Ne 


er  \ „ Ne2T 

— 8 = 8 

me  / me 


(22-42) 


Let  us  compare  this  equation  with  the  macroscopic  Ohm’s  law  in  the  form 
of  Eq.  (22-31), 

j = o-8 

The  comparison  shows  that  the  quantity  Ne2T/me,  whose  factors  all  have 
microscopic  significance,  is  equal  to  the  experimentally  measurable  macro- 
scopic conductivity  cr.  That  is, 

Ne2r 

ar  = (22-43a) 

me 


Now'  Ohm’s  law  requires  that  the  current  density  j be  proportional  to  the 
electric  field  magnitude  %.  This  can  be  true  only  if  the  conductivity  u, 
which  appears  in  Ohm’s  law  in  the  form  j = cr%,  is  a constant  independent 
of  Consequently,  the  quantity  on  the  right  side  of  Eq.  (22-43a)  must  be  a 
constant  independent  of  %.  It  is  clear  that  e 2,  the  square  of  the  electron 
charge,  and  me,  the  electron  mass,  satisfy  this  condition.  The  number  N of 
tree  electrons  per  unit  volume  will  also  satisfy  the  condition  as  long  as  the 
electric  field  is  not  so  large  that  electrons  can  be  given  enough  kinetic  en- 
ergy between  collisions  to  allow  them  to  free  still  more  electrons  from  the 
ions.  This  is  certainly  the  case  in  ordinary  metals  carrying  ordinary  cur- 
rents, as  you  can  see  from  the  result  of  Example  22-4. 

We  now  show  that  our  model  predicts  that  the  mean  scattering  time  r 
is  also  independent  of  the  magnitude  % of  the  externally  applied  electric 


1040  Steady  Electric  Currents 


field.  The  reason  is  that  the  field  has  a direct  effect  on  the  drift  velocity 
only.  As  long  as  the  drift  velocity  is  very  small  compared  to  the  random  ve- 
locities of  the  individual  electrons  (and  you  have  seen  that  it  is  typically  only 
10-8  as  large),  the  distance  they  cover  per  unit  time  is  governed  essentially 
by  the  latter.  And  since  the  collisions  made  by  the  electrons  are  with  objects 
having  some  average  separation,  a change  in  the  electric  field  has  negligible 
influence  on  the  time  required,  on  the  average,  for  an  electron  to  pass  from 
one  collision  to  the  next.  It  follows  that  the  mean  scattering  time  t is  indeed 
independent  of  %,  and  therefore  the  entire  quantity  Ne2T/me  on  the  right 
side  of  Eq.  (22-43 a)  is  a constant  independent  of  Thus  our  model  sa- 
tisfies Ohm’s  law. 

Note  that  the  charge  — e on  the  charge  carriers  (which  we  assumed  from  the 
beginning  to  be  electrons)  enters  Eq.  (22-43a]  only  as  a square,  e2.  Thus  the  valid- 
ity of  Eq.  (22-43a ) does  not  depend  on  the  sign  of  the  charge  carriers.  If  Ohm’s  law 
is  obeyed  by  some  substance  other  than  a metal,  where  the  charge  carriers  are  not 
electrons  but  other  particles  with  a charge-to-mass  ratio  q/m,  the  conductivity  is 
given  by  the  expression 

Nq2r 

(22-43b) 

m 


We  can  explore  how  well  this  rather  crude  picture  of  electrons  in  a 
metal  conforms  to  reality  by  making  comparisons  with  experimental  re- 
sults. Solving  Eq.  (22-43a)  for  the  mean  scattering  time  gives 


crrne 

~N? 


(22-44) 


We  have  no  direct  way  of  measuring  r and  thus  checking  the  theory  against 
experiment.  However,  an  electron  moving  with  the  random  speed  urms 
makes  a collision,  on  the  average,  every  time  it  travels  a distance  A given  by 


A = urmsT  (22-45) 

This  distance  is  called  the  mean  free  path.  [As  was  pointed  out  in  the  first 
small-print  section  following  Eq.  (22-36),  the  drift  speed  is  very  small  com- 
pared to  the  random  speed.  So  it  does  not  significantly  affect  the  relation 
between  collision  time  and  collision  distance.]  For  an  ideal  gas,  Eq.  (18-53) 
gives  for  the  root-mean-square  random  speed 

(5  kT  \ 1/2 

tW  = — — (22-46) 

\ me  ) 

where  k is  Boltzmann’s  constant  and  T is  the  absolute  temperature.  Substi- 
tuting Eqs.  (22-44)  and  (22-46)  into  Eq.  (22-45),  we  obtain 

/3  kT  \1/2  <rme 
^ \ me  ) Ne2 


or 

A = ^ (3  kTmeY12  (22-47) 

If  the  ‘Tree-electron"  model  we  have  developed  is  to  be  physically  mean- 
ingful. this  value  of  the  mean  free  path  must  be  comparable  to  the  distance 
between  ions  in  the  metal,  since  we  have  tentatively  assumed  that  the  free 


22-5  The  Microscopic  Basis  of  Electric  Resistance  1041 


electrons  are  colliding  with  the  ions  which  make  up  the  crystalline  “skele- 
ton" of  the  metal.  This  check  of  the  free-electron  model  against  experi- 
mental data  is  carried  out  in  Example  22-6. 


EXAMPLE  22-6 

Calculate  the  mean  free  path  for  brass  at  room  temperature  ( T = 300  K)  and  com- 
pare it  to  the  interionic  distance.  Brass  is  an  alloy  consisting  of  about  two-thirds 
copper  and  one-third  zinc.  Since  the  atomic  weights  of  copper  and  zinc  are  63.57 
and  65.38,  respectively,  you  will  be  accurate  enough  if  you  use  an  average  atomic 
weight  of  64  in  finding  the  number  of  electrons  per  unit  volume  TV  and  the  in- 
terionic distance  d.  Assume  that  each  atom  contributes  approximately  one  free  elec- 
tron, so  that  the  number  of  electrons  is  equal  to  the  number  of  ions. 

■ You  have  for  TV,  the  number  of  electrons  (or  ions)  per  unit  volume. 


TV  = 


number  of  atoms  in  f kmol 
volume  of  1 kmol 


Avogadro’s  number 
mass  of  i kmol/density 


For  brass,  the  density  is  8.4  X 103  kg/m3.  Using  this  value,  together  with  Avogadro’s 
number  A = 6.0  X fO26,  and  setting  the  mass  of  1 kmol  equal  to  64  kg,  you  have 


N = 


6.0  x 102B  x 8.4  x 103  kg/m3 
64  kg 


= 7.9  x I02i 


This  is  both  the  number  of  ions  per  cubic  meter  and  the  number  of  electrons  per 
cubic  meter,  if  you  imagine  N ions  spaced  a distance  d apart  in  a cubic  array,  they 
will  fill  a 1-m  cube  if  the  value  of  d is  given  by  d = 1/TV1'3.  Thus  you  have 

d = ^ = 2-3  x lO'10  m 


Inserting  the  value  of  TV  into  Eq.  (22-47),  and  obtaining  a from  Table  22-1,  you  find 
14  x 106  S 

^ ^ 

(1.60  x 10“19  C)2  x 7.9  x 1()28  m~3 

(3  x 1.38  x 10“23J/K  x 300  K x 9.1  x 10“31  kg)1'2 

= 7.4  x 10_1°  m 


So  the  mean  free  path  is  about  three  times  the  interionic  distance. 


The  result  of  Example  22-6  makes  the  free-electron  picture  look  quite 
plausible.  Indeed,  the  mean  free  path  and  the  interionic  distance  are  quite 
comparable  for  most  alloys,  where  the  ions  of  the  constituent  elements  are 
mixed  together  in  more  or  less  random  fashion.  But  for  pure  metals,  the 
situation  is  quite  different.  Pure  copper,  for  instance,  has  an  interionic  dis- 
tance about  the  same  as  that  of  brass.  But  its  conductivity  is  about  4 times 
greater  at  room  temperature.  According  to  Eq.  (22-47),  the  mean  free  path 
is  directly  proportional  to  the  conductivity,  so  that  the  mean  free  path  for 
pure  copper  would  appear  to  be  something  like  12  times  the  interionic  dis- 
tance. Even  with  the  crude  calculations  we  have  used,  it  is  questionable 
whet  her  the  electrons  are  really  colliding  with  the  ions  in  billiard-ball  fash- 
ion. It  is  significant  in  this  connection  that  the  conductivity  of  alloys  does 
not  change  dramatically  when  the  temperature  is  reduced,  but  the  con- 
ductivity of  all  pure  metals  increases  quite  rapidly.  At  liquid  helium 
temperatures  (T  — 4 K)  the  conductivity  of  pure  single  crystals  of  metals 
can  be  106  times  greater  than  the  room-temperature  value,  and  is  usually  at 
least  103  greater. 


1042 


Steady  Electric  Currents 


T 

1 imp 

Temperature  T 


Fig.  22-12  Schematic  plot  of  the  elec- 
trical resistivity  p of  a metal  as  a function 
of  absolute  temperature  T. 


The  microscopic  picture  of  ohmic  conduction  in  terms  of  a gas  of  colliding 
free  electrons  was  originally  put  forward  in  1900  by  the  German  physicist  P. 
Crude  (1863-1906),  in  essentially  the  way  we  have  developed  it.  Shortly  after- 
ward, H.  A.  Lorentz  elaborated  the  theory  by  averaging  explicitly  over  the 
Maxwell-Boltzmann  distribution,  instead  of  using  the  simple  but  crude  averaging 
process  of  assigning  the  root-mean-square  speed  to  every  electron.  For  this  reason, 
the  theory  of  conduction  by  free  electrons  acting  like  ideal-gas  molecules  is  often 
called  the  Drude-Lorentz  theory.  It  is  not  worth  repeating  Lorentz’  calculations,  be- 
cause electrons  in  metals  do  not  conform  to  the  Maxwell-Boltzmann  distribution. 
Rather,  their  explicitly  quantum-mechanical  behavior  makes  them  follow  a quite 
different  distribution,  the  so-called  Fermi-Dirac  distribution,  which  we  do  not 
discuss  in  this  book.  Furthermore,  even  aside  from  collisions,  the  electrons  are  not 
free  to  move  within  the  metal  under  the  influence  of  the  externally  applied  electric 
field  only.  They  experience  a periodic  force  as  they  move  along.  This  periodic 
force  arises  from  the  orderly  crystalline  array  of  the  positively  charged  ions  in  the 
metal.  A fairly  complete  general  account  of  the  behavior  of  electrons  in  metals  was 
achieved  in  the  1930s,  and  it  is  one  of  the  major  accomplishments  of  modern 
solid-state  physics. 


The  failure  of  the  free-electron  theory  to  account  properly  for  the 
observed  temperature  dependence  of  conductivity  is  one  of  its  greatest 
weaknesses.  To  see  this  quantitatively,  we  solve  Eq.  (22-47)  for  the  conduc- 
tivity cr  and  obtain 

Ne2\ 

a ~ (3  kTme)ia 

If  electrons  collide  with  ions,  the  mean  free  path  A.  should  not  change 
much  with  changing  temperature,  since  the  interionic  distance  changes 
only  slowly  as  the  metal  expands  or  contracts.  Thus  the  equation  immedi- 
ately above  predicts  that  the  conductivity  should  be  inversely  proportional 
to  the  square  root  of  the  absolute  temperature: 

a x T~vl 

Experiment  contradicts  this  prediction;  the  conductivity  of  pure  metals  is 
roughly  proportional  to  T-1  if  the  temperature  is  not  too  low.  Thus  the 
resistivity  p,  which  is  the  reciprocal  of  the  conductivity  cr,  is  directly  propor- 
tional to  T. 

Figure  22-12  is  a plot  of  resistivity  versus  temperature  for  a typical  pure 
metal.  At  very  low  temperatures  (typically  below  20  K)  the  resistivity  is  nearly 
constant.  This  constant  value  is  called  the  residual  resistivity  pimp.  (The  subscript 
“imp"  stands  for  impurity.  Its  significance  will  become  apparent  soon.)  At  higher 
temperatures  the  resistivity  increases  at  a constant  rate.  Thus,  at  sufficiently  high 
temperatures  the  resistivity  curve  is  fairly  well  approximated  by  the  grey  line 
in  the  figure,  whose  equation  is 

P = Pimp  +A(T  - Timp)  (22-48) 

The  empirical  constants  A and  Timp  are  determined  by  measurement. 

The  explanation  for  this  observation  is  essentially  quantum-mechanical.  In 
certain  circumstances,  electrons  in  a metal  behave  in  a wavelike  fashion,  as  is  dis- 
cussed in  Chaps.  30  and  31.  It  turns  out  that  electrons  in  a metal  having  the  proper 
wavelength  would  not  collide  at  all  with  the  ions  in  a hypothetical  perfectly  regu- 
lar crystal.  (For  an  analogy,  imagine  a standing  wave  in  a string.  If  you  touch  the 
string  at  the  nodes,  nothing  much  happens.)  It  is  irregularities  in  the  crystalline 
structure  of  the  metal  which  lead  to  collisions.  In  a carefully  prepared  crystal. 


22-5  The  Microscopic  Basis  of  Electric  Resistance  1043 


these  irregularities  are  of  two  main  kinds.  The  first,  which  dominates  in  alloys,  is 
the  irregular  ordering  of  the  various  types  of  ions  in  the  mixture.  In  a more  or  less 
pure  metal,  there  are  still  always  some  ions  of  impurities  present  in  random  loca- 
tions through  the  crystal.  The  second  source  of  irregularity  arises  from  the  thermal 
vibration  of  the  ions  in  the  crystal  about  their  equilibrium  positions.  The  higher 
the  temperature,  the  greater  the  amplitude  of  this  vibration  on  the  average,  and  the 
greater  the  chance  that  the  ion  will  be  “out  of  its  regular  place”  when  the  electron 
passes  it. 

The  electrical  resistivity  is  directly  proportional  to  the  probability  of  collision 
for  the  electrons.  But  the  probability  of  collision  per  unit  time  is  just  the  reciprocal 
of  the  mean  collision  time.  Thus  we  can  write 

1 

p °c- 

T 

Both  sources  of  irregularity  in  the  crystal  have  mean  collision  times  associated 
with  them.  There  is  an  impurity  collision  time  rimp  and  a thermal  collision  time 
rth.  The  two  collision  mechanisms  are  independent  of  each  other,  and  each 
presents  resistance  to  the  passage  of  electrons.  Thus  we  can  define  the  independent 
resistivities  pimp  « l/rimp  associated  with  the  impurity  collision  mechanism  and 
pth  l/r,h  associated  with  the  thermal  collision  mechanisms.  The  overall  resistiv- 
ity p is  the  sum  of  the  two: 

P = Pimp  + Pth  a f (22-49) 

T’imp  ^"th 

The  term  pimp  in  this  equation  is  the  residual  resistivity.  It  is  determined  by  the 
impurity  concentration  in  the  metal,  and  it  does  not  depend  on  temperature.  The 
term  pth  is  thermally  dependent,  and  detailed  analysis  shows  that  it  is  propor- 
tional to  T,  so  that 

Pth  = AT 

where  A is  some  constant.  Thus  we  have  for  the  overall  resistivity  p 

P Pimp  T Pth  Pimp  TAT 

This  accounts  for  the  experimental  behavior  at  temperatures  well  above  Timp 
shown  in  Fig.  22-12  and  summarized  in  the  empirical  equation  (22-48).  The  fact 
that  the  resistivity  of  metals  can  be  separated  into  a temperature-independent  term 
and  a temperature-dependent  term  is  called  Matthieso n’s  rule. 


22-6  JOULE  S LAW  When  an  electron  is  a part  of  a current  flowing  through  a conductor,  it 

starts  at  a location  where  its  potential  energy  is  high  and  moves  toward  a lo- 
cation where  its  potential  energy  is  lower.  Nevertheless,  on  the  average  the 
electron  arrives  at  its  new  location  with  substantially  the  same  kinetic  en- 
ergy it  had  to  begin  with,  because  on  the  average  its  speed  does  not  change. 
As  we  noted  in  Sec.  22-5,  the  “missing”  energy  has  been  converted  in  form 
to  the  thermal  energy  of  the  random  motion  of  the  elementary  constituents 
of  the  conductor. 

Even  without  considering  microscopic  details,  the  amount  of  energy 
converted  into  heat  energy  can  be  expressed  in  macroscopic  terms.  The 
calculation  is  similar  to,  but  simpler  than,  the  one  of  Sec.  21-7,  which  led  to 
an  expression  for  the  energy  stored  in  a capacitor.  However,  the  energy 
stored  in  a capacitor  is  recoverable  because  the  system  is  conservative.  In  the 
case  of  electrical  work  done  against  the  “friction”  of  resistance,  the  system  is 
dissipative.  As  is  usual  when  heat  energy  is  involved  in  a process,  the  second 
law  of  thermodynamics  forbids  the  reconversion  of  all  the  random  thermal 
energy  into  ordered  energy  of  macroscopic  motion. 


1044  Steady  Electric  Currents 


We  now  calculate  the  amount  of  electric  energy  converted  to  heat  en- 
ergy when  a current  flows  through  a conductor.  When  an  amount  of  elec- 
tric charge  dq  is  driven  through  a conductor  by  a potential  difference  V ap- 
plied to  the  ends  of  the  conductor  by  a source  of  emf,  the  work  clone  by  the 
electric  force  acting  on  the  charge  can  be  found  by  assuming  that  the  con- 
ductor is  uniform.  This  simplifies  the  calculation  without  affecting  the  gen- 
erality of  its  results.  Then  we  evaluate  the  magnitude  dF  of  the  electric  force 
exerted  on  the  charge  by  the  uniform  electric  held  of  magnitude  «?.  It  is 

dF  = % dq 

Next  we  evaluate  the  work  dW  done  by  this  force  in  displacing  the  charge 
through  the  length  / of  the  conductor.  Since  the  force  always  acts  in  the 
direction  of  the  displacement,  the  work  done  is 

dW  = dF  l = % dq  l 

Since  ^has  the  same  value  throughout  the  conductor,  Eq.  (21-48o)  allows  us 
to  write 

m = v 


Hence  we  have 


dW  = V dq 


The  rate  dW  /dt  at  which  work  is  being  clone  is  the  electric  power  input 
P to  the  conductor.  It  is  found  by  dividing  the  equation  immediately  above 
by  dt  to  obtain 


P 


dW  dq 

- — = v— - 

dt  dt 


But  dq/dt  is  just  the  electric  current  i,  so  we  have 

P = Vi  (22-50) 

That  is,  the  electric  power  input  to  the  conductor  is  given  by  the  product  of  the  poten- 
tial difference  between  its  ends  and  the  current  through  it.  Equation  (22-50)  is 

called  Joule’s  law. 


Joule’s  law  is  an  extension  of  the  principle  of  energy  conservation  to  the  par- 
ticular case  of  converting  electric  energy  to  heat  energy.  Historically,  it  was  of 
great  importance  in  leading  to  a general  understanding  of  energy  conservation. 
Joule  made  a series  of  experiments  during  the  early  1840s  in  which  he  immersed  a 
variety  of  electric  conductors  in  a calorimeter  containing  a variety  of  substances. 
He  showed  that  a given  voltage  and  a given  current  always  result  in  the  same  rate 
of  evolution  of  heat,  regardless  of  both  the  material  through  which  the  electric  cur- 
rent is  passing  and  the  material  being  heated. 

Today  our  confidence  in  the  general  principle  of  energy  conservation  is  so 
well  founded  that  Joule’s  experiment  is  but  one  of  many  which  corroborate  the 
principle.  But  its  expression  in  Eq.  (22-50)  is  very  important  in  determining  the 
power  consumption  of  electric  circuits  or  parts  of  circuits. 


Joule's  law  can  be  combined  with  Ohm’s  law  to  yield  two  very  useful 
expressions  for  the  power  consumption  of  any  part  of  an  electric  circuit 
that  obey’s  Ohm’s  law.  A part  of  an  electric  circuit  having  specific  electrical 
properties  is  called  a circuit  element.  An  ohmic  circuit  element,  in  particu- 


22-6  Joule’s  Law  1045 


lar,  is  called  a resistor.  We  first  substitute  Ofim's  law  in  the  form  V — iR 
into  Eq.  (22-50),  P = Vi,  to  obtain 

P = i2R  (22-51o) 

1'fi us  the  power  consumption  for  a given  resistor  is  proportional  to  the  square 
of  the  current  passing  through  it. 

Alternatively,  we  can  eliminate  the  current  i from  the  equation  P = Vi 
by  using  Ohm’s  law  in  the  form  i = V/R.  We  then  have 

P = V2/R  (22-51  b) 

So  the  power  consumption  of  a given  resistor  is  proportional  to  the  square  of  the  volt- 
age across  it. 

Equations  (22-50),  (22-5 la),  and  (22-51/))  have  been  derived  on  the 
basis  of  the  assumption  that  the  voltage  across  the  conductor  remains  con- 
stant. But  electric  power,  P = dW/dt,  is  defined  as  a derivative  and  there- 
fore has  an  instantaneous  value  even  if  the  work  being  done  on  the  circuit 
element  varies  with  time  because  the  voltage  varies  with  time.  Conse- 
quently, Joule’s  law,  P = Vi,  and  its  ohmic  forms,  P = rR  and  P = V2/R, 
hold  true  at  any  instant,  even  if  the  voltage  across  the  conductor  does  not 
remain  constant,  provided  that  the  proper  instantaneous  values  of  V and  i 
are  used. 

In  ordinary  commercial  power  transmission,  the  voltage  between  the 
contacts  of  the  wall  socket  varies  with  time  in  a sinusoidal  fashion,  with  a 
frequency  of  60  Hz  or  50  Hz  (depending  on  the  national  standard).  Never- 
theless it  is  possible  to  specify  an  average  voltage  by  using  a procedure 
described  in  Sec.  26-9.  This  is  the  “line  voltage”  quoted  by  the  power  com- 
pany. If  a resistor  of  resistance  R is  plugged  into  the  socket,  it  is  possible  to 
use  the  line  voltage  in  Eqs.  (22-51/))  and  (22-50)  to  calculate  an  average 
power  consumption  and  an  average  current.  These  average  values  are  sat- 
isfactory for  many  purposes,  as  Examples  22-7  and  22-8  show. 


EXAMPLE  22-7 

Find  the  current  passing  through  a 110-V,  100-W  light  bulb  when  it  is  operating 
under  normal  conditions.  If.  as  a result  of  a power  shortage,  the  electric  company 
reduces  the  line  voltage  by  5 percent,  how  much  power  does  the  nominal  100-W 
bulb  consume?  Assume  that  the  light  bulb  is  ohmic. 

■ II  P0  and  V0  are  the  rated  power  and  voltage  of  the  light  bulb,  you  can  immedi- 
ately use  Joule’s  law  to  find  the  normal  current  i0.  You  have  from  Eq.  (22-50) 


to  = 


P0  100  W 


Vo 


110  V 


= 0.909  A 


When  the  line  voltage  is  reduced  to  a new  value  Vx,  the  power  consumption  is 
reduced  to  a new  value  P1.  You  have  from  Eq.  (22-51/)) 

Pi  = v?/k  = /Ia2 

Po  VI/ R l V0/ 

Since  the  new  voltage  is  95  percent  of  the  rated  line  voltage,  you  obtain  Vx/V0  = 
0.95,  and 

P1  = (0.95)2P0  = 0.90  x 100  W = 90  W 

Thus  the  power  consumption  (and  the  light  output)  decreases  by  10  percent  when 
the  line  voltage  decreases  by  5 percent. 

The  calculation  just  made  can  be  only  approximate.  The  filament  of  a light 
bulb  operates  at  a very  high  temperature  (about  4000  K — the  exact  value  depends 


1046  Steady  Electric  Currents 


on  the  design  of  the  bulb,  which  represents  a compromise  between  efficient  light 
emission  and  long  bulb  life).  A small  reduction  in  power  consumption  results  in  a 
significant  decrease  in  filament  temperature  and  thus  in  a significant  reduction  in 
resistance.  That  is,  a light  bulb  is  not  ohmic  in  its  operating  range.  Because  of  the 
decrease  in  resistance,  the  power  consumption  does  not  fall  quite  as  much  as  the  cal- 
culation of  this  example  predicts.  However,  the  efficiency  of  visible-light  output 
falls  dramatically  with  falling  operating  temperature,  and  the  light  output  falls  by 
more  than  10  percent. 


EXAMPLE  22-8 

The  manufacturer  of  a 1 f 0-V  portable  electric  immersion  heater  claims  that  if  the 
heater  is  dipped  into  a cup  full  of  water,  the  water  will  be  boiling  and  ready  to  make 
tea  in  1 min.  Estimate  the  power  output  of  the  heater.  What  current  flows  through 
it?  What  is  its  resistance? 

■ Assume  that  a cup  holds  200  cm3,  or  0.200  kg,  of  water.  If  the  temperature  of 
the  water  as  it  comes  from  the  tap  is  a typical  10°C,  it  must  be  heated  to  boiling 
through  a temperature  difference  AT  = 90  K.  The  required  heat  energy  input  to 
the  water  is  thus  given  by  Eq.  (17-23)  to  be 


AH  = cm  AT  = 4186  J/(kg-K)  x 0.200  kg  x 90  K = 7.5  x 104  J 


where  c is  the  specific  heat  capacity  of  the  water,  expressed  in  joules  instead  of  kilo- 
calories by  the  use  of  Eq.  (17-27),  and  where  m is  the  mass  of  the  water.  Since 
this  heat  energy  is  transferred  to  the  water  at  a steady  rate  in  time  At,  the  power 
output  of  the  heater  is 


P 


AH 

At 


7.5  x 1 04  J 
60  s 


= 1.3  x 103  W 


The  current  flow  through  the  heater  can  be  found  from  Eq.  (22-50),  P = Vi. 
You  have 


P _ 1.3  x IQ3  W 
V ~ 110  V 


12  A 


Ohm's  law  gives  the  resistance  to  be 


i 


1 10  V 
12  A 


= 9 H 


Why  do  the  directions  on  such  heaters  always  specify  that  they  must  not  be  operated 
unless  immersed  in  water? 


22-7  DIRECT-CURRENT  All  practical  electric  circuits  are  made  up  of  lengths  of  conductors  of  neg- 
CIRCUITS  ligible  resistance  which  connect  circuit  elements.  Every  such  circuit  must  con- 
tain at  least  one  source  of  emf.  If  the  source  of  emf  produces  a voltage 
whose  sense  is  always  the  same,  it  is  called  a direct-current  source,  or  a dc 
source  for  short.  If,  in  addition,  the  magnitude  of  the  voltage  remains  con- 
stant in  time,  the  source  of  emf  is  called  a steady  source  of  direct  current, 
or  a steady  source  for  short.  Of  course,  no  current  will  How  at  all  unless  the 
source  is  connected  to  something  else,  in  such  a way  as  to  provide  a closed 
path,  or  circuit,  for  the  current.  In  this  section  we  deal  with  circuits  con- 
taining steady  sources  only. 


22-7  Direct-Current  Circuits  1047 


R 

Wv 

Fig.  22-13  A simple  circuit  diagram. 
The  source  of  emf  (here  shown  to  be  a 
battery)  produces  a steady  potential  dif- 
ference V,  whose  sense  or  polarity  is  in- 
dicated by  the  signs  + and  — . Leads  of 
negligible  resistance  connected  to  a re- 
sistor of  resistance  R complete  the  cir- 
cuit. 


Figure  22-13  uses  standard  symbolism  to  depict  the  simplest  possible 
circuit.  Such  a picture  is  called  a circuit  diagram.  It  contains  a battery,  re- 
presented by  the  standard  symbol  H p-  . The  longer  line  conventionally 
represents  the  positive  terminal,  and  the  shorter  one  represents  the  nega- 
tive terminal.  The  battery  is  connected  by  metal  wires,  conventionally  rep- 
resented by  lines,  in  a closed  loop  to  a resistor,  represented  by  the  symbol 
— W\r  . 


Any  real  battery  or  other  source  of  emf  has  an  internal  resistance.  That  is, 
some  electric  potential  energy  is  converted  to  heat  energy  within  the  source  as 
charge  flows  through  it.  But  the  symbol  in  the  circuit  diagram  is  intended  to  repre- 
sent an  ideal  battery  which  has  no  other  property  than  the  potential  difference 
between  its  terminals.  If  it  is  necessary  to  take  into  account  the  internal  resistance 
of  the  battery,  it  is  shown  as  an  extra  resistor  adjacent  to  the  battery. 

Likewise,  a real  resistor  may  have  other  electrical  properties  besides  its  resis- 
tance. (It  may,  for  example,  possess  a small  capacitance.)  But  the  symbol  depicts 
an  ideal  resistor  which  obeys  Ohm’s  law  and  has  no  other  properties. 

Similarly,  the  connecting  wires  have  a nonzero  resistance,  though  it  is 
usually  small  compared  to  that  of  the  resistor.  But  the  symbolic  lines  in  the  circuit 
diagram  are  intended  to  represent  ideal  conductors  having  no  resistance  at  all.  If  it 
is  necessary  to  take  into  account  the  resistance  of  the  wires,  it  is  shown  as  an  extra 
resistor  added  to  the  circuit. 

Instead  of  a battery,  the  circuit  could  contain  a thermocouple,  or  an  electric 
generator,  or  some  combination  of  sources  of  emf  which  produce  a steady  poten- 
tial difference  between  their  terminals. 


*1 


a 

v v v 

h 

b 

i 

R2 

i 

A A A 

h 

Fig.  22-15  Two  resistors  connected  in 
parallel. 


Sources  of  emf  and  ohmic  resistors  can  be  connected  in  all  sorts  of 
combinations.  The  fundamental  task  in  what  is  loosely  but  universally 
called  dc  circuit  analysis  is  to  find  the  current  flowing  through  any  part  of 
a circuit,  or  the  voltage  between  any  two  points  in  the  circuit,  given  the  nec- 
essary information  concerning  the  emf's  and  resistances  of  the  elements 
making  up  the  circuit. 

To  begin,  consider  the  network  of  resistors  shown  in  Fig.  22-14.  When 
a steady  potential  difference  Vab  exists  between  points  a and  b (presumably 
because  there  is  a source  of  emf  in  some  other  part  of  the  circuit  not 
shown),  a certain  steady  current  i will  flow  into  the  network  at  a.  And 
charge  cannot  accumulate  within  the  network,  because  if  it  did,  the  current 
could  not  be  steady.  Hence  the  same  current  i will  flow  out  at  b.  The  cur- 
rent will  split  up  and  merge  in  a complicated  way  at  the  various  branching 
points  within  the  network.  Nevertheless,  we  can  divide  Vab  by  i and  define 
the  equivalent  resistance  R of  the  network  to  be 


AW 


V2 

R2 

WNV- 


b 


Fig.  22-16  Two  resistors  connected  in 
series. 


l 

We  assert  that  if  each  individual  resistor  Rj  within  the  network  obeys  Ohm’s 
law,  the  network  as  a whole  will  do  so.  If  this  is  true  (and  we  will  soon  show 
that  it  is),  the  entire  network  can  be  replaced  with  an  equivalent  resistor  R 
between  points  a and  b.  The  question  is,  How  can  we  determine  the  value 
of  R in  terms  of  the  component  resistances  Ru  R2,  and  so  on? 

The  simplest  possible  combinations  of  resistors  are  represented  in  Figs. 
22-15  and  22-16.  The  first  is  called  a parallel  connection,  and  the  second  is 
a series  connection.  (The  terms  “parallel”  and  “series”  are  used  in  the  same 
way  as  they  are  for  combinations  of  capacitors  in  Sec.  21-6.) 


1048  Steady  Electric  Currents 


To  find  the  equivalent  resistance  for  resistors  in  parallel,  vve  note  that 
the  entire  potential  difference  Vab  across  the  network  from  a to  b is  imposed 
simultaneously  on  both  resistors  Ry  and  R2.  Thus  each  resistor  carries  the 
same  current  as  it  would  carry  if  the  other  were  not  there,  and  the  total  cur- 
rent i flowing  from  a to  b is 

i = q + i2  (22-52) 

where  q is  the  current  flowing  through  Rx  and  i2  is  the  current  flowing 
through  R2.  Each  resistor  obeys  Ohm’s  law,  so  we  have 


Va, 

Ri 


and 


h = 


Vab 

r2 


Thus  the  total  current  can  be  written  as 


_ Yah  Vab 

Ri  R2 


Vab  ' Rx  + Ri 


- Vab 


R 


Fig.  22-17  A network  of  N resistors  in 
parallel. 


These  equations  show  that  the  total  current  i is  also  directly  proportional  to 
Vab,  and  Ohm’s  law  is  thus  obeyed  by  the  network  as  a whole.  We  write  the 
proportionality  constant  for  the  entire  network  as  l/R,  where  we  define 

4 = qy-  + -J-  for  parallel  connection  (22-53) 

il  ill  /t2 


The  quantity  R is  what  we  have  called  the  equivalent  resistance  of  the  net- 
work. The  reciprocal  of  the  equivalent  resistance  of  two  resistors  in  parallel  is  equal 
to  the  sum  of  the  reciprocals  of  the  individual  resistances. 

The  derivation  of  Ecj.  (22-53)  can  be  extended  directly  to  apply  to  any 
number  N of  resistors  in  parallel,  as  in  Fig.  22-17.  The  result  is 

111  1 1 

— — + — + • • • + — + • • • + ~zz-  for  parallel  connection 

tv  tv  i t\2  t\j 

(22-54 a) 

or,  in  the  summation  notation, 


1 * 1 

~ = V — for  parallel  connection  (22-546) 

3=1  j 


For  two  resistors,  it  is  sometimes  handy  to  write  Eq.  (22-53)  in  the  form 


R 1R2 
R 1 + R2 


for  parallel  connection 


(22-55) 


Deriving  this  equation  involves  only  algebraic  manipulation  of  Eq.  (22-53). 


In  order  to  find  the  equivalent  resistance  for  the  two  resistors  shown  in 
series  in  Fig'.  22T6,  we  take  advantage  of  the  facts  that  in  the  steady  state 

o 7 O 

charge  cannot  accumulate  in  the  network  and  there  is  only  one  pathway 
along  which  charge  can  flow.  Consequently,  the  current  i must  be  the  same 
everywhere  along  the  pathway.  Thus  we  have 

i = q = i2  (22-56) 

where  q and  i2  are  again  the  currents  flowing  through  resistors  Rx  and  R2, 
respectively.  It  is  not  true  here,  as  it  was  for  the  parallel  connection,  that 


22-7  Direct-Current  Circuits  1049 


the  potential  differences  Vx  and  V2  across  the  resistors  equal  the  total  poten- 
tial difference  Vab  between  a and  b.  Rather,  Vab  divides  itself  between  the  re- 
sistors in  some  way,  subject  to  the  condition  that  the  potential  differences 
across  the  two  resistors  must  add  up  to  the  total  potential  difference 
between  a and  b.  Thus  Vab  is  given  by  the  sum 

Vab  = VX  + V2 

Each  individual  resistor  obeys  Ohm’s  law,  and  since  i = ix  = i2,  we 

have 

Vj  = iRi  and  V2  = iR2 
Adding  these  two  potential  differences  gives 

Vab  = iRx  + /R2  = i(Rx  + Ro)  = iR 

Here  again,  as  in  the  parallel  case,  the  voltage  Vab  between  a and  b is 
directly  proportional  to  the  current  i,  and  the  network  as  a whole  obeys 
Ohm's  law.  The  proportionality  constant  R is  defined  as  the  sum 

R = Ri  + R2  for  series  connection  (22-57) 

and  R is  the  equivalent  resistance  of  the  network.  Thus  the  equivalent  resis- 
tance of  two  resistors  in  series  is  the  sum  of  the  individual  resistances. 

The  derivation  of  the  equivalent  resistance  for  resistors  in  series  can  be 
extended  directly  to  apply  to  any  number  N of  resistors  in  series.  The  result 
is 


R = Rx  + Ro  + • • ■ + Rj  + • • • + RN  for  series  connection 

(22-58a) 

or,  in  summation  notation, 

N 

R = V Rj  for  series  connection  (22-586) 

j=i 

It  is  often  remarked  that  the  rules  for  finding  the  equivalent  resistance  of  re- 
sistors in  parallel  [Eq.  (22-53)]  and  in  series  [Eq.  (22-57)]  are  the  “opposite”  of 
those  for  finding  the  equivalent  capacitance  for  capacitors  in  parallel  [Eq.  (21-50)] 
and  in  series  [Eq.  (21-51)].  We  write  the  four  equations  together  for  comparison: 

111 

— = — — I-  — - and  C = Ci  + C2  for  parallel  connection 

il  il i Tl2 

i 111 

R = R,  + R2  and  — = — - + — for  series  connection 

C Cj  i Cj  2 

Can  you  account  for  this  “oppositeness”  qualitatively?  Suppose  that  instead  of 
deriving  the  equations  for  equivalent  resistance  R,  you  derive  the  equations  for 
equivalent  conductance  S,  where  S = l/R.  How  would  they  compare  with  the 
corresponding  expressions  for  equivalent  capacitance? 

The  equations  for  parallel  and  series  connections  are  applied  to  a 
more  complicated  network  in  Example  22-9. 


EXAMPLE  22-9 

Find  the  equivalent  resistance  of  the  network  shown  in  Fig.  22-18. 

■ First  note  that  the  network  between  points  a and  b is  a parallel  network  with  two 
branches.  However,  the  lower  branch  is  itself  a subnetwork  having  two  resistors  in 
series.  But  if  you  find  the  equivalent  resistance  Rs  for  the  subnetwork,  you  can  re- 


1050  Steady  Electric  Currents 


r j = 100  n 

— Wv 


i— vw — 

r2  = 50  n 


-A/VW 

= 30  n 


Fig.  22-18  Resistance  network  ana- 
lyzed in  Example  22-9. 


place  R2  and  R3  witli  the  single  resistor  Rs.  Then  you  can  find  the  equivalent  resis- 
tance R for  the  entire  network  consisting  of  Rx  and  Rs  in  parallel. 

So  you  begin  by  applying  Eq.  (22-57)  to  find 

r,  = r2  + r3  = 50  a + 30  a = so  n 

Then  you  use  Eq.  (22-53)  to  write  \/R  = 1 //?,  + 1 /Rs,  or 

1 1 

R = = = 44  n 

\/rx  + i/rs  i/ioo  n + i/so  a 


a 


Fig.  22-19  A network  in  which  the 
resistors  are  neither  in  series  nor  in 
parallel. 


Fig.  22-20  A circuit  containing'  several 
sources  of  emf  and  several  resistors. 


As  long  as  a complex  network  of  resistors  can  be  broken  down  into 
subnetworks  consisting  of  resistors  which  are  all  in  series  or  in  parallel,  the 
method  used  in  Example  22-9  can  be  extended  through  as  many  steps  as 
needed  to  find  the  overall  equivalent  resistance.  However,  the  method 
cannot  always  be  applied.  To  give  one  example  of  this,  consider  the  net- 
work of  resistors  shown  in  Fig.  22-19.  Resistors  Rx  and  R2 , for  instance,  are 
not  in  parallel  since  points  c and  d are  not  in  general  at  the  same  potential. 
Similarly,  Rx,  R5,  and  R4  do  not  form  a subnetwork  in  series  because  they 
do  not  all  carry  the  same  current. 

More  generally,  the  series  and  parallel  connection  rules  cannot  be  ap- 
plied to  a circuit  which  contains  more  than  one  source  of  emf.  Figure  22-20 
shows  such  a circuit.  We  cannot  simply  find  the  equivalent  resistance  R for 
the  path  between  a and  b and  write  V4  = iR. 

Both  cases,  however,  can  be  analyzed  by  using  a set  of  universal  rules 
called  Kirchhoff’s  rules.  Before  stating  these  rules,  we  must  define  a few 
terms.  First,  any  circuit  whatsoever  can  be  broken  up  into  a set  of  closed 
loops.  In  Fig.  22-20  three  such  loops  are  the  paths  aceba,  acdeba,  and  ecde. 
(There  are  other  possible  closed  paths,  such  as  acedceba,  but  these  involve 
retracings  of  steps  and  are  of  no  interest.)  Second,  a circuit  may  contain  a 
number  of  nodes,  such  as  the  points  c and  e,  where  three  or  more  pathways 
for  current  come  toget  her.  Third,  a branch  is  a part  of  a loop  lying  between 
two  nodes  and  having  no  nodes  within  it.  For  example,  ce,  cde,  and  cube  are 
branches. 

Kirchhoff’s  rules  state: 

1.  The  algebraic  sum  of  the  currents  entering  a node  is  zero.  For  the  pur- 
poses of  this  rule,  we  call  the  sense  of  a current  positive  if  it  flows  into  a 
node  and  negative  if  it  flows  out  of  the  node.  This  rule,  called  the  node 
rule,  is  nothing  more  than  a statement  of  the  fact  that  in  the  steady  state 
charge  cannot  build  up  at  a node.  It  is  based  on  the  law  of  the  conservation 
of  charge. 

2.  The  algebraic  sum  of  the  voltages  across  all  the  circuit  elements  in  a loop  is 
zero.  This  rule  is  called  the  loop  rule.  It  is  based  on  the  fact  that  if  we  start 
with  a positive  test  charge  at  any  point  in  a circuit,  and  bring  it  around  any 
loop  back  to  the  starting  point,  its  final  potential  energy  must  be  the  same 
as  its  initial  potential  energy. 


Fet  us  apply  Kirchhoff’s  rules  to  the  circuit  of  Fig.  22-20.  The  first  step 
is  to  assign  a sense  of  flow  to  the  current  in  each  branch  of  the  circuit,  as  we 
have  done  in  the  figure.  The  sense  assigned  is  perfectly  arbitrary;  if  you 


22-7  Direct-Current  Circuits  1051 


guess  wrong,  you  will  simply  get  a negative  value  for  that  current  when  you 
have  completed  the  analysis  of  the  circuit.  In  terms  of  the  directions  chosen 
in  Fig.  22-20,  the  node  rule  at  node  c can  be  written 

h - h ~ H = 0 (22-59) 

At  node  e the  node  rule  gives 

h + h ~ h = 0 

Comparison  of  these  two  equations  shows  that  the  second  one  furnishes  no 
further  information.  As  a general  rule,  the  number  of  independent  node  equa- 
tions is  one  less  than  the  number  of  nodes  in  the  circuit.  Can  you  explain  why? 

In  order  to  write  equations  using  the  loop  rule,  it  is  important  to  adopt 
a sign  convention  and  to  stick  to  it.  There  are  several  possible  ones.  We  use 
one  which  is  consistent  with  the  sign  of  electric  potential  energy  changes 
experienced  by  a positive  test  charge  as  it  is  taken  around  a loop.  The  con- 
vention is  as  follows: 

a.  When  a positive  test  charge  passes  through  a source  of  emf  from 
the  negative  to  the  positive  terminal,  its  electric  potential  energy  increases. 
Therefore,  in  Fig.  22-20  the  potential  change  in  going  through  the  battery 
of  voltage  Ti  from  the  negative  to  the  positive  terminal  (that  is,  from  left  to 
right)  is  +TX.  Passing  through  battery  V2  from  the  positive  to  the  negative 
terminal  (that  is,  from  right  to  left)  results  in  a potential  change  — V2. 

b.  When  a positive  test  charge  passes  through  a resistor  in  the  sense  of 
positive  current,  its  potential  energy  decreases,  since  it  is  going  from  a 
higher  potential  to  a lower  one.  Thus,  the  potential  change  in  going  through 
resistor  Rx  from  left  to  right  is  — i^Ri , because  this  passage  is  in  the  positive 
sense  of  the  current.  Such  a potential  change  is  often  called  in  iR  drop,  or  a 
voltage  drop.  Passing  through  resistor  R2  from  left  to  right  results  in  a 
potential  change  — ( — i2R2 ) = +i2R2,  because  the  sense  is  opposite  to  that 
of  the  current. 

We  are  now  ready  to  write  equations  using  the  loop  rule.  Going  clock- 
wise around  the  loop  aceba,  we  have 

-ixR,  - i2R2  + Vx  = 0 (22-60) 

Going  clockwise  around  the  loop  cdec  gives 

-i3R3  - V2  + i2R2  = 0 (22-61) 

An  equation  could  also  be  written  for  the  loop  acdeba,  but  it  would  not  be 
independent  of  the  other  two.  Here  again,  the  general  rule  is  that  the 
number  of  independent  loop  equations  is  one  less  than  the  number  of  loops.  (Some- 
times, however,  the  extra  dependent  equation  can  be  helpful  in  the  task  of 
solving  a set  of  simultaneous  equations.) 

Equations  (22-59),  (22-60),  and  (22-61)  comprise  a set  of  three  inde- 
pendent simultaneous  equations  describing  the  current  in  the  circuit  of 
Fig.  22-20.  If  the  emf s and  V2  and  the  resistances  Rlt  R2,  and  R3  are 
known,  the  three  currents  h , i2,  and  i3  are  the  unknowns.  The  three  equa- 
tions contain  just  sufficient  information  to  solve  for  these  three  unknowns. 
In  general,  however,  any  set  of  five  emf  s,  resistances,  and  currents  could 
be  given  and  the  other  three  quantities  solved  for.  In  any  case  the  result  is  a 
complete  characterization  of  the  electrical  properties  of  the  circuit.  This  is 
illustrated  in  Example  22-10. 


1052  Steady  Electric  Currents 


EXAMPLE  22-10 


Vx  = 9.0  V 


1 + 


Rx  =2.0  n 

AAyV 


R 2 = 3.0  ft 

^wv— 


-O- 


i2  = 2.4  A 


/?3  = 5.0  n 

— AAA/ — 


Fig.  22-21  Illustration  for  Example 
22-10. 


Figure  22-2 1 shows  the  same  basic  circuit  as  Fig.  22-20.  However,  an  ammeter  (a  de- 
vice which  measures  current)  lias  been  added  in  branch  ce  of  the  circuit.  Given  the 
values  of  Vx,  Rx,  R2,  and  R3  shown  in  Fig.  22-20,  find  V2 , the  emf  of  the  battery  in 
branch  cde,  if  the  ammeter  reads  a current  i2  = 2.4  A flowing  to  the  left  in  branch  ce. 

■ The  three  unknowns  in  the  circuit  are  q,  i3 , and  V2.  It  you  wish,  you  can  find  a 
general  solution  of  the  set  of  simultaneous  equations  (22-59),  (22-60),  and  (22-61) 
for  these  quantities.  However,  it  is  usually  not  worth  the  trouble  to  do  so,  since 
every  circuit  is  dif  ferent  and  specific.  Solving  a set  of  simultaneous  equations  by  ini- 
tially inserting  the  numerical  value  of  each  known  term  is  usually  less  trouble. 
Doing  this  in  the  three  equations  gives  you 

q - i3  = 2.4  A (22-62) 


-q  X 2.0  ft  - 2.4  A x 3.0  fi  + 9.0  V = 0 (22-63) 

-is  x 5.0  n - V2  + 2.4  A x 3.0  fl  = 0 (22-64) 

Equation  (22-63)  can  be  solved  immediately  for  q.  You  have 

-7.2  V + 9.0  V 


This  gives  you  the  information  you  need  to  solve  Eq.  (22-62)  for  i3.  You  have 


i3  = 0.9  A - 2.4  A = - 1.5  A 


The  negative  sign  of  q means  that  the  original  arbitrary  choice  of  current  sense  in 
the  branch  cde  was  wrong;  the  actual  sense  is  from  left  to  right. 

You  can  now  solve  Eq.  (22-64)  for  V2.  You  have 

V2  = 1.5  A x 5.0  H + 2.4  A x 3.0  fi  = 14.7  V 


The  Wheatstone  bridge,  shown  in  Fig.  22-22,  is  a variation  of  the  net- 
work shown  in  Fig.  22-19.  It  is  a device  for  measuring  the  resistance  X of  an 
unknown  resistor  accurately  and  relatively  quickly.  The  unknown  resistor 
is  made  part  of  the  diamond-shaped  network  whose  other  members  are  the 
calibrated  resistors  having  resistances  A and  B and  the  calibrated  variable 
resistor  whose  resistance  is  denoted  by  J.  (The  arrow  through  the  conven- 
tional sign  for  a resistor  denotes  a variable  resistor.)  I he  terminals  of  a bat- 
tery are  connected  to  points  a and  b,  so  that  currents  q,  i2,  i3,  and  i4  flow 
along  the  branches  of  the  network  as  shown  in  the  figure. 


22-7  Direct-Current  Circuits  1053 


Between  points  c and  d is  connected  a galvanometer.  This  is  a sensitive 
instrument  which  can  detect  the  passage  of  a current.  It  is  thus  like  an  am- 
meter, except  that  it  need  not  have  a calibrated  scale,  since  it  is  used  as  a 
null  instrument.  A null  instrument  is  any  device  used  solely  to  determine 
when  the  quantity  to  which  it  is  sensitive  is  zero.  Such  null  measurements 
are  desirable,  if  they  can  be  arranged,  since  errors  or  variations  in  calibra- 
tion will  not  affect  the  measurement. 

file  galvanometer  is  protected  by  the  switch  S,  which  is  normally  open 
(that  is,  the  conducting  contacts  are  not  touching,  and  current  cannot  flow 
through  it).  If  the  switch  is  momentarily  closed,  a current  iG  will  normally 
flow  through  the  galvanometer.  The  entire  circuit  can  be  analyzed  in  this 
general  case  by  using  Kirchhoff's  rules,  but  we  need  not  do  so  here.  The  es- 
sential point  for  our  discussion  is  that  current  will  flow  through  the  galvano- 
meter branch  only  when  the  potentials  at  points  c and  d are  different  (as  they  gener- 
ally will  be). 


But  suppose  the  resistor  / is  adjusted  so  that  the  potentials  at  c and  d 
are  the  same.  Then  no  current  will  flow  through  the  galvanometer,  even 
when  the  switch  is  closed.  And  if  iG  = 0,  Kirchhoff’s  node  rule  gives 

Node  c 

Node  d 

— 

c 

CS 

o 

II 

1 

h -*4  = 0 

or 

il  — i3  and 

h = h 

What  are  the  conditions  under  which  this  will  happen?  If  the  poten- 
tials at  c and  d are  to  be  the  same,  they  must  be  less  than  the  potential  at  a by 
the  same  amount.  We  can  thus  use  Ohm’s  law  to  write 


-iiA  = ~i2J  (22-65) 

The  potentials  at  c and  d must  also  be  greater  than  that  at  b by  the  same 
amount.  So  we  have 

h B = hX 

But  since  i3  = i1  and  i4  = i2,  we  can  rewrite  this  equation  as 

i1B  = i2X 

Dividing  Eq.  (22-65)  by  this  equation,  we  have 

A =l 

B X 


or 


D 

X=J-  for  iG  = 0 (22-66) 

Thus  the  unknown  resistance  X can  be  found  in  terms  of  the  known  resis- 
tances J,  A,  and  B.  The  result  does  not  depend  on  the  voltage  of  the  battery 
or  on  the  magnitudes  of  the  currents  in  the  branches  of  the  network.  For 
convenience,  the  ratio  B/A  is  usually  chosen  to  be  some  multiple  of  10. 


1054  Steady  Electric  Currents 


EXERCISES 


Group  A 

22-1.  I 7 an  de  Graaff  generator.  The  potential  dif- 
ference between  the  electrodes  of  a van  de  Graaff  genera- 
tor is  maintained  at  3.0  x 10s  V.  The  charging  belt  adds 
charge  at  the  rate  of  2.0  x It)-4  C/s.  What  power  is  re- 
quired to  overcome  the  electric  forces  acting  on  the  charg- 
ing belt? 

22-2.  Original  definition  of  the  ampere.  Calculate  the 
number  of  milligrams  of  silver  deposited  by  the  passage  of 
one  coulomb  of  charge  using  107.88  for  the  molecular 
weight  of  silver  and  9.6487  x 107  f or  the  numerical  value 
of  Faraday’s  constant.  Pay  due  regard  to  significant  fig- 
ures. (The  deposition  of  this  much  silver  in  one  second 
was  once  the  legal  definition  of  the  ampere.) 

22-3.  Electrolytic  cells  in  series . In  Fig.  22E-3,  two  elec- 
trolytic cells  are  in  series.  In  one,  copper  metal  is  depos- 
ited on  a carbon  rod.  In  the  other,  silver  is  similarly  depos- 
ited. 

I Fig.  22E-3 


"7a  X 

Carbon  Silver 

Silver  nitrate 
solution 


Copper  sulfate 
solution 


a.  If  1.00  g of  copper  is  deposited,  what  total  charge 
passed  through  the  circuit?  The  molecular  weight  of 
copper  is  63.6  and  its  valence  is  2. 

b.  What  mass  of  silver  was  also  deposited?  The 
molecular  weight  of  silver  is  108  and  its  valence  is  1. 

22-4.  Current  density  in  a wire.  Two  wires  of  different 
diameter,  but  both  having  circular  cross  section,  are  sol- 
dered end  to  end.  The  hrst  wire  has  a diameter  of  1.50 
mm,  the  second  a diameter  of  2.00  mm.  What  is  the  ratio 
of  the  current  density  / in  the  hrst  wire  to  that  in  the  sec- 
ond wire?  Assume  the  current  to  be  uniformly  distributed 
across  the  cross  section  of  each  wire. 

22-5.  Building  up  resistance.  The  diameter  of  no.  30 
wire  is  0.250  mm.  What  length  of  no.  30  Nichrome  wire  is 
needed  to  wind  a 100-0  resistor? 


b.  What  is  the  ratio  of  the  weight  of  the  aluminum  to 
that  of  the  copper?  (Density  of  aluminum  = 2.7  x 103 
kg/m3;  density  of  copper  = 8.9  X 103  kg/m3.) 

c.  Why  is  aluminum  preferred  to  copper  for  the 
cables  of  an  above-ground  high  voltage  transmission  line? 

22-7.  Temperature  coefficient  of  resistance.  In  Fig.  22E-7, 
Rx  is  at  the  temperature  of  melting  ice.  Its  resistance  at  this 
temperature  is  3.0  O.  R2  is  a variable  resistor.  It  is  set  so 
that  its  resistance  in  the  circuit  is  4.0  EE  Rx  is  now  placed  in 
boiling  water.  To  keep  the  reading  of  the  ammeter  A con- 
stant, the  sliding  contact  is  moved  so  that  the  resistance  of 
R2  in  the  circuit  is  decreased  to  3.2  O.  What  is  the  temper- 
ature coefficient  of  Rfi 


Fig.  22E-7 


22-8.  Rotating  coil.  The  coil  described  in  Sec.  22-4 
has  a radius  of  6.0  cm  and  is  wound  with  wire  whose  resis- 
tance is  0.30  Cl/m.  It  is  rotating  at  60  revolutions/s.  What 
charge  can  be  expected  to  flow  through  the  galvanometer 
if  the  charge  carriers  are  free  electrons? 

22-9.  Drift  velocity,  I.  Number  14  wire  has  a diameter 
of  0.16  cm.  Such  insulated  copper  wire  has  a safe  carrying 
capacity  of  15  A.  Calculate  the  drift  velocity  of  the  carriers 
(electrons),  assuming  that  each  copper  atom  supplies  one 
free  electron.  The  molecular  weight  of  copper  is  63.6  and 
its  density  is  8.9  g/cm3. 

22-10.  Drift  velocity,  II.  A 60-W  tungsten  bulb  has  a 
filament  whose  diameter  is  0.0033  cm.  It  carries  a current 
of  0.50  A.  What  is  the  drift  velocity  of  the  electrons  in  the 
tungsten  filament?  The  density  of  tungsten  is  19.3  g/cm3. 
Its  molecular  weight  is  184.  Assume  that  there  are  two 
free  electrons  for  each  tungsten  atom. 

22-11.  Dim  light  bulbs. 

a.  What  is  the  resistance  of  a 60-W,  120-V  electric 
light  bulb  when  it  is  used  across  a 120-V  line? 

b.  Repeat  for  a 100-W  bulb. 

c.  If  these  two  light  bulbs  are  connected  in  series  to  a 
120-V  line,  they  will  not  have  their  normal  brightness. 
However,  one  will  be  brighter  than  the  other.  Which  one? 
Explain. 


22-6.  Aluminum  preferred.  A length  of  copper  wire  has 
a cross  section  such  that  its  resistance  is  exactly  1 IT 

a.  If  an  aluminum  wire  of  the  same  length  also  has  a 
resistance  of  exactly  1 Cl,  what  is  the  ratio  of  the  cross  sec- 
tion of  the  aluminum  to  that  of  the  copper? 


22-12.  Figure  out  watt’s  watt.  A 10,000-0  resistor  has  a 
power  rating  of  1 W.  This  means  that  it  must  be  able  to 
safely  dissipate  this  much  power. 

a.  What  is  the  maximum  permissible  current  it  may 
carry? 


Exercises  1055 


b.  What  is  the  maximum  allowable  voltage  across  it? 

22-13.  Power  loss.  The  power  supplied  to  a house  is 
3300  W at  110  V.  The  feed  wires  from  the  power  lines  to 
the  house  have  0.10  fl  resistance. 

a.  Calculate  the  loss  of  power  in  the  feed  wires. 

b.  If  the  same  power  were  supplied  at  220  V,  what 
would  be  the  power  loss? 

22-14.  Grounded  circuit.  The  box  in  Fig.  22E-14  repre- 
sents a 10-V  battery  whose  internal  resistance  is  1.0  fl.  It  is 
connected  to  a 9.0-11  resistor.  Point  A is  connected  to  the 
ground.  The  grounded  point  is  usually  taken  as  the  zero 
reference  level  from  which  potentials  are  measured. 


c + 

_ i .o  n b 

v i | v v v - 

10  V 

Fig.  22E-14 


D 


9.o  n 

AAAA/W 


a.  What  is  the  current  at  point  C?  point  B? 

b.  What  is  the  potential  at  points  B,  C,  and  D? 

c.  What  is  the  potential  difference  Vc  ~ VB  across  the 
battery  terminals,  which  the  battery  supplies  to  the  ex- 
ternal circuit? 

d.  Why  does  no  current  flow  into  or  out  of  the 
ground? 

22-15.  Internal  resistance  of  a dry  cell.  The  emf  of  a dry 
cell  is  1.50  V.  When  the  switch  in  Fig.  22E-15  is  closed,  the 
ammeter  A reads  1.50  A.  At  the  same  time,  the  voltmeter 
V connected  to  the  terminals  CD  of  the  cell  reads  1.35  V. 
What  is  the  internal  resistance  of  the  cell?  The  value  of  the 
external  resistance  is  not  known.  The  voltmeter  has  a very 
high  resistance  and  draws  negligible  current. 

Fig.  22E-15 


22-16.  Equivalent  resistance.  Determine  the  equivalent 
resistance  of  the  network  drawn  in  Fig.  22E-16.  That  is, 
determine  the  resistance  between  the  terminals  A and  B in 
the  figure. 

Fig.  22E-16 


B 


2.oo  n 


Group  B 

22-17.  Electrolytic  cells  in  parallel.  The  two  electrolytic 
cells  shown  in  Fig.  22E-3  are  connected  in  parallel.  What 
should  be  the  ratio  of  their  internal  resistances  such  that 
the  same  mass  per  second  is  deposited  in  each  cell? 

22-18.  Drift  velocity,  III.  A silver  wire  of  cross- 
sectional  area  3.0  X 10"6  m2  carries  a current  of  2.4  A. 
Compute  the  magnitude  of  the  electron  drift  velocity.  The 
valence  of  silver  is  1,  its  molecular  weight  is  108,  and  its 
density  is  1.05  x 104  kg/m3. 

22-19.  Drawing  a wire.  A metal  rod  1.00  m long  and 
1.00  cm  in  diameter  is  drawn  through  a series  of  dies  so 
that  its  diameter  is  1.00  mm.  How  long  is  the  resulting 
wire,  assuming  no  change  in  volume?  If  the  conductivity  cr 
is  unchanged,  what  is  the  ratio  of  the  resistance  of  the  wire 
to  that  of  the  rod? 

22-20.  A fat  wire  and  a thin  wire.  Two  resistance  wires 
are  made  of  the  same  material  and  are  of  equal  length. 
Wire  1 has  a cross  section  twice  that  of  wire  2.  They  are 
connected  in  series  across  the  terminals  of  a battery.  Cal- 
culate the  ratio  of  the  current  densities  in  the  two  wires, 
the  ratio  of  the  potential  differences  across  their  ends,  and 
the  ratio  of  the  electric  fields  within  them. 

22-21.  Ohm’s  law  from  a viscous  fluid  model.  Assume 
that  the  charge  carriers  in  a conductor  are  held  back  by  a 
force  similar  to  that  which  acts  on  a particle  moving  slowly 
through  a viscous  fluid.  The  magnitude  of  such  a force  is 
proportional  to  the  first  power  of  the  speed.  Show  that 
this  assumption  leads  to  Ohm’s  law  in  the  form  given  by 
Eq.  (22-31). 

22-22.  End  to  end.  A cylindrical  wire  made  of  a mate- 
rial with  electrical  conductivity  oy  lies  along  the  negative  x 
axis.  At  x = 0,  it  is  joined  to  another  wire  of  the  same  cross 
section  but  with  conductivity  <r2.  This  second  wire  extends 
along  the  positive  x axis.  A steady  electric  current  density 
j = /x  is  flowing  in  the  wires. 

a.  Find  the  electric  fields  &x  and  S2  which  exist  within 
wire  1 (x  < 0)  and  wire  2 (x  > 0),  respectively. 

b.  Gauss’  law  and  the  results  of  part  a imply  that  an 
electrically  charged  layer  exists  at  the  interface  (at  x = 0). 
Express  the  charge  per  unit  area  in  this  layer  in  terms  of 
eo,/,  ay,  and  cr2. 

c.  The  interface  centered  at  x = 0 cannot  be  infi- 
nitely thin.  If  the  charge  layer  has  thickness  8,  what  is  the 
average  charge  density  within  the  layer,  in  terms  of  e0,/, 
ay,  cr2,  and  6. 

d.  Suppose  that  wire  1 is  tungsten  and  wire  2 is 
copper.  If  8 = 1.0  x 10-9  m,j  = 1.0  x 10s  A/m2,  and  the 
entire  system  is  maintained  at  20°C,  what  is  the  average 
charge  density  in  the  layer?  The  conductivity  of  tungsten  is 
1.8  X 107  S/m,  its  density  is  1.93  x 104  kg/m3,  and  its 
molecular  weight  is  184.  These  values  for  copper  are 
5.65  x 107  S/m,  8.94  x 103  kg/nt3,  and  63.5. 

e.  To  what  number  density  of  electric  charges  does 
your  result  for  part  d correspond?  Compare  this  excess 


1056  Steady  Electric  Currents 


number  density  of  electrons  to  t he  number  densities  of 
atoms  in  the  two  wires. 


22-27.  Circuit  analysis,  I.  In  Fig.  22E-27,  if  i3  = 1.00  A, 
find  i1;  i2,  and  V,  the  voltage  supplied  by  the  battery. 


22-23.  Density  of  conduction  electrons.  The  table 
presents  conductivities  and  densities  at  room  tempera- 
ture, and  molecular  weights,  of  four  materials.  Notice  the 
very  wide  range  of  conductivity  spanned  by  these  mate- 
rials. If  the  conduction  in  each  material  were  by  electrons 
(w  hich  is  not  strictly  true),  and  if  the  mean  scattering  time 
was  the  same  in  each  case,  what  would  be  the  relative  den- 
sity of  conduction  electrons  for  each  substance?  About 
how  many  conduction  electrons  are  there  for  each  atom 
or  molecule? 


Copper 

Iron 

Silicon 

Glass 


Conductivity,  <r 
fin  (fTcm)1] 

5.9  x 105 
1.0  x 105 
1 x 10-4 
5 x l(r14 


Density,  P 
(in  g/cm  ) 

8.9 

7.9 
2.4 
2.3 


Molecular 

weight 

63.5 

55.9 

28.1 

60.1 


22-24.  Matthiessen’s  rule.  One  form  of  Matthiessen’s 
rule,  Eq.  (22-48),  developed  in  small  print  below  that 
equation,  is  p = pimp  + AT.  Estimate  the  constant  A for 
each  of  the  metals  in  the  table.  How  accurate  does  the 
linear  relation  appear  to  be,  based  on  these  very  limited 
data? 


Copper 

Gold 

Iron 


Resistivity  (in  IT  cm) 


77  K 

0.2  x 1(T6 
0.5  x 1(T6 
0.66  x 1CT6 


273  K 

1.56  x 10-6 
2.04  X 1CT6 
8.9  X 1(T6 


373  K 

2.2  x 10fi 
2.84  X 10_,i 
14.7  X 10“6 


22-25.  Charging  batteries  in  series.  Forty  identical 
storage  batteries  are  connected  in  series  (plus  to  minus). 
The  emf  of  each  cell  is  2.20  V,  and  the  internal  resistance 
is  0.0100  fl.  They  are  to  be  charged  from  a steady  source 
producing  an  emf  of  110  V. 

a.  What  resistance  must  be  placed  in  series  with  the 
batteries  to  limit  the  current  to  10  A? 

b.  What  power  is  dissipated  in  this  current-limiting 
resistor? 

c.  What  is  the  power  dissipated  by  the  internal  resist- 
ance of  the  batteries? 

d.  What  useful  power  is  consumed  to  bring  about  the 
chemical  changes  that  take  place  during  charging? 

22-26.  Tricky!  What  is  the  resistance  from  A to  B of 
the  network  in  Fig.  22E-26?  Hint:  Redraw  the  figure 
along  more  familiar  lines. 


W/V- 


-A A/V 


Fig.  22E-26 


3.oo  n 


10.00  ft 


Fig.  22E-27 


22-28.  Circuit  analysis,  II.  Find  the  current  through 
each  of  the  resistors  in  Fig.  22E-28.  The  batteries  have 
negligible  internal  resistance. 


3.00  ft 


1.00  ft 


Fig.  22E-28 


22-29.  Don’t  burn  out  the  light  bulbs!  Three  electric 
light  bulbs  designed  to  operate  on  1 10  V are  to  be  used 
with  220  V.  The  ratings  stamped  on  the  bulbs  are  55,  55, 
and  110  W.  Draw  the  circuit,  using  no  circuit  elements 
other  than  the  bulbs,  that  will  enable  them  to  operate 
properly  at  the  higher  voltage. 

22-30.  A modified  Wheatstone  bridge.  Ford  Kelvin  de- 
vised a method  of  using  the  Wheatstone  bridge  to  mea- 
sure the  resistance  of  a meter  without  the  necessity  of 
using  a second  meter.  Instead  of  placing  the  meter  in  the 
branch  connecting  A and  B in  Fig.  22E-30,  it  is  placed  in 
one  of  the  arms.  With  the  switch  S open,  the  meter  regis- 
ters the  passage  of  current.  When  S is  closed,  in  general 
the  meter  will  change  its  reading.  By  varying  R1,  R3,  and 
R 4,  it  is  possible  to  reach  a condition  in  which  closing  S 
produces  no  change  in  the  meter  reading.  Under  this  cir- 
cumstance, what  is 


A 


Fig.  22E-30 


Exercises  1057 


a.  the  current  in  AB? 

b.  the  difference  of  potential  between  A and  B? 

c.  Derive  the  relation  which  enables  RM,  the  meter  re- 
sistance, to  be  computed. 

22-31.  A potentiometer.  Since  a battery  does  have  an  in- 
ternal resistance,  a voltmeter  connected  across  its  termi- 
nals will  not  read  the  full  emf  because  the  voltmeter  draws 
some  current.  This  results  in  a drop  in  potential  due  to  the 
internal  resistance.  The  f ull  emf  can  be  measured  only  if 
the  current  in  the  battery  is  zero.  An  arrangement  to  ac- 
complish this  is  shown  schematically  in  Fig.  22E-31  and  is 
called  a potentiometer. 

F Fig.  22E-31 


A steady  current  supplied  by  an  auxiliary  battery  F 
flows  in  a uniform  wire  GH.  The  battery  labeled  X , whose 
emf  is  to  be  determined,  is  compared  with  S,  a standard 
battery  of  known  emf.  Switch  C is  opened.  B is  closed  and  D 
is  slid  along  GH  until  the  ammeter  reads  zero.  The  length 
lx  = DH  is  measured.  B is  now  opened  and  C is  closed.  A 
new  position  is  found  to  give  a zero  ammeter  reading,  and 
the  new  length  ls  is  measured. 

a.  If  the  resistance  per  unit  length  of  GH  is  K.  write 
the  equation  for  loop  DHBXD  when  A reads  zero. 

b.  Do  the  same  for  loop  DHCSD  when  X is  replaced 
by  S. 

c.  Write  the  proportion  which  enables  the  emf  of  X to 
be  calculated. 

d.  Why  is  this  an  emf  rather  than  a terminal  voltage? 

Group  C 

22-32.  A novel  source  of  potential  difference.  A solid 
metal  cylinder  of  radius  R rotates  with  constant  angular 
speed  oj  about  its  axis.  Show  that  there  is  a difference  of 
potential  between  the  axis  and  the  surface  of  the  cylinder 
of  magnitude  i( m/e)(o2R , where  m and  e are  the  charge  and 
mass  of  an  electron.  Is  the  potential  of  the  axis  positive  or 
negative  with  respect  to  the  surface? 

22-33.  Relation  between  resistance  and  capacitance.  In 
Fig.  22E-33,  the  space  between  the  two  concentric  metallic 
spheres  is  filled  with  water.  The  inner  sphere  is  connected 
to  a battery  by  means  of  an  insulated  wire  which  passes 
through  a small  opening  in  the  top  of  the  outer  sphere. 
The  other  terminal  of  the  battery  is  connected  to  the  outer 
sphere. 


Fig.  22E-33 


a.  Show  that  the  resistance  of  the  water  can  be  calcu- 
lated from  R = e0/crC,  where  C is  the  capacitance  of  the 
spheres  in  the  absence  of  water  and  cr  is  the  conductivity 
of  the  water. 

b.  C for  the  concentric  spheres  is  given  by 
47re0r2r1/(r2  - rx).  Calculate/?  if r2  = 10.0cm,  rx  = 5.0cm, 
and  cr  for  the  water  is  1.0  x 10-3  S/m. 

22-34.  Strike  it  rich.  A knowledge  of  the  electrical  con- 
ductivity cr  of  the  ground  being  prospected  for  usef  ul  ores 
is  of  great  advantage  in  locating  underground  deposits. 
The  method  used  is  illustrated  in  Fig.  22E-34.  Two  me- 
tallic spheres  A and  B.  each  of  radius  r,  are  connected  to 
long  insulated  wires.  The  spheres  are  buried  deep  in  the 
ground  with  the  wires  sticking  out,  at  a separation  large 
compared  to  their  radii.  A battery  and  an  ammeter  are 
connected  between  the  wires.  From  the  measured  current 
and  the  known  emf  of  the  battery,  the  resistance  R offered 
by  the  extended  ground  to  the  flow  of  current  from  AtoB 
is  determined.  In  Exercise  22-33  it  was  shown  that  R = 
e0/c rC . where  C is  the  capacitance  of  the  system.  From 
a knowledge  of  C,  cr  can  be  evaluated  in  terms  of  R 
and  r.  Carry  out  the  analysis  as  follows: 


Fig.  22E-34 


a.  Prepare  to  calculate  C by  showing  that  VA  — VB  — 
|f/|/27re0r  if  the  charges  +\q\  and  — |c/|  are  on  A and  B. 

b.  Calculate  C for  the  system. 

c.  Calculate  the  resistance  R being  offered  bv  the 
ground. 

d.  Express  the  conductivity  cr  in  terms  of  the  resist- 
ance R and  the  radius  r. 


1058  Steady  Electric  Currents 


22-35.  A cylindrical  resistor.  A cylindrical  resistive  ele- 
ment is  constructed  using  a solid  cylinder  ol  radius  rx  as 
the  inner  terminal  and  a concentric  tube  of  inner  radius  r2 
as  the  outer  terminal.  These  terminals  are  made  of  such 
highly  conductive  material  that  each  may  be  regarded  as 
an  equipotential.  The  resistance  is  provided  by  filler  mater- 
ial of  conductivity  a which  occupies  the  region  rx  r =£ 
r2.  The  length  L of  the  cylindrical  assembly  is  much 
greater  than  r2,  so  that  end  effects  can  be  neglected  in 
completing  this  exercise. 

a.  If  a total  steady  current  i is  flowing  from  the  inner 
terminal  to  the  outer  one,  find  the  current  density  j(r) 
for  rx  =£  r =£  r2. 

b.  Find  the  electric  held  ’Sir)  for  rx  =£  r =£  r2. 

c.  Find  the  electrical  resistance  R between  the  inner 
and  outer  terminals. 

d.  Assuming  that  the  electric  held  is  zero  for  r > r2, 
hnd  the  surface  charge  densities  and  the  total  charges  at 
the  interfaces  r = rx  and  r = r2. 

22-36.  Ionization  fire  detector.  In  the  ionization  type  of 
hre  detector  commonly  used  in  homes,  ionized  molecules 
of  organic  vapor  form  an  electric  current  that  triggers  an 
alarm.  For  a crude  model  of  such  a system,  assume  that 
singly  charged  methyl  alcohol  ions  are  in  a cubic  box  with 
10-cm  edges.  The  top  and  bottom  are  metallic,  and  the 
side  walls  are  insulators.  Assume  that  a potential  dif- 
ference of  50  V is  maintained  between  bottom  and  top, 
and  that  a current  of  1.0  x 10~6  A is  in  the  circuit. 

a.  What  is  the  conductivity  of  the  alcohol  vapor? 

b.  For  a mean  free  path  of  1.0  x 10-5  cm,  estimate 
the  mean  scattering  time. 

22-37.  Two-layer  conductor.  A cylindrical  conductor 
consists  of  a solid  inner  wire  of  radius  rx  and  electrical  con- 
ductivity crj,  surrounded  by  a concentric  tubular  con- 
ductor with  inner  and  outer  radii  rx  and  r2,  respectively. 
The  tubular  conductor  has  conductivity  cr2. 

a.  Find  the  resistance  per  unit  length  of  this  compos- 
ite conductor. 

b.  If  the  conductor  is  carrying  a steady  total  current  i, 
hnd  the  currents  ix  and  i2  in  the  inner  wire  and  the  tubular 
sleeve,  respectively. 

c.  Show  that  the  ratio  of  the  electrical  power  dissi- 
pated in  the  two  sections  equals  the  ratio  of  the  currents: 
P2/P1  — hi  h- 

22-38.  Light  bulb  filament  design.  The  tungsten  wires  of 
all  electric  light  bulbs  are  designed  to  glow  at  about  the 
same  temperature.  This  requires,  as  a first  approximation, 
that  the  power  per  unit  surface  area  of  the  filament  be 
the  same  for  all. 

a.  Show  that  this  leads  to  the  requirement  at  constant 
voltage  that  r/l 2 = constant,  where  r is  the  radius  and  / the 
length  of  the  filament 

b.  If  P2/Pi  — n is  the  ratio  of  the  power  consumption 
of  two  different  light  bulbs,  show  that  r2/i\  = n213  and 

k/k  = n1/3. 


22-39.  Maximizing  delivered  power.  A battery  whose 
emf  is  V and  whose  internal  resistance  is  r is  connected  to 
an  external  resistance  R. 

a.  Prove  that  the  power  delivered  to  the  external  re- 
sistance is  a maximum  when  R = r. 

b.  What  is  the  efficiency  at  maximum  power?  That  is, 
what  is  the  ratio  of  the  power  consumed  by  the  external 
resistance  to  the  total  power  consumed  by  both  resis- 
tances? 

22-40.  Put  the  kettle  on.  An  electric  heater  has  two 
windings.  One  alone  across  the  1 10-V  line  brings  a kettle 
to  a boil  in  q min.  The  other  alone  does  it  in  t2  min.  How 
long  would  it  take  if  the  two  windings  are  connected 
across  the  1 10-V  line 

a.  in  series? 

b.  in  parallel? 

22-41.  Rapid  transit.  In  Fig.  22E-41.  G is  an  electrical 
generator  which  supplies  a steady  voltage  550  V between 
trolley  wire  and  track.  The  trolley  line  is  8.0  km  long,  and 
the  resistance  of  wire  and  track  return  is  0.10  fl/km.  A 
heavy  duty  storage  battery  at  the  end  of  the  line  furnishes 
an  emf  of  500  V and  has  an  internal  resistance  of  0.24  1 V 
The  trolley  draws  100  A. 


Fig.  22E-41 


a.  If  the  trolley  is  3.0  km  from  the  generator  end, 
what  is  the  current  supplied  (i)  by  the  generator?  (ii)  to 
the  battery? 

b.  What  is  the  voltage  across  the  trolley  car? 

c.  If  the  trolley  were  at  the  battery  end  of  the  line, 
what  current  is  supplied  (i)  by  the  generator?  (ii)  by  the 
battery? 

d.  What  is  the  voltage  across  the  trolley  car? 

e.  I f there  were  no  battery,  what  would  be  the  voltage 
across  the  trolley  car  when  it  is  at  the  end  of  the  line  away 
from  the  generator? 

22-42.  Circuit  analysis,  III.  When  the  switch  S in  Fig. 
22E-42  is  open,  the  power  consumed  in  one  of  the  re- 
sistors R0  is  P.  When  S is  closed,  the  total  power  consumed 
by  the  pair  of  identical  resistors  R0  is  still  P.  Calculate  the 
value  of  R in  terms  of  R0  for  which  this  is  possible.  The  in- 
ternal resistance  of  the  battery  is  negligible. 

r S Fig.  22E-42 

AA/V 


Exercises  1059 


22-43.  Kirchhoff  to  the  rescue.  What  is  the  resistance 
from  A to  B of  the  network  in  Fig.  22E-43 } Hint:  In  apply- 
ing Kirchhoffs’  laws,  exploit  the  symmetry  to  reduce  the 
number  of  unknown  currents  to  two. 


Fig.  22E-43 


22-44.  Exploit  the  symmetry.  Twelve  identical  pieces  of 
resistance  wire,  each  of  resistance  r,  are  joined  together  to 
form  the  cubical  frame  shown  in  Fig.  22E-44.  What  is  its 
resistance  from  A to  Be  Hint:  In  applying  Kirchhoffs 
laws,  exploit  the  symmetry  to  reduce  the  number  of 
unknown  currents  to  one. 


Fig.  22E-44 


1060 


Steady  Electric  Currents 


Magnetic  Fields,  I 


23-1  MAGNETIC  POLES 
AND  MAGNETIC 
FIELD  LINES 


Everyone  has  played  with  magnets  as  a child  and  has  some  familiarity  with 
the  forces  exerted  on  one  magnet  by  another.  Permanent  magnets  can  be 
made  of  naturally  occurring  lodestones,  but  it  is  usually  better  and  cheaper 
to  make  them  out  of  any  of  a number  of  alloys  of  iron,  nickel,  and  cobalt 
which  have  been  developed  for  the  purpose.  The  simplest  type  of  perma- 
nent magnet  is  the  bar  magnet  shown  in  Fig.  23-1. 

A little  qualitative  experimentation  with  bar  magnets  establishes  the 
following  facts: 


Fig.  23-1  A bar  magnet,  showing  the  1.  A bar  magnet  has  two  poles.  These  are  regions  near  the  ends  of  the 

north  and  south  poles.  magnet  where  the  magnetic  activity  appears  to  be  concentrated.  The  polar 

regions  are  not  sharply  defined.  Nevertheless,  the  forces  exerted  on  each 
of  two  bar  magnets  by  the  other  are  strongest  when  their  poles  are  close  to 
each  other. 

2.  Any  two  magnetic  poles  either  attract  or  repel  each  other. 

3.  If  a magnetic  pole  is  repelled  by  the  pole  at  one  end  of  another 
magnet,  it  will  be  attracted  by  the  pole  at  the  other  end,  as  shown  in  Fig. 
23-2  a. 

4.  If  two  magnetic  poles  of  separate  magnets  are  both  either  attracted 
or  repelled  by  a pole  of  a third  magnet,  they  will  repel  each  other,  as  shown 
in  Fig.  23-2 b.  If  one  pole  is  attracted  and  the  other  repelled  by  a third  pole, 
they  will  attract  each  other,  as  shown  in  Fig.  23-2c. 

5.  A third  kind  of  magnetic  pole  does  not  exist. 

6.  Objects  made  of  certain  materials  (notably  soft  iron)  are  attracted 
indiscriminately  by  all  magnetic  poles.  An  object  which  exhibits  magnetic 
properties  only  in  the  presence  of  a permanent  magnet  (but  not  in  the 


1061 


a 

b 1 

a 

F 

f - 

M 

0 b ) 


( c 

F 

a 

I 

c b 


Fig.  23-2  General  rules  of  interaction  among  magnetic 
poles  (shown  as  the  poles  at  ends  of  bar  magnets),  (a)  Pole  c, 
which  is  repelled  by  one  of  the  poles  of  another  bar  magnet, 
is  attracted  by  the  other  pole  of  the  same  magnet.  ( b ) Poles  a 
and  b are  both  attracted  by  pole  c and  repel  each  other.  Poles 
a and  b are  both  repelled  by  pole  c and  repel  each  other. 
Since  poles  a and  b behave  in  like  manner  with  respect  to  any 
other  pole,  they  are  called  like  poles,  (c)  Pole  a is  attracted  by 
pole  c , while  pole  b is  repelled  by  pole  c.  Pole  a and  pole  b at- 
tract each  other.  Since  poles  a and  b behave  in  unlike 
manner  with  respect  to  any  other  pole,  they  are  called  un- 
like poles.  The  two  poles  of  the  bar  magnet  in  part  a , 
marked  N and  S,  are  unlike  poles. 


(c) 


presence  of  another  object  similar  to  itself)  is  said  to  be  unmagnetized 
when  a magnet  is  not  present.  Its  magnetic  behavior  in  the  presence  of  a 
permanent  magnet  is  called  induced  magnetism,  and  the  process  by  which 
it  acquires  this  temporary  magnetic  behavior  is  called  magnetization  by  in- 
duction. 

There  is  a striking  similarity  between  these  observations  and  those 
described  for  objects  having  electric  charge  in  Sec.  20-1.  The  two  kinds  of 
poles,  which  are  called  North  and  South,  behave  analogously  to  positive 
and  negative  electric  charges.  The  attraction  of  unmagnetized  iron  by  mag- 
netic poles  of  either  kind  appears  to  be  very  similar  to  the  attraction  of  un- 
charged bits  of  paper  by  electrically  charged  objects  having  charges  of 
either  sign. 

In  the  absence  of  other  nearby  magnets,  a small  bar  magnet  mounted 
on  a pivot  which  allows  it  to  turn  in  a horizontal  plane  will  orient  itself  in  an 
approximately  north-south  direction.  The  pole  which  faces  north  (always 
the  same  pole)  is  called  the  North  pole,  and  the  other  pole,  which  faces 
south,  is  called  the  South  pole.  This  is  t lie  basis  for  the  usefulness  of  the 
magnetic  compass,  whose  needle  is  simply  a small  bar  magnet.  The  north- 


1062  Magnetic  Fields,  I 


Magnetic 
field 
of  the 
earth 


Fig.  23-3  The  pair  of  equal  and  opposite  forces  FN  and  Fs  exert  a 
torque  on  the  needle  of  a magnetic  compass.  The  situation  is  analo- 
gous to  that  of  an  electric  dipole  in  an  external  electric  field,  and 
the  magnetic  needle  may  be  regarded  as  a magnetic  dipole  in  the 
magnetic  field  of  the  earth. 


Fig.  23-4  A bar  magnet,  having  one 
North  and  one  South  pole,  is  broken  in 
two  between  the  poles.  Each  of  the 
pieces  is  found  to  have  one  North  pole 
and  one  South  pole.  An  “obvious" 
method  for  producing  isolated  single 
magnetic  poles  thus  fails. 


south  alignment  of  a compass  needle  recalls  very  strongly  the  lining  up  ol 
an  electric  dipole  in  an  externally  applied  electric  held,  as  described  in  Sec. 
21-4.  The  analogy  suggests  that  the  bar  magnet  is  a magnetic  dipole  and 
that  it  is  responding  to  a torque  which  it  experiences  in  the  externally  ap- 
plied magnetic  field  of  the  earth,  as  shown  schematically  in  I ig-  23-3.  I he 
magnetic  dipole  moment  is  analogous  to  the  electric  dipole  moment  discussed 
in  Sec.  21-4;  note  especially  the  similarity  between  Fig.  23-3  and  Fig.  21-17. 
The  magnetic  dipole  moment  can  be  measured  in  several  ways  which  are 
considered  in  Sec.  24-3.  For  the  present,  it  suffices  to  say  that  the  magnetic 
dipole  moment  is  a measure  of  the  “strength  of  a bar  magnet. 

While  the  above-described  similarities  between  permanent  magnets 
and  bodies  having  electric  charge  are  very  important,  there  are  equally 
striking  dissimilarities.  Of  these  dissimilarities,  the  central  one  is  that  a 
single  magnetic  pole  has  never  been  observed  to  exist  in  isolation.  If  a bar  magnet  is 
cut  in  half  between  its  North  and  South  poles,  the  result  is  not  two  magnets, 
each  having  one  pole.  Rather,  as  shown  in  Fig.  23-4,  each  piece  is  found  to 
have  a North  pole  and  a South  pole.  The  original  poles  remain  where  they 
were,  but  a new  opposite  pole  comes  into  existence  on  each  piece  as  the  two 
pieces  are  separated.  The  magnetic  dipole  moments  of  the  two  small 
magnets  are  smaller  than  the  dipole  moment  of  the  original  magnet.  But  it 
the  two  pieces  are  put  back  together,  the  two  poles  brought  into  contact  dis- 
appear, and  the  original  magnet  is  recreated. 

To  see  this  contrast  between  electric  and  magnetic  behavior  in  another 
way,  consider  the  two  parallel  experiments  shown  in  Fig.  23-5.  In  the  first 
experiment  a glass  rod,  originally  electrically  neutral,  is  charged  by  induc- 
tion. To  do  this,  a negatively  charged  body  is  brought  near  one  end  of  the 
glass  rod.  That  end  acquires  a net  positive  charge,  and  a negative  charge  is 
observed  at  the  other  end.  While  the  internal  charge  of  the  glass  rod  is  thus 
partially  separated,  the  rod  is  cut  in  half.  The  negatively  charged  rod  is 
then  withdrawn.  One  half  of  the  glass  rod  now  possesses  a net  positive 
charge,  and  the  other  half  has  an  equal  net  negative  charge. 

In  the  second  experiment,  a rod  of  iron  is  magnetized  by  induction. 
To  do  this,  the  South  pole  of  a magnet  is  brought  near  one  end,  which  is 
then  found  to  exhibit  the  properties  of  a North  pole.  I he  other  end  of  the 
iron  rod  exhibits  the  properties  of  a South  pole.  While  the  rod  is  thus  mag- 
netized by  induction,  it  is  cut  in  half,  and  then  the  magnet  is  withdrawn. 
Neither  piece  of  the  rod  is  found  to  have  a magnetic  “charge"! 

The  conclusion  is  that,  unlike  electric  charge,  magnetic  “charge" 
cannot  be  drained  off  a body,  leaving  an  excess  of  the  opposite  kind.  In- 

23-1  Magnetic  Poles  and  Magnetic  Field  Lines  1063 


Glass  rod 


Iron  rod 


d 


a 


3 ()=—=> 


(E 


d 


D O 


o~dd)  (e 

(a) 


3 


Fig.  23-5  (a)  A method  for  pro- 

ducing isolated  electric  charge  by  elec- 
tric induction,  as  described  in  the  text. 
( b ) The  analogous  method,  employing 
magnetic  induction,  does  not  produce 
isolated  “magnetic  charge.” 


( b ) 


deed,  there  is  no  such  thing  as  a magnetic  charge.  The  only  evidence  we  have  to 
support  this  statement  is  negative — if  such  things  did  exist,  they  should  be 
separable.  We  might  someday  contradict  this  statement  by  finding  such  an 
object,  called  a magnetic  monopole,  but  no  one  has  so  far  done  so.  Even  if 
magnetic  monopoles  are  ultimately  found,  they  are  at  best  exceedingly  rare 
in  our  corner  of  the  universe.  But  magnetic  fields  are  quite  common. 

If  magnetic  charge  does  not  exist,  what  are  the  poles  of  a magnet?  This 
question  can  be  answered  best  in  terms  of  the  photographs  of  Fig.  23-6. 
Every  magnet  has  its  own  magnetic  field,  and  illustrated  in  the  photograph 
of  Fig.  23-6(7  is  a simple  method  of  visualizing  the  magnetic  field  lines  of  a 
bar  magnet.  A sheet  of  thin  cardboard  is  placed  on  top  of  a bar  magnet, 
and  iron  filings  are  sprinkled  on  the  card.  These  filings  become  magnetized 
by  induction.  With  a little  agitation  of  the  card  they  aggregate  as  shown. 


Fig.  23-6  (a)  Iron-filing  pattern  showing  magnetic  field  lines  of  a bar  magnet.  The  North 

pole  is  above  the  South  pole.  ( b ) Grass-seed  pattern  showing  electric  field  lines  of  an  electric  di- 
pole. The  positive  electrode  is  above  the  negative  electrode,  (c)  Magnetic  field  lines  of  two  sep- 
arated bar  magnets.  In  each  the  North  pole  is  above  the  South  pole. 


1064  Magnetic  Fields,  I 


each  filing  interacting  as  a small  magnet  with  its  neighbors.  A small 
compass  needle  located  next  to  one  of  the  pathways  formed  by  the  filings 
will  point  along  the  pathway,  with  its  North  pole  oriented  toward  the  South 
pole  of  the  bar  magnet.  Thus,  as  far  as  the  field  lines  external  to  the  bar 
magnet  are  concerned,  the  magnetic  field  resembles  very  strongly  the  elec- 
tric field  of  an  electric  dipole.  Figure  23-66  illustrates  this  similarity.  In  this 
photograph,  the  ends  of  two  wires  having  a potential  difference  of  several 
thousand  volts  between  them  are  immersed  in  a dish  containing  oil  in 
which  grass  seed  has  been  suspended.  The  fine  seeds  tend  to  line  up  along 
the  electric  field,  which  we  know  to  be  dipolar  (see  Sec.  21-4).  In  the  electric 
case,  the  direction  of  the  field  lines  is  that  in  which  a positive  test  charge 
would  tend  to  move  — from  plus  to  minus,  or  generally  downward  in  the 
photograph.  In  the  magnetic  case,  the  direction  is  that  in  which  the  North 
pole  of  a compass  needle  would  tend  to  point — from  North  to  South,  or 
downward  in  the  photograph. 

But  now  consider  the  magnetic  field  line  pattern  of  Fig.  23-6c.  Here 
the  original  bar  magnet  of  Fig.  23-6o  has  been  cut  in  two,  and  the  parts  are 
separated  by  a very  small  distance.  The  iron  filings  in  the  gap  between 
them  make  visible  the  field  lines  of  the  strong  field  in  the  gap,  close  to  the 
new  poles,  rather  than  the  relatively  weak  field  seen  in  the  same  region  in 
Fig.  23-6a.  The  field  lines  in  the  gap  are  directed  from  North  pole  to  South 
pole.  But  since  the  North  poles  of  both  pieces  of  the  original  magnet  are  lo- 
cated at  their  upper  ends,  and  the  South  poles  at  their  lower  ends,  the 
direction  of  the  field  in  the  gap  is  from  bottom  to  top,  instead  of  the 
top-to-bottom  direction  of  Fig.  23-6«.  This  can  be  verified  with  a small 
compass.  If,  instead  of  cutting  the  original  magnet  in  two,  we  had  exca- 
vated a small  cavity  somewhere  inside  it,  it  is  not  difficult  to  imagine  that 
the  same  field  direction  would  obtain  inside  the  cavity.  Thus  the  direction 
of  the  field  lines  inside  a magnet  is  from  South  pole  to  North  pole — not 
North  pole  to  South  pole — and  there  is  no  net  flux  of  magnetic  field  lines 
from  the  North  pole  to  the  South  pole.  That  is,  the  field  lines  do  not  begin 
at  the  North  pole  and  end  at  the  South  pole.  Rather,  they  circulate  in  closed 
curves,  going  one  way  outside  the  magnet  and  the  other  way  inside.  The 
poles  of  a magnet  are  not  true  source  and  sink  points,  as  the  charges  of  an 
electric  dipole  are.  The  poles  are  merely  the  regions  where  most  of  the 
magnetic  field  lines  emerge  from  or  return  to  the  magnet. 

Contrast  this  behavior  of  the  magnetic  field  lines  of  a bar  magnet  with 
that  of  the  electric  field  lines  of  an  electric  dipole.  The  electric  field  lines  do 
begin  at  the  positive  charge  and  end  at  the  negative  charge.  Thus  there  is  a 
net  electric  flux  from  the  positive  “pole”  to  the  negative  “pole”  of  the  elec- 
tric dipole.  It  follows  from  this  discussion  that  when  we  speak  of  a bar 
magnet  as  a dipole,  we  refer  only  to  its  properties  as  viewed  from  a distance 
large  compared  to  the  dimensions  of  the  magnet. 


23-2  THE  MAGNETIC  The  origin  of  magnetic  fields  is  explored  in  Sec.  23-5  and  again  in  a deeper 
FORCE  AND  THE  waY  'n  ^ec-  24-2.  However,  it  is  best  to  begin  by  defining  the  magnetic  field 
MAGNETIC  FIELD  *n  a fluant*tat’ve  way  and  then  study  some  of  its  properties.  To  do  this,  we 

need  a means  of  detecting  and  measuring  magnetic  fields,  as  well  as  a 
means  of  producing  magnetic  fields.  In  this  section  we  concentrate  on  the 
former  problem.  We  consider  the  latter  problem  only  in  very  broad  out- 
line, leaving  the  details  to  Secs.  23-5  and  24-2. 


23-2  The  Magnetic  Force  and  the  Magnetic  Field  1065 


Fig.  23-7  A permanent  magnet  de- 
signed to  produce  a region  between  its 
pole  faces  where  the  magnetic  field  is  es- 
sentially uniform. 


Fig.  23-8  A particle  having  positive 
charge  (q  > 0)  moves  with  velocity  v in  a 
direction  perpendicular  to  the  direction 
ffi  of  the  magnetic  field. (The  direction  © 
is  that  in  which  the  North  pole  of  a mag- 
netic compass  points  when  it  is  located 
in  the  field.)  The  particle  experiences  a 
magnetic  force  of  magnitude  F = qvSft. 
The  magnitude  33  of  the  magnetic  field 
vector  © is  defined  by  the  equation  38  = 
F/qv.  So  the  magnetic  field  vector  is 
given  by  © = 38© 


(B 


In  the  interest  of  simplicity,  we  begin  with  a magnetic  held  that  is  uni- 
form. In  a region  of  uniform  magnetic  held,  a magnetic  compass  will  point 
everywhere  in  the  same  direction.  Moreover,  the  torque  on  the  compass 
needle  when  it  is  misaligned  with  the  magnetic  held  by  a given  angle  will  be 
everywhere  the  same  in  the  region. 

How  can  a uniform  magnetic  field  be  produced?  One  possibility  is  the  follow- 
ing. A permanent  magnet  is  built  which,  unlike  a bar  magnet,  is  bent  into  the 
shape  shown  in  Fig.  23-7.  (This  shape  is  something  like  that  of  the  familiar  horse- 
shoe magnet.)  Just  as  in  a bar  magnet,  the  poles,  where  the  magnetic  activity  is 
concentrated,  are  at  the  ends  of  the  magnet.  However,  the  ends  are  close  together. 
The  pole  faces,  marked  N and  S,  are  flat  and  parallel.  Their  separation  is  small 
compared  to  their  diameters.  The  symmetry  of  the  situation  suggests  that  the  mag- 
netic field  will  be  uniform  between  the  pole  faces,  if  we  stay  far  enough  away  from 
their  edges.  The  field  lines  in  the  gap  between  the  poles  are  normal  to  the  pole 
faces  and  directed  from  the  North  to  the  South  pole.  The  arrangement  is  the  mag- 
netic analogue  of  the  plane-parallel  capacitor,  whose  field  lines  are  similarly 
oriented. 

M agnetism  and  electricity  are  so  closely  linked  that  we  will  soon  define 
the  magnetic  held  in  terms  of  the  way  in  which  it  affects  electric  charges. 
But  the  connections  between  electricity  and  magnetism  do  not  lie  in  the  sim- 
ilarities between  the  behavior  of  electrically  charged  objects  and  the 
behavior  of  magnets.  Electric  and  magnetic  phenomena  are  distinct.  In 
particular,  Gilbert  showed  in  1600  that  magnetic  compasses  do  not  interact 
with  electrically  charged  rods  in  experiments  where  there  was  no  relative 
motion.  The  links  between  electricity  and  magnetism  are  both  more  subtle 
and  more  profound. 

The  first  of  these  links  is  this:  An  electric  charge  experiences  a force 
when  it  is  moving  in  a magnetic  held.  This  force  is  called  the  magnetic 
force.  1 he  motion  is  absolutely  essential. 

When  there  is  no  electric  held  in  a region,  no  force  is  exerted  on  a sta- 
tionary  test  charge  in  the  region.  However,  experiment  shows  that  a force 
is  exerted  on  the  test  charge  if  the  charge  is  moving  in  the  vicinity  of  a 
magnet.  It  is  this  magnetic  force  which  gives  evidence  of  the  presence  of  a magnetic 
field  at  the  location  of  the  test  charge.  Experiment  shows  that,  other  things 
being  equal,  the  magnitude  of  this  force  is  proportional  to  the  speed  of  the 
test  charge. 

More  specifically,  we  use  the  magnetic  force  to  define  a vector  called 
the  magnetic  field®.  The  “official”  name  of  the  quantity  © is  the  magnetic 
induction;  strictly  speaking,  the  word  “field"  signifies  the  entire  array  of 
vectors  ffi  everywhere  in  space.  But  it  is  common  practice  to  call  the  vector 
ffi  at  any  particular  location  the  magnetic  field,  just  as  it  is  common  practice 
to  call  the  vector  8 at  any  particular  location  the  electric  field  rather  than 
using  its  “official”  name  electric  field  intensity.  We  follow  common  practice  in 
this  book. 

As  a first  approach  to  the  definition  of  ffi , consider  the  special  case 
shown  in  Fig.  23-8.  A particle  with  positive  electric  charge  q moves  in  a 
magnetic  field  with  velocity  v.  We  choose  the  direction  of  the  motion  to  be 
perpendicular  to  the  uniform  magnetic  field  lines.  We  defer  for  the  time 
being  a discussion  of  the  direction  of  the  magnetic  force  exerted  on  the 
particle.  As  for  the  magnitude  F of  the  magnetic  force,  under  these 
circumstances  it  is  given  by  the  equation 

F = qvFft  for  v i ffi  (23-1) 


1066  Magnetic  Fields,  I 


Table  23-1 


Orders  of  Magnitude  Typical  of  Magnetic  Fields 

S3  (in  T) 


Interstellar  space 

Magnetic  field  of  the  earth  (at  surface) 

Surfaces  of  stars 

Permanent  magnets 

Iron-core  electromagnets 

Superconducting  magnets 

High-current  coils 

Pulsed  coils  (duration  — 10-3  s) 

Imploding  coil  (high-current  coil  surrounded 
by  explosives)  (duration  — HU6  s;  single  use) 


<10~9 
5 X 10  5 
1(T2  to  5 
1CT2  to  1 
up  to  3 
up  to  20 
up  to  20 

10  to  30  for  indefinite  cycling; 

30  to  50  for  short-term  use 
upward  of  100 


This  equation  defines  the  magnitude  S3  of  the  vector  (B  in  terms  of  the  measura- 
ble quantities  F,  q,  and  v.  Compare  Eq.  (23-1)  with  the  corresponding  defi- 
nition of  the  electric  held  magnitude, 

F = q% 

To  see  the  similarity  between  these  two  equations  more  clearly,  we  rewrite 
them  in  the  forms 


As  far  as  magnitudes  are  concerned,  is  a force  per  unit  q,  and  S3  is  a force 
per  unit  of  qv.  While  the  equation  F = qvSR  is  more  complicated  than  F = 
q%,  it  is  one  of  the  simplest  possibilities  for  a.  speed-dependent  force. 

According  to  the  second  of  Eqs.  (23-2),  the  unit  of  magnetic  held 
S3  must  be  N/(C-m/s).  This  unit  is  called  the  tesla  (T).  That  is, 

1 T = 1 N-s/(C-m)  (23-3) 

In  older  literature,  a different  unit  of  magnetic  held  is  often  used,  called 
the  gauss  (G);  1 G = 10~4  T.  Table  23-1  gives  typical  magnitudes  for  the 
magnetic  held  S3. 

The  tesla,  the  unit  of  magnetic  field  S3,  is  named  after  the  Croatian-American 
electrical  engineer  Nikola  Tesla  (1856-1943).  Tesla  came  to  the  United  States  in 
1884  and  pioneered  in  the  development  of  the  machinery  required  to  generate, 
transmit,  and  utilize  alternating-current  electricity.  Later  on,  he  experimented 
extensively  with  electric  currents  of  very  high  frequencies. 

We  now  consider  the  direction  of  the  magnetic  force  F.  In  the  special 
case  we  are  discussing,  where  the  velocity  v of  the  charged  particle  is  per- 
pendicular to  the  direction  of  the  magnetic  held  (B,  it  is  found  experi- 
mentally that  the  force  F is  perpendicular  to  both  v arid  <B,  as  shown  in  Fig. 
23-8.  Both  the  magnitude  and  the  direction  of  the  magnetic  force  are 
given  by  the  vector  equation 

F = q\  x ffi  (23-4) 


23-2  The  Magnetic  Force  and  the  Magnetic  Field  1067 


A 

F 


Fig.  23-9  Application  of  the  right-hand  rule  for 
cross  products  to  Eq.  (23-4),  F = q\  x ®.  The 
case  shown  is  that  in  which  the  charge  has  a posi- 
tive value;  that  is,  q > 0.  (Compare  with  Fig.  9-8 a, 
which  illustrates  the  right-hand  rule  for  the  cross 
product  of  a pair  of  arbitrary  vectors.)  In  the  case 
of  charge  having  a negative  value,  q'  < 0,  the 
direction  of  the  magnetic  force  F'  is  found  to  be 
opposite  to  the  direction  of  F shown  here.  Equa- 
tion (23-4)  is  consistent  with  this  observation.  To 
see  this,  substitute  q'  = — q for  q in  the  equation  to 
obtain  F'  = q'\  x © = — F. 


v 


This  equation  defines  the  direction  of  the  vector  (B  in  terms  of  the  directions 
of  the  measurable  quantities  F and  v and  the  sign  of  the  measurable  quan- 
tity q.  Specifically,  if  q > 0,  then  the  direction  of  F is  the  direction  of  the 
vector  v x (B.  What  is  the  direction  of  F if  q < 0?  Figure  23-9  shows  the  ap- 
plication of  the  right-hand  rule  for  cross  products  to  Eq.  (23-4). 

Figure  23- 10a  is  a photo  of  the  tracks  of  a negatively  charged  electron 
and  a positively  charged  positron  in  a device  called  a cloud  chamber,  in 
which  the  passage  of  energetic  charged  particles  results  in  the  condensa- 
tion of  visible  droplets  of  water  vapor  from  the  atmosphere  of  the 
chamber.  In  this  photo  the  magnetic  held  vector  is  directed  into  the  page. 
From  the  roughly  circular  shape  of  the  tracks,  you  can  deduce  that  the 
acceleration  of  each  particle  is  everywhere  perpendicular  to  its  velocity. 
That  is,  the  accelerations  are  centripetal.  However,  they  are  oppositely 
directed  for  the  two  oppositely  charged  particles.  This  is  evidenced  by  the 
fact  that  the  particles  rotate  in  opposite  senses,  an  observation  used  in  Ex- 
ample 23-1  to  determine  which  particle  is  which. 


EXAMPLE  23-1 


In  Fig.  23-10,  the  electron  and  the  positron  originate  at  point  0.  Using  the  right- 
hand  rule  for  cross  products  as  it  applies  to  Eq.  (23-4),  determine  which  track  was 
left  by  the  positron  and  which  by  the  electron. 

■ You  begin  by  drawing  a vector  to  represent  v.  It  is  most  convenient  to  do  this  at 
the  point  of  origin  O,  since  both  particles  are  moving  in  approximately  the  same 
direction  at  that  point.  The  vector  (B  is  directed  into  the  page.  Using  your  right 
hand,  you  curl  your  fingers  in  the  sense  which  would  rotate  v into  (B  through  the 
smaller  angle,  as  shown  in  Fig.  23-1 06.  Your  thumb  then  points  to  the  right.  This  is 
the  direction  of  the  force  F+  on  the  positron  (or  any  positive  charge).  It  must  there- 
fore be  the  positron  which  produced  the  counterclockwise  track  on  the  right  be- 
cause the  particle  following  that  track  is  accelerating  toward  the  right  at  point  O. 
The  other  track,  whose  sense  is  clockwise,  must  be  that  of  the  electron.  The  reason 
is  that  at  point  O the  force  F_,  directed  toward  the  left,  is  the  force  on  a negative 
charge.  And  the  particle  following  the  clockwise  track  is  accelerating  toward  the  left 
at  that  point. 


1068  Magnetic  Fields,  I 


We  are  now  ready  to  consider  the  general  case,  in  which  the  angle 
between  the  velocity  v of  the  charged  particle  and  the  magnetic  held  (B  is 
arbitrary.  No  matter  what  this  angle  may  be  (unless  it  is  0°  or  1 80°,  so  that  v 


Fig.  23-10  (a)  Cloud  chamber  photo  of  an  electron-positron  pair  produc- 

tion event,  (b)  The  essential  information  in  the  photo  is  extracted  for  clari- 
ty. The  initial  velocity  v of  both  particles  at  O is  downward.  The  magnetic 
field  (Bis  directed  into  the  page.  The  thumb  of  the  right  hand  points  in  the 
direction  of  the  initial  force  F+  on  the  positron.  The  force  F_  on  the  elec- 
tron is  in  the  opposite  direction. 


is  parallel  or  antiparallel  to  ffi),  the  vectors  v and  © define  a plane.  The 
magnetic  force  vector  F is  always  observed  to  be  perpendicular  to  this 
plane,  its  direction  that  of  v x (B,  as  shown  in  Fig.  23- 11a.  Other  things 
being  equal,  it  is  found  that  the  magnitude  F of  the  magnetic  force  has  its 
maximum  value  when  v and  <B  are  mutually  perpendicular  and  decreases 
with  decreasing  or  increasing  angle  between  v and  ffi.  In  particular,  the 
magnetic  force  vanishes  when  v is  parallel  or  antiparallel  to  ffi.  That  is,  a 
charged  particle  moving  along  a field  line  in  a uniform  magnetic  field  experiences  no 
magnetic  force. 

Experiment  shows  that  Eq.  (23-4), 

F = q\  X (S> 


Fig.  23-11  Magnetic  force  F on  a 
positively  charged  particle  whose 
velocity  v makes  an  arbitrary  angle 
with  the  magnetic  field  (B . (a)  The  ar- 
bitrary initial  velocity  vector  v of  the 
particle  and  the  vectors  vx  and  Vh  with 
which  it  can  be  replaced,  (b)  The  path 
of  the  particle  is  a helix  of  pitch  z and 
radius  r,  whose  axis  lies  along  the  mag- 
netic held  direction  (B. 


23-2  The  Magnetic  Force  and  the  Magnetic  Field  1069 


is  valid  as  to  both  magnitude  and  direction  for  any  angle  between  v and®. 
Thus  the  magnitude  of  the  magnetic  force  is  given  by  the  equation 

F = \q\v  sin  6 3i 

where  6 is  the  smaller  angle  between  v and®.  In  the  special  case  where  v 
and  ® are  parallel  or  antiparallel,  the  vectors  do  not  define  a plane.  But  in 
this  case,  v x ® = 0,  and  there  is  no  need  to  define  a direction  for  the 
vector  F = 0. 

Many  persons  who  learn  the  relation  F = qv  x ® for  the  first  time  feel  that  it 
is  peculiar  or  even  “artificial.”  Most  people  have  not  had  direct,  everyday  experi- 
ence with  the  operation  of  the  magnetic  force,  and  it  is  therefore  not  “physiolog- 
ically” familiar.  Nevertheless,  the  cloud  chamber  picture  of  Fig.  23-10  is  only  one 
of  myriad  demonstrations  of  the  validity  of  the  force  law.  But  if  the  force  (which  is 
a vector)  is  to  depend  on  both  the  magnitude  and  the  direction  of  the  velocity,  and 
also  on  the  magnitude  and  direction  of  the  magnetic  field,  there  is  no  mathemati- 
cal expression  simpler  than  F = qv  x ® which  could  describe  it.  (Compare  with 
the  analogous  small-print  discussion  following  Fig.  9-8.) 

The  electron  and  positron  tracks  of  Fig.  23-10  show  that  a charged 
particle  tends  to  follow  a circular  path  in  a uniform  magnetic  field  if  its  ini- 
tial velocity  v is  perpendicular  to  the  field  ®.  If  this  is  the  case,  the  force  F, 
which  is  perpendicular  to  both  v and  ®,  has  magnitude  F = qv3b.  By  using 
Newton’s  second  law,  a = F/m,  we  see  that  the  acceleration  a of  the  particle 
has  magnitude 


m 


(23-5) 


where  m is  the  mass  of  the  particle.  The  direction  of  a is  always  perpendic- 
ular to  v,  because  F is  always  perpendicular  to  v.  Therefore,  the  accelera- 
tion changes  the  direction,  but  not  the  magnitude,  of  v.  Consequently,  the 
magnitude  of  the  acceleration  itself  remains  constant,  at  the  value  given  by 
Eq.  (23-5).  The  acceleration  vector  is  directed  always  toward  the  same 
center,  although  there  is  nothing  at  that  center.  The  path  of  the  charged 
particle  is  circular.  The  magnitude  of  the  centripetal  acceleration  deter- 
mines the  radius  r of  the  circular  orbit  through  the  relation  a = v2/r.  We 
thus  have  from  Eq.  (23-5) 


r m 


or 


m v 

r = 7^ 

An  application  of  this  relation  is  illustrated  in  Example  23-2. 


(23-6) 


EXAMPLE  23-2 

A deuteron  (the  nucleus  of  a deuterium,  or  heavy  hydrogen,  atom)  is  injected  into  a 
vacuum  chamber  between  the  poles  of  a large  magnet  constructed  as  in  Fig.  23-7. 
Its  kinetic  energy  is  K = 25.0  MeV  =25.0  X 106  eV,  and  its  direction  of  motion  is 
parallel  to  the  pole  faces  of  the  magnet.  If  the  magnetic  field  is  = 2.00  T,  find  the 
radius  r of  the  circular  orbit.  The  mass  of  a deuteron  is  mD  = 3.34  x 10-27  kg,  and 
its  charge  is  + e. 


1070  Magnetic  Fields,  I 


■ The  kinetic  energy  of  the  deuteron  is  small  compared  to  its  rest  mass  energy, 
which  is  m0c2  — 3 x 1(E27  kg  X (3  X 108  m/s)2  = 3 x 1 0— 10  J = 2 X 103  MeV.  You 
can  therefore  use  the  nonrelativistic  relation  K = mv2/ 2,  or 

\mD  ) 


Thus  Eq.  (23-6)  becomes 

= % /2K\1/2 

q \ mD  ) 


1 (2  mDK)m 
q & 


(23-7) 


Inserting  the  numerical  values  given,  you  have 

(2  x 3.34  x l<r27  kg  x 25.0  x 106  eV  x 1.60  x 10“19  J/eV)ia 
1.60  x IQ-19  C x 2.00  T 


0.511  m. 


Example  23-3  deals  with  the  motion  in  a uniform  magnetic  held  of  a 
charged  particle  whose  velocity  has  a completely  arbitrary  direction. 


EXAMPLE  23-3  mmmmmmmm  

A particle  having  mass  m and  charge  q is  propelled  into  a region  of  uniform  mag- 
netic field  ffi.  The  particle  has  a velocity  v whose  direction  with  respect  to  ffi  is  arbi- 
trary. The  region  is  evacuated,  so  that  the  particle  can  move  without  making  any 
collisions.  Show  that  the  path  of  the  particle  is  a helix  whose  axis  is  parallel  to  ffi. 

■ The  situation  is  that  shown  in  Fig.  23-1  la,  with  the  velocity  vector  v making  an 
arbitrary  angle  with  ffi.  You  already  know  that  an  initial  velocity  perpendicular  to 
the  magnetic  field  leads  to  a circular  orbit,  while  a velocity  parallel  to  the  magnetic 
field  results  in  no  force  at  all  on  the  charged  particle.  In  order  to  reduce  the  arbi- 
trary velocity  vector  to  these  known  special  cases,  replace  the  vector  v by  the  two 
vectors  vx  and  Vy,  whose  magnitude  are  found  by  taking  the  components  of  v per- 
pendicular and  parallel  to  the  magnetic  field  ffi.  As  shown  in  the  figure,  you  have 

V = VX  + Vy 

and  you  can  write  the  force  on  the  charged  particle  in  the  form 

F = q{\±  + Vy)  x ffi  = qv±  x ffi  4-  q\n  X ffi 

But  the  cross  product  of  two  parallel  vectors  is  zero,  so  that  vy  x ffi  = 0.  The  accel- 
eration thus  reduces  to 


F q 

a = — = — vx  x ffi 

m m 

This  acceleration  is  always  perpendicular  to  vx  and  ffi,  and  its  magnitude  is 


m 


(23-8) 


The  form  of  this  equation  is  that  of  Eq.  (23-5),  but  with  v±  substituted  for  v.  Thus  as 
far  as  vL  is  concerned,  the  magnetic  field  makes  the  charged  particle  move  as  if  it 
were  in  a circular  orbit  in  the  plane  perpendicular  to  the  magnetic  field,  just  as  in 
the  special  case  where  the  initial  velocity  is  perpendicular  to  ffi.  However,  the  par- 
ticle also  has  a constant  velocity  vy  along  the  direction  of  ffi.  This  velocity  component 
is  unaffected  by  the  magnetic  field,  since  no  force  is  exerted  in  that  direction.  The 
combination  of  circular  motion  in  the  plane  perpendicular  to  ffi  and  uniform  mo- 
tion along  ffi  results  in  a helical  (“corkscrew”)  motion,  as  shown  in  Fig.  23-1  lb.  The 
axis  of  the  helix  is  in  the  direction  of  ffi.  You  can  find  the  radius  of  the  helix  by  gen- 


23-2  The  Magnetic  Force  and  the  Magnetic  Field  1071 


eralizing  the  analysis  that  led  to  Eq.  (23-6).  Beginning  with  Eq.  (23-8),  you  have  cen- 
tripetal acceleration  v\jr  = (q/m)vx£ft.  Solving  for  r gives  you 


m vx 


(23-9) 


Whether  the  helix  is  “left-handed”  like  the  one  sketched  in  Fig.  23-116 
or  “right-handed”  depends  on  two  things:  the  charge  on  the  particle  (as- 
sumed positive  in  the  figure)  and  whether  vM  is  parallel  to  (B  (as  in  the  fig- 
ure) or  antiparallel  to  (B. 

The  pitch  z of  the  helix  is  the  distance  between  adjacent  turns.  This  dis- 
tance can  be  written 


z = vnT 

where  T is  the  time  required  for  the  particle  to  make  one  turn.  The  particle 
may  be  regarded  as  moving  in  a circle  of  radius  r at  speed  v±,  while  the 
circle  itself  moves  in  the  direction  (B  at  speed  V\\.  Since  the  circumference  of 
the  circle  is  27rr,  we  have 


Fl 


Using  the  value  of  r given  by  Eq.  (23-9)  yields  the  period 


and  the  pitch  is 


T 


27 t m vx 
v±  q 2ft 


2 ~m 
q2ft 


z = t/„  T = 


27nraun 

q2ft 


(23-10) 


(23-11) 


Equation  (23-10)  expresses  the  remarkable  fact  that  the  period  of  revolution  of 
a (nonrelativistic)  charged  particle  in  a magnetic  field  is  independent  of  its  speed 
and  hence  of  its  kinetic  energy.  The  time  T is  called  the  cyclotron  period.  The 
corresponding  angular  frequency  coc  is 

t oc==^=12ft  (23-12) 

I m 

It  is  called  the  cyclotron  frequency.  The  reason  for  these  names  be- 
comes clear  in  Sec.  23-3. 


The  fact  that  the  magnetic  force  on  a particle  is  always  perpendicular 
to  its  direction  of  motion  has  a very  important  consequence  for  the  energy 
of  the  particle.  The  work  done  on  the  particle  by  the  magnetic  held  as  the 
particle  moves  through  an  infinitesimal  displacement  ds  is 

dW  = F • ds  = F • v dt  = q(v  x (B)  • v dt 

In  this  equation,  (B  is  the  magnetic  held  at  the  location  of  the  particle.  The 
held  need  not  be  uniform,  as  long  as  it  does  not  change  with  time.  The 
cross  product  v x (B  must  always  be  perpendicular  to  the  vector  v.  And  the 
dot  product  of  two  perpendicular  vectors  is  zero.  Thus  we  have 

dW  = q( 0)  dt  = 0 (23-  13a) 


1072  Magnetic  Fields,  I 


23-3  CYCLOTRON 
RESONANCE  AND 
CYCLOTRONS 


® 


Fig.  23-12  Cyclotron  resonance.  A 
uniform,  steady  magnetic  held  ® is  ap- 
plied normal  to  the  surface  of  a sample 
of  semiconducting  material.  The  orbit 
of  an  electron  in  the  material  is  shown, 
neglecting  collisions.  An  oscillating  elec- 
tric held  £ is  applied  with  its  direction 
always  parallel  to  the  surface  of  the 
sample. 


A magnetic  field  which  does  not  change  in  time  cannot  do  work  on  a charged  particle. 
Or,  to  state  it  in  an  equivalent  manner,  the  energy  of  a charged  particle  acted  on 
by  only  a steady  magnetic  field  is  constant.  (The  reason  for  restricting  this  state- 
ment to  steady  magnetic  fields  becomes  evident  in  Chap.  24.) 

Since  Eq.  (23- 13a)  is  true  for  any  infinitesimal  segment  of  the  path  of  a 
charged  particle  through  a magnetic  held,  it  must  also  be  true  for  the  entire 
path.  Thus  the  total  work  done  on  the  particle  by  the  magnetic  held  is 

W = j dW  = 0 for  time-independent  (B  (23-136) 

path 

This  does  not  mean  that  the  momentum  of  the  particle  remains  constant. 
Whenever  the  direction  of  the  motion  of  the  particle  changes,  the  mo- 
mentum p = mv  must  change  as  well. 

In  Sec.  23-2,  it  was  shown  that  the  angular  frequency  of  a nonrelativistic 
charged  particle  in  a uniform  magnetic  held — the  cyclotron  frequency — is 
independent  of  the  speed  of  the  particle  and  hence  of  its  kinetic  energy. 
The  cyclotron  frequency  is  given  by  Eq.  (23-12), 

(oc  — — S3 
m 

This  energy  independence  is  exploited  in  a variety  of  physical  situations. 

The  most  important  present-day  application  of  the  constancy  of  the 
cyclotron  frequency  is  to  the  measurement  of  cyclotron  resonance  in  semi- 
conductors and  metals.  As  an  introduction  to  this  very  important  technique 
of  solid-state  physics,  assume  for  the  moment  that  the  electrons  in  a con- 
ducting substance  are  completely  free  to  move  about,  colliding  only  with 
the  surfaces  of  the  sample.  In  Fig.  23-12,  a uniform  magnetic  held  (B  is  ap- 
plied externally  in  a direction  normal  to  the  surface  of  a sample  of  such  a 
substance.  Under  the  influence  of  this  held,  the  electrons  follow  clockwise 
helical  orbits  (as  seen  by  an  observer  looking  in  the  direction  of  (B)  at  the  cy- 
clotron frequency  mc  = e3i/me.  In  addition  to  the  magnetic  held,  an  electric 
held  8 is  applied  parallel  to  the  surface  of  the  sample.  This  held  oscillates  in 
time,  its  value  at  time  t being  8 = 80  cos(o >t).  (In  practice,  this  is  usually 
done  by  placing  the  sample  inside  a “microwave  cavity.”)  If  the  angular  fre- 
quency oj  of  the  applied  electric  held  is  arbitrary,  an  orbiting  electron  will 
sometimes  gain  energy  as  its  acceleration  by  the  instantaneous  electric  held 
increases  the  magnitude  of  its  instantaneous  velocity  and  will  sometimes 
lose  energy  as  this  acceleration  decreases  the  magnitude  of  its  instanta- 
neous velocity.  Energy  gained  by  the  electrons  is  extracted  from  the  energy 
of  the  electric  held  itself.  (The  energy  density  of  an  electric  held  is  dis- 
cussed in  Sec.  21-6.)  Conversely,  energy  lost  by  the  electrons  results  in  an 
increase  in  the  energy  of  the  electric  held.  Under  these  circumstances,  the 
electrons  neither  gain  nor  lose  energy  on  the  average,  and  the  average  en- 
ergy of  the  electric  held  remains  constant. 

But  suppose  that  the  electric  held’s  angular  frequency  of  oscillation 
equals  the  cyclotron  frequency  wc.  Then  an  electron  whose  direction  of 
motion  at  some  instant  is  such  that  it  is  gaining  energy  from  the  electric 
held  will  continue  to  gain  energy  indefinitely.  This  is  because  the  compo- 
nent of  v along  the  direction  80  will  reverse  direction  at  the  same  frequency 
as  the  electric  held  8 reverses  direction.  We  can  express  this  point  by  saying 
that  the  motion  of  the  elect  ron  in  its  helical  orbit  and  the  externally  applied 
oscillating  electric  held  are  in  phase. 


23-3  Cyclotron  Resonance  and  Cyclotrons  1073 


If  the  sample  contains  a significant  number  of  electrons  which  can  cir- 
culate in  phase  with  the  oscillation  of  the  electric  field,  the  electrons  will,  on 
the  average,  absorb  energy  from  the  electric  field.  (The  process  is  analo- 
gous to  that  by  which  the  charge  carriers  in  a current-carrying  wire  absorb 
energy  from  the  externally  applied  electric  field  in  the  wire,  as  described 
in  Sec.  22-4.)  As  a result,  the  energy  of  the  electric  field  will  decrease,  and 
this  can  be  detected  quite  sensitively.  In  practice,  the  usually  technique  is  to 
keep  the  electric  field  angular  frequency  o»  constant  and  to  vary  the  mag- 
netic field  slowly,  thus  varying  the  cyclotron  frequency  a>c.  When  the  cyclo- 
tron resonance  condition,  co  = coc  = e@}/me,  is  attained,  the  magnitude  of 
the  magnetic  field  is 


$ 


( ome 
e 


(23-14) 


We  can  solve  Eq.  (23-14)  to  obtain  the  mass  me  of  the  electrons  in  terms  of 
measurable  quantities.  This  yields 


me 


(23-15) 


This  idealized  discussion  can  now  be  extended  to  real  situations.  Elec- 
trons do  not  behave  like  the  molecules  of  an  ideal  gas.  Nevertheless,  one  of 
the  happy  conclusions  of  a detailed  quantum-mechanical  treatment  of  elec- 
trons in  a metal  or  a semiconductor  is  that  they  can  be  treated  in  many  ways 
as  if  they  were  newtonian  particles,  provided  that  they  are  assigned  an  ef- 
fective mass  m*.  The  precise  value  of  m*  cannot  be  predicted,  however, 
without  making  detailed  assumptions  about  the  structure  of  the  material 
which  are  difficult  to  verify  or,  at  best,  difficult  to  calculate.  A knowledge  of 
the  effective  mass  thus  becomes  vital  information  in  the  theoretical  calcula- 
tions of  the  structure  of  the  material.  Cyclotron  resonance  is  useful  because 
it  makes  a direct  measurement  of  m*.  In  terms  of  the  effective  mass,  Eq. 
(23-15)  becomes 


(23-16) 


For  most  metals  and  semiconductors,  the  effective  mass  is  not  isotropic.  That 
is,  electrons  behave  as  though  they  have  a mass  which  varies  depending  on  the 
direction  in  which  they  are  going  relative  to  the  crystal  structure.  Although  there 
are  important  exceptions,  usually  m*  has  a value  in  the  range  0.1  me  to  me,  where 
me  is  the  free-electron  mass.  This  is  in  concord  with  the  results  of  the  Tolman- 
Stewart  experiment,  which  yields  average  q/m  values  for  metals  given  in  Table 
22-2.  The  cyclotron  resonance  technique,  however,  is  a far  better  way  of  deter- 
mining m*. 

It  is  also  possible  to  determine  the  sign  of  the  charge  on  the  charge  carriers  in 
metals  and  semiconductors  by  using  cyclotron  resonance.  This  is  done  by  apply- 
ing a rotating  electric  field,  rather  than  an  oscillating  electric  field,  in  the  plane 
parallel  to  the  sample  surface.  In  Fig.  23-12,  the  right-hand  rule  for  cross  products 
tells  us  that  negatively  charged  electrons  must  move  in  a clockwise  sense  (as 
seen  by  an  observer  looking  in  the  direction  of®).  Thus  they  will  be  accelerated 
continuously  by  an  electric  field  which  rotates  clockwise  at  the  cyclotron  fre- 
quency. But  cyclotron  resonance  is  observed  in  some  samples  for  clockwise  ro- 
tating fields,  in  some  for  counterclockwise  rotating  fields,  and  in  some  for  both. 
This  indicates  that  positive  as  well  as  negative  charge  carriers  are  present  in  some 


1074  Magnetic  Fields,  I 


semiconductors  and  metals.  The  positive  carriers  are  called  holes.  A hole  may  be 
thought  of  as  a small  bubble  in  an  otherwise  full  container  of  electrons.  If  you  in- 
vert a closed  bottle  which  is  nearly  full  of  water,  but  contains  a small  bubble,  what 
happens  in  a fundamental  sense  is  that  the  water  descends  slightly  under  the  influ- 
ence of  gravity.  But  this  motion  is  much  easier  to  observe  and  describe  in  terms  of 
the  rise  of  the  bubble,  which  responds  to  the  downward  gravitational  field  as 
though  it  experiences  an  upward  force  (that  is,  as  though  it  had  negative  mass). 
Now,  a bubble  contains  no  water,  so  what  is  rising  is  actually  “no  water.”  Never- 
theless, it  is  usually  better  to  describe  what  is  happening  when  the  bottle  is  in- 
verted in  terms  of  the  motion  of  the  bubble,  rather  than  the  motion  of  the  water. 
The  two  descriptions  are  completely  equivalent.  Similarly,  a hole,  which  is  “no 
electron,”  acts  as  though  it  had  the  negative  of  a negative  charge— that  is,  a posi- 
tive charge. 

A hole  in  a semiconductor  or  a metal  acts  just  as  though  it  had  a charge  +e 
and  some  effective  mass  m*.  The  values  of  m*  for  holes  tend  to  lie  in  the  same 
general  range  as  those  for  electrons. 

Example  23-4  is  concerned  with  a typical  application  of  the  cyclotron 
resonance  technique  to  measurement  of  the  properties  of  semiconducting 
substances. 


EXAMPLE  23-4  11,111,11,1 

In  an  experiment  to  determine  the  effective  mass  m*  of  the  electrons  in  indium  an- 
timonide  (InSb),  you  insert  a sample  of  the  substance  into  a microwave  cavity.  The 
cavity  is  “tuned"  to  oscillate  in  a standing-wave  mode  at  a frequency  v = 30.0  GHz  = 
3.00  X 1010  Hz.  In  this  mode,  the  microwave  electric  held  rotates  in  the  proper 
sense  to  allow  cyclotron  resonance  energy  transfer  to  electrons,  but  not  to  holes. 
You  place  the  cavity  between  the  poles  of  a small  electromagnet  and  slowly  vary  the 
magnitude  39  of  the  uniform  magnetic  held  it  produces.  As  you  do  so,  you  note  that 
there  is  a relatively  strong  absorption  of  microwave  energy  in  the  cavity  when  £$  = 
1.66  X 10-2  T.  What  is  the  “cyclotron  effective  mass”  of  the  electrons  in  InSb? 

■ The  frequency  v is  related  to  the  angular  frequency  ojc  by  the  equation  a>c  = 
27 tv.  You  can  therefore  write  Eq.  (23-16)  in  the  form 

39<? 


The  measured  values  give  you 


mr 


1.66  x I0“2  T x 1.60  x 10-19  C 
2t r x 3.00  x 1010  Hz 


= 1.41  x 10~32  kg 


It  is  often  convenient  to  express  this  value  as  a fraction  of  the  mass  me  of  a free  elec- 

:an 

1.41  x 10-32  kg 


tron.  Since  me  = 9.11  X 10  31  kg,  you  can  write 


mr 

mP 


9.11  x 1CT31  kg 


= 1.55  x 10“2 


or  about  1.5  percent  of  the  free-electron  mass.  This  is  relatively  small  as  the  effec- 
tive masses  of  electrons  and  holes  in  semiconductors  and  metals  go. 


So  far,  we  have  described  the  cyclotron  resonance  effect  as  though 
electrons  (and  holes)  never  made  collisions.  But  in  reality  they  do,  and  if 
the  mean  free  path  A.  discussed  in  Sec.  22-4  is  too  short,  cyclotron  reso- 
nance cannot  be  observed.  Every  time  an  electron  (or  a hole)  makes  a colli- 
sion, its  direction  is  randomized.  If  its  motion  before  the  collision  was  in 
phase  with  the  oscillation  of  the  electric  held,  it  will  no  longer  be  so  after 


23-3  Cyclotron  Resonance  and  Cyclotrons  1075 


the  collision.  Even  if  the  frequency  of  the  electric  held  has  the  proper  value 
a > = (oc,  the  electron  (or  hole)  cannot  continue  in  a circular  path  long 
enough  to  absorb  significant  energy  from  the  held.  For  the  experiment  to 
be  carried  out  successfully,  it  is  important  that  the  mean  collision  time  r be 
greater  than  or  roughly  comparable  to  the  cyclotron  period  T.  If  this  is  so, 
the  electron  (or  hole)  can  absorb  energy  through  at  least  one  full  orbit  be- 
fore its  motion  is  randomized  by  a collision.  Thus  the  rough  condition  for 
successfully  carrying  out  the  experiment  can  be  written 


or,  since  T = 2tt/u>c, 


t^T 


OJ,.T 


2tt 


(23-1 7a) 


(23-176) 


There  are  several  ways  of  satisfying  this  condition.  The  most  commonly 
employed  one  is  to  use  extremely  pure  samples  at  very  low  temperatures 
(around  4 K).  This  maximizes  both  the  impurity  collision  time  Timp  and  the 
thermal  collision  time  rth  and  thus  the  overall  collision  time  r,  as  discussed 
in  Sec.  22-5.  It  is  usually  possible  to  increase  the  collision  time  by  five  orders 
of  magnitude  or  more  over  the  room-temperature  value.  Typical  condi- 
tions for  cyclotron  resonance  in  a semiconductor  are  discussed  in  Example 
23-5. 


A sample  of  germanium,  cooled  to  a temperature  of  4.2  K,  has  a collision  time  r — 
10-9  s.  You  have  a microwave  apparatus  which  can  produce  an  oscillating  electric 
held  at  angular  frequency  a>  = 150  X 109  Hz.  Will  you  be  able  to  observe  cyclotron 
resonance?  If  so,  what  order  of  magnitude  of  magnetic  held  will  you  need? 

■ First  you  test  the  condition  of  Eq.  (23-176),  wct  > 2tt.  Vou  have 

coct  = 150  x 109  Hz  X 10-9  s = 150  » 2tt 

So  you  should  have  no  trouble  obtaining  a clear  energy  absorption  at  the  cyclotron 
resonance  frequency.  Next,  however,  you  must  ascertain  whether  the  required 
magnetic  held  is  a practical  one.  You  have 

m*u>c 

® 1 

e 

If  the  effective  masses  you  are  tx  ying  to  measure  lie  in  a range  of  values  not  too  far 
from  m*  = me,  you  need  a magnetic  held  whose  magnitude  is  roughly 


9.1  x 10 
& = 


~31  kg  x 150  x 109  Hz 
1.6  x ltr19  C 


0.85  T 


This  held  can  be  obtained  quite  conveniently  with  a standard  laboratory  electro- 
magnet. Smaller  effective  masses  will  require  smaller  helds  yet.  So  the  experiment  is 
quite  feasible. 


The  possibility  of  cyclotron  resonance  in  semiconductors  was  first  suggested 
by  G.  F.  Dresselhaus  while  he  was  a graduate  student.  The  first  successful  results 
were  announced  by  him  in  collaboration  with  Arthur  Kip  and  Charles  Kittel  in 
1956.  The  necessary  electronic  techniques  were  originally  developed  for  radar 
purposes  during  and  just  after  World  War  II.  The  materials  technology,  needed  to 
prepare  the  very  pure  samples  in  which  the  effect  could  readily  be  observed,  was 
developed  at  about  the  same  time,  in  conjunction  with  the  invention  of  the  tran- 


1076 


Magnetic  Fields,  I 


sistor.  In  turn,  the  information  derived  from  cyclotron  resonance  was  indispen- 
’ sable  to  the  improved  understanding  of  the  structure  of  semiconductors  on  which 
the  further  development  of  solid-state  electronic  technology  was  based.  Thus,  cy- 
clotron resonance  is  an  excellent  example  of  the  intertwining  of  scientific  and 
technological  progress. 

The  application  of  cyclotron  resonance  to  the  study  of  the  properties 
of  charge  carriers  in  solids  was  inspired  by  the  historically  very  important 
particle  accelerator  called  the  cyclotron.  The  cyclotron  was  invented  in 
1931  by  the  American  physicists  E.  O.  Lawrence  and  M.  Stanley  Livingston. 
Lrom  shortly  after  its  invention  until  the  1950s,  it  remained  the  most  im- 
portant single  tool  of  nuclear  and  fundamental-particle  physics.  Like  all 
particle  accelerators,  it  accomplishes  the  task  of  producing  a reasonably  in- 
tense stream  of  charged  particles  at  high  energy.  This  is  a prerequisite  to 
almost  all  experiments  in  those  fields  of  physics. 

In  the  cyclotron,  the  particles  accelerated  are  not  the  charge  carriers  in 
solids,  but  nuclear  particles  in  a vacuum  chamber  placed  between  the  poles 
of  a very  large  electromagnet.  The  particles  most  frequently  used  are  pro- 
tons, deuterons,  and  alpha  particles.  Since  the  particles  are  accelerated  in  a 
vacuum  chamber,  they  do  not  lose  energy  in  collisions.  They  continue  to 
acquire  energy  from  an  electric  field  oscillating  at  angular  frequency  cjc  as 
they  rotate  at  that  frequency  in  the  magnetic  field.  The  maximum  particle 
energies  attainable  in  a cyclotron  are  approximately  100  MeV.  (This  is 
roughly  a billion  times  greater  than  the  energies  of  interest  in  cyclotron  res- 
onance experiments  in  solids.) 


23-4  THE  LORENTZ  Up  to  this  point,  we  have  considered  the  behavior  of  charged  particles 
FORCE  under  the  influence  of  the  electric  force 

Fe  = q£>  (23-  18a) 

or  under  the  influence  of  the  magnetic  force 

Fm  = qx  x « (23-186) 

It  often  happens  that  a charged  particle  is  simultaneously  under  the  influ- 
ence of  both  kinds  of  force.  The  total  force  is  given  by  the  vector  sum  of  the 
electric  and  magnetic  forces,  and  it  is  thus 

F = Fe  + Fm  = q(£  + v x®)  (23-19) 

This  force  represents  the  totality  of  the  forces  that  can  act  upon  a body  by 
virtue  of  the  fact  that  it  possesses  an  electric  charge.  That  is,  a body  may 
experience  other  kinds  of  forces,  but  they  have  no  direct  connection  with  the 
charge  on  the  body.  The  total  force  described  by  Eq.  (23-19)  is  called  the 

Lorentz  force. 

A classical  application  of  the  Lorentz  force  is  the  experiment  per- 
formed in  1897  by  J.  J.  Thomson  to  determine  the  charge-to-mass  ratio 
c/me  for  free  electrons.  Thomson’s  apparatus  is  shown  in  Fig.  23-13.  It  is  a 
modification  of  a family  of  devices  called  cathode-ray  tubes,  which  had 
been  used  to  study  the  conduction  of  electricity  through  gases  at  low  pres- 
sures for  some  decades  before  1897.  (If  you  are  familiar  with  the  appear- 
ance of  modern  television  tubes,  you  will  see  a close  resemblance.)  In  ear- 
lier cathode-ray  experiments,  the  residual  gas  present  in  the  tube  had 


23-4  The  Lorentz  Force  1077 


Fig.  23-13  Diagram  of  the  cathode-ray  tube  used  in  the  Thomson  experiment  to  measure 

e/me . 


played  a dominant  or  at  least  an  interfering  role  in  the  behavior  of  the  elec- 
tron stream,  or  “cathode  ray,”  as  it  was  called.  However,  Thomson  was  able 
to  attain  a sufficiently  good  vacuum  that  the  gas  remaining  in  the  tube 
could  be  neglected.  (In  those  days,  that  was  the  hardest  part  of  the  experi- 
ment!) 

In  Thomson’s  experiment,  a large  potential  difference  is  applied 
between  the  negative  electrode  C (called  the  cathode)  and  the  positive  elec- 
trode Ax  (called  the  anode).  Typically,  the  potential  difference  is  about 
50,000  V,  but  its  precise  value  is  not  important  to  the  experiment.  The 
large  electric  field  at  the  surface  of  the  cathode  results  in  the  extraction  of 
electrons  from  it  by  means  of  one  of  several  possible  processes,  which  we  do 
not  need  to  consider  here.  These  electrons  are  accelerated  through  the  po- 
tential difference  between  C and  Ax-  Most  of  them  strike  Ax.  However, 
there  is  a small  hole  in  Ax,  and  some  of  the  electrons  pass  through  it  into  the 
space  between  Ax  and  A2.  Since  Ax  and  A2  are  electrically  connected,  they 
are  at  the  same  potential  and  the  space  between  them  is  field-free.  The 
electrons  coast  through  the  space.  Those  that  are  not  headed  straight  down 
the  tube  strike  A2.  The  properly  directed  electrons,  comprising  a narrow 
beam,  pass  through  a small  hole  in  A2  and  into  the  main  part  of  the  tube. 
There  they  pass  between  a pair  of  parallel  metal  plates  P+  and  P_.  If  the 
potential  difference  Vd  between  these  plates  is  zero,  the  electrons  continue 
in  a straight  line  and  strike  the  center  of  the  fluorescent  zinc  sulfide  screen 
at  the  end  of  the  tube.  There  they  come  to  rest,  producing  a bright  spot  on 
the  screen  at  the  point  S0. 

If  the  plate  P+  has  a positive  potential  Vd  relative  to  the  plate  P_,  there 
is  a downward-directed  electric  field  £ in  tfie  region  between  them,  and  the 
electrons,  whose  charge  is  — e,  are  deflected  upward  by  an  electric  force 
Ff  = — e£>.  If  the  distance  between  the  plates  is  d,  the  magnitude  of  the  de- 
flecting electric  field  is 


If  we  neglect  the  fringing  field  at  the  edges  of  the  plates,  the  electric  field  is 
uniform  between  the  plates  and  is  zero  elsewhere.  For  the  time  t it  takes  for 
an  electron  to  pass  between  the  plates,  the  electron  experiences  a constant 
upward  force  of  magnitude  Fe  = e%.  Consequently,  there  is  a constant  up- 
ward acceleration  of  magnitude  a,  given  by  the  expression 

— % (23-21) 

me  me 

The  electron  therefore  follows  a parabolic  trajectory  between  the  plates, 
which  is  an  inverted  version  of  that  followed  by  a ball  thrown  in  a horizon- 


1078  Magnetic  Fields,  I 


tal  direction  and  acted  on  by  gravity.  When  the  electron  passes  the  end  of 
the  plates,  it  has  been  deflected  through  a vertical  distance  y,  which  is  given 
by 


y = 


hat2 


If  the  horizontal  speed  of  the  electron  is  v and  the  length  of  the  plates  is  /, 
the  time  spent  in  passing  between  the  plates  is  t = l/v.  Thus  when  the  elec- 
tron is  just  leaving  the  plates,  its  vertical  deflection  is 


y 


2 \ trip 


r- 


(23-22) 


Afterward  the  electron  continues  in  a straight-line  trajectory  whose  slope  is 
the  tangent  to  the  end  of  the  parabolic  part  of  the  path  lying  between  the 
plates.  It  strikes  the  fluorescent  screen  at  the  point  S1.  The  vertical  deflec- 
tion y in  Eq.  (23-22)  can  be  calculated  from  a measurement  of  the  distance 
between  and  S0,  together  with  a knowledge  of  the  dimensions  of  the 
apparatus.  Thus  Eq.  (23-22)  can  be  used  to  express  the  ratio  e/me  in  terms 
of  quantities  which  are  all  known  except  the  electron  speed  v. 

The  electron  speed  v is  determined  by  using  a magnetic  held.  The  vac- 
uum tube  is  placed  between  the  poles  of  a magnet.  The  magnet  is  oriented 
with  its  North  pole  in  front  of  the  plane  of  the  page  in  Fig.  23-13  and  its 
South  pole  behind  the  plane  of  the  page.  It  produces  a magnetic  held  (B 
which  penetrates  the  apparatus  in  the  negative  z direction,  perpendicular  to 
both  the  direction  of  motion  of  the  electrons  and  to  the  electric  held  S 
between  the  plates  P.  Electric  and  magnetic  helds  oriented  in  this  way  are 
called  crossed  fields.  Assume  for  simplicity  that  the  magnetic  field  is 
restricted  to  exactly  the  same  region  as  the  electric  held.  (This  is  not  neces- 
sary and  is  not  how  Thomson  did  the  experiment.  But  the  principle  is  es- 
sentially  the  same.)  Applying  the  right-hand  rule  for  cross  products  to  the 
expression  for  the  magnetic  force,  F m = -tv  x ffi.  we  see  that  the  mag- 
netic held  in  the  negative  z direction  tends  to  deflect  the  electrons  down- 
ward,  toward  the  point  S2  on  the  face  of  the  tube.  The  magnetic  force  on 
the  electrons  thus  opposes  the  electric  force.  If  the  magnetic  held  is  ad- 
justed in  magnitude  until  the  deflection  of  the  electron  beam  is  zero,  the 
total  force  acting  on  each  electron — that  is,  the  Lorentz  force  — must  be 
zero.  Under  these  conditions  we  therefore  have 


It  follows  that 


F — F,,  + Fm  — — e(C  + v x (B ) — 0 

8 = - v x © 


Since  the  magnetic  field  © is  perpendicular  to  the  electron  velocity  v,  the 
magnitude  of  — v x © is  given  by  | — v x ffi|  = vSft.  Elence,  taking  the  mag- 
nitudes of  both  sides  of  the  equation  displayed  above,  we  have 

% = 
or 


(23-23) 


Substituting  this  value  of  the  electron  speed  into  Eq.  (23-22)  gives 

1 e l 2®2  1 e 

^ 2 me  2 me  % 


23-4  The  Lorentz  Force  1079 


(23-24) 


Using  Eq.  (23-20),  % = Vd/d,  and  solving  for  g/rae,  we  obtain 

J_  = 2yUd 
me  d /2  S32 

Since  we  have  today  ample  evidence  of  the  existence  of  the  electron  as  a par- 
ticle, Thomson’s  experiment  may  appear  at  first  glance  to  be  nothing  more  than  a 
way  of  obtaining  the  numerical  value  of  the  important  constant  e/me.  However, 
the  existence  of  the  electron  was  by  no  means  established  before  Thomson’s  work. 
Historically,  the  most  important  consequence  of  the  experiment  was  not  the  fairly 
accurate  establishment  of  the  numerical  value  of  e /me,  but  the  demonstration  that 
such  a unique  value  exists.  If  something  exists  which  has  a well-defined  ratio  of 
electric  charge  to  mass,  it  becomes  difficult  to  understand  what  it  might  be  if  it  is 
not  a particle!  As  we  have  already  noted,  the  name  “electron”  was  used  before 
1897  to  refer  not  to  a particle,  but  to  a unit  of  charge  which  appeared  to  have  a fun- 
damental role  in  electrochemical  processes.  It  was  Lorentz  who  extended  the  use 
of  the  word  to  mean  the  particle  itself.  Thomson  did  not  adopt  this  new  usage 
for  years,  preferring  to  call  the  particle  a “corpuscle.” 

In  his  1897  paper,  Thomson  reported  that  the  value  of  e/me  thus  ob- 
tained was  in  the  neighborhood  of  1011  C/kg.  This  is  thousands  of  times 
greater  than  the  corresponding  value  for  ions.  What  is  more,  Thomson 
showed  that  the  value  oi  e/me  was  independent  of  the  material  used  for  the 
cathode  and  of  the  chemical  nature  of  the  residual  gas  in  the  tube.  Thus 
the  particles  involved  in  the  experiment  were  not  atomic  ions. 

Thomson  later  refined  his  methods  in  order  to  obtain  a more  precise 
value  for  e/me.  His  best  value  was 

— = 1.7  x 1011  C/kg 
me 

The  best  modern  value,  based  on  a weighted  average  of  a variety  of  interre- 
lated determinations,  is 

— = 1.75880  x 1011  C/kg  (23-25) 

me 

The  technique  devised  by  Thomson,  greatly  improved  and  extended,  is 
widely  applied  to  ions.  The  technique  is  called  mass  spectroscopy,  because  it  can 
be  used  to  separate  a beam  of  mixed  ions  into  a “spectrum,”  or  distribution,  each 
part  of  which  contains  one  kind  of  ion.  There  are  several  types  of  apparatus  in  use, 
but  all  depend  on  the  fact  that  the  trajectory  followed  by  a charged  particle  in 
an  electric  field,  a magnetic  field,  or  both,  depends  on  its  charge-to-mass  ratio 
q/m.  It  was  through  the  use  of  early  mass  spectrometers  that  Thomson  and  his  as- 
sociate F.  W.  Aston  confirmed  the  existence  of  isotopes.  Isotopes  are  forms  of  an 
atom  of  a chemical  substance  which  differ  only  in  the  number  of  neutrons  in  their 
nuclei,  and  therefore  in  their  masses.  But  this  difference  does  not  affect  the  elec- 
tric charge  of  the  nucleus.  Consequently,  it  does  not  affect  the  number  or  distribu- 
tion of  the  electrons  of  the  isotopes,  and  their  chemical  properties  are  nearly  the 
same.  It  is  therefore  difficult  or  impossible  to  separate  isotopes  of  a single  chemi- 
cal substance  by  chemical  means.  It  is  the  mass  difference  which  makes  separation 
possible  by  means  of  mass  spectrometry. 

Atomic  masses  can  be  measured  very  accurately  with  a mass  spectrometer. 
Aston’s  demonstration  that  the  atoms  of  pure  isotopes  do  not  have  masses  pre- 
cisely equal  to  integral  multiples  of  the  mass  of  the  hydrogen  atom  was  of  great 
importance  in  the  development  of  nuclear  physics.  It  led  ultimately  to  the  discov- 
ery of  the  relativistic  relation  between  the  masses  of  nuclei  and  their  binding  en- 
ergies, which  is  discussed  in  Sec.  14-5. 


1080 


Magnetic  Fields,  I 


X- 


x 


Fig.  23-14  The  Hall  effect  for  positive  charge  carriers.  The  symbols  are  explained  in  the 
text. 

Today,  mass  spectrometers  of  great  sophistication  are  used  for  very  delicate 
chemical  analyses,  often  in  conjunction  with  the  chemical  analyzer  called  the  gas 
chromatograph.  The  combination — the  so-called  GC-MS — is  perhaps  the  most  so- 
phisticated analytical  tool  ever  produced  on  a commercial  basis. 

When  an  electric  conductor  carries  a current,  the  charge  carriers 
within  it  are  in  motion.  If  the  conductor  lies  in  a magnetic  held,  the  moving 
charges  are  deflected.  In  1879,  the  U.S.  experimental  physicist  Henry  Row- 
land (1848-1901)  suggested  to  his  graduate  student  Edwin  H.  Hall 
(1855-1938)  that  he  look  into  the  matter,  l ire  effect  which  the  latter  dis- 
covered is  called  the  Hall  effect. 

Figure  23-14  shows  the  experimental  arrangement.  A rectangular  slab 
of  a conducting  material  is  connected  to  a battery,  so  that  a current  i flows 
along  the  positive  x direction  due  to  a longitudinal  electric  held  &x.  A 
uniform  magnetic  held  (B  is  applied  perpendicular  to  the  current  flow, 
along  the  positive  z direction.  Suppose  for  the  moment  that  the  charge  car- 
riers are  positive  and  all  of  one  kind.  They  How  in  the  same  direction  as  the 
current,  from  the  positive  to  the  negative  terminal  of  the  slab.  These 
charges  will  experience  a magnetic  force  qv  x (B  directed  upward,  that  is, 
in  the  negative  y direction.  However,  they  cannot  continue  to  move  upward 
for  long.  The  upper  edge  of  the  slab  acts  as  a wall  which  prevents  them 
from  moving  farther,  and  as  they  bunch  close  to  the  upper  edge,  the  result 
is  a positive  net  charge  along  that  edge.  The  upward  migration  of  positive 
charge  also  leaves  a deficit  of  positive  charge  along  the  lower  edge.  Thus,  in 
effect,  a net  negative  charge  accumulates  along  the  lower  edge.  There  is  a 
transverse  electric  field  8>u  associated  with  this  charge  distribution.  The 
direction  of  £y  is  from  top  to  bottom,  in  the  positive  y direction.  Since  the 
positive  and  negative  charges  are  distributed  uniformly  on  the  parallel 
upper  and  lower  edges,  the  electric  field  Sv  is  uniform,  like  that  between 
the  plates  of  a plane-parallel  capacitor. 

Because  of  the  existence  of  this  electric  field,  a positive  charge  carrier 
anywhere  within  the  conducting  slab  experiences  a downward  electric 
force  q£>y.  This  force  thus  opposes  the  upward  magnetic  force  q\  x (B 
which  brought  the  transverse  electric  field  into  existence  in  the  first  place. 
As  more  and  more  positive  charge  piles  up  along  the  upper  edge  of  the  slab 
under  the  action  of  the  magnetic  force,  the  opposing  electric  force  in- 
creases until  the  two  forces  just  balance.  (In  practice,  this  happens  very 


23-4  The  Lorentz  Force  1081 


quickly.)  Under  these  steady-state  circumstances,  the  total  force  F = 
Fe  + Fm  on  a charge  carrier  in  the  y direction  is  zero.  According  to 
Eq.  (23-19)  for  this  Lorentz  force  F,  the  condition  requires  that 

q{E>y  + v x (B)  = 0 

Since  Sy,  v,  and  (B  are  all  mutually  perpendicular,  the  magnitudes  of  these 
vectors  must  he  related  by  the  expression 

%y  = v ZJi  (23-26) 


Once  the  steady  state  has  been  attained,  the  charge  carriers  drift  directly 
along  the  x axis,  since  the  remaining  net  force  on  them  is  due  to  the  longi- 
tudinal electric  held  Sx  in  the  x direction,  which  drives  the  current  i. 

Since  there  is  a uniform  transverse  electric  held  8 y,  there  must  be  a po- 
tential difference  VH  between  points  C and  D on  opposite  edges  of  the  con- 
ducting slab.  If  the  width  of  the  slab  is  Y,  we  have  V H = %yY,  so  that 


% 


y 


(23-27) 


Substituting  this  value  into  Eq.  (23-26)  yields 

VH  = Yv m (23-28) 


The  potential  difference  VH,  which  is  called  the  Hall  voltage,  can  be  mea- 
sured by  connecting  a voltmeter  between  points  C and  D.  Thus  all  the 
quantities  in  Eq.  (23-28)  can  be  measured  directly  except  for  the  drift  speed 
v of  the  charge  carriers.  But  we  know  how  to  express  v in  terms  of  the  cur- 
rent i,  which  can  be  measured  directly.  Let  N be  the  concentration  of  carri- 
ers, that  is,  the  number  of  carriers  per  unit  volume.  If  each  has  charge  q 
and  a is  the  cross-sectional  area  of  the  conducting  slab,  then  Eq.  (22-40)  can 
be  written  in  the  form 

i = Nqav 


Since  the  slab  width  is  Y and  its  thickness  is  Z (see  Fig.  23-14),  a is  given  by 
a = YZ.  Thus  the  drift  speed  can  be  written 


i i 

~Nqa  = NqYZ 


Substituting  this  value  of  v into  Eq.  (23-28)  yields 


V 


H 


m 

~NqZ 


(23-29) 


(23-30) 


It  is  not  surprising  that  this  equation  shows  the  Hall  voltage  VH  to  be 
directly  proportional  to  the  current  i,  since  increasing  i increases  the  drift 
speed  of  the  charge  carriers  and  therefore  the  magnitude  Fm  = qv£R  of  the 
magnetic  force.  This,  in  turn,  requires  that  the  magnitude  Fey  of  the  bal- 
ancing electric  force  be  greater,  and  consequently  %y  and  VH  are  also  in- 
creased. Similarly,  increasing  the  magnitude  S3  of  the  magnetic  held  in- 
creases Fm  and  therefore  %y  and  VH.  It  is  not  so  easy  to  guess  in  advance 
that  V H is  independent  of  the  width  of  the  slab  or  that  it  is  inversely  propor- 
tional to  the  thickness  of  the  slab.  The  width  Y cancels  out  when  Eq.  (23-29) 
is  substituted  into  Eq.  (23-28).  The  inverse  proportionality  of  VH  to  Z im- 
plies that  if  a large  Hall  voltage  is  desired,  a thin  sample  should  be  used. 

The  most  useful  information  in  Eq.  (23-30),  however,  is  that  Nq,  the 
product  of  the  density  of  charge  carriers  and  their  charge  q,  can  be  deter- 


mined  from  the  measurable  quantities  VH,  i,  <3i,  and  Z.  We  have  already 
noted  that  in  metals  and  semiconductors  the  carriers  always  have  a charge 
of  — e or  + e.  The  only  remaining  unknown  in  Eq.  (23-30),  therefore,  is  the 
carrier  concentration  N.  Solving  explicitly  for  N gives 

iSft 

N=±- — (23-31) 

eV HZ 


Thus,  if  a conductor  contains  only  one  kind  of  charge  carrier,  it  is  possible 
to  determine  the  carrier  concentration  by  means  of  Hall-effect  measure- 
ments. 

Equation  (23-30)  can  be  rewritten  in  the  form 


1 iM  iM 

— — = — 


(23-32) 


The  quantity 


Rh  Nq 


is  called  the  Hall  coefficient.  Its  units  are  l/(m  3-C)  = m3/C. 


(23-33) 


A very  useful  property  of  the  Hall  effect  is  its  asymmetry  with  respect 
to  the  sign  of  the  charge  carriers.  Suppose  the  charge  carriers  are  electrons, 
so  that  q — — e.  Figure  23-15  shows  the  new  situation.  The  externally  im- 
posed conditions  are  exactly  the  same  as  those  of  Fig.  23-14,  where  it  was 
assumed  that  the  charge  carriers  were  positive.  The  current  still  flows  from 
left  to  right,  in  the  positive  x direction,  under  the  influence  of  the  exter- 
nally applied  longitudinal  electric  held  8X.  The  magnetic  held  (B  is  still 
directed  in  the  positive  y direction,  into  the  page.  Now,  however,  the  nega- 
tive charge  carriers  move  from  right  to  left.  The  magnetic  force  gv  x( B 
acts  on  them  in  an  upward  direction,  since  q is  negative.  This  is  the  same 
direction  in  which  the  magnetic  force  acts  on  positive  charge  carriers  in  the 
same  situation;  compare  with  Fig.  23-14.  The  reason  is  that  the  same  mag- 
netic field  acts,  in  the  two  cases,  on  charge  carriers  of  opposite  sign  which 
are  moving  in  opposite  directions.  These  two  reversals  of  sign  cancel,  and 
the  magnetic  force  is  upward  on  both  kinds  of  carriers. 


Fig.  23-15  The  Hall  effect  for  negative  charge  carriers. 


z 


23-4  The  Lorentz  Force  1083 


But  an  upward  force  on  negative  carriers  means  that  an  excess  nega- 
tive charge  will  accumulate  on  the  upper  edge  of  the  slab.  The  transverse 
electric  held  8y  will  be  directed  in  the  negative  y direction.  This  reversal  of 
direction  of  8y  reverses  the  sense  of  the  Hall  voltage  VH.  Thus  the  Hall  ef- 
fect reveals  the  sign  of  the  charge  carriers  as  well  as  their  concentration  N. 
It  is  conventional  to  call  the  Hall  voltage  VH  and  the  Hall  coefficient  RH  pos- 
itive when  the  charge  carriers  are  positive  and  negative  when  the  charge 
carriers  are  negative.  For  most  metals,  the  Hall  coefficient  is  negative, 
which  corroborates  the  results  of  the  Tolman-Stewart  experiment 
described  in  Chap.  22.  However,  there  are  also  many  metals  which  have 
positive  Hall  coefficients.  This  is  particularly  true  of  metals  whose  chemical 
valence  is  not  1 or  2. 

The  use  of  the  Hall  effect  is  explored  in  Examples  23-6  and  23-7. 


EXAMPLE  23-6  

You  are  preparing  to  measure  the  Hall  coefficient  of  a sample  of  copper.  It  is  in  the 

form  of  a rectangular  slab  of  thickness  Z = 0.100  mm.  You  have  a battery  which 
can  pass  a steady  current  i = 10.0  A through  the  sample,  and  you  have  scrounged  a 
large  permanent  magnet  which  has  between  its  pole  faces  a uniform  magnetic  field 
31  = 0.563  T.  You  must  now  go  out  and  borrow  a voltmeter  of  sufficient  sensitivity 
to  measure  the  Hall  voltage.  Estimate  the  Hall  coefficient  and  the  Hall  voltage  you 
will  need  to  measure  on  the  assumptions  that  the  tree-electron  theory  developed  in 
Sec.  22-4  is  valid  and  that  each  copper  atom  contributes  one  free  electron.  The 
atomic  weight  of  copper  is  63.6,  and  its  density  is  8.93  x 103  kg/m3. 

■ To  find  the  electron  concentration  for  copper,  you  proceed  just  as  in  Example 
22-6,  where  the  electron  concentration  for  brass  was  estimated.  You  find  the 
number  of  copper  atoms  per  cubic  meter,  which  is  given  by 

Avogadro’s  number  x density  6.02  x 1026  x 8.93  X 103  kg/m3 
mass  of  1 kmol  63.6  kg 

= 8.45  x 1028  atoms/m3 

Assuming  that  each  atom  contributes  one  free  electron,  you  have  for  the  free- 
electron  concentration 

N = 8.45  x 1028  electrons/m3 

Using  the  definition  of  the  Hall  coefficient  RH  given  by  Eq.  (23-33),  you  predict  its 
value  to  be 


N(  — e)  8.45  x 1028  electrons/m3  x (—1.60  x 10  19  C) 
= -7.40  x 10~n  m3/C 


To  calculate  the  expected  Hall  voltage  VH,  you  use  Eq.  (23-32)  in  the  form  VH  = 
RHi3i/Z  and  obtain 


V„  = 


-7.40  x 10“n  m3/C  x 10.0  A x 0.563  T 


1.0  x 10- 


= -4.16  x 10“6  V 


m 


or 


VH 


-4.16  mV 


So  you  need  a fairly  sensitive,  but  not  unusual,  voltmeter. 


1084  Magnetic  Fields,  I 


EXAMPLE  23-7 


When  you  perform  the  experiment  described  in  Example  23-6,  you  find  t Hat  the 
Hall  voltage  is  actually  VH  = —5.69  /xV.  What  is  the  value  of  the  Hall  coefficient 
based  on  experiment?  According  to  this  result,  what  is  the  number  nexp  of  free  elec- 
trons per  atom  in  copper?  Compare  this  with  the  value  nfe  = 1 electron  per  atom 
based  on  the  free-electron  theory. 

■ Since  the  Hall  voltage  is  larger  than  expected,  the  Hall  coefficient  must  also  be 
larger  by  the  same  factor,  because  according  to  Eq.  (23-32)  the  two  quantities  are 
proportional.  Using  the  subscript  “fe”  to  denote  quantities  predicted  by  the  free- 
electron  theory  and  the  subscript  “exp”  to  denote  quantities  derived  from  experi- 
mental results,  you  have 

Rh  ex p 1 '//  e\ p 

Rff  fe  C//  fe 
or 

Vtfexp  -5.69  mV 

Rh  exp  = ~Tr  Rh  fe  = - . (-7.40  x 10-11  m3/C) 

VHie  -4.16  mV 

= -10.1  x 10-11  m3/C 


The  Hall  coefficient  is  inversely  proportional  to  the  electron  concentration 
N,  as  you  can  see  from  Eq.  (23-33).  You  thus  have 


A^exp  Rh fe  -7.40  X 10"11  m3/C 
"a^T  ~ RHex p ” -10.1  x 10-11  m3/C 


But  the  ratio  of  the  experimentally  derived  electron  concentration  Nexv  to  the  elec- 
tron concentration  N(e  predicted  on  the  basis  of  the  free-electron  theory  must  be  the 
same  as  the  ratio  of  the  experimentally  derived  number  of  free  electrons  per 
copper  atom  nexp  to  the  number  per  atom  predicted  on  the  basis  of  the  free- 
electron  theory,  nfe.  That  is,  nexp/nfe  = Nexp/N{e.  And  since  the  original  assumption 
was  that  nfe  = 1 electron  per  atom,  you  have 


n 


exp 


N 

1 vexp 


x 


1 electron/atom  = 0.73  electron/atom 


The  analysis  underlying  the  Hall  effect  involves  few  assumptions 
regarding  the  microscopic  behavior  of  charge  carriers  in  metals.  Thus,  pro- 
vided that  there  is  only  one  kind  of  carrier,  the  value  of  the  carrier  concen- 
tration N derived  from  Hall-effect  measurements  is  more  likely  to  be  reli- 
able than  that  based  on  the  assumption  that  the  number  of  free  electrons 
per  atom  is  equal  to  the  chemical  valence  of  the  metal.  The  result  of  Ex- 
ample 23-7  is  far  enough  from  the  free-electron  result  to  cast  doubt  on  the 
free-electron  theory  taken  literally,  and  yet  it  is  close  enough  to  suggest  that 
the  theory  is  not  totally  wrong,  at  least  in  the  case  of  copper.  There  are  sim- 
ilar correspondences  between  the  free-electron  and  Hall  values  of  the  elec- 
tron concentrations  for  many  monovalent  and  some  divalent  metals.  So  we 
can  guess  that  the  electrons  in  such  metals  are  “nearly”  free.  This  guess  is 
substantiated  by  quantum  theory. 


23-4  The  Lorentz  Force  1085 


23-5  THE  BIOT- 
SAVART  LAW 


The  first  four  sections  of  this  chapter  have  been  devoted  to  exploring  the 
consequences  of  one  of  the  two  central  links  between  electric  charge  and 
magnetic  field:  An  electric  charge  experiences  a force  when  it  is  moving  in  a 
magnetic  field.  In  this  and  the  following  sections,  we  consider  the  second, 
equally  important  link  between  electric  charge  and  magnetic  field,  which 
adds  a measure  of  symmetry  to  the  first  one.  The  second  link  is  this:  When 
an  electric  charge  is  in  motion  — that  is,  when  an  electric  current  exists  — there  exists 
a magnetic  field  in  the  vicinity  of  the  current.  To  put  it  another  way,  an  electric 
current  produces  a magnetic  field. 

The  first  observation  of  the  association  of  a magnetic  field  with  an  elec- 
tric current  was  made  in  the  spring  of  1820  by  the  Danish  physicist  Hans 
Christian  Oersted  (1777-1851).  He  found  that  a magnetic  compass  was  de- 
flected when  it  lay  in  the  vicinity  of  a current-carrying  wire.  This  deflection 
implies  the  existence  of  a magnetic  field  in  the  region  around  the  wire.  If 
the  earth’s  magnetic  field  can  be  neglected,  the  compass  needle  will  line  up 
parallel  to  the  magnetic  field  associated  with  the  current,  with  its  North 
pole  pointing  in  the  direction  of  (B. 

A particular  magnetic  field  line  can  be  traced  out  by  moving  the 
compass,  always  following  the  direction  indicated  by  the  North  pole.  If  the 
wire  is  long  and  straight,  as  shown  in  Fig.  23-16,  and  carries  a constant  cur- 
rent, any  particular  magnetic  field  line  is  a circle  centered  on  the  wire  and 
lying  in  a plane  perpendicular  to  the  wire. 

Magnetic  field  lines  are  not  always  circular,  as  they  are  in  the  highly 
symmetrical  case  shown  in  Fig.  23-16.  But  they  are  always  closed  curves 
surrounding  the  current.  They  do  not  cross  themselves  or  one  another. 

If  the  sense  of  the  electric  current  through  the  wire  is  reversed  in 
Oersted’s  experiment,  so  is  the  sense  of  the  magnetic  field  lines.  The  rela- 
tion between  the  senses  of  the  electric  current  and  of  the  associated  mag- 
netic field  is  established  by  experiment.  As  usual,  we  define  the  sense  of  the 
electric  current  as  the  sense  of  motion  of  a positive  charge  around  the  cir- 
cuit of  which  the  wire  is  a part.  And  we  define  the  sense  of  the  magnetic 
field  line  as  that  in  which  the  North  pole  of  the  compass  needle  points.  The 
experimental  results  can  be  described  by  the  right-hand  rule  for  magnetic 
field  lines:  Place  your  right  hand  so  that  the  thumb  points  in  the  sense  of  the  cur- 
rent flow  in  a wire.  The  fi  ngers  then  curl  in  the  sense  of  the  magnetic  field  lines  en- 
circling the  wire.  This  is  illustrated  in  Fig.  23-17.  The  rule  can  be  used  to  de- 
termine the  sense  of  the  magnetic  field  lines  in  the  vicinity  of  a particular 
part  of  a wire  carrying  a current,  even  if  the  wire  is  not  straight. 

What  is  the  quantitative  connection  between  the  magnetic  field  vector 
(Bat  a certain  location  and  the  value  of  the  electric  current  with  which 
it  is  associated?  What  we  are  looking  for  is  a law,  analogous  to  the  form  of 
Coulomb’s  law  that  relates  the  electric  field  vector  8 at  a certain  location 
and  the  value  of  the  point-source  charge  with  which  it  is  associated.  (We 
have  often  used  the  possessive,  calling  8 the  “electric  field  of  the  charge.” 
Similarly,  we  will  often  call  (B  the  “magnetic  field  of  the  current.”) 

In  the  case  of  Coulomb’s  law,  we  are  able  to  begin  experimentally  with 
a good  approximation  of  the  simplest  case,  in  which  the  source  charge  q is 
concentrated  at  a single  point.  The  location  at  which  the  electric  field  is  to 
be  evaluated  can  then  be  described  by  a vector  r extending  from  the  source 
charge  to  that  location,  and  the  electric  field  is  then  given  by  Eq.  (20-20), 

o 1 q - 

8 = g r 

47T60  r 


1086  Magnetic  Fields,  I 


s 


(a) 

Fig.  23-16  (a)  Tracing  a magnetic  field 

line  which  is  part  of  a magnetic  field  as- 
sociated with  an  electric  current  flowing 
through  a long,  straight  wire.  A small 
magnetic  compass  is  placed  at  point  P,  a 
distance  R from  the  wire.  It  is  then 
moved  slowly  in  the  direction  in  which 
its  North  pole  points.  The  compass  will 
trace  out  a circle  of  radius  R in  a plane 
perpendicular  to  the  wire,  with  the  wire 
at  its  center,  (b)  Visualization  of  the 
magnetic  field  associated  with  an  electric 
current  in  a long,  straight  wire.  The 
wire  passes  through  a small  hole  in  a 
card  held  in  a plane  perpendicular  to 
the  wire.  As  in  the  photo  of  Fig.  23-6, 
iron  filings  have  been  shaken  onto  the 
card  and  have  clumped  along  magnetic 
field  lines. 


Fig.  23-17  The  right-hand  rule  for  re- 
lating the  sense  of  magnetic  field  lines  to 
the  sense  of  the  electric  current  with 
which  they  are  associated. 


N 


Wmm 

mmm- 


(b) 


Building  on  this  simplest  case,  we  are  then  able  to  evaluate  the  electric  held 
8 associated  with  more  complicated  charge  distributions.  This  is  done  by 
dividing  the  total  charge  q into  infinitesimal  charge  elements  dq,  each  of 
which  occupies  an  infinitesimal  region  of  space.  Each  of  these  elements 
makes  a contribution  d£  to  the  total  electric  held  8 at  the  location  of  inter- 
est. Each  contribution  d£>  is  given  by  Coulomb’s  law  in  the  form 


d 6 = 


1 dq  A 
47re0  r2  r 


where  r is  the  vector  from  the  particular  charge  element  dq  to  the  location 
at  which  8 is  to  be  evaluated.  We  then  assume  that  the  total  electric  held  8 is 
the  vector  sum  given  by  the  integral 

s = I dS 

all 

contributions 


This  assumption  is  amply  verihed  by  experimental  measurement. 

We  cannot  follow  an  analogous  procedure  in  finding  the  law  which  re- 
lates magnetic  held  to  electric  current,  because  there  is  no  such  thing  as  a 
“point  current”  on  which  the  analogue  of  the  Coulomb  experiment  can  be 
performed.  What  we  do,  instead,  is  to  work  backward.  We  begin  with 
experimental  observations  of  the  magnetic  held  (B  associated  with  a steady 
current  i in  a long,  straight  wire.  On  the  basis  of  these  observations,  we  de- 
duce mathematically  the  contribution  d( B to  the  total  magnetic  held  <B 


23-5  The  Biot-Savart  Law  1087 


from  each  infinitesimal  element  of  the  wire.  To  do  this,  we  assume  that  the 
total  magnetic  field  is  the  vector  sum  given  by  the  integral 

® = [ d(B 

all 

contributions 

For  the  moment,  the  justification  of  this  assumption  lies  in  its  close  analogy 
with  the  electric  field  integral  displayed  in  the  previous  paragraph.  Its  ulti- 
mate justification,  however,  must  lie  in  the  consistency  with  experimental 
results  of  predictions  made  of  the  basis  of  the  assumption. 

The  first  step,  then,  is  to  summarize  the  experimental  evidence. 
Oersted’s  results,  which  are  illustrated  in  Fig.  23-16,  were  confirmed  and 
extended  in  1820  by  the  French  physicists  Jean-Baptiste  Biot  (1774-1862) 
and  Felix  Savart  (1791-1841),  very  shortly  after  the  news  of  Oersted's  dis- 
covery reached  Paris.  Biot  and  Savart  took  advantage  of  the  fact  that  a 
compass  needle  is  a magnetic  dipole,  which  will  oscillate  in  a magnetic  field 
if  it  is  displaced  from  an  alignment  parallel  to  (B.  As  noted  in  the  discussion 
accompanying  Fig.  23-3,  this  behavior  is  analogous  to  that  of  an  electric  di- 
pole in  an  electric  field,  described  in  Sec.  2 1-4.  The  period  of  oscillation  de- 
pends on  the  magnitude  35  of  the  magnetic  field.  Thus  Biot  and  Savart 
could  measure  the  relative  strength  of  the  magnetic  field  as  a function  of 
electric  current  and  position. 

The  first  general  observation  is  that  at  any  point  in  the  vicinity  of  a 
wire  of  arbitrary  shape  carrying  a current  i,  the  magnitude  of  the  magnetic 
field  © is  directly  proportional  to  the  current;  that  is, 

35  a i (23-34) 

The  direction  © of  the  magnetic  field  at  a particular  point  does  not  change 
as  the  current  is  varied. 

The  second  observation  is  restricted  to  the  magnetic  field  ot  a current 
flowing  in  a long,  straight  wire;  that  is,  restricted  to  the  magnetic  field  pro- 
duced by  such  a current.  In  the  vicinity  of  the  wire,  and  far  from  its  ends,  35 
decreases  as  the  distance  from  the  wire  increases.  If  the  perpendicular  dis- 
tance from  the  wire  to  point  P in  Fig.  23-18  is  called  R , the  relation  between 
the  magnitude  of  the  magnetic  field  and  the  distance  is 

35  ^ — (23-35) 

A 

This  proportionality  is  one  which  by  now  you  may  have  come  to  expect  in 
any  situation  where  the  geometry  involves  cylindrical  symmetry.  The  same 
proportionality  is  found  in  the  case  of  the  electric  field  of  a long,  straight 
charged  wire  [Eq.  (20-48)]  and  for  the  current  density  in  a disk  carrying 
current  in  the  radial  direction  (Example  22-5).  As  far  as  magnitudes  are 
concerned,  the  magnetic  field  conforms  to  the  same  rule. 

We  now  wish  to  incorporate  the  proportionalities  of  Eqs.  (23-34)  and 
(23-35)  into  a relation  among  i,  the  vector  r describing  the  position  ot 
point  P with  respect  to  an  arbitrary  element  of  the  wire,  and  the  contribu- 
tion <7©  of  that  element  to  the  magnetic  field  © at  P.  It  is  reasonable  to  hope, 
at  least,  that  this  relation  will  be  as  similar  as  possible  to  Coulomb’s  law.  But 
the  directional  relationship  between  current  and  magnetic  field  is  not  as 
simple  as  the  corresponding  relationship  between  charge  and  electric  field 
in  Coulomb’s  law.  In  Coulomb’s  law,  the  direction  of  the  electric  field  is 


1088  Magnetic  Fields,  I 


© 


Fig.  23-18  The  direction  of  the  mag- 
netic field  © at  a point  P whose  location 
with  respect  to  a long,  straight  wire  is 
specified  by  the  vector  R.  The  sense  of 
the  current  i in  the  wire  is  shown. 


Fig.  23-19  An  electric  current  i flows 
through  a wire  of  arbitrary  shape  in  the 
sense  shown.  The  infinitesimal  segment 
ds  makes  a contribution  t/ffi  to  the  total 
magnetic  field  at  point  P,  whose  location 
witfi  respect  to  ds  is  specified  by  the 
vector  r. 


simply  outward  from  the  (positive)  source  charge.  In  the  magnetic  case, 
Oersted’s  experiment  tells  us  that  the  magnetic  held  direction  at  a point  P 
in  the  vicinity  of  a long,  straight,  current-carrying  wire  is  perpendicular 
both  to  the  general  direction  of  the  electric  current  and  to  the  direction 
from  the  nearest  point  on  the  wire  to  P.  This  is  illustrated  in  Fig.  23-18. 

Figure  23-19  shows  a wire  of  arbitrary  shape,  which  carries  a steady 
current  i.  The  infinitesimal  segment  ds  makes  a contribution  r/ffi  to  the  total 
magnetic  held  (B  at  point  P.  The  point  P is  located  at  the  end  of  the  vector  r 
originating  at  ds.  The  vector  ds  is  chosen  along  the  wire,  its  direction  corre- 
sponding to  the  sense  of  the  current.  The  magnitude  d53  of  the  contribu- 
tion must  be  proportional  to  the  current  i,  since  according  to  Eq.  (23-34), 
SS  is  proportional  to  i.  Thus  we  have 

d33  * i (23-36 a) 

Also,  it  seems  reasonable  to  assume — subject  to  later  verihcation — that  the 
magnitude  d'Sl  of  the  contribution  to  53  of  the  segment  ds  is  directly  propor- 
tional to  its  length  ds.  so  that 

<iS3  ^ ds  (23-36 b) 

Now  we  turn  to  the  observation  that  for  a long,  straight  wire,  S3  is  propor- 
tional to  r-1,  as  indicated  by  Eq.  (23-35).  In  going  from  the  long,  straight 
wire  to  the  inhnitesimal  segment  of  length  ds,  we  go  from  a geometric 
arrangement  having  cylindrical  symmetry  to  one  which  approximates  a 
point.  1 his  is  analogous  to  going  from  the  electric  field  % r-1  of  a long, 

straight,  charged  wire  to  the  electric  field  % « r~2  of  a point  charge.  So  it 
seems  fair  to  argue  (again  subject  to  later  verification)  that  the  magnitude 
dS3  of  the  contribution  of  the  segment  ds  to  the  total  field  at  point  P will  be 
proportional  to  r~2.  We  therefore  write 

dS3  a -5  (23-36c) 

r i 

So  far,  what  we  have  done  is  quite  analogous  to  the  argument  which 
was  made  concerning  Coulomb’s  law.  But  now  we  must  take  into  consider- 
ation the  directionality  of  the  relationship  between  magnetic  field  and  elec- 
tric current,  as  contrasted  to  the  isotropy  of  the  relationship  between  elec- 
tric field  and  electric  charge.  Given  the  mutual  perpendicularity  in  the 
Oersted  experiment  (Fig.  23-18)  of  the  general  direction  of  current  flow, 
the  vector  R from  the  long,  straight  wire  to  the  point  P,  and  the  direction  of 
the  field  (B,  the  simplest  possible  relation  is  that  of  the  cross  product.  This 
can  be  seen  in  the  case  of  the  wire  taken  as  a whole,  as  follows.  Taking  ad- 
vantage of  the  fact  that  the  wire  is  straight,  we  can  assign  a direction,  and  not 
merely  a sense,  to  the  current  in  the  wire.  (We  assume  that  the  remainder 
of  the  circuit,  where  the  current  must  flow  in  other  directions,  is  so  far  away 
that  it  makes  no  significant  contribution  to  the  magnetic  field  at  P.)  If  we 
call  that  direction  x,  it  is  evident  from  Fig.  23-18  that  the  directions  x,  R, 
and  (B  are  properly  related  by  the  cross-product  relation 

(B  = x x R 

In  the  case  of  a long,  straight  wire,  all  the  elements  ds  of  the  wire  have 
the  same  direction.  We  assume,  however,  that  this  cross-product  relation  for 
direction  is  valid  for  a wire  element,  regardless  of  whether  it  is  part  of  a 
straight  wire  or  a wire  of  some  other  shape.  (This  assumption  must  be  justi- 
fied ultimately  by  experimental  verification  of  the  conclusions  based  on  it.) 


23-5  The  Biot-Savart  Law  1089 


We  denote  the  unit  vector  in  the  direction  of  a particular  element  ds  by 
means  of  the  symbol  ds.  (Note  that  the  “hat,”  denoting  a unit — and  there- 
fore finite — vector  goes  over  the  entire  symbol  ds,  and  not  over  the  s only. 
This  is  because  we  have  defined  an  infinitesimal  vector  ds,  which  has  a defi- 
nite direction.  But  there  is  no  such  thing  as  an  “infinitesimal  direction” 
“ds.”)  If  the  assumption  we  have  just  made  is  correct,  we  can  write  the 
directional  relationship 

d(B  * ds  x r (23-36d) 


That  is,  the  direction  of  d(B  is  the  same  as  the  direction  specified  by  the 
cross  product  ds  x r,  and  its  magnitude  d38  is  proportional  to  the  magni- 
tude of  ds  x r.  That  magnitude,  which  is  |ds  X rj  = 1 sin  6 1 = sin  6,  is  the 
sine  of  the  angle  shown  in  Fig.  23-19. 

Combining  the  proportionalities  of  Eqs.  (23-36o)  through  (23-36d) 
yields  the  proportionality 


d(S>  on 


i ds 


ds 


x r 


Using  the  identity  ds  ds  = ds,  we  can  write  this  proportionality  as 


d( B * \ ds  x f 

r 


(23-37) 


As  usual,  it  is  preferable  to  express  this  relationship  as  an  equality  rather 
than  a proportionality.  In  the  SI  system,  the  proportionality  constant  is 
chosen  so  that  the  equality  has  the  form 

<7( B = ds  x r (23-38a) 

where  the  constant  /jl0  is  defined  to  be  exactly 

tio  = 4tt  x 1(T7  T-m/A  (23-38 b) 

Equation  (23-38a)  is  called  the  Biot-Savart  law. 


The  Biot-Savart  law  is  the  magnetic  analogue  of  Coulomb’s  law  for 
electric  fields.  This  is  made  evident  by  writing  the  two  laws  together  in  the 
following  forms: 


dS,  = 


1 dq 


47re0 


dffi 


/ (JL0  ids  \ 
1,477  r2  / 


ds  x r 


The  scalar  parts  of  the  two  equations  (the  terms  in  parentheses)  are  iden- 
tical in  mathematical  form.  The  vector  parts  are  not.  In  particular,  the  elec- 
tric held  has  the  same  magnitude  at  all  points  equidistant  from  the  source 
charge  dq,  regardless  of  direction.  In  contrast,  the  segment  ds  makes  a mag- 
netic held  contribution  of  maximum  magnitude 


d»  = 

477  r~ 


in  the  plane  perpendicular  to  the  direction  ds  of  the  segment,  where  ds  x r 
has  its  maximum  magnitude. 

The  term  ds  x r in  the  Biot-Savart  law  specifies  the  direction  of  the 
magnetic  held  contribution  d( B.  But  it  also  introduces  a magnitude  factor. 
Remember  that  |ds  X r|  = sin  6,  where  6 is  the  (smaller)  angle  between  the 
directions  ds  and  r.  Thus  the  magnitude  of  d( B depends  on  6 as  well  as  on 


the  magnitudes  i,  ds,  and  r.  When  this  angle  is  0°  or  180° — that  is,  at  loca- 
tions directly  “ahead  of”  or  “behind”  the  segment  ds — the  current  in  the 
segment  makes  no  contribution  at  all  to  the  magnetic  field. 


The  constant  yu.0  is  called  the  permeability  of  free  space.  It  is  the  mag- 
netic analogue  of  the  electric  quantity  e0,  the  permittivity  of  free  space. 
However,  there  is  a difference  in  the  way  their  values  are  determined.  In  SI, 
the  value  of  the  permeability  of  free  space  is  chosen  by  an  arbitrary  defini- 
tion, and  the  value  given  by  Eq.  (23-38 b)  is  therefore  exact.  The  value  of  e0 
is  not  chosen  arbitrarily;  it  is  determined  experimentally.  One  way  to  do  so 
is  by  means  of  careful  measurement  of  the  capacitance  of  a precision 
guard-ring  capacitor,  as  described  in  Sec.  21-5.  However,  it  is  shown  in  Sec. 
27-4  that  there  is  a remarkable  relation  among  the  quantities  ya,0,  e0,  and  the 
speed  of  light  c.  That  relation  is 


(23-39) 


The  speed  of  light  can  be  measured  with  great  precision.  Since  /x0  has  a 
value  assigned  by  definition,  the  value  of  e0  given  by  Eq.  (23-39)  written  in 
the  form  e0  = l//u,0c2  is  determined  as  precisely  as  the  value  of  c2. 


Example  23-8  applies  the  Biot-Savart  law  to  the  case  of  a long,  straight, 
current-carrying  wire.  The  result  is  therefore  directly  useful  in  verifying 
the  assumptions  underlying  the  Biot-Savart  law,  since  it  makes  a prediction 
on  the  basis  of  that  law  which  can  be  compared  with  the  results  of  direct 
experimental  observation  on  a long,  straight  wire  carrying  a current. 


EXAMPLE  23-8 

A very  long,  straight  wire  carries  a steady  current  i = 15  A.  Find  the  magnitude  and 
direction  of  its  magnetic  field  at  a distance  R = 5.0  cm  from  the  wire,  using  the 
Biot-Savart  law,  Eq.  (23-38a). 

■ Make  a sketch  of  the  arrangement  as  in  Fig.  23-20.  The  current  i flows  from 
left  to  right  along  the  wire,  which  lies  along  the  x axis.  The  point  P , at  which  the 
magnetic  field  is  to  be  found,  is  located  a distance  R from  the  wire.  The  x coordinate 


Fig.  23-20  Diagram  for  Example  23-8,  in  which  the  result  of 
Oersted's  experiment  is  shown  to  be  consistent  with  the  assumptions 
underlying  the  Biot-Savart  law. 


P 


23-5  The  Biot-Savart  Law  1091 


of  P is  chosen  to  be  % = 0.  Connect  the  typical  infinitesimal  segment  ds  of  the  wire  to 
P with  the  vector  r.  Note  that  r is  drawn  from  the  segment  (which  is  making  the  con- 
tribution d( B)  to  P. 

Since  the  cross  product  ds  x r is  directed  perpendicularly  out  of  the  plane  of 
the  page,  that  will  be  the  direction  of  d(R.  The  magnitude  d3l  is  given  by  the 
Biot-Savart  law,  Eq.  (23-38a).  Since  ds  has  the  direction  of  the  positive  x axis,  you 
have  ds  = dx.  Hence  you  can  write  \ds  x r|  = ds  sin  9 = dx  sin  9,  where  9 is  the 
angle  between  ds  and  r.  So  the  law  gives 

Ho  1 

d3i  = dx  sin  9 (23-40) 

47t  r 2 

You  must  now  express  dSft  in  terms  of  a single  independent  variable.  If  you  choose 
the  coordinate  x for  this  purpose,  you  have 

r2  = x2  + R2 


and 


Thus  you  have 


R 

sin  9 = — 
r 


R 

(x2  + R2)1'2 


d$l 


Ho>  R dx 
~4tt  (x2  + R2)312 


While  other  segments  ds  located  elsewhere  along  the  wire  will  not  make  con- 
tributions of  equal  magnitude  to  the  field  ® at  P,  their  contributions  will  all  be  in  the 
same  direction.  Therefore  the  magnitude  of  the  field  vector  ® can  be  found  by 
taking  the  sum  of  the  magnitudes  dSft  of  the  infinitesimal  vectors  d(S>  — that  is,  by 
integrating.  You  have 


58 


d'Sl 


f°°  Hoi  R dx 

J_„  4^  (x2  + R2)312 


all 

contributions 


Ho'R  f°°  dx 

4tt  J_„  (x2  + R2)312 


(23-41) 


(No  real  wire  is  infinitely  long.  But  the  approximation  of  infinite  length  is  a good 
one  if  the  length  of  the  straight  wire  is  much  greater  than  R and  the  point  P is  far 
from  either  end.)  Referring  to  a table  of  integrals,  you  evaluate  the  last  integral  in 
Eq.  (23-41)  and  find 


477  l 


1 


L/r-  (x2  + R2)112 


1 


R2  (x2  + fi2)1,2J 


Hot 
4t tR 


[1  - ("I)] 


or 


58 


Mo  i 
2^R 


(23-42 a) 


The  direction  oi  ® is  outward  from  the  page,  as  shown  in  Fig.  23-20.  II  you  wish  to 
express  the  direction  explicitly,  you  can  write  the  result  in  the  vectorial  form 


ffi  = -p—  ds  x R (23-426) 

zttR 

where  ds  is  the  direction  along  the  wire  in  which  the  current  flows  and  R is  the 
direction  from  the  wire  to  point  P along  the  perpendicular  to  the  wire. 

You  are  now  ready  to  insert  the  numerical  values  into  this  equation.  You  have 


4tt  x 10“7  T-m/A  x 15  A 
2 77  x 5.0  X 10“2  m 


6.0  x 10-5  T 


1092  Magnetic  Fields,  I 


In  everyday  terms,  this  is  a fairly  small  magnetic  field  (having  about  the  same  mag- 
nitude as  that  of  the  earth),  and  it  is  associated  with  a reasonably  large  electric  cur- 
rent flowing  at  a modest  distance  from  P.  Thus,  passing  a current  through  a straight 
wire  is  not  an  efficient  way  of  generating  a magnetic  field. 


Equations  (23-42a)  and  (23-426)  give  a result  based  on  the  Biot-Savart 
law.  To  test  the  validity  of  that  law,  they  must  now'  be  compared  with  exper- 
imental results.  These  experimental  results  are  summarized  in  proportion- 
alities of  Eq.  (23-34),  38  oc  i,  and  of  Eq.  (23-35),  38  « R~1j  which  can  be  com- 
bined to  give 


Equation  (23-42a)  does,  indeed,  conform  to  this  result.  The  remaining 
factor,  /j-o/2tt  is  determined  by  the  geometry  of  the  situation  and  the  value 
/x0/477  assigned  to  the  constant  in  the  Biot-Savart  law',  Eq.  (23-38).  Equa- 
tion (23-426)  hears  a strong  resemblance  to  Eq.  (20-48),  the  expression  for 
the  electric  field  of  a very  long  straight  w'ire  carrying  an  electric  charge  A. 
per  unit  length.  This  is  made  evident  by  writing  them  side  by  side: 


8 = 


(—A 

\27reo  R 


R 


(B  = 


( P-0  j 
\2t tR 


ds  x R 


The  Biot-Savart  law  can  also  be  applied  to  situations  of  lower  sym- 
metry than  that  of  an  infinitely  long  wire.  Example  23-9  illustrates  such  a 
case. 


EXAMPLE  23-9  ^ — '"« 

Figure  23-2 1 shows  a circular  loop  of  radius  k , made  of  fine  copper  wire.  The  loop  is 
broken  at  one  point,  and  two  closely  spaced  parallel  wares  are  joined  to  the  broken 


Fig.  23-21  Diagram  for  Example 
23-9,  in  which  the  Biot-Savart  law 
is  used  to  And  the  magnetic  field  <B 
at  a point  on  the  axis  of  a 
current-carrying  loop. 


23-5  The  Biot-Savart  Law  1093 


ends,  so  that  a current  i can  be  made  to  flow  around  the  loop  in  a nearly  complete 
circle.  Use  the  Biot-Savart  law  to  find  the  magnetic  field  at  a point  along  the  axis  of 
the  loop,  a distance  z from  its  center. 

■ You  note  first  that  the  contribution  to  the  magnetic  field  of  the  pair  of  lead 
wires  can  be  ignored,  at  least  to  a first  approximation.  The  two  wires  carry  equal 
currents  in  opposite  directions,  and  they  are  nearly  coincident  with  each  other.  The 
contributions  of  the  two  wires  to  the  total  magnetic  field  at  any  point  will  thus  nearly 
cancel,  and  only  the  circular  loop  need  be  considered. 

Just  as  was  done  for  the  straight  wire  of  Example  23-8,  you  divide  the  loop 
into  infinitesimal  segments  ds.  One  such  segment  is  shown  in  Fig.  23-21.  The  point 
P on  the  axis  of  the  loop  is  located  at  the  end  of  the  vector  r,  which  originates  at  ds. 
Regardless  of  the  location  of  P,  r is  perpendicular  to  ds.  However,  it  makes  an  angle 
(J)  with  the  z axis,  which  is  perpendicular  to  the  plane  of  the  loop. 

According  to  the  Biot-Savart  law,  the  contribution  d®  which  the  segment  ds 
makes  to  the  field  ® at  P must  be  directed  perpendicular  to  both  ds  and  r.  Thus  the 
vector  d(S>  is  tilted  above  the  plane  which  passes  through  P and  is  parallel  to  the 
plane  of  the  loop.  The  tilt  angle  is  <j>.  You  can  see  this  by  comparing  the  direction  of 
r/®  with  that  of  the  dashed  line  xx',  which  is  parallel  to  the  radius  k. 

You  can  describe  the  vector  d®  by  its  two  components.  One  of  them,  d@Sz , is  the  z 
component.  The  other,  d£&x,  is  the  component  in  the  direction  xx'.  perpendicular  to 
the  z axis.  You  now  use  a symmetry  argument  to  show  that  the  contributions  d2ft±  of 
all  the  segments  comprising  the  loop  add  to  zero.  The  argument  is  the  same  as  that 
used  in  Example  20-7  to  find  the  electric  field  of  an  electrically  charged  loop.  Con- 
sider the  segment  ds1  which  lies  opposite  ds  on  the  loop.  Its  contribution  r/®'  at  P 
will  also  be  directed  at  an  angle  4>  above  the  line  xx’.  It  is  evident  from  the  diagram 
that  the  component  is  a signed  scalar  equal  in  magnitude  but  opposite  in  sign  to 
the  component  d2ft±.  Thus  the  two  components  will  cancel.  However,  the  z compo- 
nents d3Pz  and  d£ftz  will  be  equal  in  magnitude  and  will  have  the  same  direction  along 
the  z axis.  Thus  they  will  add  as  signed  scalars  of  the  same  sign.  The  same  argument 
applies  to  every  pair  of  opposite  segments  comprising  the  loop.  Therefore  the  total 
field  vector  ® lies  along  the  z axis,  and  its  magnitude  is  the  algebraic  sum  of  all  the 
contributions  d^z. 

The  figure  shows  that  each  of  the  contributions  d2ftz  can  be  written 

d3iz  = dM  sin  <j> 

And  you  can  use  the  identity  ds  = ds  ds  to  write  the  Biot-Savart  law  in  the  form 


You  thus  have 


d(R 


Mo  i . 

z as  X r 

477  r2 


/u-o  i ds 

Hr 


ds  x r 


d2ftz 


Un  l ds  A U0l  ds 

— sin  </>  ds  x r = — — — sin  4>  (sin  90  ) 

477  r2  477 r 


fj.0i  ds  sin  cj> 
477 r2 


To  find  <3i.  the  magnitude  of  the  total  magnetic  field  vector,  you  integrate  around  the 
loop  and  obtain 


/ji0i  sin  (j)  /u-o?  sin  <£ 

ds  = 


4777- 


477  72 


ds 


loop 


loop 


But  integrating  the  length  element  ds  around  the  loop  just  gives  you  the  circumfer- 
ence 2Trk  of  the  loop,  so  you  have 


/i0i  sin  </> 
4t772 


2 77* 


fx0ik  sin  <p 
2 r2 


(23-43) 


You  can  evaluate  sin  4>  and  r2  in  terms  of  z and  k.  You  have 


r2  = z2  + k 2 and 


k _ k 

sin  (b  — — ~ — Tjr 

v r (z2  + k2)112 


1094 


Magnetic  Fields,  I 


Thus  you  can  rewrite  Eq.  (23-43)  in  the  form 

fjb0ik2  1 


(23-44a) 


2 (z2  + k2)312 


Taking  the  direction  of  (B  explicitly  into  account,  you  have 

fji0ik2  1 

® 2 (z2  + k2)312  Z 

where  z is  a unit  vector  in  the  direction  of  the  positive  z axis. 


(23-446) 


It  is  instructive  to  consider  the  result  obtained  in  Example  23-9  for  the 
case  of  axial  distances  from  the  loop  which  are  large  compared  to  the  loop 
radius,  so  that  z » k.  In  such  a case,  (z2  + k2)312  — z3,  and  Eq.  (23-446)  sim- 
plifies to  the  form 

®=^z  (23-45) 

2 z 3 


The  dependence  of  (B  on  z~3  is  reminiscent  of  the  behavior  of  the  electric 
field  8 of  an  electric  dipole  at  sufficiently  great  distances  from  the  dipole. 
(It  can  be  shown  that  the  inverse-cube  behavior  of  the  magnetic  field  holds 
at  points  not  on  the  axis  of  the  loop  as  well,  but  we  do  not  do  so  here.)  In 
order  to  make  a direct  comparison  between  the  electric  and  magnetic 
fields,  we  can  use  Eqs.  (21-29)  to  express  8 at  a distant  point  on  the  z axis. 
The  result  is 


8 = %zz  = 


1 P „ 

2 7re0  z3  Z 


_J_  P 

2 7re0  z3 


(23-46a) 


Here  the  electric  dipole  moment  p is  defined  by  the  equation 

p = \q\2d  z (23-46 6) 

w here  \q\  is  the  magnitude  of  each  of  the  two  opposite  electric  charges  com- 
prising the  dipole  and  dz  = d+  is  the  vector  in  Eq.  (21-31)  extending  from 
the  center  of  the  dipole  to  the  positive  charge.  The  comparison  between 
the  electric  and  magnetic  cases  is  most  striking  if  we  define  a magnetic 
quantity  analogous  to  the  electric  dipole  moment.  To  do  so,  let  us  deliber- 
ately rewrite  Eq.  (23-45)  so  that  it  looks  just  like  Eq.  (23-46a).  Instead  of  the 
electric  dipole  moment  p,  we  invent  an  analogous  quantity  m,  which  we  call 
the  magnetic  dipole  moment.  Copying  Eq.  (23-46a),  but  substituting  m for 
p,  <B  for  8,  and  ix0/2tt  for  \/2tt€0,  we  have 


<B=^™  (23-47a) 

2-tt  rJ 

What  must  the  quantity  m be?  Comparing  Eq.  (23-47a)  with  Eq.  (23-45) 
gives 

m = i(  Trk2)z 

And  since  ttK2  is  equal  to  the  area  a enclosed  by  the  current-carrying  cir- 
cular loop,  the  magnetic  dipole  moment  can  be  written  in  the  form 

m = iaz  (23-476) 


23-5  The  Biot-Savart  Law  1095 


In  Sec.  24-3,  it  is  shown  that  this  definition  is  valid  regardless  of  the  shape 
of  the  current  loop.  There  is  thus  a close  analogy  between  the  fields  of  cur- 
rent loop  and  electric  dipole  for  the  case  z >S>  k.  This  analogy  holds  as  well 
for  off-axis  points  whose  distance  from  the  current  loop  is  large  compared 
to  the  characteristic  dimension  k. 

It  is  important  to  remember  that  this  close  analogy  between  electric  and  mag- 
netic systems  is  valid  only  at  large  distances.  At  the  origin,  to  the  contrary,  the 
fields  of  the  electric  and  magnetic  systems  are  not  only  different  in  magnitude,  but 
also  opposite  in  direction!  To  show  this,  write  Eq.  (23-44b)  for  the  casez  = 0.  You 
have 

p,„i  , m 

<B  = — — z = — — 

2k  2nk3 

The  electric  field  found  by  applying  Coulomb’s  law  at  the  point  midway  between 
the  two  charges  of  an  electric  dipole  is 

g = 4 - = __f P_ 

27re0d2  27re0  2d3 

The  field  magnitudes  differ  by  a factor  of  2,  relative  to  the  fields  at  distant  points. 
The  opposite  directions  of  the  two  fields  are  consistent  with  the  observations  we 
made  in  Sec.  23-1  concerning  the  magnetic  field  inside  a bar  magnet. 

In  Sec.  24-3,  you  will  see  that  a current-carrying  loop  behaves  exactly  like  a 
bar  magnet  having  the  same  magnetic  dipole  moment  and  oriented  along  the  axis 
of  the  loop. 

23-6  AMPERE’S  LAW  The  Biot-Savart  law  is  the  magnetic  analogue  of  the  electric  Coulomb’s  law. 

Like  Coulomb’s  law,  it  is  perfectly  general.  It  can  be  used  to  find  the  mag- 
netic field  vector  (B  at  any  point  in  the  vicinity  of  any  steady  electric-current 
configuration,  no  matter  liow  twisted  the  path  of  the  current  may  be  and 
even  if  there  are  several  independent  currents.  But  just  as  in  the  case  of 
Coulomb’s  law,  the  calculations  can  become  quite  difficult.  Indeed,  the 
mathematical  difficulties  are  usually  greater  than  those  arising  in 
Coulomb’s-law  calculations,  owing  to  the  presence  of  the  vector  product  in 
the  Biot-Savart  law.  In  complicated  cases,  numerical  integration  becomes 
the  preferable  (or  the  only  possible)  approach. 

To  continue  the  path  of  reasoning  by  analogy  with  the  electric  case,  it 
appears  desirable  to  develop  a magnetic  analogue  of  Gauss’  law.  While 
Gauss’  law  is  logically  equivalent  to  Coulomb’s  law,  it  makes  the  calculation 
of  electric  fields  remarkably  easy  in  cases  of  high  enough  symmetry.  Even 
more  importantly,  Gauss’  law  provides  a new  and  valuable  viewpoint  for 
the  understanding  of  electric  fields. 

Our  argument  by  analogy  proceeds  in  two  steps.  The  first  step,  based 
directly  on  the  reasoning  which  led  to  Gauss’  law  for  electric  fields,  leads  to 
an  important  statement  of  the  properties  of  magnetic  fields,  called  Gauss’ 
law  for  magnetic  fields.  This  law,  however,  is  not  useful  for  calculation  and 
does  not  lead  to  further  insights  into  the  connection  between  electric  cur- 
rents and  magnetic  fields.  The  second  step,  based  on  a somewhat  freer 
analogy,  leads  to  the  desired  law,  called  Ampere’s  law. 

Gauss’  law  for  electric  fields  is  given  by  Eqs.  (20-36)  and  (20-37),  which 
can  be  rewritten  slightly  in  the  form 

e0  J S • da  = e0<Pe  = q (23-48) 


1096  Magnetic  Fields,  I 


closed 

surface 


da 


Fig.  23-22  Diagram  for  the  definition 
of  the  magnetic  flux  . Shown  is  a sur- 
face element  oi  infinitesimal  area  da. 
The  surface  element  is  represented  by  a 
vector  da.  whose  magnitude  is  da  and 
whose  direction  da  is  normal  to  the  sur- 
face. (Because  the  element  is  infinites- 
imal, it  is  essentially  planar  and  the 
normal  direction  is  essentially  the  same 
over  the  entire  element.)  The  element  is 
part  of  a closed  surface  and  the  direc- 
tion da  is  chosen  outward.  The  surface 
element  is  penetrated  by  a magnetic 
field  whose  local  value  is  (B.  (Again  be- 
cause of  the  infinitesimal  extent  of  the 
element,  the  magnitude  and  direction 
of  © are  constant  over  the  element.) 
The  angle  between  ffi  and  da  is  6.  The 
magnetic  flux  penetrating  the  ele- 
ment is  d<t>m  = © • da  = 33  cos  0 da. 


Here  <t>e  is  the  total  electric  flux  penetrating  outward  from  the  gaussian 
surface  which  encloses  a region  of  space,  and  q is  the  total  electric  charge 
contained  in  that  region.  In  analogy  to  the  electric  flux,  we  will  define  a 
magnetic  flux  We  begin  by  defining  an  infinitesimal  magnetic  flux  ele- 
ment J>m.  In  Fig.  23-22,  the  vector  da  represents  an  infinitesimal  surface 
element  of  area  da.  The  orientation  of  the  element  is  specified  by  the  direc- 
tion of  da,  as  explained  in  the  figure  caption.  The  magnetic  field  vector  © at 
the  location  of  da  is  shown.  The  infinitesimal  magnetic  flux  element  dd>,„  is 
defined  as  the  dot  product  of  © and  da: 

d<t>m  = ffi  • da  = 2ft  cos  6 da  (23-49) 

where  6 is  the  smaller  angle  between  the  two  vectors.  This  is  completely 
analogous  to  the  definition,  given  in  Eq.  (20-29),  of  the  electric  flux  element 
d<t>e: 

d<t>e  — £ • da  = % cos  6 da  (23-50) 

The  total  magnetic  flux  penetrating  a surface  is  found  by  adding  by  in- 
tegration the  contributions  of  the  magnetic  flux  elements  over  the  surface. 
In  particular,  for  a closed  gaussian  surface,  the  magnetic  flux  is  given  by 

4>m  = j d<i>m  = j ffi  • da  (23-51  a) 

closed  closed 

surface  surface 

According  to  Eq.  (23-49),  the  SI  unit  of  magnetic  flux  must  be  teslas  times 
square  meters.  This  unit  is  called  the  weber  (Wb),  so  that 

1 Wb  = 1 T-m2  (23-516) 

The  weber  is  named  after  the  German  physicist  W.  E.  Weber  (1804-1891), 
who  was  the  first  to  define  a unit  of  electric  current  and  one  of  the  first  to  measure 
experimentally  the  quantity  which  in  modern  units  is  written  /u.0e0. 

Let  us  apply  Eq.  (23-5  la)  to  the  relatively  simple  case  of  the  magnetic 
field  in  the  vicinity  of  a very  long,  straight,  current-carrying  wire.  That  is, 
let  us  evaluate  the  total  magnetic  flux  d>,„  penetrating  a gaussian  surface 
surrounding  the  wire.  In  Fig.  23-23,  a cylindrical  surface  of  length  L and 


23-6  Ampere’s  Law  1097 


radius  R has  been  drawn  with  its  axis  along  the  wire.  An  electric  current  i 
flows  in  the  wire  in  the  sense  specified  by  the  unit  vector  ds.  The  magnetic 
field  lines  are  circles  lying  in  planes  perpendicular  to  the  wire  and  sur- 
rounding it.  At  a point  on  the  curved  surface  of  the  cylinder  whose  position 
with  respect  to  the  wire  is  specified  by  the  perpendicular  vector  R,  Eq. 
(23-426)  gives  the  magnetic  field  as 

® = ds  x R (23-52) 

J.TT1X 

Thus  ® is  perpendicular  to  R and  lies  on  the  surface.  But  the  surface  ele- 
ment vector  da  at  that  point  is  parallel  to  R and  therefore  perpendicular  to 
®.  It  follows  that  for  any  surface  element  da  on  the  curved  surface 

® • da  = 0 (23-53) 

A similar  argument  can  be  made  for  surface  element  da  located  on  one  of 
the  flat  ends  of  the  gaussian  cylinder.  In  Fig.  23-23  a circular  magnetic  field 
line  is  shown  at  this  surface.  The  magnetic  field  vector  (B  at  any  point  on 
the  flat  end  lies  in  the  surface,  perpendicular  to  the  wire.  But  the  surface 
element  vector  da,  being  parallel  to  the  wire,  is  perpendicular  to®.  Hence 
again  ® • da  = 0. 

Thus  for  the  entire  cylindrical  gaussian  surface,  Eq.  (23-5 la)  gives 

< Vm=  I ® • da  = 0 (23-54) 

closed 

surface 

The  total  magnetic  flux  penetrating  a closed  surface  is  zero.  This  is  Gauss’  law  for 
magnetic  fields.  It  is  equivalent  to  the  statement  that  there  are  no  magnetic 
charges  within  the  cylinder.  (To  see  this,  consider  how  Gauss’  law  for  electric 
fields  relates  the  electric  flux  penetrating  a gaussian  surface  to  the  total 
electric  charge  enclosed  within  it.)  Gauss’  law  for  magnetic  fields  is  consist- 
ent with  the  more  general  fact  that  magnetic  monopoles  have  never  been 
observed.  It  is  expressed,  however,  in  terms  of  the  fact  that  unlike  electric 
field  lines,  which  begin  at  source  charges,  magnetic  field  lines  are  always 
closed  curves. 

To  point  this  out  and  to  generalize  Eq.  (23-54)  to  less  symmetric  cases, 
we  recalculate  the  magnetic  flux  penetrating  a gaussian  surface  sur- 
rounding the  long,  straight  wire,  but  with  the  gaussian  surface  distorted 
into  an  arbitrary  shape.  Figure  23-24#  is  a perspective  view  of  the  arrange- 
ment, while  Fig.  23-246  is  a view  of  a perpendicular  cross  section  through 
the  wire,  as  seen  from  the  end  of  the  wire.  The  magnetic  field  lines  are  still 
circular,  but  the  gaussian  surface  does  not  have  a circular  cross  section.  A 
typical  field  line  is  shown.  It  penetrates  the  surface  at  four  places.  At  these 
locations,  d<t>m  = ® • da  is  not  equal  to  zero.  But  here  we  can  use  the  power 
of  the  concept  underlying  Gauss’  law  to  avoid  a laborious  integration. 
While  the  field  line  penetrates  the  gaussian  surface  four  times,  it  does  so 
twice  outward  and  twice  inward.  Thus  the  net  number  of  outward  penetra- 
tions is  zero.  This  result  of  zero  net  outward  penetrations  holds  for  any 
closed  field  line  penetrating  any  closed  surface.  Since  field  lines  represent 
flux,  there  is  no  net  penetration  of  magnetic  flux  into  or  out  of  the  closed 
gaussian  surface;  all  the  magnetic  flux  that  “comes  out”  also  “goes  back  in.” 
So  the  total  flux  <t»m  penetrating  an  arbitrary  gaussian  surface  is  zero,  and 
Gauss’  law  of  magnetism  is  a direct  consequence  of  the  fact  that  magnetic 
field  lines  are  always  closed  curves. 


Fig.  23-24  Demonstration  of  Gauss’  law  for  mag- 
netic fields  for  a more  general  case.  Here  the  mag- 
netic flux  penetrating  the  surface  is  not  zero 
everywhere,  as  it  is  in  Fig.  24-23.  Nevertheless,  the 
net  flux  4>m  penetrating  the  entire  surface  is  zero. 
(a)  Perspective  view  of  an  irregular  gaussian  sur- 
face enclosing  a section  of  a long,  straight  wire,  (b) 
Cross-sectional  view  of  the  surface  in  part  a. 


Gauss’  law  for  magnetic  fields  is  far  from  trivial,  since  it  states  a very 
important  physical  observation.  But  it  is  of  little  use  in  making  magnetic- 
field  calculations.  We  therefore  search  for  an  alternative.  We  have  just  seen 
that  it  cannot  be  completely  parallel  to  Gauss’  law  for  electric  fields.  But 
such  an  alternative  should  have  some  resemblance  to  Gauss’  law  for  electric 
fields,  in  the  sense  that  it  should  have  the  general  form 

| ® • (/(geometric  quantity)  a i (23-55) 

closed 

geometric 

figure 


Fig.  23-25  End  view  of  a long,  straight 
wire  carrying  current  i into  the  page.  A 
circular  path  of  radius  R is  shown  en- 
circling the  wire.  The  sense  of  the  path, 
shown  by  the  arrowhead,  is  chosen  in 
accordance  with  the  right-hand  rule  for 
magnetic  field  lines  of  Fig.  23-17,  so  that 
it  will  be  the  same  as  the  sense  of  the 
magnetic  field  encircling  the  wire.  An 
infinitesimal  segment  of  the  path,  speci- 
fied by  the  vector  dl,  is  shown  at  a loca- 
tion specified  by  the  vector  R.  By  sym- 
metry and  because  of  the  way  in  which 
the  sense  of  the  path  (and  hence  the 
positive  direction  for  dl)  have  been 
chosen,  the  magnetic  field  ® is  every- 
where parallel  to  dl. 


Compare  this  general  form  with  that  of  Eq.  (23-48),  Gauss’  law  for  electric 
fields. 

A strong  hint  as  to  how  to  make  the  general  mathematical  relation  of 
Eq.  (23-55)  more  specific  can  be  found  in  Eq.  (23-52), 

® = ^ ds  x R 
luti 

This  equation  relates  the  magnetic  field  in  the  vicinity  of  a long,  straight 
wire  to  the  current  flowing  through  the  wire.  Considering  magnitudes  only 
and  multiplying  both  sides  of  the  equation  by  the  quantity  2ttR,  we  have 

58(27 tR)  = Hoi  (23-56) 

The  quantity  27 tR  is  simply  the  circumference  of  the  circular  path  sur- 
rounding the  wire  at  a distance  R.  As  R increases,  this  circumference  in- 
creases in  direct  proportion.  But  the  magnitude  of  the  magnetic  field  de- 
creases, since  38  i?-1.  Thus  the  product  on  the  left  side  of  Eq.  (23-56) 

remains  constant  and  equal  to  Hoi.  Figure  23-25  is  an  end  view  of  the  wire, 
showing  a particular  circular  path  encircling  it  at  a distance  R.  The  sense  of 
the  path  is  chosen  in  the  manner  described  in  the  caption.  The  path  is  di- 
vided into  infinitesimal  segments  dl,  one  of  which  is  shown.  The  magnetic 
field  vector  ® at  the  location  of  that  segment  is  also  shown.  Because  of  the 
symmetry  of  the  situation,  ® is  parallel  to  dl  and  has  the  same  magnitude 
everywhere  along  the  path.  That  is,  we  have  chosen  a particular  path 
around  the  wire  which  coincides  with  a field  line.  Consequently.®  • dl  = 


23-6  Ampere’s  Law  1099 


2J&  d\,  and  since  2ft  has  the  same  value  everywhere  on  the  circular  path,  we 
can  write 

| 65  • dl  = J 2ft  dl  = 2ft  j dl 

circular  circular  circular 

path  of  path  of  path  of 

radius  R radius  R radius  R 

But  the  last  integral  is  just  the  circumference  2ttR  of  the  circular  path.  So 
we  have 

| (B  • dl  = 2ft(2TrR) 

circular 
path  of 
radius  R 

Hence,  for  this  particular  path  of  high  symmetry,  Eq.  (23-56)  can  be 
written 

| (B  • dl  = (23-57) 

circular 
path  of 
radius  R 

Comparing  this  equation  with  the  proportionality  in  Eq.  (23-55)  shows  that 
we  are  on  the  right  track.  Equation  (23-57)  certainly  does  relate  the  mag- 
netic held  vector  (B  to  the  current  i for  this  very  special  case.  Can  Eq. 
(23-57)  be  generalized  so  that  it  applies  to  an  arbitrary  path  surrounding  an 
arbitrary  array  of  current-carrying  wires? 

We  made  just  such  a generalization  in  developing  Gauss’  law  for  electric 
fields.  We  began  with  a sphere  surrounding  a single  point  charge  and  showed 
that  Gauss’  law  was  valid.  Then  we  generalized  to  a gaussian  surface  of  arbitrary 
shape  enclosing  an  arbitrary  distribution  of  charges,  and  we  showed  that  Gauss’ 
law  applied  to  the  general  case  as  well. 

As  a step  toward  generalizing  Eq.  (23-57),  consider  the  arrangement 
shown  in  Fig.  23-26e.  Again,  this  is  an  end-on  view  of  a long,  straight, 
current-carrying  wire.  This  time,  however,  the  closed  path  chosen  around 
the  wire  is  an  arbitrary  one.  The  path  cuts  across  held  lines,  some  of  which 
are  shown.  Figure  23-26 b shows  a part  of  the  arbitrary  path  which  contains 
a particular  infinitesimal  element  dl.  The  distance  of  this  element  from  the 
wire  is  r.  The  angle  between  dl  and  the  held  (B  at  that  location  is  6.  Thus  we 
have 


(B  • dl  = 2ft  cos  6 dl 

Now  the  component  ds  of  the  vector  dl  along  a direction  tangent  to  the 
circle  of  radius  r centered  at  the  wire  has  the  value 

ds  = dl  cos  6 

But  ds  can  also  be  expressed  in  terms  of  d<f>,  the  angle  subtended  by  the  ele- 
ment dl  at  the  wire.  From  Fig.  23-26 b we  have  ds  = r dft>.  Thus  we  can  write 

dl  cos  0 = r dft> 

so  that 


1100  Magnetic  Fields,  I 


($>  • dl  = 2ftr  dcf) 


Fig.  23-26  Evaluating  the  magnetic  circulation 
along  a path  of  arbitrary  shape  enclosing  a long, 
straight,  current-carrying  wire,  (a)  The  electric 
current  is  into  the  page,  and  the  magnetic  held 
lines  are  circular,  as  in  Fig.  23-25.  The  sense  of  the 
magnetic  held  is  again  related  to  the  sense  of  the 
current  by  the  right-hand  rule  for  magnetic  helds. 
The  sense  of  the  path  is  chosen  to  be  the  same  as 
that  of  the  magnetic  held,  (b)  A short  segment  of 
the  path  shown  above.  The  diagram  is  explained 
in  the  text. 


Substituting  the  value  <3&  = fx0i/2-7rr  into  this  expression  yields 


(B  • d\ 


,,  t^O1  j, 

-ETZ.  r d(f)  = -s—  a<p 

4777  4 7 7 


(23-58) 


We  can  now  integrate  both  sides  of  this  equation  around  the  arbitrary 
path  of  Fig.  23-26a.  Regardless  of  the  shape  of  the  path,  it  goes  once 
around  the  wire,  so  that  the  angle  c f>  goes  from  0 to  277  rad.  Thus  integra- 
tion of  both  sides  of  Eq.  (23-58)  yields 


or 


j « • dl  = 

closed 

curve 


d<f>  = ZTT 

477 


| « • dl  = /JL0i  (23-59) 

closed 

curve 


This  is  exactly  the  same  result  as  Eq.  (23-57),  which  was  obtained  for  the 
special  case  of  a circular  path.  Equation  (23-59)  is  called  Ampere’s  law.  The 
quantity  on  the  left  side  of  the  equation, 

j CB  * c/1 

closed 

curve 

23-6  Ampere’s  Law  1101 


is  called  the  magnetic  circulation,  or  the  circulation  for  short  where 
there  is  no  chance  of  confusion.  Ampere’s  law,  the  fact  that  the  circulation 
is  equal  to  fx0i , is  the  essential  relation  between  the  magnetic  held  and  the 
associated  electric  current:  The  magnetic  circulation  around  a closed  curve  is 
equal  to  /jl0  times  the  electric  current  penetrating  it.  The  closed  curve  around 
which  the  circulation  is  calculated  is  often  called  an  amperean  curve,  in 
analogy  to  a gaussian  surface.  In  order  to  simplify  calculation,  the  am- 
perean curve  is  usually  chosen  in  such  a way  that  d\  is  everywhere  either 
parallel  or  perpendicular  to  the  magnetic  held  ®. 

The  analogous  electric  circulation, 

f£- dl 

closed 

curve 

is  always  equal  to  zero  for  steady  fields.  This  is  a direct  consequence  of  the  fact 
that  the  electric  force  is  a conservative  force.  If  a test  charge  q is  taken  around  a 
closed  curve  back  to  its  starting  point  in  an  electric  field  which  does  not  change  in 
time,  the  very  fact  that  an  electric  potential  can  be  defined  requires  that  the  total 
work  done  be  zero: 

W = | dW  = | F • dl  = q j 8*  dl  = 0 

closed  closed  closed 

curve  curve  curve 

Thus  the  electric  circulation  is  zero.  Since  this  is  not  true  for  the  magnetic  circula- 
tion, a quantity  analogous  to  the  electric  potential  V cannot  be  defined  for  a mag- 
netic field.  To  put  it  another  way,  the  magnetic  force  is  not  a conservative  force. 

The  above  development  of  Ampere’s  law,  Eq.  (23-59),  and  Gauss’  law 
for  magnetism,  Eq.  (23-54),  depends  on  assuming  the  validity  of  the 
Biot-Savart  law,  Eq.  (23-38a).  Taken  together,  Ampere’s  law  and  Gauss’ 
law  for  magnetism  are  logically  equivalent  to  the  Biot-Savart  law.  Thus  they 
are  valid  if  the  Biot-Savart  law  is  valid,  and  vice  versa.  We  have  discussed 
the  underlying  experimental  verification  only  for  the  case  of  a current  in  a 
long,  straight  wire.  As  is  usually  the  case  with  such  fundamental  laws,  how- 
ever, enormous  amounts  of  experimental  evidence,  direct  and  indirect, 
have  built  up  over  the  years,  and  the  validity  of  these  laws  for  steady  cur- 
rents is  established  with  a very  high  degree  of  confidence.  Example  23-10 
illustrates  the  connection  between  the  Biot-Savart  law  and  Ampere’s  law. 


EXAMPLE  23-10  aB™“ — 

In  Example  23-9,  you  found  that  the  magnetic  field  along  the  axis  of  a current- 

carrying  circular  loop  of  radius  k at  a distance  z from  the  center  of  the  loop  is  given 
by  Eq.  (23-446), 

- i 

2 (z2  + k2)312 

Show  that  Ampere’s  law  is  obeyed  for  a closed  path  like  that  in  Fig.  23-27,  which  in- 
cludes the  axis  of  the  loop  from  a point  at  a very  great  distance  below  the  loop,  to  a 
point  at  a very  great  distance  above  it. 

■ In  order  to  close  the  amperean  curve,  you  must  choose  a return  path  outside 
the  current-carrying  loop.  According  to  Eq.  (23-45),  the  magnetic  field  on  the  loop 
axis  decreases  in  proportion  to  z~3  at  large  distances  from  the  loop.  The  off-axis  field 


1102  Magnetic  Fields,  I 


z = oo 


a | b 

i 

i 

i 


Sense  of  path 

Sense  of  magnetic 
circulation 


Current-carrying 

loop 


i 

i 

l 

2 = — oo  d ^ ^ c 

X = °° 

Fig.  23-27  Diagram  for  Example 
23-10.  Since  the  sense  of  the  current  in 
the  loop  is  as  shown,  the  right-hand  rule 
for  magnetic  field  lines  gives  the  sense 
of  the  magnetic  circulation  to  be  that 
shown  by  the  dashed  curve.  (This  is  not 
the  actual  shape  of  a magnetic  field  line, 
but  only  the  sense  is  important.)  The 
amperean  curve  abcda  is  chosen  as 
described  in  the  text.  The  sense  of  tra- 
versal around  it  is  taken  to  be  the  same 
as  that  of  the  magnetic  circulation.  The 
dashed  segments  of  the  amperean  curve 
indicate  that  each  of  its  sides  is  of  arbi- 
trarily great  length. 


also  decreases  very  rapidly  with  distance.  Thus  if  you  choose  the  return  path  so  that 
all  parts  of  it  are  very  far  from  the  current-carrying  loop,  the  magnitude  S9  of  the 
magnetic  field  will  be  negligible  everywhere  along  it.  So  you  choose  an  amperean 
curve  something  like  that  shown  as  abcda  in  Fig.  23-27.  You  choose  the  sense  of  inte- 
gration around  this  curve  to  be  the  same  as  the  sense  of  the  magnetic  circulation,  as 
explained  in  the  caption. 

Along  the  part  of  the  amperean  curve  from  a via  b and  c to  d , the  integral 
/®  • d 1 will  have  negligible  value.  Thus  you  can  write 

I « • dl  + (C  ® • dl  + r ®-  dl  = 0 

J a J b J c 


Along  the  part  of  the  amperean  curve  from  d to  a,  you  have 

"z=”  p.0i  k2  a _ ix0ik2 


® • dl  = 


2 (z2  + k2)312 


dz 


dz 


(z2  + k2)312 


The  integral  on  the  far  right  is  the  same  one  evaluated  in  Example  23-8.  Its  value  is 
2/k2,  so  you  have 


f 

J d 


® • dl 


Hoik2  2 

~T^  = 


The  magnetic  circulation  around  the  entire  curve  abcda  is  found  by  adding  the 
parts.  You  obtain 


j ® • dl 

closed 

curve 


® • dl 


® • d\  = /j. 0i  + 0 = /ji0i 


rest  of 
path 


which  conforms  to  Ampere’s  law. 


Ampere’s  law  is  valid  not  only  for  the  magnetic  circulation  evaluated 
along  a path  enclosing  a single,  straight  current-carrying  wire,  but  also  for 
the  magnetic  circulation  evaluated  along  a path  enclosing  any  number  of 
wires,  straight  or  not.  To  see  this,  note  that  the  magnetic  circulation  given 
by  Ampere’s  law  depends  only  on  the  magnitude  and  sense  of  the  electric 
current  encircled  by  the  amperean  curve,  and  not  on  the  particular  loca- 
tion within  the  curve  at  which  it  penetrates.  Consequently,  what  is  signifi- 
cant is  only  the  net  current  (that  is,  the  net  electric  charge  per  unit  time) 
“threading  through”  the  amperean  curve  (as  a thread  passes  through  the 
eye  of  a needle)  in  the  sense  that  is  consistent,  according  to  the  right-hand 
rule  for  magnetic  fields,  with  the  sense  of  the  circulation  around  the  curve. 
The  net  current  can  be  evaluated  by  adding  all  the  currents  threading 
through  the  curve  in  this  sense  and  subtracting  all  the  currents  threading 
through  in  the  opposite  sense. 

Indeed,  it  need  not  be  a wire  which  is  carrying  the  electric  current 
through  the  amperean  curve.  The  current  may  be  carried  by  positive 
charges  moving  in  the  sense  of  i,  or  by  negative  charges  moving  in  the  oppo- 
site sense,  or  both  at  the  same  time.  The  concentration  of  the  charge  car- 
riers and  their  drift  speeds  are  irrelevant.  The  current  density  need  not  be 
uniform,  nor  need  the  current  lines  be  mutually  parallel.  All  that  counts  is 
the  net  current  encircled  by  the  amperean  curve.  Several  examples  are 
shown  in  Fig.  23-28. 


23-6  Ampere’s  Law  1103 


Fig.  23-28  The  use  of  Ampere’s  law  to  evaluate  the  magnetic  circulation  around  amperean 
curves  penetrated  by  various  electric-current  configurations,  (a)  The  elementary  case,  in- 
volving a single  wire  carrying  current  i.  In  this  and  subsequent  cases,  the  right-hand  rule  re- 
lating the  sense  of  the  magnetic  field  lines  to  the  sense  of  the  current  is  used  to  establish  the 
sense  of  the  circulation.  The  direction  of  dl  is  chosen  to  be  consistent  with  the  latter  sense.  ( b ) 
Two  conductors  carry  currents  of  opposite  sense  through  the  amperean  curve.  The  sense  of  q 
has  been  chosen  to  be  positive.  The  actual  sense  of  the  circulation  cannot  be  known  unless  the 
magnitudes  of  q and  i2  are  known.  The  sense  shown  is  correct  if  q > i2.  But  if  q > q,  the 
equation  shown  in  the  diagram  will  have  negative  terms  on  both  sides.  It  is  therefore  correct 
regardless  of  the  magnitudes  of  q and  i2  .(It  you  choose  the  sense  of  i2  to  be  positive,  how  must 
you  write  Ampere’s  law  for  this  case?)  (c)  The  circulation  is  zero  because  no  current  is  enclosed 
by  the  amperean  curve.  (The  current  shown  is  outside  the  curve.)  (d)  A complicated  case.  The 
sense  of  q has  been  chosen  to  be  positive.  The  way  in  which  the  various  wires  cross  one  an- 
other is  irrelevant.  Only  the  sense  in  which  they  carry  current  through  the  amperean  curve  is 
significant. 

In  order  to  make  explicit  the  point  of  the  preceding  paragraph,  we 
rephrase  Ampere’s  law  in  a more  general  way  to  take  into  account  the  pos- 
sibility that  the  current  threading  through  an  amperean  curve  need  not  be 
either  uniform  in  density  or  carried  by  one  or  more  wires.  In  doing  so,  we 
will  make  rigorous  the  intuitive  notion  of  a current  “threading  through"  an 
amperean  curve.  Figure  23-29  shows  an  amperean  curve  encircling  a 
region  through  which  an  arbitrary  current  is  flowing.  For  the  purposes  of 
ibis  argument,  we  assume  that  the  current  density  outside  the  tube  of  flow 
shown  is  everywhere  zero.  However,  the  current  density  within  the  tube  of 
flow  varies  in  both  magnitude  and  direction  from  point  to  point.  But  since 


Fig.  23-29  A tube  of  flow  of  electric  current  i whose  sense  is  indi- 
cated. Typical  current  lines  are  shown  within  the  tube,  and  it  is  as- 
sumed that  the  current  density  outside  the  tube  is  everywhere  zero. 
Within  the  tube,  the  current  density  at  any  particular  location  is 
denoted  by  the  vector  j.  An  amperean  curve  lying  in  a single  plane 
encircles  the  tube  of  flow  and  encloses  the  planar  surface  1 of  area 
ax  and  also  an  arbitrary  surface  2 of  area  a2.  Infinitesimal  surface 
elements  on  the  respective  surfaces  are  denoted  by  the  vectors  dax 
and  da2 . 


1104  Magnetic  Fields,  I 


the  current  is  assumed  to  be  steady,  the  current  density  j at  any  given  loca- 
tion remains  constant. 

If  the  amperean  curve  lies  in  a single  plane,  we  can  draw  the  plane  sur- 
face of  area  alt  which  is  enclosed  by  the  curve.  As  we  have  done  before,  we 
divide  the  surface  into  infinitesimal  area  elements  dax  and  associate  with 
each  element  a vector  da,  whose  magnitude  is  numerically  equal  to  the  area 
and  whose  direction  is  normal  to  the  area  element  and  in  the  sense  of  the 
current.  (Since  the  surface  is  planar,  the  direction  of  dax  is  the  same  for  all 
the  area  elements  comprising  the  surface.  However,  the  current  density  j is 
not,  in  general,  the  same  everywhere  on  the  surface  and  is  not,  in  general, 
parallel  to  dax . 

The  current  passing  through  an  element  dax  is  defined  by  Eq.  (22-28). 
With  the  slight  change  in  notation  necessary  to  apply  that  equation  to  the 
present  case,  it  becomes 

dii  = j • da.x 

The  total  current  i which  threads  through  the  amperean  curve  is  found  by 
evaluating  the  integral  that  gives  the  total  current  penetrating  surface  1, 
which  is  enclosed  by  the  amperean  curve.  That  integral  is 


enclosed  enclosed 
surface  surface 


But  surface  I is  not  the  only  surface  which  is  enclosed  by  the  amperean 
curve.  It  is  possible  to  draw  an  infinite  number  of  such  surfaces.  One  of 
them  is  surface  2,  also  shown  in  Fig.  23-29.  The  total  current  i which 
threads  through  the  amperean  curve  is  also  identical  to  the  total  current 
penetrating  surface  2.  You  can  see  this  intuitively  by  noting  that  every  cur- 
rent line  which  penetrates  surface  1 must  also  penetrate  surface  2,  since 
current  lines  do  not  terminate  in  the  case  of  steady,  continuous  flow.  An- 
other somewhat  more  rigorous  way  to  see  the  same  thing  is  to  make  an 
argument  similar  to  that  underlying  Gauss’  law.  Surface  1 and  surface  2, 
taken  together,  make  up  a closed  surface.  Flow  of  current  into  the  enclosed 
volume  takes  place  through  surface  1 only,  and  flow  out  of  the  enclosed 
volume  takes  place  through  surface  2 only.  But  current  neither  appears 
nor  disappears  within  the  enclosed  volume.  Thus  the  total  current  through 
surface  2 must  have  the  same  magnitude  as  that  through  s tit  face  I.  If  we 
specify  a typical  surface  element  of  surface  2 by  the  vector  r/a2,  the  current 
di2  flowing  through  that  element  must  be 

di2  = j • da2 

where  j is  the  current  density  vector  evaluated  at  the  location  of  t/a2.  The 
total  current  penetrating  surface  2 is  then  given  by 


enclosed  enclosed 
surface  surface 


And  according  to  the  argument  we  have  just  made,  the  total  current  i. 
threading  through  the  amperean  curve  can  be  found  by  evaluating  the 
integral  of  j • r/a  over  any  surface  enclosed  by  the  curve.  There  is  no  need 
for  the  surface  to  be  planar.  Consequently,  our  initial  assumption  that  the 
amperean  curve  itself  must  be  planar  is  not  essential  to  the  argument  and 


23-6  Ampere's  Law  1105 


can  be  dropped.  We  can  therefore  write,  for  the  current  i threading 
along  any  path  through  any  amperean  curve, 


di  — j • da 


enclosed  enclosed 
surface  surface 


where  the  integral  can  be  taken  over  any  surface  enclosed  by  the  amperean 
curve. 

We  now  apply  this  result  to  writing  Ampere’s  law  in  a more  general 
form.  Substituting  the  value  of  i given  immediately  above  into  Eq.  (23-59), 
we  have 


(23-60) 


enclosed 

surface 


closed 

curve 


This  form  of  Ampere’s  law  will  be  of  great  importance  in  a further  gener- 
alization carried  out  in  Chap.  27. 

In  Eq.  (23-60),  the  directions  of  the  vectors  (B  and  j are  those  that  con- 
cern us  fundamentally.  But  the  sense  chosen  to  traverse  the  closed  am- 
perean curve  determines  which  of  the  two  directions  along  the  curve  is 
taken  to  be  the  direction  of  a vector  d\.  This  choice  must  be  consistent  with 
the  choice  of  which  of  the  two  directions  normal  to  the  enclosed  surface  is 
taken  to  be  the  direction  of  a vector  da.  (Here  we  cannot  say  that  the  direc- 
tion for  da  is  chosen  to  be  the  outward  normal  direction,  as  we  do  in  Gauss’ 
law,  because  here  the  surface  of  which  the  da  are  elements  does  not  com- 
pletely surround  a region  of  space  and  so  outward  has  no  meaning.)  Once 
the  sense  of  traversal  has  been  chosen,  the  following  right-hand  rule  yields 
a consistent  specification  of  the  direction  of  each  surface  element  vector. 
Place  your  right  hand  so  that  the  fi  ngers  curl  in  the  sense  of  traversal  of  a closed 
curve.  The  thumb  then  points  in  the  sense  specifying  which  of  the  two  directions 
normal  to  a suface  enclosed  by  the  curve  is  that  of  each  of  its  surface  element  vectors 
da.  This  is  the  right-hand  rule  for  surface  element  vectors. 

It  is  extremely  important  to  note  that  a steady  current  (that  is,  a cur- 
rent which  does  not  vary  with  time)  can  flow  only  in  a closed  circuit. 
Regardless  of  the  variations  from  part  to  part  of  the  circuit  in  the  way  in 
which  the  charge  is  transported — changes  in  carrier  concentration,  charge 
and  sign,  drift  speed,  and  current  density — the  current  i is  the  same  every- 
where in  the  circuit  if  it  is  a steady  current,  without  accumulation  of  charge 
anywhere  in  the  circuit.  Therefore,  the  circulation  evaluated  along  all  closed 
curves  encircling  the  circuit  is  the  same.  This  is  illustrated  in  Fig.  23-30. 

To  conclude  this  section,  we  summarize  the  four  independent  laws 
which  govern  the  behavior  of  steady  electric  and  magnetic  fields.  These 
comprise  Gauss’  laws  and  the  circulation  laws  (one  of  which  is  Ampere’s 
law): 


Gauss’  law  for  electric  fields  (23-6 la) 


closed 

surface 


Gauss’  law  for  magnetic  fields  (23-6 16) 


closed 

surface 


1106  Magnetic  Fields,  I 


Neon  glow  tube  Spark  gap 


Fig.  23-30  A hypothetical  electric  circuit  in  which  the  mode  of 
charge  transport  varies  from  place  to  place.  Nevertheless,  the  cir- 
culation f ® • dl  is  the  same  around  all  the  curves  shown  and  is 

closed 

curve 

equal  in  each  case  to  ju 0i. 


j 8 • dl  = 0 

closed 

curve 


Electric  fields  are  conservative  in  presence 
of  steady  magnetic  fields 


(23-6 lc) 


J (R  • dl  = Hoi 

closed 

curve 


Ampere’s  law  for  currents  in  presence 
of  steady  electric  fields 


(23-61rf) 


Taken  together,  these  four  equations  are  called  Maxwell’s  equations  for 
the  restricted  case  of  steady  fields.  They  comprise  everything  funda- 
mental that  can  be  said  about  steady  fields  (fields  which  do  not  vary  with 
time)  as  seen  on  a macroscopic  scale  by  an  observer  who  does  not  move  with 
respect  to  the  apparatus  producing  the  magnetic  field.  In  Sec.  24-2,  we  con- 
sider the  situation  from  the  point  of  view  of  an  observer  who  does  move.  In 
Chaps.  25  through  27  we  study  the  effects  of  electric  and  magnetic  fields 
which  vary  in  time. 


23-7  APPLICATIONS  Just  as  Gauss’  law  is  a very  powerful  tool  for  calculating  the  electric  fields 
OF  AMPERE  S LAW  produced  by  symmetric  charge  distributions,  Ampere’s  law  can  be  used 

to  calculate  the  magnetic  fields  produced  by  symmetric  current  distribu- 
tions. The  archetypal  application  of  Ampere’s  law  is  to  the  solenoid. 

A solenoid  is  a closely  wound  helix  of  many  turns  of  wire,  as  shown  in 
Fig.  23-31.  Each  turn  approximates  a circular  turn,  even  though  it  does  not 
lie  exactly  in  a single  plane.  A long  solenoid,  having  many  turns,  is  some- 
thing like  a large  number  of  circular  loops  stacked  together.  At  points  on 
the  axis  of  the  solenoid,  the  magnetic  field  contribution  of  each  single  loop 
is  directed  along  the  axis.  It  follows  that  the  direction  of  the  total  magnetic 


23-7  Applications  of  Ampere’s  Law  1107 


i (inward) 

( b ) 


Fig.  23-31  (a)  A solenoid.  ( b ) Cross- 

sectional  view  of  a solenoid  of  radius  k, 
showing  the  direction  (B  of  the  magnetic 
held  on  its  axis.  The  current  i is  conven- 
tionally represented  by  a dot  (•), 
suggesting  the  head  of  an  arrow,  where 
it  comes  out  of  the  page.  Similarly,  it  is 
represented  by  a cross  (x),  suggesting 
the  tail  feathers  of  the  arrow,  where  it 
goes  into  the  page. 


field  © at  any  point  on  the  axis  must  be  parallel  to  the  axis.  But  it  is  also  true 
that  the  field  vector  © is  parallel  to  the  solenoid  axis  at  any  interior  point, 
provided  the  solenoid  is  very  long,  and  we  restrict  our  measurements  to 
points  not  too  close  to  the  ends.  The  proof,  which  is  essentially  a symmetry 
argument,  is  as  follows.  Consider  the  arbitrary  interior  point  P shown  in 
Fig.  23-32.  A particular  loop  of  the  many  that  comprise  the  solenoid, 
called  loop  1,  is  located  a distance  z to  the  left  of  P.  Loop  1 makes  a contribu- 
tion d(S>1  to  the  magnetic  field  at  P,  as  shown.  There  is  a second  point  P' 
a distance  z to  the  left  of  loop  1 and  the  same  distance  from  the  axis  as  P.  At 
P',  the  loop  makes  a contribution  <r/®;  to  the  magnetic  field.  By  symmetry, 
the  magnitudes  dS^1  and  d2JZ[  are  equal.  Also  by  symmetry,  the  angles  8 and 
S'  which  and  respectively,  make  with  the  axis  are  equal  in  magni- 
tude, but  have  opposite  senses. 

Next,  consider  the  loop  which  lies  a distance  z to  the  right  of  P,  which 
we  call  loop  2.  Since  it  is  identical  to  loop  1 in  every  way  except  location,  its 
field  line  pattern  must  be  identical  to  that  of  loop  1 except  that  it  is  shifted  a 
distance  2z  to  the  right.  Thus  the  magnetic  field  contribution  d(S>2  of  loop  2 at  P is 
the  same  as  the  contribution  d(S>[  of  loop  1 at  P' . The  net  contributions  of  loops  1 
and  2 at  P can  be  found  by  adding  the  vectors  d(S>1  and  t/ffi2.  Their  compo- 
nents along  a direction  parallel  to  the  axis  add,  but  their  components  along 
a direction  perpendicular  to  the  axis  cancel. 

For  every  loop  to  the  left  of  P there  is  a loop  an  equal  distance  to  the 
right.  Thus,  if  the  solenoid  is  infinitely  long,  each  loop  can  be  paired  with 
another  one  in  such  a way  that  the  contribution  of  each  pair  to  the  total 
magnetic  field  is  parallel  to  the  solenoid  axis.  In  Fig.  23-32,  the  point  P is 
shown  inside  the  solenoid.  However,  the  same  argument  holds  for  points 
outside  the  solenoid.  For  real  solenoids,  having  finite  length,  the  argument 
fails  near  the  ends.  Can  you  explain  why?  Indeed,  it  must  fail,  since  there 
has  to  be  some  way  for  the  field  lines  inside  the  solenoid  to  return  outside 
the  solenoid  and  form  closed  curves.  The  field  outside  the  solenoid  must  be 
directed  oppositely  to  that  inside,  since  the  field  lines  form  closed  curves.  It 
must  be  weaker  than  the  field  inside,  since  the  magnetic  flux 

= | ffi  • da. 

cross-sectional 

area  inside  solenoid 

which  passes  through  the  limited  cross-sectional  area  inside  the  solenoid  in 
one  direction  returns  through  the  unlimited  cross-sectional  area  available 
outside.  Iron  filings  are  used  to  visualize  the  field  lines  of  an  actual  solenoid 
in  Fig.  23-33.  (It  was  Ampere  who  invented  the  solenoid.  He  also  gave  it  its 
name,  which  is  derived  from  the  Greek  word  meaning  “channel”;  the  inte- 
rior of  the  solenoid  may  be  thought  of  as  a channel  for  magnetic  field 
lines.) 


Fig.  23-32  Illustration  to  demonstrate  that  at  any  interior  point  P 
far  from  the  ends  of  a solenoid,  the  magnetic  field  ® is  parallel  to 
the  solenoid  axis.  The  argument  is  given  in  the  text. 


1108  Magnetic  Fields,  I 


Fig.  23-33  In  this  photo,  a small  solenoid  has 
been  wound  through  holes  in  a flat  card,  so  that 
the  axis  of  the  solenoid  lies  in  the  plane  of  the 
card.  A current  is  made  to  flow  through  the  so- 
lenoid, and  iron  filings  are  sprinkled  on  the  card 
in  order  to  visualize  the  magnetic  field  lines,  as  in 
Fig.  23-6a  and  c and  Fig.  23-16 b.  Note  the  highly 
concentrated,  parallel  field  lines  inside  the  so- 
lenoid where  the  field  is  relatively  strong  and  par- 
allel to  the  solenoid  axis.  The  field  lines  spread  out 
at  the  ends  of  the  solenoid.  Along  the  external  re- 
turn paths,  the  total  flux  is  spread  out  over  a much 
larger  region,  and  the  field  is  much  weaker.  As  a 
consequence,  the  external  part  of  the  field  can  be 
seen  only  indistinctly  or  not  at  all.  In  spite  of  the 
fact  that  the  solenoid  turns  are  not  tightly  packed, 
the  field  between  adjacent  turns  is  so  weak  that  no 
iron  filings  adhere. 


With  these  qualitative  ideas  in  mind,  let  us  apply  Ampere’s  law  to  the 
system.  We  calculate  magnetic  circulation  along  the  rectangular  amperean 
curve  shown  in  Fig.  23-34.  It  is  given  by 


i (outward) 

OOOQ ( 

a 

d\ 

b 

« 

M Z ► 

i (inward) 


Fig.  23-34  Diagram  for  applying 
Ampere’s  law  to  a long  solenoid. 


I (B  • dl 

closed 

curve 


(B  • d\ 


<B  • d\ 


<B  • d\ 


(B  • d\ 


The  path  ab  lies  along  a held  line,  and  (B  is  constant  along  the  path.  Thus,  if 
the  distance  from  a to  b is  z,  the  hrst  integral  has  the  value 

I h (B  • d\  = r & di = ® r di= 

J a J a J a 


The  length  of  paths  be  and  da  is  not  important.  As  long  as  the  ends  of  the 
solenoid  are  far  away,  the  magnetic  Held  both  inside  and  outside  the  so- 
lenoid will  be  parallel  to  the  solenoid  axis  and  thus  perpendicular  to  the 
paths.  Thus  we  have  (B  perpendicular  to  d\,  so  that 


C (B  • d\  = I " (B  • dl 


We  have  already  argued  that  the  Held  outside  the  solenoid  is  weak.  More- 
over, we  can  choose  to  have  path  cd  as  far  away  from  the  solenoid  as  we 
wish  and  thus  ensure  that  the  magnitude  of  the  field  (B  is  negligible.  The 
integral  along  path  cd  thus  gives 

I d (B  • dl  = 0 


Adding  the  contributions  of  the  four  parts  of  the  closed  rectangular  curve, 
we  obtain  the  circulation 

j (B  • dl  = 

closed 

curve 

Now  we  must  evaluate  the  total  current  flowing  through  the  closed 
curve.  Each  turn  carries  a current  i.  If  there  are  n turns  per  unit  length  of 
the  solenoid,  the  number  of  turns  passing  through  the  closed  curve  is  nz. 
The  current  sense  is  the  same  for  all  the  turns,  so  that  the  net  current  I en- 
closed by  the  curve  is 


I = nzi 

23-7  Applications  of  Ampere’s  Law  1109 


Using  Ampere’s  law  to  equate  the  circulation  to  /jl0I  gives 

53z  = fji0nzi 


or 


53  = ix0ni  (23-62) 

Thus  we  have  used  Ampere’s  law  to  evaluate  the  magnetic  held  inside 
a very  long  solenoid.  We  have  done  so  by  following  essentially  the  reverse 
of  the  procedure  employed  in  Example  23-10,  where  we  verified  Ampere’s 
law  for  the  axial  magnetic  held  of  a single  loop.  There  we  used  the  expres- 
sion for  the  magnetic  held  derived  by  means  of  the  Biot-Savart  law.  In  the 
present  case  of  relatively  high  symmetry,  however,  the  mathematical  labor 
involved  in  using  Ampere’s  law  is  much  less  than  that  in  the  Biot-Savart  cal- 
culation. In  this  calculational  sense,  Ampere’s  law  has  much  the  same  virtue 
as  Gauss’  law  for  electric  helds. 

The  most  significant  fact  about  the  result  expressed  in  Eq.  (23-62)  is 
that  it  is  independent  of  the  location  of  the  part  of  the  amperean  curve  in- 
side the  solenoid.  That  is,  the  magnetic  field  inside  a long  solenoid  is  uniform  and 
independent  of  the  radius  of  the  solenoid.  This  uniformity  often  is  ex- 
ploited in  experiments  requiring  a uniform  magnetic  held.  A practical  so- 
lenoid is  discussed  in  Example  23-11. 


You  want  to  wind  a solenoid  having  a length  Z = 15.0  cm  and  a diameter  D = 3.00 
cm.  You  plan  to  use  a single  layer  of  no.  36  copper  wire  (diameter  d = 0.127  mm) 
wound  as  closely  as  possible.  At  a temperature  of  75°C,  the  resistance  per  unit 
length  of  no.  36  copper  wire  is  r = 1.66  fl/m.  In  order  to  make  sure  the  tempera- 
ture does  not  rise  above  75°C,  you  plan,  as  a rough  guess,  to  restrict  the  power  dissi- 
pation of  the  solenoid,  through  Joule  heating,  to  250  W.  What  is  the  maximum 
magnetic  field  you  can  plan  on  inside  the  solenoid?  For  this  held,  what  will  be  the 
current  i through  the  winding,  and  what  voltage  will  you  have  to  apply  across  the 
terminals? 

® The  maximum  magnetic  held  depends  on  the  maximum  allowable  electric  cur- 
rent i and  the  number  n of  turns  per  meter.  But  i depends  on  the  maximum  allow- 
able power  P and  the  total  resistance  R of  the  wire.  And  the  resistance  of  the  wire 
depends  on  its  length  L.  So  you  begin  by  calculating  L.  The  length  of  wire  in  each 
turn  is  ttD.  If  the  total  number  of  turns  is  N,  you  have 

L = ttDN  - ttDhZ 

Next,  you  note  that  since  adjacent  turns  of  wire  are  in  contact,  the  number  of  turns 
per  meter  is  equal  to  the  reciprocal  of  the  wire  diameter;  that  is. 


1 


So  the  length  of  the  wire  is 


L = 


ttDZ 

~d~ 


and  its  resistance  is 


R = rL 


vrDZ 

d 


1110 


Magnetic  Fields,  I 


If  you  call  the  maximum  allowable  power  dissipation  P,  you  have,  from  Eq. 
(22-51a),  P = i2R.  Solving  for  i gives 


? = \R 


1/2 


Pd 
7 rrDZ 


1/2 


Using  Eq.  (23-62),  you  find  the  magnetic  held  to  be 


Ho™  = Ho' 


H 0 


(7 rrdDZ  ) 


1/2 


(23-63) 


(23-64) 


Using  the  numerical  values  given,  you  calculate 
3i  = 4tt  x KT7  T-m/A 

/ 250  W 

in  x 1.66  O/m  x 1.27  x 10-4  m x 3.00  x 10~2  m x 0.150  m 


or 


31  = 1.15  x 10~2  T 

This  is  a modest  magnetic  held,  but  one  in  which  many  experiments  can  be  done. 
You  can  use  Eq.  (23-63)  or  (23-64)  to  calculate  the  current.  Using  the  latter,  you 

have 

. _0Bd  _ 1.15  x 10-2  T x 1.27  x 10“4  m _ 

Ho  4-it  x 10-7  T-m/A 

Since  you  already  know  i and  P,  the  voltage  is  most  easily  calculated  by  using  Joule’s 
law: 


250  W 
1.16  A 


= 215  V 


If  in  imagination  you  take  a solenoid  and  bend  it  into  a circle  so  that 
the  two  ends  meet,  the  resulting  structure  is  called  a toroid.  Such  a toroid  is 
shown  in  Fig.  23-35.  Ideally,  the  magnetic  field  line  pattern  of  a toroid  is 


Fig.  23-35 


5B  = 0 


A toroid. 


1111 


EXAMPLE  23-12 


Fig.  23-36  (a)  A solenoid  of  elliptical 

cross  section.  ( b ) A solenoid  of  rectangu- 
lar cross  section,  (c)  A toroid  of  elliptical 
cross  section.  The  equations  relating  the 
magnetic  field  inside  a solenoid  or  a 
toroid  to  the  current  flowing  through 
the  windings  are  derived  in  a manner 
which  has  nothing  to  do  with  the 
cross-sectional  shape  of  the  solenoid  or 
toroid.  The  internal  field  is  indepen- 
dent of  the  cross-sectional  shape,  pro- 
vided only  that  the  shape  is  uniform. 


(c) 


very  simple.  The  field  lines  which  emerge  from  one  “end”  immediately 
reenter  the  other  “end”  without  the  necessity  of  passing  through  any  inter- 
vening space.  Thus  the  magnetic  field  of  an  ideal  toroid  lies  entirely  inside 
the  hollow  core.  In  Example  23-12,  you  will  see  another  application  of 
Ampere's  law  to  the  calculation  of  magnetic  field,  in  this  case,  the  field  of  a 
toroid. 


A solenoid  of  length  Z having  N turns  is  bent  into  a toroid.  Evaluate  the  magnetic  I 
field  inside  the  toroid  when  a current  i is  flowing  through  the  windings. 

■ The  first  problem  in  applying  Ampere’s  law  is  to  decide  how  to  draw  the  am- 
perean  curve  along  which  the  circulation  is  to  be  calculated.  As  usual,  you  try  to  sim- 
plify the  ensuing  calculation  by  drawing  it  in  such  a way  that  it  is  everywhere  either 
parallel  or  perpendicular  to  the  magnetic  field  lines.  Since  the  field  lines  of  the 
toroid  are  circular,  this  is  especially  easy  in  this  case.  You  choose  for  the  amperean 
curve  a circle  of  radius  R which  lies  inside  the  core  of  the  toroid,  as  shown  in  Fig. 
23-35.  (This  circle  need  not  coincide  with  the  axial  circle  of  the  toroid,  which  is  also 
shown.)  Ampere’s  law  gives  you 

| © • dl  = 38  j dl  = S8(2ttR)  = /jl0I 

closed  closed 

curve  curve 

where  I is  the  total  current  enclosed  by  the  curve.  There  are  N turns  of  wire  in  the 
toroid,  and  each  one  is  threaded  by  the  amperean  curve  in  such  a way  that  its  inside 
part,  where  the  current  i is  flowing  from  below  the  plane  of  the  page  to  above  it,  is 
within  the  curve.  The  outside  parts  of  the  turns  are  not  within  the  amperean  curve. 
They  therefore  make  no  contribution  to  the  total  current  I enclosed  by  the  am- 
perean curve.  So  you  have 

I = Ni 

Combining  this  equation  with  the  previous  one  yields 

38(2  77 R)  = /j-o  Ni 


or 


38 


M-o  Ni 
~2tt  "FI 


(23-65) 


According  to  Eq.  (23-65),  the  magnetic  field  inside  the  core  of  a toroid  is  not 
uniform,  but  varies  inversely  with  the  distance  from  the  center  of  the  toroid,  labeled 
C in  Fig.  23-35.  Can  you  show  that  the  value  of  38  along  the  axis  of  the  toroid  (shown 
as  a broken  line  in  Fig.  23-35)  is  equal  to  that  of  the  solenoid  from  which  the 
toroid  was  formed  in  imagination? 


So  far,  we  have  discussed  solenoids  and  toroids  of  circular  cross  sec- 
tion. But  note  that  the  Ampere’s-law  calculations  did  not  depend  on  this 
fact  at  all.  Thus  Eqs.  (23-62)  and  (23-65)  apply,  respectively,  to  solenoids 
and  toroids  of  arbitrary  cross-sectional  shape,  such  as  those  shown  in  Fig. 
23-36,  as  long  as  the  shape  is  uniform.  (To  see  the  necessity  of  this  restric- 
tion, suppose  the  shape  were  not  uniform.  What  complications  would  this 
introduce  into  the  evaluation  of  the  circulation?) 

Equation  (23-42a),  38  = /jL0i/2TrR,  gives  die  magnitude  of  the  mag- 
netic field  around  a long,  straight  wire.  If  you  compare  it  with  Eq.  (23-65), 


1112  Magnetic  Fields,  I 


Fig.  23-37  A toroid  having  N turns  lies  in  the 
plane  shown.  A long,  straight  wire  passes  through 
the  center  of  the  toroid.  The  direction  of  the  wire 
is  normal  to  the  plane.  Within  the  core  of  the 
toroid,  the  magnetic  field  of  a current  i flowing 
through  the  toroid  winding  is  the  same  as  the 
magnetic  field  of  a current  Ni  flowing  through  the 
wire. 


you  must  conclude  that  the  magnetic  field  inside  a toroid  of  N turns,  car- 
rying a current  i,  is  the  same  as  that  which  would  exist  in  the  same  region  if 
a wire,  like  that  shown  in  Fig.  23-37,  ran  through  the  center  of  the  toroid 
and  perpendicular  to  the  plane  of  the  toroid,  with  current  Ni  flowing 
through  the  wire  and  no  current  flowing  through  the  toroid.  Why  should  a 
toroid  have  the  S8  oc  R~l  dependence  which  you  have  seen  on  many  occa- 
sions to  be  typical  of  cylindrical  symmetry?  Example  23-13  suggests  an 
answer  to  this  question. 


EXAMPLE  23-13  ! r 

A coaxial  cable  is  shown  in  Fig.  23-38.  Both  conductors  are  hollow  cylindrical  shells, 
having  radii  kx  and  k2,  respectively.  A battery  connected  between  the  two  drives  a 
current  I = Ni  in  the  leftward  direction  in  the  inner  conductor,  through  a resistor 
at  the  end  of  the  cable,  and  then  back  through  the  outer  conductor.  Find  the  magni- 
tude and  direction  of  the  magnetic  field  (B  at  radial  distances  R < A,,  k,  < R'  < 
and  R"  > k2. 

■ Draw  the  three  amperean  curves  shown  in  the  figure.  The  smallest  curve,  a 
circle  of  radius  R , encloses  no  current.  Ampere’s  law  thus  gives 

| « • dl  = fjL0I  = 0 

closed 

curve 


Fig.  23-38  A coaxial  cable,  discussed  in  Example  23-13.  The  two 
cylindrical  conductors,  of  radii  kx  and  k 2 , respectively,  lie  on  a 
common  axis.  A current  Ni  flows  from  the  positive  terminal  of  the 
battery  through  the  inner  conductor,  then  through  a resistor  and 
via  the  outer  conductor  to  the  negative  terminal  of  the  battery, 
which  completes  the  circuit. 


23-7  Applications  of  Ampere's  Law  1113 


That  is,  the  magnetic  circulation  around  this  amperean  curve  must  be  zero.  You 
must  now  use  a symmetry  argument  to  show  that  the  zero  circulation  is  due  to  the 
fact  that  the  magnetic  held  ® is  zero  everywhere  on  a circular  amperean  curve  of 
radius  R < kx . To  show  that  such  a proof  is  necessary,  consider  the  following  case, 
in  which  the  circulation  is  zero  even  though  ® is  not  zero.  Suppose  that  the  entire 
coaxial  cable  were  placed  between  the  poles  of  a large  magnet  and  oriented  so  that 
the  uniform  magnetic  held  was  perpendicular  to  the  axis  of  the  cable.  What  would 
be  the  circulation  around  the  same  amperean  curve  in  this  case? 

But  in  this  example,  you  are  interested  in  only  the  magnetic  helds  associated 
with  the  currents  in  the  two  conductors  of  the  cable  itself.  Both  conductors  have  cy- 
lindrical symmetry,  and  the  magnetic  helds  associated  with  the  currents  in  them 
must  partake  of  this  symmetry  as  well.  To  satisfy  this  condition,  the  magnetic  held 
lines  must  have  one  of  only  two  possible  configurations.  The  hrst  of  these  is  a radial 
conhguration  centered  on  the  cylinder  axis.  That  is,  the  held  lines  would  have  to 
begin  or  end  at  the  axis  and  be  directed  along  such  vectors  as  R.  But  this 
conhguration  is  inconsistent  with  the  absence  of  magnetic  charge  at  the  axis;  mag- 
netic held  lines  are  always  closed  curves. 

The  second  possibility  is  that  the  magnetic  held  lines  are  circles  centered  about 
the  axis.  If  this  is  the  case,  the  magnetic  held  will  have  the  same  magnitude  every- 
where along  the  amperean  curve  consisting  of  a circle  of  radius  R and  will  be  every- 
where tangent  to  that  circle.  Thus  the  magnetic  circulation  can  be  written 

| ® • dl  = j 3ft  dl  = 8ft  I dl  = Sft(2TrR)  = /x0f  = 0 for  R < kA 

closed  closed  closed 

curve  curve  curve 

Since  in  the  general  case  R is  not  zero,  the  only  possible  way  to  satisfy  the  condition 
2 t rRSft  = 0 is  that 

8ft  = 0 for  R < k i 

The  intermediate  amperean  curve,  the  circle  of  radius  R' , encloses  the  cur- 
rent I = Ni  carried  by  the  inner  conductor,  but  not  the  current  of  opposite  sense 
carried  by  the  outer  conductor.  So  in  the  intermediate  region  you  have 

I ® • dl  = Sft(2TrR')  = mo / = fioNi 

closed 

curve 

or 

UqNi 

3ft  = for  < R'  < k2 

2ttR' 

This  value  of  the  magnetic  held  is  the  same  as  would  be  obtained  if  the  conductor 
were  a thin  wire  instead  of  a cylindrical  shell.  (This  conclusion  is  entirely  analogous 
to  the  electric  held  case.) 

The  outermost  amperean  circle  of  radius  R"  encloses  two  equal  currents 
flowing  in  opposite  senses.  Thus  the  net  current  flowing  through  this  amperean 
curve  is  / — / = 0,  and  Ampere’s  law  gives 

3ft  — 0 for  R"  > k2 


To  interpret  the  result  for  the  toroid  lying  in  the  plane  perpendicular 
to  the  current-carrying  wire  in  Fig.  23-37,  discussed  in  Example  23-12, 
imagine  that  the  coaxial  conductors  are  replaced  by  a stack  of  toroids  of 
rectangular  cross  section,  having  an  inner  radius  k j and  an  outer  radius  k2, 
as  in  Fig.  23-39 a.  Three  of  these  toroids  are  shown  in  cross  section  in  Fig. 


1114 


Magnetic  Fields,  I 


Fig.  23-39  Diagram  to  show  that  a coaxial  cable 
is  equivalent  to  a stack  of  toroids  of  rectangular 
cross  section,  (a)  Perspective  view  of  the  stack; 
compare  with  Fig.  23-38.  (b)  Section  through  the 
stack.  Shown  are  the  current  lines  (straight 
arrows)  and  an  amperean  curve  around  which  the 
magnetic  circulation  is  zero. 


Amperean  curve 


l t I t 


( b ) 


23-39 b.  An  amperean  curve  drawn  as  shown  in  the  figure  encloses  equal 
and  opposite  currents,  and  the  circulation  is  zero.  What  remains  is  the  cur- 
rent flow  pattern  of  the  coaxial  cable  of  Example  23-13,  whose  symmetry  is 
clearly  cylindrical.  To  see  the  matter  in  a slightly  different  way,  think  of  the 
coaxial  cable  as  a very  thick  toroid  of  rectangular  cross  section.  Its  magnetic 
field  distribution  must  be  both  toroidal  and  cylindrical,  since  it  is  both  a 
pair  of  coaxial  cylinders  and  a toroid  whose  ends  are  so  far  away  that  it 
cannot  matter  whether  they  are  actually  present.  The  current  Ni  flowing 
through  the  inner  and  outer  conductors  is  just  that  which  would  flow 
through  N turns  of  a toroidal  winding  carrying  current  i. 

Can  you  use  a similar  argument  to  relate  the  uniform  magnetic  field 
inside  a solenoid  of  square  cross  section  to  the  magnetic  field  associated  with 
a current  flowing  in  a thin,  wide,  flat  plate  lying  parallel  to  the  axis  of  the 
solenoid,  as  in  Fig.  23-40?  Why  should  you  expect  a field-versus-distance 
dependence  S3  °c  R°  in  this  case? 


Fig.  23-40  Diagram  to  show  that  a large,  flat,  current-carrying 
sheet  is  equivalent  to  a stack  of  solenoids  of  rectangular  cross  sec- 
tion. The  argument  is  analogous  to  that  accompanying  Fig.  23-39. 


23-7  Applications  of  Ampere’s  Law  1115 


EXERCISES 


Group  A 

23-1.  Ionic  motions.  The  metal  vessel  in  Fig.  23E-1  has 
a metal  center  post  insulated  from  the  bottom.  It  contains 
salt  water  in  which  the  current  carriers  are  positively 
charged  sodium  ions  Na+  and  negatively  charged  chlorine 
ions  Cl-.  The  apparatus  is  in  a magnetic  field  directed  up- 
ward. Describe  the  motion  of  each  kind  of  ion.  Describe 
the  motion  of  each  kind  of  ion  as  viewed  from  above. 


23-8.  Current-carrying  wire.  What  is  the  value  of  35  at  a 
distance  of  1 .0  cm  from  a long  straight  wire  carrying  a 
current  of  10  A? 

23-9.  Two  current-carrying  wires.  Two  long  straight 
wires  are  20  cm  apart.  The  current  in  each  is  10  A,  but  the 
currents  are  in  opposite  directions.  What  is  the  value  of  38 
at  a point  midway  between  them? 


23-2.  Orbiting  electrons.  An  electron  whose  speed  is  1.0 
percent  of  the  speed  of  light  enters  a uniform  magnetic 
field  moving  at  right  angles  to  the  field. 

a.  What  is  the  value  of  38  if  the  electron  travels  in  a 
circular  orbit  of  radius  10  cm? 

b.  What  is  the  period  of  the  electron  in  its  orbit? 

23-3.  Cyclotron  resonance,  I.  Cyclotron  resonance  of 
the  electrons  in  a particular  solid  is  observed  to  occur  at  a 
magnetic  field  of  4.00  X 10-2  T when  a radiation  of  fre- 
quency 2.50  x 109  Hz  is  applied.  What  is  the  ratio  of  the 
effective  mass  to  the  free-electron  mass? 

23-4.  Accelerating  protons,  I.  A cyclotron  accelerates 
protons  to  nonrelativisitic  speeds.  The  cyclotron  magnet 
has  a field  strength  of  1.00  T.  What  is  the  frequency  at 
which  the  protons  circulate  in  the  accelerator?  The  mass 
of  a proton  is  1.67  X 10-27  kg. 

23-5.  Accelerating  protons,  II.  A proton  accelerator  has 
a magnetic  field  strength  of  1.20  x 10-1  T.  As  the  protons 
achieve  relativistic  speeds,  their  mass  increase  affects  the 
frequency  with  which  the  protons  circulate  in  the  accelera- 
tor. Calculate  the  ratio  of  frequency  of  circulation  for  a 
proton  at  v = 0.50c  to  that  for  the  nonrelativistic  proton. 

23-6.  Thomson's  apparatus.  A beam  of  electrons  moves 
through  crossed  electric  and  magnetic  fields  as  in 
Thomson’s  apparatus  for  determining  elme.  If  % = 1.5  X 
105  V/m  and  = 3.0  x 10-3  T,  what  is  the  speed  of  the 
electrons  if  the  beam  is  undeviated? 


23-7.  Hall  voltage.  In  Fig.  23E-7,  a Hall  voltage  is  set 
up  in  beryllium  with  the  front  edge  positive.  What  is  the 
sign  of  the  main  carriers  of  the  current? 


Fig.  23E-7 


+ + + + + + + + + 


55 


1116 


23-10.  Current-carrying  coil,  I.  A circular  coil  of  radius 
2.5  cm  has  100  turns  and  carries  a current  of  1.0  A.  What 
is  the  magnetic  field  at  the  center  of  the  coil? 

23-11.  Current-carrying  coil,  II.  What  current  must  a 
circular  coil  of  100  turns  and  radius  5.0  cm  carry  to  pro- 
duce a magnetic  field  of  1.0  x 10-3  T at  its  center? 

23-12.  Comparing  the  field  of  two  coils.  A length  L of 
wire  is  wound  into  a circular  coil  of  many  turns,  all  of 
radius  R.  An  equal  length  of  wire  is  made  into  another  cir- 
cular coil  of  radius  R/2.  What  is  the  ratio  of  the  magnetic 
fields  at  their  centers  if  they  carry  equal  currents? 

23-13.  Fooling  the  compass.  A horizontal  wire  is  lined 
up  in  the  north-south  direction.  A compass  needle  is 
placed  above  the  wire.  When  the  current  is  turned  on,  the 
North  pole  of  the  compass  is  deflected  toward  the  west.  In 
which  direction  are  electrons  in  the  wire  moving? 

23-14.  Coaxial  cable.  The  current  flowing  in  one 
direction  along  the  inner  conductor  of  a long  coaxial 
cable,  and  in  the  opposite  direction  along  the  outer  con- 
ductor, has  a magnitude  of  25.00  A.  The  inner  and  outer 
conductors  are  thin-walled  tubes  having  radii  of  4.00  mm 
and  10.00  mm.  Evaluate  the  magnitude  of  the  magnetic 
field  and,  if  nonzero,  describe  its  direction  at  points  whose 
distances  from  the  axis  are  2.00  mm,  6.00  mm,  8.00  mm, 
12.00  mm. 

23-15.  Don’t  let  this  get  your  soul  annoyed.  Long  solenoid 
A is  wound  with  a single  layer  of  N turns  of  wire.  Solenoid 
B is  of  the  same  length  but  has  two  layers  of  the  same  wire 
in  its  windings  so  that  it  has  2 N turns.  Each  solenoid  is  con- 
nected across  the  same  type  of  battery,  which  has  neg- 
ligible internal  resistance. 

a.  Compare  the  values  of  the  internal  magnetic  field 
in  the  two  solenoids,  taking  into  account  the  resistances  of 
the  solenoid  windings. 

b.  Solenoid  B has  an  advantage  over  solenoid  A. 
What  is  it? 

Group  B 

23-16.  An  efme  measurement.  The  value  of  e/me  can  be 
obtained  by  using  a specially  designed  vacuum  tube  illus- 
trated in  Fig.  23E-16.  It  contains  a heated  filament  F and 
an  anode  A which  is  maintained  at  a positive  potential  rel- 
ative to  the  filament  by  a battery  of  known  voltage  V.  Elec- 
trons evaporate  from  the  heated  filament  and  are  acceler- 
ated to  the  anode,  which  has  a small  hole  in  the  center 


Fig.  23E-16 


Fig.  23E-18 


r — H 


v 


through  which  some  of  the  electrons  pass.  The  tube  con- 
tains a drop  of  mercury.  The  slight  concentration  of  mer- 
cury vapor  makes  the  electron  path  in  the  tube  visible  in  a 
darkened  room,  yet  the  concentration  of  the  vapor  is  too 
small  to  substantially  interfere  with  the  motion  of  the  elec- 
trons. The  tube  is  placed  in  a uniform  magnetic  held  of 
known  magnitude,  with  ® parallel  to  the  flat  surface  of 
the  anode.  The  anode  surface  has  circles  of  known 
radius  scribed  on  it.  ® is  varied  in  magnitude  until  the  visi- 
ble semicircular  trajectory  of  tbe  electrons  ends  on  one  of 
the  circles.  Show  that  e/me  = 8V / £392r2,  where  r is  the 
radius  of  the  scribed  circle  on  which  the  electron  trajec- 
tory ends. 

23-17.  Cyclotron  resonance,  II.  In  a cyclotron  reso- 
nance experiment,  the  magnetic  held  is  directed  upward. 
The  results  indicate  that  the  charged  particles  are  circu- 
lating counterclockwise  as  viewed  from  above. 

a.  What  is  the  sign  of  the  charge  on  the  particles? 

b.  Assuming  that  the  charged  particles  have  charge  e 
and  mass  me  and  that  their  energy  is  the  thermal  energy  f 
kT  at  300  K,  what  is  the  radius  of  their  orbits  if  the  magni- 
tude of  the  magnetic  held  is  S3  = 0.10  T?  The  velocity  of 
the  particles  is  perpendicular  to  the  held  direction.  When 
substituting  numerical  values,  include  their  units  and 
check  to  see  that  the  result  is  indeed  given  in  meters. 

c.  How  long  does  it  take  the  particles  to  complete 
one  orbit? 

23-18.  The  cyclotron,  I.  A cyclotron  is  a device  for 
accelerating  nuclear  particles.  The  heart  of  the  apparatus 
consists  of  a split  metal  pillbox.  Figure  23E-18  shows  top 
and  side  views  of  the  halves  called  Dees.  A rapidly  oscil- 
lating potential  difference  is  applied  between  the  Dees. 
This  produces  an  oscillating  electric  held  in  the  gap 
between  the  Dees,  the  region  inside  each  Dee  being  essen- 
tially free  of  electric  held.  The  Dees  are  enclosed  in  an 
evacuated  container,  and  the  entire  unit  is  placed  in  a uni- 
form magnetic  held  S3  whose  direction  is  normal  to  the 
plane  of  the  Dees.  A charged  particle  of  mass  m and 
charge  q in  the  gap  between  the  Dees  is  accelerated  by  the 
electric  held  toward  one  of  them.  Inside  the  Dees,  it  moves 
with  constant  speed  in  a semicircle.  The  time  to  traverse 
the  semicircle  is,  according  to  Eq.  (23-10),  t = irm/q S3  and 
is  independent  of  the  speed.  If  the  half-period  of  the  oscil- 
lating electric  held  is  equal  to  this  time,  the  charged  par- 


Top  view 


Side  view 

tide  will  again  be  accelerated  when  it  next  crosses  the  gap, 
because  of  the  reversal  of  the  direction  of  the  electric 
held.  Thus  it  will  gain  energy.  This  makes  the  next  semi- 
circle it  traverses  have  larger  radius,  as  shown  in  the  fig- 
ure. The  energy  gain  can  be  repeated  many  times.  A cy- 
clotron is  accelerating  deuterons  which  are  nuclei  of  heavy 
hydrogen  carrying  a charge  of  +e  and  having  mass  of 
3.3  x 10“27  kg. 

a.  What  is  the  required  frequency  of  the  oscillating 
electric  held  if  S/3  = 1.5  T? 

b.  If  the  deuterons  are  to  acquire  15  MeV  of  kinetic 
energy  and  the  difference  of  potential  across  the  gap  is  50 
keV,  how  many  times  does  the  deuteron  undergo  acceler- 
ation? 

c.  What  is  the  radius  of  the  semicircle  if  the  deu- 
terons acquire  this  energy? 

23-19.  The  cyclotron,  II.  A cyclotron  has  been  adjusted 
to  accelerate  deuterons.  It  is  now  to  be  adjusted  to  acceler- 
ate protons  instead,  which  have  almost  exactly  half  the 
deuteron  mass. 

a.  What  change  must  be  made  if  there  is  no  change 
in  the  frequency  of  the  oscillating  potential  difference  ap- 
plied between  the  Dees? 

b.  What  change  must  be  made  if  there  is  no  change  in 
the  strength  of  the  magnetic  field  applied  normal  to  the 
Dees? 

c.  How  will  each  change  alter  the  maximum  energy 
which  the  protons  can  acquire? 

23-20.  The  cyclotron,  III.  A large  cyclotron  is  designed 
accelerate  deuterons  to  450  MeV  of  kinetic  energy.  This 
means  that  their  speed  will  become  a substantial  fraction 
of  c.  Hence  their  mass  will  become  substantially  larger 
than  the  rest  mass.  If  the  magnetic  field  is  everywhere  of 
the  same  value,  this  requires  that  the  frequency  of  the  os- 
cillating potential  difference  applied  between  the  Dees  be 
decreased  during  the  acceleration  of  a group  of  deu- 
terons. What  is  the  ratio  of  the  final  to  the  initial  fre- 
quency? The  rest  mass  of  a deuteron  is  3.3  x 10~27  kg. 

23-21.  The  cyclotron,  IV.  A proton  of  charge  +e  is  or- 
biting in  a cyclotron  which  has  magnetic  field  5R  = 
2.00  T. 


Exercises  1117 


a.  What  must  be  the  frequency  of  the  oscillating  elec- 
tric held  in  the  cyclotron? 

b.  Show  that  the  momentum  of  a proton  at  radius  r is 
given  by  the  expression  mv  = Sfter. 

c.  What  is  the  kinetic  energy  of  the  proton  if  the  or- 
bital radius  r is  (i)  10  cm?  (ii)  100  cm?  Express  your  values 
in  MeV. 

d.  What  must  be  the  frequency  of  the  cyclotron  oscil- 
lator when  r = 100  cm?  Hint:  First  find  the  relativistic 
mass  of  the  proton. 

23-22.  Bainbridge  mass  spectrometer.  Figure  23E-22 
represents  the  device  designed  by  Bainbridge  to  accu- 
rately measure  the  masses  of  isotopes.  S is  the  source  of 
positively  charged  ions  of  the  element  being  investigated. 
The  ions  all  have  the  same  charge  e,  but  they  have  a range 
of  speeds.  Throughout  the  region  a uniform  magnetic  field 
£3  is  directed  into  the  plane  of  the  page.  In  addition,  an 
electric  held  %,  directed  parallel  to  the  plane  of  the  page, 
is  set  up  between  electrodes  E and  £'. 


Fig.  23E-22 


c 


E' 


S 

a.  Show  that  only  ions  whose  speed  v equals  «?/3 8 will 
be  passing  out  of  the  opening  at  C. 

b.  Show  that  the  mass  m of  an  ion,  whose  trajectory  is 
a semicircle  of  radius  R , is  equal  to  e3ftR/v. 

c.  The  element  tin  is  being  analyzed.  Among  the  iso- 
topes present  are  those  with  mass  numbers  A = 116,  117, 
1 18,  119,  and  120.  That  is,  an  ion  of  a particular  isotope 
has  mass  Au,  where  the  atomic  mass  unit  u has  the  value 
u = 1.66  X 1CT27  kg.  The  electric  and  magnetic  fields  are 
% = 2.0  x 104  V/m  and  38  = 0.25  T.  What  is  the  spacing 
between  the  marks  produced  on  the  photographic  plate  P 
by  ions  of  tin-1 16  and  ions  of  tin- 120? 

23-23.  Hall  angle.  The  Hall  angle  6 is  defined  as  the 
angle  between  the  direction  of  the  total  electric  held  acting 
in  the  Hall  effect  and  the  direction  of  the  current  flowing 
in  the  sample. 

a.  Prove  that  tan  9 = <T$ft/Nq,  where  cr  is  the  conduc- 
tivity of  the  sample,  N is  the  charge  carrier  concentration 
in  it,  q is  the  charge  on  each  carrier,  and  38  is  the  applied 
magnetic  held.  Show  that  6 is  independent  of  the  current 
density. 

b.  If  38  = 0.50  T,  N = 8.0  x 1028  nT3,  and  cr  = 6.0  X 
107  S/m,  calculate  6,  assuming  that  electrons  are  the  carri- 
ers. 


23-24.  Semicircular  wire.  A current  of  10  A Hows 
through  a semicircular  wire  of  radius  5.0  cm,  as  in  Fig. 
23E-24.  What  is  the  value  of  the  magnetic  held  at  the 
center  O?  What  is  its  direction? 


23-25.  No  magnetic  field.  An  equilateral  triangle  is 
formed  from  a piece  of  uniform-resistance  wire.  Current 
is  fed  into  one  corner  and  led  out  of  the  other.  See  Fig. 
23E-25.  Show  that  the  current  flowing  through  the  sides 
of  the  triangle  produces  no  magnetic  held  at  its  center  0 
(the  intersection  of  the  medians). 


Fig.  23E-25 


23-26.  “Phonomagnetism.”  A phonograph  record  of 
radius  R , which  carries  a uniformly  distributed  charge  Q, 
is  rotating  with  constant  angular  speed  co.  Show  that  the 
magnetic  held  at  the  center  of  the  disk  is  given  by  38  = 
/jl0ci)Q/2ttR.  The  American  physicist  Rowland  performed 
such  an  experiment  to  demonstrate  that  any  moving 
charge,  not  necessarily  a current  in  a wire,  produces  a 
magnetic  held. 


23-27.  Coils  subtending  the  same  angle.  Two  circular 
coils  shown  in  Fig.  23E-27  have  the  same  number  of  turns 
and  carry  current  in  the  same  sense.  They  have  different 
diameters  but  subtend  the  same  angle  at  P. 


Fig.  23E-27 


1118  Magnetic  Fields,  I 


a.  Which  coil  makes  the  larger  contribution  to  the 
magnetic  held  at  P? 

b.  It  the  smaller  coil  is  midway  between  P and  the 
larger  one,  what  is  the  ratio  of  the  larger  contribution  at  P 
to  the  smaller? 

23-28.  Magnetic  field  inside  a wire.  A wire  of  circular 
cross  section  and  radius  R carries  a current  I uniformly 
distributed  across  the  wire.  Calculate  the  magnetic  held 
inside  the  wire  as  a function  of  the  distance  r from  the 
center  of  the  wire. 

23-29.  Current-carrying  sheet.  A sheet  of  metal  of  infi- 
nite extent  in  the  y and  z directions  carries  a uniform  cur- 
rent in  the  +z  direction  of  magnitude  A.  per  unit  length. 

a.  By  considering  symmetrically  placed  current  fila- 
ments, determine  the  direction  of  ® at  any  point  P in  the 
space  around  the  infinite  current-carrying  conductor. 

b.  Apply  Ampere’s  law  to  show  that  the  magnitude  of 
® does  not  depend  on  the  distance  of  P from  the  con- 
ductor. 

c.  Show  that  Ampere’s  law  leads  to  the  result  53  = 

MoA/2. 

23-30.  Sudden  stops  forbidden.  In  Fig.  23E-30,  N and  .S’ 
are  the  North  and  South  poles  of  a magnet.  The  held  of 
the  left  half  has  been  drawn  so  that  the  magnetic  held 
lines  are  everywhere  straight  and  normal  to  the  pole  faces, 
which  are  parallel  and  opposite  each  other.  The  held  of 
the  right  half  has  been  drawn  with  the  magnetic  held  lines 
bowing  outward.  Apply  Ampere’s  law  to  the  left  half  to 
show  that  it  is  incorrect;  that  is,  show  that  the  magnetic 
held  cannot  stop  suddenly. 

Fig.  23E-30 


Group  C 

23-31.  Corbino  effect.  A disk-shaped  sample  of  a con- 
ducting material  has  wrapped  around  its  periphery  a 
band  of  material  of  much  higher  conductivity,  which 
serves  as  an  electrode.  The  same  high-conductivity  mate- 
rial, made  into  a small  plug  at  the  center  of  the  disk,  is  the 
other  electrode.  See  Fig.  22-10.  A steady  current  i is  made 
to  how  radially  outward  through  the  disk.  The  disk  is 
placed  in  a uniform  magnetic  held  ®.  oriented  perpendic- 
ular to  the  plane  of  the  disk. 

a.  Show  that  an  electron  located  at  a point  in  the  disk 
where  the  electric  held  has  magnitude  %,  will  have  a drift 


N 


velocity  which  makes  an  angle  0 with  respect  to  the  radius 
vector  given  by 


0 = tan  1 I ~~ 


Ne  % 


where  j is  the  current  density  at  that  location  and  N is  the 
number  of  electrons  per  unit  volume. 

b.  Using  Ohm’s  law,  together  with  the  definition  of 
the  electrical  conductivity  cr  in  terms  of  microscopic  quan- 
tities of  Eq.  (22-43a),  show  that  the  angle  6.  called  the  Hall 
angle,  can  be  written 


9 = tan  1 


tan  1 


S3 


where  elme  is  an  electron's  charge-to-mass  ratio  and  t is  its 
mean  scattering  time.  Thus  show  that  0 depends  on  the 
magnitude  of  the  magnetic  held  but  is  independent  of  the 
current,  the  current  density,  and  the  position  of  the  elec- 
tron in  the  disk. 

c.  Using  numbers  given  in  the  text,  show  that  for  a 
typical  metal  (say  copper)  at  room  temperature,  the  Hall 
angle  is  of  magnitude  1(U3  S3  rad,  with  Sft  measured  in  T. 

d.  In  the  absence  of  a magnetic  held,  the  voltage 
drop  across  the  sample  required  to  maintain  a current  i is 
V,  and  the  sample  resistance  is  R = V/i.  If  the  current  is 
not  to  change  when  the  magnetic  held  is  turned  on,  the 
voltage  drop  must  be  increased  to  a new  value  V = V + 
AT.  That  is,  the  sample  has  a new  “resistance”  R'  = R + 
A R = (V  + A V)/i.  Using  the  approximation  tan  6 — 6, 
which  is  valid  if  the  Hall  angle  is  small,  show  that  1 + 
A R/R  = sec  6 so  that,  using  the  small-angle  approxi- 
mation sec  6 — 1 + 62/ 2,  you  have 


A R 1 /er 

R 2 \ me 


532 


e.  According  to  the  result  of  part  b,  the  current  lines 
in  the  presence  of  a magnetic  held  make  a constant  angle  6 
with  the  radii  through  which  they  cut.  Show  that  the  polar 
equation  of  the  current  lines  is  r = r0en,b.  This  curve  is  called 
a logarithmic  spiral. 

23-32.  Another  elme  measurement.  A long  evacuated 
cathode  ray  tube  is  inside  a still  longer  solenoid.  See  Fig. 
23E-32.  Electrons  boiled  off  a glowing  hlament  F are 


Fig.  23E-32 


Exercises  1119 


accelerated  toward  the  positively  charged  anode  A by  a 
difference  of  potential  V between  anode  and  cathode.  A 
small  cone  of  electrons  moving  with  speed  v passes 
through  a small  opening  in  the  anode.  The  electrons 
strike  a fluorescent  screen  5 at  the  right  end  of  the  tube, 
producing  a visible  spot.  As  the  solenoid  current  is  turned 
on  and  increased,  the  spot  broadens,  then  narrows,  then 
broadens  attain,  then  narrows  asrain. 

If  6 is  the  angle  the  velocity  of  an  electron  makes  with 
the  axis  of  the  tube  when  it  leaves  the  anode,  the  compo- 
nent perpendicular  to  the  axis  is  v±  = v sin  0,  while  the 
component  parallel  to  the  axis  is  v\\  = v cos  0.  With  small 
values  of  6,  which  is  the  case  in  this  arrangement,  then  for 
all  electrons,  cos  6 — I and  i/,,  = v.  The  electrons  move 
down  the  tube  with  this  speed  v.  At  the  same  time,  they  ro- 
tate about  the  magnetic  held. 

a.  Show  that  the  period  of  rotation  T about  the  mag- 
netic held  is  the  same  for  all  electrons  and  is  equal  to 
clTrme/3fce. 

The  time  for  the  electrons  to  travel  the  length  L of  the 
tube  while  rotating  about  the  magnetic  held  is  L/v.  11  Si  is 
adjusted  so  that  the  period  of  rotation  is  equal  to  this  time, 
the  electrons  will  all  reach  the  spot  that  they  struck  with 
the  magnetic  held  off.  The  electrons  are  said  to  be  fo- 
cused. 

b.  Show  that  this  condition  requires  that  v = 
SieL/2Trme. 

c.  Show  that  when  the  electrons  are  focused,  e/me  = 
8t rV/L2Si2. 

d.  In  an  experiment  with  this  apparatus,  the  follow- 
ing data  were  obtained  for  the  hrst  focusing:  turns  pet 
meter  on  the  solenoid  = 1.12  x 103  m-1;  L = 21.5  cm; 
V = 632  V;  current  in  solenoid  = 1.81  A.  Calculate  e/me 
from  this. 

23-33.  Back  to  basics.  In  Fig.  23E-33,  AB  is  a finite 
length  of  wire  carrying  current  i.  The  perpendicular  dis- 
tance from  any  point  P to  the  line  of  the  wire  is  a.  Because 
of  the  lack  of  any  symmetry,  it  is  necessary  to  fall  back  on 
the  Biot-Savart  law  to  find  the  held  at  P due  to  the  length 
AB  of  the  wire.  Show  that  it  is  equal  in  magnitude  to 
( gud/Arra)  (cos  a2  — cos  ay),  where  a1  and  a2  are  defined  in 
the  hgure.  Then  use  this  result  to  obtain  the  result  for  an 
infinite  straight  wire. 

A 1 , B Fig.  23E-33 


23-34.  Circle  or  square ? 

a.  Use  the  result  of  Exercise  23-33  to  show  that  the 
magnetic  held  at  the  center  of  a square  frame  of  side  l il- 
lustrated in  Fig.  23E-34,  and  carrying  current  i,  is 
(Mo?/ rrl)  2V2. 


Fig.  23E-34 


b.  A length  L of  w'ire  carrying  current  i is  to  be  bent 
into  a circle  or  a square,  each  of  one  turn.  In  which  case  is 
Si  at  the  center  of  the  figure  greater?  What  is  the  ratio  of 
Sig  (greater)  to  Sis  (smaller)? 

23-35.  Short  solenoid. 

a.  From  Eqs.  (23-44)  show  that  Si  at  point  P on  the 
axis  of  a solenoid  of  finite  length  equals  (/jl 0/2)  ni  (cos  a2  — 
cos  a^,  the  angles  being  defined  in  Fig.  23E-35  and  n 
being  the  number  of  turns  pet'  unit  length. 


b.  What  is  the  value  of  Si  (i)  at  point  A on  the  axis  of 
the  solenoid  just  outside  the  end?  (ii)  at  C on  the  axis  in  the 
center?  (iii)  What  is  the  ratio  of  Sic  to  SiA? 

c.  Show  that  the  result  in  part  a for  Si  inside  a long 
solenoid  agrees  with  Eq.  (23-62). 

23-36.  Model  hydrogen  atom.  A simplified  model  of  a 
hydrogen  atom  is  an  electron  revolving  in  a circular  orbit 
around  a proton  at  a distance  R = 5.3  x 10_u  m. 

a.  Show  that  the  magnetic  held  at  the  proton  pro- 
duced by  the  moving  electron  is  given  by  Si  = 
(H0e2/Ai tR2)  Vl  /ATT€0meR  . 

b.  Evaluate  this  held. 

23-37.  Helmholtz  coils.  One  method  of  obtaining  a uni- 
form magnetic  held  is  to  employ  a pair  of  identical  cir- 
cular coils  which  carry  equal  currents  i in  the  same  sense. 
Each  coil  has  N turns.  The  coils  have  a common  axis  and 
are  separated  a distance  2b.  See  Fig.  23E-37.  If  this  dis- 
tance is  chosen  properly,  the  held  will  be  fairly  uniform 
for  some  distance  on  either  side  of  the  midpoint  0. 


1120  Magnetic  Fields,  I 


Coils  used  in  this  manner  are  called  Helmholtz  coils,  after 
a German  physicist. 

The  held  at  P,  at  distance  x from  O is  given  by  Eq. 
(23-44a)  to  be 


/uLoNia2 


1 


1 


2 l [a2  + (b  + x)2]312  [ a 2 + (b-  x)2]3'2 

fjboN  ia2  1 


2 (a2  + b2)312 


I 


2 foe  + x2  \ 3,2 


a2  + b2 


b.  Obtain  the  same  result  from  Ampere’s  law. 

23-39.  Thick-walled  current-carrying  tube. 
a.  A long  straight  thick-walled  metal  tube  has  inner 
radius  rl  and  outer  radius  r2 . A current  / flows  along  it,  the 
current  density  being  the  same  everywhere  within  the 
body  of  the  tube.  Show  that  the  magnetic  held  inside  the 
body  of  the  tube  at  distance  r from  its  axis  has  the  value 


Pol  r2  - r2 
27 t (r22  - r2)r 


for  rx  < r < r2 


1 

H 

— 2 bx  4-  x2  \312 


a.  Expand  the  fractions  inside  the  large  square 
brackets  using  the  binomial  theorem,  and  determine  the 
relation  between  b and  a for  which  3ft  is  independent  of  x 
for  powers  of  x up  to  the  second. 

b.  Show  that  for  this  choice  of  b the  uniform  held  at 
t he  midpoint  is  equal  to  8 iLjSli/5\/b  a. 

23-38.  Thin-walled  current-carrying  tube.  A uniform 
current  hows  along  a long  copper  tube  shown  in  end  view 
in  Fig.  23E-38. 


Fig.  23E-38 


b.  Write  expressions  for  3ft  in  the  region  0 < r < rl 
and  in  the  region  r2  < r.  Give  a one-  or  two-sentence  justi- 
fication of  each.  Then  sketch  3ft  versus  r for  0 < r < 4r2. 

23-40.  Designing  a multilayer  coil.  A multilayer  coil  is  to 
be  wound  on  a hxecl  form  with  the  wires  in  contact.  See 
the  cross  section  in  Fig.  23E-40.  The  insulation  is  thin,  and 
the  space  provided  is  to  be  filled. 


a.  How  will  the  strength  of  the  magnetic  held  pro- 
duced depend  on  the  diameter  d of  the  wire  chosen  if  (i) 
the  power  consumed  by  the  coil  is  hxed?  (ii)  the  potential 
difference  applied  across  the  coil  is  hxed? 

b.  In  part  a (ii),  how  does  the  power  consumption  de- 
pend on  the  diameter  of  the  wire? 


a.  Using  S3  = {gL0/2ir)(i/r)  for  a long  straight  wire  and 
also  using  the  geometry  of  the  circle,  show  that  the  mag- 
netic held  at  any  point  P inside  the  tube  is  zero. 


Exercises 


1121 


24 

Magnetic  Fields,  II 


24-1  AMPERE’S  Chapter  23  was  concerned  with  the  two  faces  of  the  relation  between  mag- 
EXPERIMENT  AND  netic  fields  and  electric  charges.  First,  when  an  observer  sees  electric 
THE  AMPERE  charges  move,  he  will  always  detect  a magnetic  held  produced  by  the  mo- 
tion. Second,  when  an  observer  sees  electric  charges  moving  in  a magnetic 
held,  he  hnds  that  they  experience  forces.  The  hrst  statement  is  expressed 
quantitatively  either  in  the  Biot-Savart  law, 

Un  i 

d( B = ;(is  x r (24- 1 ) 

477  r2 

or  in  Ampere’s  law, 

| (B  • d\  = fj,0i  (24-2) 

closed 

curve 

The  second  statement  is  expressed  quantitatively  in  the  relation 

F m = ?v  x ® (24-3) 

which  is,  in  fact,  the  definition  of  a magnetic  held. 

Putting  together  these  two  statements  leads  to  the  conclusion  that  an 
electric  charge  should  experience  a magnetic  force  if  it  is  observed  to  be 
moving  in  the  vicinity  of  other  electric  charges  which  also  are  observed  to 
be  moving.  Or,  what  amounts  to  the  same  thing,  an  electric  charge  moving 
in  the  vicinity  of  an  electric  current  should  experience  a magnetic  force. 
This  is  illustrated  schematically  in  Fig.  24-1.  Thus  the  existence  of  the  cur- 
rent implies  the  existence  of  the  magnetic  held,  which  exerts  a force  on  the 
moving  charge. 


1122 


EXAMPLE  24-1 


mmamm 


A current  i = 1.00  A flows  through  a long,  straight  wire  located  in  an  evacuated 
tube.  A proton  is  accelerated  from  rest  through  a potential  difference  of  100  V and 
thus  acquires  a certain  velocity.  It  is  then  injected  into  the  tube  in  a direction  paral- 
lel to  that  of  the  current,  as  shown  in  Fig.  24-1.  If  the  distance  between  the  wire  and 
the  proton  is  R = 1.00  cm,  find  the  initial  magnetic  force  exerted  on  the  proton  and 
the  initial  acceleration  due  to  this  force. 

■ The  magnetic  held  in  the  vicinity  of  a long,  current-carrying  wire  is  given  by 
Eq.  (23-426), 

® = ds  x R (24-4) 

Zttix 

where  ds  is  a unit  vector  in  the  direction  of  current  flow  through  the  wire  and  R is 
the  unit  vector  along  the  perpendicular  from  the  wire  to  the  proton.  If  the  velocity 
of  the  proton  is  v,  the  magnetic  force  experienced  by  the  proton  is 

u0i 

Fm  = ex  x ® = evv  x (B  = ev  — — — [v  x (ds  x R)]  (24-5) 

IttR 

To  determine  the  direction  of  Fm,  consider  the  unit  vector  Fm  = v X (ds  x R). 
Then  use  the  fact  that  v is  parallel  to  ds,  that  is,  v = ds.  You  can  therefore  write  the 
unit-vector  relationship 

F,„  = v x (ds  x R)  = ds  x (ds  x R) 


Fig.  24-1  An  observer  sees  a test 
charge  moving  with  velocity  v parallel  to 
a long,  straight  wire  carrying  a current. 
As  it  moves  through  the  magnetic  field 
® produced  by  the  current  in  the  wire, 
the  charge  experiences  a magnetic  force 
F*. 


You  now  employ  the  right-hand  rule  for  cross  products  twice  in  succession.  First, 
you  find  the  direction  (ds  x R)  inside  the  parentheses.  [You  must  perform  this 
vector  multiplication  first!  The  quantities  (ds  x ds)  x R and  ds  x (ds  x R)  are  not 
the  same.]  The  resulting  direction  (ds  x R)  is  shown  in  Fig.  24-2a;  it  is  the  magnetic 
field  direction  (B,  as  stated  in  Eq.  (24-4).  Second,  you  take  the  cross  product  of  ds 
with  this  vector,  as  shown  in  Fig.  24-2 b.  The  result  is  a unit  vector  in  the  — R direc- 
tion, so  that 


Fm  = -R 


That  is,  the  magnetic  force  exerted  on  the  proton  will  be  in  the  direction  toward  the 
wire. 

Next,  you  use  Eq.  (24-5)  to  determine  the  magnitude  of  the  force.  To  do  this, 
you  must  first  calculate  the  speed  v of  the  proton,  whose  mass  is  m,  from  its  kinetic 
energy  K = mv2/2.  You  have 


v 


2K\'12  /2  x 1.00  x 102  eV  x 1.60  x 10-“J/eVy« 

“ V 1.67  x IQ-27  kg  / 


1.38  x 105  m/s 


Inserting  this  value  into  Eq.  (24-5),  you  have 


Fm  = 1.60  x 10“19  C x 1.38  x 103  m/s  x 


4tt  x 10"7  T-m/A  x 1.00  A 
2t t x 1.00  x 10~2  m 


= 4.42  x 10-19  N 


Fig.  24-2  Determination  of  the  direction  of  the  unit  vector  Fm  = 
ds  x (ds  x R).  (a)  The  unit  vector  ds  x R determines  the  local  direc- 
tion of  the  magnetic  field  ® at  any  location  in  the  direction  R from  the 
wire,  (b)  The  direction  Fm  is  the  direction  — R. 


( b ) 


24-1  Ampere’s  Experiment  and  the  Ampere  1123 


ffi 


(a) 

ffi 


( b ) 


Fig.  24-3  (a)  An  infinitesimal  segment 

of  wire,  whose  length  and  orientation 
are  described  by  the  vector  ds,  lies  in  an 
arbitrary  magnetic  held  ®.  A current  i 
flows  in  the  sense  shown.  The  drift 
velocity  v of  the  charge  carriers,  as- 
sumed to  be  positive,  has  the  same 
direction  as  ds.  (b)  Here  the  charge  car- 
riers are  negative,  and  v is  antiparallel 
to  ds. 


So  the  force  can  be  expressed  vectorially  in  the  form 

F,„  = -4.42  x Hr19  N x R 

To  find  the  acceleration,  you  use  Newton’s  second  law  in  the  form  a = Fm/m. 
The  direction  of  the  acceleration  is  the  same  as  that  of  the  force,  and  its  magnitude 
is 


a 


4.42  x KT19  N 
1.67  x IQ-27  kg 


108  m/s2 


Thus,  even  though  the  magnetic  field  is  modest  (see  Example  23-8),  the  accelera- 
tion is  enormous.  This  is  due  to  the  very  large  charge-to-mass  ratio  characteristic  of 
elementary  charged  particles. 


Since  there  is  a force  exerted  on  an  electric  charge  moving  through  a 
magnetic  field,  there  must  be  a force  exerted  on  a current-carrying  wire  in 
a magnetic  field  as  well.  The  reason  is  that  the  wire  carries  a current  by 
virtue  of  the  fact  that  it  contains  a very  large  number  of  moving  charges,  on 
each  of  which  a force  is  exerted  by  the  field.  To  make  the  transition  from 
the  force  f exerted  by  a magnetic  field  on  a single  moving  charge  carrier  to 
the  force  F exerted  by  the  magnetic  field  on  a current-carrying  wire,  con- 
sider an  infinitesimal  segment  of  wire  described  by  the  vector  ds  whose 
direction  is  along  its  length  in  the  sense  of  current  flow.  In  Fig.  24-3 a,  such 
a segment  is  shown  located  in  an  arbitrary  magnetic  field  (B.  Each  of  the 
charge  carriers,  all  assumed  to  be  the  same  kind,  has  charge  +e.  Averaged 
over  their  random  motion,  they  move  with  drift  velocity  v.  We  can  calculate 
the  average  force  on  a single  carrier  by  using  the  drift  velocity  v to  write  Eq. 
(24-3)  for  this  force  f as 

f = e\  x <B 

Remembering  that  v = v\  = v ds,  we  have 

f = ev  ds  x <B  (24-6) 

If,  instead,  the  charge  carriers  in  the  wire  are  negative,  as  in  Fig.  24-36, 
having  charge  —e,  then  their  drift  velocity  will  be  v = vv  = v(~ds).  The 
average  force  on  a single  carrier  is  thus 

f = — ev(  — ds)  x (B  = ev  ds  x (B 

This  result  is  identical  to  the  result  obtained  for  positive  charge  carriers  in 
Eq.  (24-6).  I 1ms  it  is  the  sense  of  the  current,  expressed  in  terms  of  the  direction  ds, 
and  not  the  direction  of  the  drift  velocity  v,  which  determines  the  direction  of  the  mag- 
netic force  on  the  charge  carriers. 

The  wire  segment  ds  contains  dN  charge  carriers,  all  with  the  same 
drift  velocity.  Thus  the  force  d F exerted  by  the  magnetic  field  on  the  entire 
wire  segment  is  given,  regardless  of  the  sign  of  the  charge  carriers,  by  the 
equation 

(IF  = dN  f 

If  a is  the  cross-sectional  area  of  the  wire  segment  whose  length  is  ds,  the 
volume  of  the  wire  segment  is  a ds.  If  we  write  the  number  of  charge  carri- 
ers per  unit  volume  as  n,  the  number  of  carriers  dN  in  the  segment  is  dN  = 
na  ds.  Since  the  magnetic  force  on  each  carrier,  on  the  average,  is  f,  the  total 
magnetic  force  on  the  wire  segment  is 


1124  Magnetic  Fields,  II 


dF  = na  ds  f 


Using  Eq.  (24-6),  we  obtain 

dF  = naev  ds  ds  x © 

We  now  relate  the  drift  speed  v to  the  current  i through  Eq.  (22-40), 
which  can  he  written  in  the  form 


i = naev 


This  can  be  substituted  directly  into  the  equation  for  dF  immediately  above 
to  yield 

dF  — i ds  ds  x © (24-7) 

Since  the  wire  is  of  indefinite  length,  it  is  better  to  deal  with  the  force  ex- 
erted on  it  per  unit  length  than  with  the  total  force.  We  therefore  divide 
Eq.  (24-7)  by  ds  to  obtain  the  force  per  unit  length, 

dF  . , 

— r-  — i ds  x © (24- 8a) 

ds 

where  it  is  understood  that  the  magnetic  held  © may  itself  be  a function  of 
position  and  must  be  evaluated  at  the  location  of  the  wire  segment  ds. 

Equation  (24-8o)  is  perfectly  general  and  expresses  the  force  per  unit 
length  exerted  on  an  element  of  wire  which  carries  a current  i and  is  lo- 
cated in  a magnetic  held  ffi.  For  a straight  wire  of  length  L in  a magnetic 
field  whose  value  is  the  same  everywhere  along  the  w'ire,  Eq.  (24-8a)  can  be 
rewritten  to  give  the  force  per  unit  length  in  the  form 

F 

— = is  x ffi  (24-8Z?) 


Fig.  24-4  Two  long,  straight  parallel 
wires.  Wire  1 carries  current  and  is  lo- 
cated a distance  R from  wire  2,  which 
carries  current  z2  in  the  same  sense. 


A 

S 


R 


l 


Wire  2 


Wire  1 


In  this  equation,  s is  a unit  vector  directed  along  the  straight  wire  in  the 
sense  of  current  flow.  It  is  identical  to  the  unit  vector  we  have  called  ds, 
which  expresses  the  direction  of  an  element  of  the  wire — that  is,  s = ds. 
We  call  the  unit  vector  s instead  of  ds  in  order  to  emphasize  that  it  denotes 
the  same  direction  for  all  parts  of  the  straight  wire. 

Either  Eq.  (24-8a)  or  Eq.  (24-7)  is  essentially  the  fundamental  magnetic 
force  law  of  Eq.  (24-3),  F = q\  x ffi,  rewritten  in  terms  of  a current  i = 
dq/dt  rather  than  in  terms  of  a charge  q moving  at  velocity  v. 

Like  the  Biot-Savart  law,  Eq.  (24-8a)  is  not  directly  verifiable  through 
experiment,  since  it  is  never  possible  to  isolate  a single  infinitesimal  seg- 
ment of  a current-carrying  w'ire  in  actuality,  as  is  done  in  imagination  in 
deriving  the  equation.  Also  like  the  Biot-Savart  law,  however,  Eq.  (24-8a) 
can  be  integrated  either  analytically  or  numerically  over  any  actual,  finite 
current  configuration  located  in  any  magnetic  field  whose  value  ffi  is  known 
as  a function  of  position.  An  important  (and  relatively  simple)  special  case 
is  that  shown  in  Fig.  24-4.  A long,  straight  wire  1,  carrying  current  fi,  is  situ- 
ated in  the  field  ffi2  of  the  current  i2  which  is  flowing  in  the  same  sense 
through  a parallel,  long,  straight  wire  2.  The  perpendicular  distance 
between  the  two  wires  is  R.  We  define  the  vector  R as  having  magnitude  R 
equal  to  this  distance  and  direction  from  wire  2 to  wire  1.  In  dtis  special  case, 
the  unit  vector  Sj,  which  specifies  the  orientation  of  wire  1 and  the  sense  of 
q,  is  identical  to  the  unit  vector  s2,  which  specifies  the  orientation  of  the 
parallel  wire  2 and  the  sense  of  i2.  We  can  thus  define  a single  unit  vector  s 
which  specifies  both  orientations;  that  is, 

s = Sj  = s2 

24-1  Ampre’s  Experiment  and  the  Ampere  1125 


Moreover,  the  magnetic  field  ffi2  of  the  current  4 is  the  same  all  along  wire 
1 and  has  the  value 


(B2  — 


J.  x R 

2ttR  S R 


Substituting  this  value  of  (B2  into  Eq.  (24-8u),  with  ds  = s,  leads  to  the 
following  expression  for  the  force  per  unit  length  exerted  on  an  element  ds 
of  wire  1 : 


dF 

ds 


tl  h*  ~ x /-  x f>, 
2n  R ( R) 


And  since  a construction  like  the  one  in  Fig.  24-2  shows  that  s x (s  x R)  = 
-R,  we  have 


dF  = _ M oMzi 
ds  2t tR 


(24-9) 


Because  of  the  symmetry  of  the  situation,  dF /ds  is  the  same  everywhere 
along  wire  1.  So  Eq.  (24-9)  applies  to  any  length  L of  the  wire,  and  not  just 
to  an  infinitesimal  length  ds.  We  therefore  have  dF/ds  = F/L,  and  we  can 
rewrite  the  equation  in  the  form 


F 

I 


Mo  hli 
2 77  R 


R 


(24-10) 


The  force  on  wire  1 is  directed  toward  wire  2. 

The  currents  i1  and  i2  appear  in  this  equation  in  perfectly  symmetrical 
form.  Thus  we  can  conclude  without  further  calculation  that  the  force  per 
unit  length  exerted  on  wire  2,  by  virtue  of  the  presence  of  the  magnetic 
field  ffij  of  wire  1,  is  directed  toward  wire  1 and,  furthermore,  is  equal  in 
magnitude  to  the  force  on  wire  1 given  by  Eq.  (24-10).  Additional  support 
for  this  conclusion  is  found  in  its  consistency  with  Newton's  third  law.  Note 
that  the  magnetic  forces  on  the  two  wires  are  attractive  if  the  currents  flow 
in  the  same  sense. 

If  the  currents  flow  in  opposite  senses,  the  forces  will  be  repulsive.  You 
can  verify  this  last  statement  by  rederiving  Eq.  (24-10),  but  setting  s2  = 

-sj. 


It  is  easy  to  make  the  currents  and  i2  exactly  equal  by  connecting  the 
two  wires  in  series.  In  this  special  case,  Eq.  (24-10)  can  be  written 


F 

z 


tit  R 

277  RR 


(24-1 1) 


This  result  is  amply  confirmed  by  experiment.  The  first  demonstration  of 
this  was  carried  out  by  the  French  physicist  Andre  Marie  Ampere 
(1775-1836)  within  a week  after  he  had  heard,  in  June  1820,  of  Oersted’s 
discovery  that  a magnetic  field  is  produced  by  a steady  electric  current. 
Ampere  could  not  know  that  current  is  carried  by  charged  particles,  and  he 
did  not  couch  his  reasoning  in  terms  of  a microscopic  argument  based  on 
them,  as  we  have.  Rather,  his  line  of  reasoning  must  have  run  something 
like  this:  Two  magnets  experience  forces  when  they  are  near  each  other.  A 
current-carrying  wire  behaves  like  a magnet,  in  the  sense  that  a compass 
needle  is  deflected  in  its  vicinity,  as  Oersted  has  shown.  Will  two 
current-carrying  wires  act  like  magnets  in  the  further  sense  that  they  expe- 


1126  Magnetic  Fields,  II 


“Z- 

Knife 

edge 


Wire  1 


Wire  1 


S 

Knife  edge 


Fig.  24-5  Schematic  drawing  of  Ampere’s  current  balance.  A wire  shaped  as  shown  in  the 
figure  is  suspended  from  two  knife  edges,  so  that  it  can  swing  freely.  Running  parallel  to  its 
lower  part  is  a fixed  wire.  When  equal  electric  currents  flow  through  the  two  wires  in  the  same 
direction,  each  wire  experiences  a force  directed  toward  the  other.  This  force  can  be  mea- 
sured by  restraining  the  movable  wire  by  means  of  the  calibrated  spring  shown  in  the  figure. 
(In  his  work  Ampere  actually  used  counterweights  attached  to  the  movable  wire  near  the 
knife  edges.)  The  magnitude  of  the  force  depends  on  the  currents  flowing  through  the  wires, 
the  distance  R between  them,  and  the  length  L of  the  balanced  wire,  according  to  Eq.  (24-1 1). 
Only  the  section  of  wire  I which  is  parallel  to  wire  2 is  significant  in  this  experiment.  You  can 
use  Eq.  (24-4),  together  with  Eq.  (24-8a),  to  convince  yourself  that  the  magnetic  force  on  each 
vertical  section  of  wire  1 is  directed  away  from  the  other  vertical  section.  By  symmetry  the  two 
forces  are  equal  in  magnitude.  Since  they  are  opposite  in  direction,  they  cancel. 


rience  magnetic  forces  when  they  are  near  each  other?  Ampere’s  demon- 
stration that  the  answer  is  Yes  was  carried  out  with  the  apparatus  illustrated 
schematically  in  Fig.  24-5.  The  experiment  is  described  in  the  caption. 


The  magnitude  part  of  Eq.  (24-1 1)  can  be  solved  for  the  current  i to 

give 


(24-12) 


The  magnitude  F of  the  force,  the  distance  R between  the  wires,  and  the 
length  L of  the  movable  wire  are  all  directly  measurable.  The  permeability 
of  free  space  p,0  is  defined  to  be  p,0  = 4-77  x 10-7  T-m/A  (see  Sec.  23-4). 
Thus  the  unit  of  current  can  be  expressed  entirely  in  terms  of  an  experi- 
mental procedure.  Indeed,  Eq.  (24-12)  is  used  to  define  the  ampere,  the  SI 
unit  of  current.  That  is,  if  the  Ampere  experiment  is  performed  and  nu- 
merical values  are  determined  for  F in  newtons  and  for  R and  L in  meters, 
then  Eq.  (24-12)  yields  the  value  of  the  (common)  current  flowing  through 
the  two  wires,  expressed  in  amperes. 

In  Example  24-2  the  Ampere  experiment  is  considered  from  the  prac- 
tical standpoint  of  the  magnitudes  of  the  quantities  measured. 


EXAMPLE  24-2 


You  plan  to  repeat  Ampere’s  experiment  with  an  apparatus  having  a movable  wire 
50.0  cm  long.  The  two  wires  can  carry  a current  i = 15.0  A without  overheating. 
You  would  like  to  keep  the  distance  between  them  no  less  than  1 cm,  so  that  you  can 
avoid  any  later  arguments  concerning  the  effect  on  the  force  of  the  fact  that  the 
thickness  of  the  wires  is  not  negligible  compared  to  their  separation.  You  therefore 
adjust  that  separation  to  be  1.00  cm.  Is  the  magnitude  of  the  force  you  can  expect  to 
observe  large  enough  to  make  possible  a practical  experiment? 


24-1  Ampere’s  Experiment  and  the  Ampere  1127 


■ Beginning  with  Eq.  (24- 1 1 ),  you  can  write  the  magnitude  of  the  expected  force 
as 


F = 


Mo 

2tt 


L 

R 


4-7 t X 10  7 T-m/A 
2tt 


x (15.0  A)2  x 


0.500  m 
0.0100  m 


2.25  x 10“3  N 


A counterweight  having  a mass  of  230  mg  exerts  this  much  force.  Thus  the  experi- 
ment is  feasible,  but  you  probably  cannot  expect  to  obtain  great  accuracy. 


From  a practical  point  of  view,  the  geometry  of  the  Ampere  apparatus  is  not 
ideal  for  high-precision  measurements.  In  modern  practice,  the  actual  measure- 
ment is  carried  out  by  means  of  an  apparatus  called  a current  balance.  Its  principle 
of  operation  is  exactly  the  same  as  that  of  the  Ampere  apparatus,  but  the  current  is 
carried  by  coils  containing  many  turns  of  wire.  The  apparatus  used  by  the  U.S.  Na- 
tional Bureau  of  Standards  for  this  purpose  is  shown  in  Fig.  24-6. 

We  are  now  in  a position  to  see  the  train  of  logic  along  which  the  fundamental 
SI  electromagnetic  units  are  defined. 


1.  The  permeability  of  free  space  fx0  is  defined  to  be  47r  x 10  7 N/A2. 

2.  By  using  this  definition,  together  with  the  usual  SI  mechanical  units,  the 
ampere  is  defined  and  measured  as  described  immediately  above. 

3.  The  coulomb  is  defined  to  be  1 C = 1 A-s. 

4a.  The  permittivity  of  free  space  is  defined  and  measured  by  Coulomb’s  law: 


e0 


1 Q1Q2 
4t t Fer2 


(24-13) 


Fig.  24-6  The  standard  current  balance  of  the  U.S.  National  Bureau  of 
Standards.  This  is  the  apparatus  used  to  define  the  standard  ampere. 
While  the  current-carrying  coils  (seen  in  the  bottom  case)  have  a different 
geometric  arrangement  from  the  simple  one  of  the  original  Ampere 
apparatus,  the  principle  of  operation  is  the  same.  The  current  flowing 
through  the  coils  is  given  in  terms  of  the  force  F exerted  on  the  balance  by 
the  expression  i = G[(2tt//j.0)F]112,  where  the  constant  factor  G is  calcu- 
lated from  the  coil  sizes,  shapes,  numbers  of  turns,  spacings,  and  other 
geometrical  factors. 


1128  Magnetic  Fields,  II 


where  Fe  is  electric  force,  q1  and  q2  are  charges,  and  r is  their  separation. 

Or: 

4b.  The  value  of  the  permittivity  of  free  space  e0  is  defined  in  terms  of  p,0  and 
the  value  of  the  speed  of  light  c according  to  Eq.  (23-39),  l//u.0e0  = c2,  a relation 
which  is  derived  in  Chap.  27.  Thus  measurement  of  c serves  to  evaluate  e0. 


24-2  RELATIVISTIC 
ORIGIN  OF  THE 
MAGNETIC  FORCE 


Fig.  24-7  I he  Ampere  experiment 
reduced  to  its  essentials.  Two  long, 
straight,  parallel  wires,  1 and  2,  are 
oriented  along  the  direction  designated 
by  the  unit  vector  s.  Each  carries  an 
electric  current  i,  whose  sense  is  in  the 
positive  direction  s.  The  distance  from 
wire  2 to  wire  1 is  specified  by  the  mag- 
nitude of  the  vector  R,  which  is  perpen- 
dicular to  the  wires.  Wire  1 experiences 
a magnetic  force  F,„  directed  toward 
wire  2. 


In  this  section  we  use  relativistic  mechanics  to  develop  the  basic  relation 
between  the  magnetic  force  and  the  electric  force.  But  first  we  make  a pre- 
liminary study  of  the  magnetic  force  using  newtonian  mechanics. 

Newtonian  mechanics  applies  in  inertial  frames  in  which  all  observed 
speeds  are  small  compared  to  the  speed  of  light  c.  Observer  O'  is  moving  at 
a small  constant  velocity  V relative  to  observer  0,  who  is  at  rest  with  respect 
to  an  Ampere  apparatus  which  both  are  observing  and  which  lies  in  an 
inertial  reference  frame.  Each  will  agree  that  both  are  situated  in  an  inertial 
frame  of  reference.  They  therefore  attempt  to  interpret  their  observations 
of  the  Ampere  experiment  in  newtonian  terms.  We  will  see  that  this  at- 
tempt is  only  partially  successftd.  In  spite  of  the  fact  that  V « c,  a funda- 
mental understanding  of  the  Ampere  experiment  requires  a relativistic  in- 
terpretation! Figure  24-7  shows  the  Ampere  apparatus  reduced  to  its 
essentials.  Two  identical  copper  wires  of  indefinite  length  lie  a distance  R 
apart.  They  carry  equal  parallel  currents  i.  The  charge  carriers  are  elec- 
trons. The  positive  ions,  whose  concentration  is  identical  to  that  of  the  elec- 
trons, do  not  move.  Observer  O,  who  is  at  rest  with  respect  to  the  wires,  ob- 
serves an  attractive  magnetic  force  per  unit  length  F m/L  to  be  exerted  on 
wire  1.  According  to  Eq.  (24-1 1),  she  finds  this  quantity  to  be 


Fm  /xq  i « 

T ~ ~ 2 kR 


(24-14) 


where  R is  the  unit  vector  along  a perpendicular  from  wire  2 to  wire  1.  In 
order  to  express  the  observations  of  0 in  terms  of  the  motion  of  charge  at 
the  drift  velocity  v_  of  the  electrons,  we  use  the  equation 


is  — — neax _ 


(24-15) 


As  before,  the  unit  vector  s designates  the  sense  of  current  flow,  n is  the 
number  of  free  electrons  per  unit  volume,  a is  the  cross-sectional  area  of 
the  wire,  and  — e is  the  electron  charge.  This  equation  can  be  rewritten  in 
terms  of  the  total  mobile  charge  per  unit  length  of  the  wire,  A_.  We  have 


charge  per  unit  volume  x volume 
length 


( — ne)(aL) 

L 


= — nea 


(24-16) 


The  quantity  has  a negative  value.  In  terms  of  A_,  Eq.  (24-15)  can  be 
written 


is  = k-\-  (24- 17a) 

Taking  the  dot  product  of  each  side  of  this  equation  with  itself  gives 

z2s  • s = A.?_v_  • v_ 
or 

i2  = klvl  (24-17  b) 


24-2  Relativistic  Origin  of  the  Magnetic  Force  1129 


Thus  observer  0 can  write  the  magnitude  of  the  force  per  unit  length  ex- 
erted on  wire  1 in  the  Ampere  apparatus,  as  given  by  Eq.  (24-14),  in  the 
form 


Fm  _ _ /Up  A.-t£  - 

L 2tt  R 


(24-18) 


Now  let  us  put  ourselves  in  the  shoes  of  observer  O',  who  moves  with 
respect  to  the  wires  at  a velocity  V = v_  exactly  equal  to  the  drift  velocity  of  the 
electrons  in  the  wires.  From  his  point  of  view,  the  electrons  are  at  rest.  That  is, 
his  measurement  of  the  drift  velocity  of  the  electrons  (using,  say,  an  ideal 
microscope)  yields  the  result  vl  = 0.  Therefore  O'  can  observe  no  mag- 
netic held  at  wire  1 resulting  from  the  motion  of  the  electrons  through  wire 
2,  since  he  does  not  perceive  them  to  be  moving.  Furthermore,  the  electrons 
in  wire  1 are  not  moving  from  the  point  of  view  of  O'.  Thus  even  if  there 
were  a magnetic  held  at  wire  1,  he  would  still  have  to  conclude  that  it  would 
exert  no  force  on  the  electrons  in  wire  1.  Consequently,  O'  cannot  attribute 
the  force  exerted  on  the  wires  to  magnetic  effects  arising  from  the  motion 
of  the  electrons  in  the  two  wires,  as  O does. 

But  we  argue  that  O'  must  observe  the  same  force  per  unit  length  as  O. 
The  drift  speed  of  the  charge  carriers  in  metal  wires  carrying  reasonable 
currents  is  very  modest.  You  saw  in  Example  22-2  that  in  a no.  14  copper 
wire  carrying  a current  of  15  A,  the  drift  speed  is  something  like  0.5  mm/s. 
It  is  very  easy  for  an  observer  to  move  at  such  a speed  — indeed,  it  may  be 
difficult  to  stand  so  still  as  not  to  do  so.  Nevertheless,  we  most  certainly 
cannot  make  the  force  on  wire  1 disappear  just  by  moving  very  slowly. 

The  newtonian  solution  to  this  apparent  paradox  lies  in  the  presence 
of  the  positive  ions  of  the  metal  of  which  the  wire  is  made.  These  ions  are 
fixed  with  respect  to  the  wires.  Therefore  they  are  at  rest  from  the  point  of 
view  of  O,  who  is  stationary  with  respect  to  the  wires.  But  O',  who  is  moving 
with  respect  to  the  wires,  sees  the  positive  ions  as  moving  with  a velocity 
v+  = — v_.  That  is,  he  sees  the  positive  ions  moving  in  one  direction  exactly 
as  fast  as  O sees  the  electrons  moving  in  the  opposite  direction.  From  the 
newtonian  point  of  view,  moreover,  the  positive  charge  per  unit  length  A.+ 
must  be  equal  to  — A._,  since  the  wire  is  electrically  neutral  overall.  Thus  O' 
can  also  claim  that  the  force  he  measures  on  wire  1 is  a magnetic  force. 
From  his  point  of  view,  it  is  associated  with  a current  flowing  through  the 
two  wires,  which  he  calls  i' . In  order  to  be  consistent  with  O , he  chooses  the 
same  direction  s to  denote  the  positive  sense  of  current.  In  complete  anal- 
ogy with  Eqs.  (24-15)  and  (24-16),  he  writes 

i's  — nea\+  and  h+  = nea 

where  e is  the  proton  charge.  Combining  these  two  equations  gives 

i's  — A.+V+ 

But  since  we  have  A.+  = — and  v+  = — v_,  this  becomes 

i's  = — A_(  — v_)  = A._v_ 

Comparing  this  result  w'ith  Eq.  (24- 17a)  gives 

i's  = is 

Thus,  in  spite  of  their  different  views  as  to  which  charges  are  moving  in 
which  direction,  O and  O'  see  the  same  current  flowing  in  the  same  sense. 

1130  Magnetic  Fields,  II 


They  therefore  agree  completely  that  the  magnetic  force  per  unit  length  is 
given  by  Eq.  (24-14). 

In  order  to  make  the  basis  for  this  agreement  between  0 and  O' 
clearer,  consider  what  O'  observes  if  his  velocity  V with  respect  to  0 is  arbi- 
trary (but  still  small).  He  now  observes  both  the  electrons  and  the  positive 
ions  moving  with  respect  to  himself.  He  measures  the  velocity  of  the  elec- 
trons to  be  vf  and  that  of  the  positive  ions  to  be  v+.  From  this  new  point  of 
view,  both  kinds  of  charges  carry  currents,  which  we  call  iL  and  i'+,  respec- 
tively. These  currents  are  given  by  expressions  just  like  Eq.  (24-17a): 

i+ s = A+v+  and  iLs  = A_vf 

The  total  current  i'  flowing  in  either  wire,  as  observed  by  O',  is  found  by 
taking  the  sum  of  these  two  currents,  which  gives 

i's  = (i'+  + iL)  s = A+vV  + A_vf 

But  since  A+  = — A_,  this  expression  can  be  written  in  the  form 

i's  = A_(vf  - v;)  (24-19) 

There  is  a notable  distinction  between  Eq.  (24-19),  which  describes  the 
electric  current  from  the  point  of  view  of  an  arbitrary  observer,  and  the 
more  restricted  Eq.  (24- 17a),  which  describes  the  current  from  the  point  of 
view  of  an  observer  who  is  stationary  with  respect  to  the  wires.  In  Eq. 
(24-19),  the  observed  charge-carrier  velocities  appear  only  in  the  form  of  a 
difference.  In  the  newtonian  realm,  the  difference  between  the  velocities  of 
two  objects,  as  observed  by  the  same  observer,  is  independent  of  the  veloc- 
ity of  the  observer  with  respect  to  either  of  the  objects.  [You  can  prove  this 
explicitly  by  using  the  Galilean  velocity  transformation,  Eq.  (3-57),  to 
rewrite  Eq.  (24-19)  in  terms  of  the  velocities  v_  and  v+  which  are  observed 
by  0.]  Consequently,  the  directed  current,  which  for  observer  O'  moving 
with  respect  to  the  wires  is  the  quantity  i's  evaluated  in  Eq.  (24-19),  will 
have  the  same  value  for  all  observers.  In  particular,  we  again  have 

i's  = is 

where  is  is  the  directed  current  observed  by  observer  0,  who  is  at  rest  with 
respect  to  the  wires.  Taking  the  dot  product  of  each  side  of  this  equation 
with  itself  gives  i’2s  • s = i2s  • s,  or  i'2  = i2.  But  the  magnetic  force  per  unit 
length  depends  on  the  square  of  the  current  according  to  Eq.  (24-14), 
F m/L  = — (p,0/277)(t2//?)R.  Thus  the  magnetic  force  per  unit  length  will 
be  the  same  for  all  observers,  and  in  particular 

Fm  _ Fm 

L L 

It  would  thus  appear  that  we  have  resolved  what  seemed  at  first  to  be  a 
paradox — that  the  magnetic  force  F,„  observed  in  the  Ampere  experiment 
would  depend  on  the  velocity  of  the  observer. 

But  this  resolution  of  the  paradox  works  only  because  there  are  two 
equal  and  opposite  charge  densities  present,  so  that  the  wires  are  electri- 
cally neutral.  We  now  repeat  the  analysis  in  an  idealized  but  “simplified”  sit- 
uation, where  only  one  kind  of  charge  is  present.  This  will  lead  to  a new 
and  fundamental  insight  into  the  nature  of  the  magnetic  force — that  it  is 
essentially  relativistic,  even  though  the  speeds  involved  are  all  small.  After- 
ward, we  will  reconsider  the  electrically  neutral  case  from  a relativistic  per- 
spective. 


24-2  Relativistic  Origin  of  the  Magnetic  Force  1131 


Fig.  24-8  The  idealized  Ampere 
experiment  repeated  in  such  a way  that 
only  positive  charges  are  present.  Two 
identical  parallel  beams  of  protons  are 
shown.  In  each  the  particle  velocity  is  v. 
and  the  total  electric  charge  per  unit 
length  is  \.  The  forces  are  explained  in 
the  text. 


We  repeat  the  Ampere  experiment  in  idealized  form,  using  electric 
charge  of  one  sign  only.  You  can  imagine  two  parallel  beams  of  protons,  of 
arbitrary  length,  which  contain  identical  charge  densities  and  which  move 
with  identical  (small)  velocities  v with  respect  to  the  apparatus  that  pro- 
duces them,  as  shown  in  Fig.  24-8.  Since  the  particle  beams  are  not  electri- 
cally neutral,  as  ordinary  current-carrying  wires  are,  there  will  be  an  elec- 
tric force  as  well  as  a magnetic  force  exerted  on  each  beam  of  protons.  The 
net  force  will  tend  to  drive  apart  the  beams,  as  we  soon  will  see.  So  we  must 
imagine  some  external  apparatus  (not  shown  in  the  figure)  which  holds  the 
protons  in  each  of  the  beams  in  the  desired  path  by  exerting  on  each  beam 
an  inward  force  per  unit  length  just  great  enough  to  balance  the  outward 
force  per  unit  length  exerted  on  it  on  account  of  the  presence  of  the  other 
proton  beam.  Prior  to  the  experiment,  0 and  O'  stand  at  rest  with  respect 
to  this  apparatus,  adjust  it,  and  record  the  balancing  force  per  unit  length 
that  it  exerts  on  each  of  the  particle  beams,  just  as  in  Ampere’s  own  experi- 
ment. 

In  calculating  the  force  per  unit  length  exerted  on  beam  1 by  the  pres- 
ence of  beam  2,  the  electric  force  must  be  taken  into  account  as  well  as  the 
magnetic  force.  If  the  charge  per  unit  length  in  either  beam  is  A,  Eq. 
(20-48)  expresses  the  electric  held  8 of  the  charge  in  beam  2,  at  the 
location  of  beam  1 , as 


8 = 


1 A - 

2^70R  r 


(24-20) 


The  resulting  electric  force  per  unit  length  exerted  on  wire  1 is 


_J_Z!r 

2776,,  R 


(24-2 1 cz) 


The  fact  that  FP  is  parallel  to  R implies  that  the  electric  force  is  repulsive. 

From  the  point  of  view  of  observer  O,  who  remains  at  rest  with  respect 
to  the  apparatus,  there  is  also  a magnetic  force  exerted  on  beam  1.  This 
force  per  unit  length  is  given  by  Eq.  (24-18),  from  which  we  delete  the  sub- 
script minus  signs  because  the  charge  carriers  are  now  positive.  We  thus 
have 


= _ Mo  AV  . 

L ~ 2 77  R 


(24-21  b) 


The  fact  that  F,„  is  antiparallel  to  R implies  that  the  magnetic  force  is  attrac- 
tive. Observer  O can  write  the  Lorentz  force  per  unit  length  which  she  ob- 
serves on  beam  1,  that  is,  the  sum  of  the  electric  and  magnetic  forces  per 
unit  length.  Adding  Eqs.  (24-2  la)  and  (24-21/;),  she  obtains 


F 

L 


Fe  F ™ =h-  /_! h L 2 

L L R \2t76o  2 77  Z' 


R 


or 


F 

z 


1 A2 
277€o  R 


(1  — |U.oeoT;2)R 


This  equation  can  be  written  in  more  convenient  form  by  using  the  relation 
among  p,0,  e0,  and  the  speed  of  light  c,  given  by  Eq.  (23-39).  I his  relation, 
which  is  derived  in  Chap.  27,  is 


Mo^o 


(24-22) 


1132 


Magnetic  Fields,  II 


By  using  it,  the  force  per  unit  length  on  proton  beam  1,  as  seen  by  observer 
O,  becomes 


F = _J_AV 
L 27re0  R\  c2 


(24-23 a) 


or 


(24-236) 


Since  (1  — v2/c 2)  > 0,  the  Lorentz  force  F is  repulsive,  just  as  the  electric 
force  Fe  is. 


In  the  meantime,  observer  O'  moves  with  the  beam  so  that  the  pro- 
tons are  at  rest  from  his  point  of  view.  What  does  O'  see?  He  observes  an 
electric  force,  but  the  magnetic  force  must  vanish  for  him,  since  from  his 
point  of  view  the  protons  are  not  moving.  The  force  per  unit  length  he 
measures  is  given  by 


F'  1 \'2  „ 

77  ~ 2 ^~R  R 


(24-24a) 


or 


F 

77 


(24-246) 


Now  O and  O'  meet  to  compare  their  observations.  They  must  have  ob- 
served the  same  Lorentz  force  per  unit  length,  since  this  was  balanced 
throughout  the  experiment  by  the  force  per  unit  length  exerted  by  the 
apparatus  they  adjusted  together  beforehand  (and  can  recheck  if  they 
wish).  During  the  experiment.  O'  continued  to  observe  the  beams  as  paral- 
lel, just  as  O did.  Consequently,  they  must  agree  to  the  fact  that  the  observed 
force  per  unit  length  is  the  same  for  both  of  them;  that  is, 

F F 

l ~ 77 

On  the  basis  of  this  agreement,  the  right  sides  of  Eqs.  (24-23a)  and  (24-24a) 
can  be  set  equal,  to  obtain 

A2(l-^)=A'2  (24-25) 

Since  the  quantity  1 — v2/c 2 is  not  equal  to  1,  it  follows  that  k cannot  equal 
k' . Thus  O and  O'  observe  different  values  for  the  charge  per  unit  length 
(also  called  the  charge  density  or  charge  concentration). 

How  can  O and  O'  reconcile  their  observations?  The  quantities  A and 
k'  can  be  written  in  the  form 

o Q! 

k — — and  k'  = — 

where  Q is  the  total  electric  charge  contained  in  a length  L of  either  beam  1 
or  beam  2 as  measured  by  O and  Q'  is  the  total  charge  contained  in  a length 
L'  of  either  beam  as  measured  by  O'.  We  assume  that  the  charge  on  a pro- 
ton is  independent  of  its  velocity  with  respect  to  the  observer.  Therefore, 
since  O and  O'  are  observing  the  same  number  of  protons,  they  cannot  rec- 

24-2  Relativistic  Origin  of  the  Magnetic  Force  1133 


oncile  their  observations  by  saying  that  Q'  differs  from  Q.  They  (and  we) 
must  therefore  set  Q'  = Q.  Squaring  the  resulting  expressions  for  the 
charge  densities  A and  A'  yields 

02  Q2 

A2  = ~JJ  ail<^  A'2  ~ ffi 

Substituting  these  values  of  A2  and  A'2  into  Eq.  (24-25)  and  then  dividing 
through  by  Q give  the  result 

L2\  c2  ) ~ L 12 
or 

L = sj  1 ~^L'  (24-26) 

This  is  the  Lorentz  contraction,  Eq.  (14-7)!  It  is  arrived  at  here,  however,  on 
the  basis  of  an  argument  quite  different  from  that  of  Sec.  14-5.  (The 
present  argument  is  much  closer  to  that  originally  followed  by  Einstein.) 


Starting  from  the  experimental  phenomenon  of  magnetism,  we  have 
derived  the  Lorentz  contraction.  Alternatively,  it  is  possible  to  assume  the 
Lorentz  contraction  and  derive  from  it  the  properties  of  magnetism.  Thus 
we  can  say  that  magnetism  is  an  essentially  relativistic  effect,  in  spite  of  the  fact 
that  it  is  observable  at  quite  small  speeds.  (Example  24-3  gives  a specific  in- 
stance of  what  is  meant  by  “quite  small  speeds.”) 

Comparison  of  Eqs.  (24-23a),  (24-23 b),  (24-24 a),  and  (24-246)  in  the 
light  of  this  discussion  leads  to  the  conclusion  that  it  is  the  Lorentz  force — and 
not  the  electric  or  the  magnetic  force  individually — which  is  fundamental.  Whether 
an  observer  interprets  that  force  as  being  purely  electric  or  purely  mag- 
netic or  some  combination  of  electric  and  magnetic  forces  depends  entirely 
on  the  speed  of  the  observed  charges  relative  to  the  observer.  In  the  ideal- 
ized experiment  just  discussed,  observer  O'  sees  all  the  charges  at  rest.  He 
therefore  interprets  the  force  per  unit  length  F ' /L'  as  the  purely  electric 
repulsion  given  by  Eqs.  (24-24), 


F' 

re 

v 


————  \ ,2 
2tT€0R 


R 


Observer  O,  on  the  other  hand,  sees  the  particle  beam  to  be  moving.  While 
the  Lorentz  force  per  unit  length  F/L  she  observes  is  exactly  equal  to  F ' /L', 
she  infers  a greater  electric  repulsive  force  per  unit  length, 

L 2t re0R  K 


than  does  O' . This  is  because  the  Lorentz  contraction  of  the  particle  beams 
from  her  point  of  view  leads  to  a greater  charge  density  A than  that  ob- 
served by  O'.  Observer  O reconciles  her  observations  with  those  of  O'  by  in- 
voking an  attractive  magnetic  force  per  unit  length 


Fm 

L 


2tT€oR 


A2  -5-  R = 


p-o  A2u2 

2tt  R 


R 


And  this  force  per  unit  length  is  just  large  enough  to  cancel  the  additional 
force  per  unit  length  she  infers,  so  that  the  total  Lorentz  force  per  unit 


1134  Magnetic  Fields,  II 


length  is  the  same  as  that  inferred  by  O'.  Adding  the  two  forces  per  unit 
length,  and  using  Eqs.  (24-23),  (24-24),  and  (24-25),  we  have 

Fe  + F,„  _ 

L ~ L' 

or 

F F 

~L~T' 

for  the  Lorentz  forces  per  unit  length  in  the  reference  frames  of  the  two 
observers. 


Our  conclusions  show  that  it  is  possible  to  transform  a magnetic  force 
into  an  electric  force  simply  by  changing  the  state  of  motion  of  the  observer 
relative  to  the  particles  in  the  beam.  Let  us  see  how  this  applies  to  the  famil- 
iar case  of  an  electric  current  flowing  in  a long  copper  wire.  As  a “detector” 
let  us  use  an  electron  which  moves  parallel  to  the  wire,  so  that  its  position 
with  respect  to  the  wire  can  always  be  described  by  the  vector  R.  This  vector 
is  perpendicular  to  the  wire  and  moves  along  with  the  electron,  as  shown  in 
Fig.  24-9.  The  velocity  of  the  detector  electron  with  respect  to  the  wire  is 
made  to  be  equal  to  the  drift  velocity  v of  the  electrons  which  carry  current 
in  the  wire.  How  do  two  observers  0 and  O'  see  the  force  exerted  on  the  de- 
tector electron? 

Observer  O is  at  rest  with  respect  to  the  wire.  By  making  tests  with 
charges  which  do  not  move  in  her  reference  frame,  she  can  satisfy  herself 
that  the  wire  is  electrically  neutral.  That  is,  for  0 the  positive  and  negative 
charge  densities  per  unit  length  add  to  zero: 

\+  + K-  0 


Therefore  0 observes  no  electric  forces.  Nevertheless,  she  observes  a force 
acting  on  the  moving  detector  electron,  which  she  calls  magnetic  force.  It  is 
directed  toward  the  wire  (in  the  — R direction)  and  has  magnitude 

F — Fm  = evtfi 

If  we  substitute  the  value  of  55  for  a current  in  a long,  straight  wire,  given 
by  Eq.  (23-42a),  the  magnitude  of  the  force  on  the  detector  electron 
is  given  by 


F = 


ev 


2nR 


(24-27) 


In  order  to  facilitate  the  later  comparison  of  this  result  with  the  observa- 


R 

© “Detector”  electron 

-•  O' 


Fig.  24-9  Applications  of  the  Lorentz  contraction  to  the  analysis  of 
the  force  exerted  on  a charge  which  is  moving  in  the  vicinity  of  a 
current-carrying  wire.  A current  i flows  in  the  copper  wire  shown. 
This  current  arises  from  the  motion  of  the  charge  carriers,  which 
are  electrons,  with  drift  velocity  v.  A “detector”  electron  moves 
parallel  to  the  wire  with  the  same  velocity  v.  As  explained  in  the 
text,  observer  0,  who  is  at  rest  with  respect  to  the  wire,  and  ob- 
server O',  whose  velocity  relative  to  0 is  V = v,  find  a common  rela- 
tivistic basis  for  understanding  the  force  exerted  on  the  detector 
electron. 


• O 


24-2  Relativistic  Origin  of  the  Magnetic  Force  1135 


lions  of  O',  we  make  some  substitutions.  Using  Eq.  (24-22),  we  write  /x0  in 
the  form 


Mo  - 


1 

9 

6oC“ 


The  desired  substitution  for  the  current  i is  obtained  by  beginning  with  Eq. 
(24-17 b),  i2  = Alt'2.  Since  A2  = Ay,  this  equation  can  be  written  i2  = A+v2. 
Taking  the  positive  square  root  of  each  side,  we  obtain 

i = k+v 

Making  these  substitutions  into  Eq.  (24-27),  we  have 

1 k+v  _ 1 ek+  v2 

b = ev  7^?  2 7r  = TttTo  "ft"  7 


Taking  into  account  the  direction  of  the  force  as  well,  0 observes  the  mag- 
netic force 


F = F,„ 


1 

2tT€0 


ek+  v2 


R c2 


tR 


(24-28) 


Now  let  us  take  the  point  of  view  of  O' , who  is  moving  with  respect  to  0 
with  a velocity  v equal  to  that  of  the  detector  electron  and  of  the  electrons 
in  the  wire.  From  the  point  of  view  of  O',  there  can  be  no  magnetic  force  on 
the  detector  electron,  since  it  has  zero  velocity.  However,  the  positive  ions 
in  the  wire,  which  0 perceives  as  stationary,  are  moving  with  velocity  v'  = 
— v according  to  O'.  On  account  of  this  motion,  there  is  a (very  tiny) 
Lorentz  contraction,  and  0'  will  detect  a concentration  of  positive  charge 
k'+  which  is  greater  than  the  concentration  A+  observed  by  0.  But  the  elec- 
trons in  the  wire  are  stationary  according  to  O'.  He  therefore  does  not  mea- 
sure the  (very  tiny)  increase  in  the  magnitude  of  the  negative  charge  den- 
sity measured  by  O due  to  the  Lorentz  contraction.  I hus  O'  observes  a 
concentration  of  electrons  which  is  less  than  the  concentration  observed  by 
O.  But  remember  that  O began  the  experiment  by  verifying  that  the  force 
on  a test  charge  which  she  held  stationary  was  zero.  Thus  0 observes  equal 
magnitudes  for  the  densities  of  the  positive  and  negative  charges  in  the 
wire.  Since  O'  observes  a gr'eater  magnitude  than  O for  the  positive  charge 
density  and  a smaller  magnitude  than  O for  the  negative  charge  density,  0' 
observes  a net  positive  charge  on  the  wire!  He  claims  that  it  is  the  electric 
held  associated  with  this  net  positive  charge  which  results  in  an  electric  force 
exerted  on  the  detector  electron  in  the  direction  toward  the  wire. 

The  positive  and  negative  charge  concentrations  A+  and  A2  measured 
by  O'  are  related  to  the  charge  concentrations  A+  and  A_  measured  by  0 by 
means  of  the  Lorentz  contraction.  Specifically,  each  observer  finds  a con- 
traction for  that  stream  of  charges  which  is  moving  with  respect  to  him  and 
therefore  a greater  charge  density  than  that  found  by  the  other  observer. 
To  make  this  statement  clear  in  quantitative  terms,  we  rewrite  Eq.  (24-25) 
with  the  following  change  of  notation.  We  give  the  name  ks  to  the  density  of 
charge  (of  whatever  sign)  seen  by  one  of  the  observers  as  stationary.  The 
other  observer  sees  the  same  charge  to  be  moving,  and  we  call  the  charge 
density  measured  by  that  observer  Am . Equation  (24-25)  then  becomes 


1 


where  v is  the  speed  of  the  charges  in  question  relative  to  the  observer  and  c 
is  the  speed  of  light.  Taking  the  square  root  of  both  sides  of  this  equation, 
we  have 


1 


v 

9 

c 


1/2 


= K 


(24-29 a) 


We  now  apply  Eq.  (24-29«)  to  the  positive  charge  in  the  wire.  Observer 
O sees  this  charge  to  be  stationary.  Hence  the  positive  charge  density  which 
she  infers  is  A+  = As.  Observer  O'  sees  the  positive  charge  to  be  moving, 
and  thus  for  him  k'+  = km.  In  this  case  Eq.  (24-29a)  becomes 


k+ 


— A+ 


or 


/ v2  \ -1/2 

A;  = k+  [ 1 - J (24-296) 

Next,  consider  die  negative  charge  in  the  wire.  Observer  O sees  this 
charge  to  be  moving,  so  that  A_  = km  in  this  case.  Observer  O'  sees  the  neg- 
ative charge  to  be  stationary,  and  thus  for  him  Al  = ks.  Hence  for  the  nega- 
tive charge  Eq.  (24-29tf  ) becomes  A_(l  — v2/c2)112  = Af,  or 

/ v2\i'2 

kL  = A-  N - J (24-29c) 

Now  remember  again  that  0 has  made  a direct  experimental  observation  of 
the  electrical  neutrality  of  the  wire.  Thus  she  knows  that  A_  + A+  = 0,  or 
A^  = — A+.  By  making  this  substitution,  Eq.  (24-29c)  becomes 

Al  = -A+fl--^J  (24-29d) 


Equations  (24-296)  and  (24-29 d)  express  the  charge  densities  for  the 
charges  of  both  signs  as  observed  by  O'  in  terms  of  the  charge  density  A+  of 
the  stationary  charges  observed  by  O,  who  is  herself  stationary  with  respect 
to  the  wire  as  a whole.  We  now  use  the  binomial  expansion  to 
express  these  two  equations  in  simpler  form.  Taking  advantage  of  the  fact 
that  the  quantity  v2/c2  is  extremely  small  compared  to  1,  we  use  the  approx- 
imation 


This  gives  us 


and 


,2  \ ±1/2 


1 V2 

2 7 for 


\ « 1 
C 


A+  — A+  1 + 


2c2 


24-2  Relativistic  Origin  of  the  Magnetic  Force  1137 


The  quantities  A+  and  AT  have  values  of  opposite  sign  but,  unlike  A+ 
and  A_,  they  are  not  of  equal  magnitude.  The  net  charge  concentration  ob- 
served by  O'  in  the  wire  is  given  by  the  sum 


a;  + xi  = a.+ 


(24-30) 


There  is  an  electric  field  8'  associated  with  this  charge  distribution.  It  is 
given  by  Eq.  (24-20),  if  we  substitute  into  that  equation  the  charge  concen- 
tration k+v2/c2.  This  gives 


1 k+v2  /c2 

27 re0  R 


R 


Thus  the  force  F'  on  the  detector  electron,  as  observed  by  O',  is  the  electric 
force  F'  = Fg  = — e£>' , or 


F = 


1 gA+  v2  - 

2776c  R C2 


(24-31) 


Subject  to  the  limits  of  accuracy  imposed  by  the  mathematical  approxi- 
mation made  on  the  basis  of  the  assumption  v2/c2  <5C  1,  this  electric  force 
is  equal  to  the  magnetic  force  F = Fm  observed  by  O and  given  by  Eq. 
(24-28).  (Observer  O'  also  observes  a magnetic  field  associated  with  the  mo- 
tion of  the  positive  charge  in  the  wire  which  he  observes.  Why  need  this 
magnetic  field  not  be  considered  in  calculating  the  total  force  F'  exerted  on 
the  detector  electron  from  the  point  of  view  of  O'?) 


It  is  fair  to  ask,  How  can  relativistic  effects  be  significant  at  such  tiny 
speeds  as  that  of  an  electron  drifting  through  a copper  wire?  The  answer  is 
suggested  by  Example  24-3. 


a.  According  to  Example  23-7,  the  room-temperature  Hall  coefficient  of 
copper  is  RH  = —1.01  X 10~10  m3/C.  Find  — ne,  the  negative  charge  per  unit  volume 
measured  by  an  observer  who  is  stationary  with  respect  to  the  wire. 

b.  The  radius  of  a no.  14  copper  wire  is  r = 8.141  x 10“4  m,  and  its  cross- 
sectional  area  is  a = 2.082  X 10-6  m2.  Find  A_,  the  negative  charge  per  unit  length 
in  this  wire,  from  the  point  of  view  of  an  observer  at  rest  with  respect  to  the  wire. 

c.  As  in  Example  22-2,  the  wire  carries  its  maximum  rated  current  i = 15.0  A. 
Find  the  magnitude  of  the  magnetic  field  © at  a distance  R = 1.00  cm  from  the 
wire. 

d.  Recalculate  the  electron  drift  speed  v.  (In  Example  22-2,  the  value  v = 0.55 
mm/s  was  based  on  a crude  estimate  of  the  charge  per  unit  volume  —ne.) 

e.  Find  the  net  charge  per  unit  length  k'+  + Af  from  the  point  of  view  of  an  ob- 
server moving  along  the  wire  at  a velocity  v equal  to  the  drift  velocity  of  the  elec- 
trons in  the  wire. 

f.  Calculate  the  magnitude  of  the  electric  field  8'  at  a distance  R = 1.00  cm 
from  the  wire,  as  seen  by  the  moving  observer. 

g.  An  electron  located  1.00  cm  from  the  wire  is  moving  parallel  to  the  wire  with 
velocity  v equal  to  the  electron  drift  velocity  in  the  wire.  Find  the  force  exerted  in 
this  electron  by  the  magnetic  field  calculated  in  part  c and  by  the  electric  field  calcu- 
lated in  part  /. 


1138  Magnetic  Fields,  II 


■ a.  Equation  (23-33),  RH  = \/nq,  can  be  used  to  determine  the  negative  mobile 
charge  per  unit  volume  — ne.  Setting  q = — e,  you  have 


1 


1 


ne  = 


= -9.90  x 109  C/m3 


Rh  -1.01  x lO”10  m3/C 

b.  To  find  the  negative  mobile  charge  per  unit  length  A_,  you  use  Eq.  (24-16), 

A_  = —nea 

and  obtain 

A_  = -9.90  x 109  C/m3  x 2.08  x lO”6  m2  = -2.06  x 104  C/m 

c.  You  can  calculate  the  magnitude  of  the  magnetic  held  directly  from  the  cur- 
rent, using  Eq.  (23-42a), 


2 t tR 


You  have 


477  x 10  7 T-m/A  x 15.0  A 
2 77  x 1.00  x 10-2  m 


= 3.00  x 10“4  T 


d.  The  most  direct  way  to  calculate  the  electron  drift  speed  is  to  use  Eq. 
(24-17 a),  is  = — A_v.  Considering  magnitudes  only,  you  have 

i 15.0  A 

v = T~l  = q ntfv  i nTr ■ / = 7-28  x 10  4 m/s  = °-728  mm/s 

|A_|  2.06  X 104  C/m 

(This  is  a little  greater  than  the  value  of  v obtained  in  Example  22-2  because  the 
crude  guess  underlying  the  result  of  that  example  overestimated  the  charge  per 
unit  volume  by  about  30  percent.) 

e.  From  the  point  of  view  of  the  moving  observer,  the  Lorentz  contraction  re- 
sults in  a net  charge  per  unit  length  given  by  Eq.  (24-30), 

v2 

k'+  + Al  = A+ 

c 


Using  A+  = — A_ , and  inserting  the  numerical  values  you  have  found  in  parts  b and 
d above,  you  find 


A|  + kL  = +2.06  X lo4  C/m  X 


/ 7.28  x 10  4 m/s 
\ 3.00  x 108  m/s 


2 


= 1.21  x 10-19  C/m 


This  net  charge  is  quite  small — the  equivalent  of  only  about  one  extra  electron 
charge  per  meter  of  wire — because  v2/c 2 has  the  very  small  value  5.89  x 10~24. 

f.  You  use  Eq.  (21-48),  % = {\ /2Tre0)(k/R),  to  calculate  the  magnitude  of  the 
electric  held  in  the  reference  frame  of  the  moving  observer.  LTsing  k = k+  + Al, 
you  have 

1 (A!  + kL)  1 1.21  X 10"19  C/m 

27760  R 277  X 8.85  X 10“12  C2/(N-m2)  1.00  x 10“2  m 

= 2.18  x 10-7  N/C 


g.  Call  the  magnetic  force  exerted  on  the  detector  electron  in  the  reference 
frame  of  the  stationary  observer  Fm.  You  have  for  its  magnitude 

Fm  = evSft  = 1.60  x 10“19  C x 7.28  x 10"4  m/s  x 3.00  x 10~4  T 
= 3.49  x 10"26  N 


Call  the  electric  force  exerted  on  the  detector  electron  in  the  reference  frame 
of  the  moving  observer  Fe.  You  have  for  its  magnitude 

F'e  = e%’  = 1.60  x 10-19  C x 2.18  x 10~7  N/C  = 3.49  x 10~26  N 


24-2  Relativistic  Origin  of  the  Magnetic  Force  1139 


24-3  MAGNETIC 
DIPOLES  AND 
THEIR  APPLICATIONS 


which  is  the  same  as  Fm . This  is  not  a large  force.  But  remember  that  the  mass  of  an 
electron  is  less  than  10“30  kg.  The  acceleration  experienced  by  a free  electron  sub- 
jected to  the  force  would  be  nearly  4000  m/s2. 

Next,  suppose  that  instead  of  a single  detector  electron,  a second  wire  carrying  a 
current  equal  to  that  in  the  first  were  located  parallel  to  the  first  at  a distance  of  1 cm. 
Each  of  the  vast  number  of  electrons  in  the  second  wire  would  experience  a force 
equal  to  Fm  or  F'e  (depending  on  the  observer’s  point  of  view).  The  resuiting  force  ex- 
erted on  the  wire  as  a whole  is  by  no  means  small. 


Example  24-3  makes  evident  the  fact  that  the  very  small  Lorentz  con- 
traction is  significant  because  the  electron  and  ion  concentrations  in  a wire 
are  so  large.  1 he  tiny  Lorentz  contraction  thus  significantly  upsets  the  very 
delicate  balance  involved  in  electrical  neutrality.  Thus,  even  though  all  the 
velocities  involved  in  the  ordinary  process  of  conduction  of  electric  current 
through  a wire  satisfy  the  condition  v2/c 2 « 1 extremely  well,  the  condi- 
tion does  not  suffice  in  this  case  to  place  the  phenomenon  of  magnetism  in 
the  newtonian  domain.  This  is  because  magnetism  is  an  essentially  relativistic 
phenomenon. 


In  Example  23-9  and  the  discussion  following  it,  a strong  analogy  was 
found  between  the  magnetic  held  of  a circular  loop  of  current-carrying 
wire  and  the  electric  held  of  an  electric  dipole.  At  all  points  along  the  axis 
of  the  circular  current  loop,  the  magnetic  held  is  directed  along  the  axis, 
just  as  is  the  case  for  the  electric  held  of  the  electric  dipole  along  its  axis. 
Furthermore,  at  points  along  the  loop’s  axis  and  far  from  it,  the  magnitude 
of  the  magnetic  held  has  the  same  dependence  on  distance  as  the  magni- 
tude of  the  electric  held  of  an  electric  dipole  at  points  along  its  axis  and  far 
from  it.  In  both  cases  the  magnitude  of  the  “far  axial  held”  is  inversely  pro- 
portional to  the  cube  of  the  distance — in  the  former  case  the  distance  from 
the  center  of  the  current  loop  and  in  the  latter  case  the  distance  from  the 
center  of  the  electric  dipole.  It  was  also  found  that  the  magnitude  of  the 
magnetic  held  of  a circular  current  loop  is  directly  proportional  to  the  mag- 
nitude of  its  magnetic  dipole  moment  m,  defined  to  be 

m = iai  (24-32) 

Here  i is  the  current  flowing  through  the  loop,  a is  its  area,  and  z is  the  unit 
vector  in  the  direction  of  the  magnetic  held  along  its  axis,  given  by  the 
right-hand  rule  for  magnetic  held  lines,  which  relates  the  sense  of  these 
lines  to  the  sense  of  the  associated  current.  The  magnetic  held  of  a current 
loop  depends  on  the  vector  m.  This  vector  is  named  the  magnetic  dipole 
moment  of  the  current  loop  since  it  plays  a role  analogous  to  that  of  the 
electric  dipole  moment  of  an  electric  dipole  in  determining  its  electric  held. 
Even  though  it  is  not  composed  of  two  separated,  equal  but  opposite  mag- 
netic poles,  a current  loop  is  often  called  a magnetic  dipole  because  of  the 
analogies  just  described. 

A bar  magnet  is  another  object  which  often  is  called  a magnetic  dipole. 
This  application  of  the  name  is  certainly  appropriate.  As  described  in  Sec. 
23-1,  a bar  magnet  is  composed,  in  effect,  of  two  equal  but  opposite  mag- 
netic poles,  each  located  very  near  one  end  of  the  bar.  A magnetic  pole  is 
characterized  by  the  quantity  t //  (lowercase  Greek  psi),  called  the  magnetic 
pole  strength.  If  a bar  magnet  is  sufficiently  long,  the  magnetic  pole 


1140  Magnetic  Fields,  II 


A 

z 


Fig.  24-10  A circular  loop  enclosing  an 
area  a carries  an  electric  current  i in  the 
sense  shown.  It  possesses  a magnetic  di- 
pole moment  m = iaz.  A bar  magnet 
having  the  same  dipole  moment  m is 
shown. 


strength  of  one  of  its  poles  can  be  determined  experimentally  by  placing 
the  pole  in  a magnetic  held  ©,  which  extends  over  a region  that  does  not  in- 
clude the  other  pole,  and  measuring  the  force  F exerted  on  the  pole.  The 
relation  between  these  quantities  is 

F = t/t®  (24-33) 


This  relation  is  analogous  to  F = </8,  giving  the  force  exerted  on  an  electric 
charge  q by  an  applied  electric  held  8.  Measurements  show  that  the  two 
poles  of  a bar  magnet  have  pole  strengths  of  the  same  magnitude,  with  i/t 
positive  for  the  North  pole  and  negative  for  the  South  pole. 

The  characteristics  of  a bar  magnet  as  a whole  are  specihed  by  its  mag- 
netic dipole  moment  m.  For  a bar  magnet  this  quantity  is  defined  to  be 

m - |i)/|2d+  (24-34) 


Here  d+  is  the  vector  extending  from  the  center  of  the  bar  magnet  to  its 
positive  (North)  pole.  Equation  (24-34)  is  completely  analogous  to  the  one 
determining  the  electric  dipole  moment  of  an  electric  dipole,  p = |<?|2d+. 
Analysis  shows  that  at  all  points  far  from  a bar  magnet  its  magnetic  held  is 
indistinguishable  from  that  of  a circular  current  loop,  provided  that  the 
magnetic  dipole  moment  of  the  bar  magnet,  as  determined  by  Eq.  (24-34), 
equals  the  magnetic  dipole  moment  of  the  circular  current  loop,  as  deter- 
mined by  Eq.  (24-32).  Figure  24-10  shows  a circular  current  loop  and  its 
equivalent  bar  magnet. 


EXAMPLE  24-4  — »*■— 11 

Demonstrate  that  the  two  different  definitions  of  magnetic  dipole  moment,  m = iaz 
and  m = |i//|2d+,  are  consisteni  in  the  sense  that  they  both  specify  the  same  units  for 

m. 

■ Inspecting  Eq.  (24-8 b),  F/L  = is  x (B,  and  remembering  that  the  magnitude  of 
any  unit  vector  is  the  pure  number  1,  you  can  conclude  that 

force 

units  of  i = — — : — 

(length)(magnetic  held) 

Hence  since  the  units  of  a are  (length)2,  you  have 

(force)(length)2 

units  of  iaz  = — : — 

(length)(magnetic  field) 

(force)(length) 

magnetic  field 

From  Eq.  (24-33),  you  find  that 


units  of 


force 

magnetic  field 


24-3  Magnetic  Dipoles  and  Their  Applications  1141 


Thus  since  length  is  the  unit  of  d+,  you  have 

units  of  |<|»|2d+  = <f°'”><leng.h> 
magnetic  held 

Comparison  shows  that  the  two  definitions  of  m are  consistent  as  far  as  units  are 
concerned;  that  is,  they  are  dimensionally  consistent. 


In  this  section  we  continue  the  investigation  of  magnetic  dipoles  — 
both  current  loops  and  bar  magnets.  But  here  we  are  not  concerned  princi- 
pally with  the  fields  0/  these  objects.  Instead  we  focus  our  attention  on  what 
happens  when  a uniform  external  magnetic  held  is  applied  to  magnetic  di- 
poles. Then  we  consider  several  important  practical  applications  of  the 
behavior  of  magnetic  dipoles  when  they  are  located  in  uniform  applied 
magnetic  fields. 

It  is  easier  to  analyze  the  effect  of  applying  a magnetic  held  to  a bar 
magnet  than  it  is  to  analyze  the  effect  of  applying  the  held  to  a current 
loop.  So  we  begin  by  considering  a bar  magnet  lying  in  a uniform  applied 
magnetic  held  (B.  Figure  24-1 1 shows  the  magnet  and  the  forces  F+  and  F_ 
exerted  by  the  applied  magnetic  held  on  its  North  and  South  poles,  respec- 
tively. These  forces  have  equal  magnitude  and  opposite  direction  because 
the  held  is  uniform.  Hence,  it  is  apparent  that  no  net  force  is  exerted  on  a mag- 
netic dipole  by  a uniform  applied  magnetic  field,  just  as  is  true  in  the  corre- 
sponding electric  case. 

But  if  the  axis  of  the  magnet  is  not  oriented  parallel  to  the  applied 
magnetic  held,  the  magnet  experiences  a net  torque.  This  torque  is  exem- 
plified by  the  tendency  of  a compass  needle  to  orient  itself  in  a 
northerly-southerly  direction.  To  evaluate  the  torque  on  the  bar  magnet 
in  Fig.  24-1 1,  we  choose  the  center  of  the  magnet  as  the  origin  about  which 
torque  is  to  be  measured.  Then  the  net  torque  T acting  on  the  magnet  is 
the  sum  of  the  vector  products  of  the  vectors  d+  and  d_  which  extend  from 
the  central  origin  to  the  North  and  South  poles,  respectively,  with  the 
forces  F+  and  F_  exerted  on  these  poles.  That  is, 

T = d+  x F+  + d_  x F_ 

We  make  use  of  Eq.  (24-33)  to  determine  F+  and  F_,  expressing  the  pole 
strength  of  the  North  pole  as  |i//|  and  that  of  the  South  pole  as  — |t//|.  We  have 

F+  = |t|/|(B  and  F_  = — |t/t|© 


Fig.  24-11  The  equivalent  bar  magnet 
of  Fig.  24-10  oriented  so  that  the  mag- 
netic moment  m makes  an  angle  6 with  a 
uniform,  externally  applied  magnetic 
field  ®.  The  vector  d+  extends  from  the 
center  of  the  bar  magnet  to  its  North 
pole,  which  is  represented  ideally  as 
though  it  were  localized  at  a single 
point.  The  forces  F+  and  F_  = — F+  ex- 
erted on  the  North  and  South  poles, 
respectively,  are  shown. 


Employing  these  equations  and  the  relation  d_  = — d+,  we  obtain 
T = d+  x + (-d+)  x ( — |i//|)(B 
= |i//|d+  x © + |i//|d+  x © 


We  write  this  as 


T = |^|2d+  x © (24-35) 

We  then  apply  the  definition  of  the  magnetic  dipole  moment  m,  given  in 
Eq.  (24-34),  to  obtain  the  result 

T = m x © (24-36) 


(a) 


( b ) 

Fig.  24-12  (a)  Perspective  view  and  ( b ) 

edge-on  view  of  a current-carrying  rec- 
tangular loop.  The  magnetic  forces  ex- 
erted on  the  sides  of  the  loop  are  shown 
as  it  they  were  concentrated  at  the 
centers  ot  the  respective  sides;  actually, 
these  forces  are  distributed  uniformly 
along  the  sides. 


The  torque  exerted  on  a bar  magnet  by  a uniform  applied  magnetic  field  equals  the 
cross  product  of  its  magnetic  dipole  moment  vector  and  the  applied  magnetic  field 
vector. 

Note  that  Eq.  (24-36)  is  completely  analogous  to  Eq.  (21-32),  T = 
p x 8,  which  describes  the  net  torque  exerted  on  an  electric  dipole,  of  elec- 
tric dipole  moment  p,  by  a uniform  applied  electric  held  8.  Note  also  that 
the  verbal  statement  of  Eq.  (24-36)  does  not  specify  the  origin  about  which 
the  torque  is  measured.  T he  reason  is  that  it  has  the  same  value,  no  matter 
which  origin  is  used,  just  as  is  the  case  for  the  electric  dipole  in  the  uniform 
applied  electric  held.  Verify  this  by  repeating  the  calculation  with  a dif- 
ferent choice  of  origin. 


Now  we  will  calculate  the  net  torque  acting  on  a current-carrying  loop 
to  which  a uniform  magnetic  held  is  applied.  We  will  hrst  treat  not  a cir- 
cular current  loop,  but  the  simpler  case  of  a rectangular  current  loop. 
Then  we  will  see  that  it  is  easy  to  generalize  our  results  so  that  they  apply  to 
a current  loop  which  is  circular  (or  has  any  other  shape). 

Figure  24- 12a  shows  a rectangular  loop  having  sides  of  length  2g  and 
h.  The  loop  carries  a current  i and  lies  in  a uniform  applied  magnetic  held 
ffi.  The  sides  of  length  h are  perpendicular  to  ffi.  The  angle  between©  and 
the  direction  z normal  to  the  plane  of  the  loop  is  0.  Just  as  is  the  case  for  the 
circular  current  loop  illustrated  in  Fig.  24-10,  the  unit  vector  z is  in  the 
direction  of  the  axial  magnetic  held  of  the  loop  (not  the  external  magnetic 
held  applied  to  the  loop),  given  by  the  right-hand  rule  for  magnetic  held 
lines  which  relates  the  sense  of  these  lines  to  that  of  the  current  i.  We 
choose  the  origin  at  the  center  of  the  loop. 

According  to  Eq.  (24-86),  the  force  per  unit  length  F/L  acting  on  a side 
of  the  loop  in  which  current  hows  in  a direction  s is  given  by 

F 

— = is  X ffi 

There  are  forces  acting  on  each  element  of  a particular  side  of  the  loop. 
But  since  these  forces  are  distributed  uniformly  along  the  side,  their  effect 
is  that  of  a single  net  force  acting  at  the  midpoint  of  the  side,  as  indicated  in 
the  figure.  Applying  the  equation  displayed  above,  we  see  that  the  net 
forces  F(J  and  — F„  exerted  on  the  sides  of  the  loop  of  length  2 g are  both 
directed  away  from  the  origin  at  the  center  of  the  loop,  and  therefore  they 
exert  no  torque  about  that  origin.  The  net  forces  F,,  and  — Fft  exerted  on 
the  sides  of  length  h are  directed  as  shown  in  the  figure.  They  do  result  in  a 
net  torque  about  the  origin,  as  can  be  seen  more  clearly  from  the  end-on 
sketch  in  Fig.  24-126. 


24-3  Magnetic  Dipoles  and  Their  Applications  1143 


Fig.  24-13  A planar  loop  of  circular 
shape  carries  current  i and  lies  in  a uni- 
form, externally  applied  magnetic  field 
©.  For  the  purpose  of  evaluating  the 
torque  acting  on  the  loop,  it  is  replaced 
by  an  infinite  number  of  infinitesimally 
thin,  rectangular  subloops,  each  of 
which  carries  current  i in  the  sense  indi- 
cated. The  orientation  of  the  rectangles 
is  chosen  so  that  their  short  sides  are 
perpendicular  to  ffi. 


Since  s is  perpendicular  to  (B  for  the  sides  of  length  h,  the  magnitude 
of  s x (B  is  38  for  these  sides.  Hence  FA  and  — F/(  have  magnitudes  satisfying 
the  relation  Fh/h  = ?38,  or  Fh  = (638.  The  net  torque  which  these  forces 
exert  about  the  origin  at  the  center  of  the  loop  has  the  magnitude 

T = g sin  6 Fh  + g sin  9 Fh  = 2 g sin  9 ih8& 

= i2gh  sin  9 38  = ia  sin  9 38 

Here  a = 2 gh  is  the  area  enclosed  by  the  loop.  It  is  evident  from  Fig.  24-126 
that  the  direction  of  this  torque  is  into  the  plane  of  the  page.  Its  direction, 
as  well  as  its  magnitude,  can  be  expressed  by  the  vector  equation 

T = iaz  x (B  (24-37a) 

To  see  that  this  equation  gives  the  proper  direction,  apply  the  right-hand 
rule  for  cross  products  to  the  vectors  z and  (B  illustrated  in  Fig.  24-126. 

Although  Eq.  (24-37a)  was  obtained  by  choosing  the  origin  about 
which  the  torque  T is  measured  to  be  at  the  center  of  the  rectangular  cur- 
rent loop,  it  is  valid  no  matter  where  the  origin  is  chosen.  You  can  verify 
this  by  repeating  the  calculation  leading  to  the  equation,  but  using  a dif- 
ferent choice  of  origin. 

To  extend  our  results  to  the  case  of  a circular  current  loop,  we  con- 
struct Fig.  24-13  in  which  such  a loop  is  approximated  as  a set  of  adjacent 
rectangular  current  loops.  The  rectangles  are  supposed  to  have  infinites- 
imal widths.  All  the  short  sides  of  these  rectangles  are  perpendicular  to  the 
uniform  applied  magnetic  held  (B , just  as  was  the  case  for  the  sides  of 
length  h in  Fig.  24- 12<j.  Since  opposite  currents  of  equal  magnitude  How 
through  the  adjoining  long  sides  of  all  pairs  of  adjacent  rectangular  loops, 
there  will  be  a cancellation  of  all  magnetic  effects  due  to  almost  all  the 
lengths  of  the  long  sides  of  the  rectangular  loops.  In  other  words,  as  far  as 
magnetic  effects  are  concerned,  the  current  distribution  is  completely 
equivalent  to  that  of  a current  flowing  along  a zigzag  approximation  to  the 
circular  loop.  This  approximation  is  a very  good  one,  indeed,  because  the 
zigs  and  zags  are  supposed  to  be  infinitesimal. 

Applying  Eq.  (24-37«)  to  evaluate  the  torque  d T exerted  on  an  infini- 
tesimal rectangular  current  loop  of  area  da,  we  obtain 

<7T  = i da  z x (B 


We  write  this  as 


d T = iz  x (B  da 


and  then  integrate  over  all  the  infinitesimal  rectangular  loops  to  obtain  the 
net  torque  T exerted  on  the  circular  loop: 

I d T = I iz  x (B  da 

all  all 

rectangles  rectangles 

The  integral  on  the  left  side  immediately  gives  T.  (Can  you  see  that  there 
would  be  difficulties  in  this  step  if  the  individual  torques  d T were  not 
origin-independent?)  In  the  integral  on  the  right  side  the  factor  iz  x (B  is 
the  same  for  each  infinitesimal  rectangular  loop,  so  it  can  be  moved  outside 
the  integral.  Thus  we  have 


1144  Magnetic  Fields,  II 


T = ii  x (B  I da  = ii  x (B  a 

all 

rectangles 

or 

T = iai  x (B  (24-37 b) 

Here  a is  the  area  of  the  circular  current  loop.  This  result  is  identical  in 
form  to  Ecj.  (24-37 a),  but  it  applies  to  a circular  current  loop  rather  than  a 
rectangular  one. 

We  have  defined  the  quantity  iai  for  a circular  current  loop  to  be  its 
magnetic  dipole  moment  m.  Hence  we  can  write  Eq.  (24-37 b)  as 

T = m x (B  (24-37c) 

Compare  this  equation  with  Eq.  (24-36),  which  gives  the  torque  exerted  on 
a bar  magnet  of  magnetic  dipole  moment  m by  a uniform  applied  magnetic 
field  (B.  The  comparison  will  show  you  that  in  both  cases  the  torque  is 
described  by  the  same  equation. 

We  have  demonstrated  that  a circular  current  loop  of  magnetic  dipole 
moment  m = iai  and  a bar  magnet  of  equal  magnetic  dipole  moment  m = 
|t|/|2d+  are  magnetically  equivalent  in  two  quite  different  ways:  (1)  The  mag- 
netic fields  of  these  objects  are  the  same  at  all  axial  points  far  from  them,  as 
was  discussed  at  the  beginning  of  this  section.  (2)  A given  uniform,  external 
magnetic  field  applied  to  them  exerts  the  same  torques  on  them,  as  we  have 
just  found. 

There  is  nothing  in  the  argument  leading  to  Eq.  (24-37 b)  which 
restricts  it  to  a circular  current  loop.  It  should  be  apparent  that  if  the  argu- 
ment is  repeated  for  the  planar  current  loop  of  arbitrary  shape  shown  in 
Fig.  24-14,  the  same  equation  will  be  obtained.  With  this  in  mind,  we  define 
the  magnetic  dipole  moment  m of  an  arbitrary  planar  loop  of  area  a car- 
rying current  i to  be 

m = iai  (24-38) 

The  unit  vector  i is  normal  to  the  plane  of  the  loop  and  in  the  direction 
agreeing  with  the  sense  of  the  magnetic  field  lines  passing  through  the 
loop,  as  a result  of  the  current  circulating  around  it.  This  definition  allows 
us  to  write 


T = m x (B 


(24-39) 


Fig.  24-14  A planar  current-carrying  loop 
of  arbitrary  shape  replaced  by  a set  of  infin- 
itesimally thin,  rectangular  current  loops, 
as  in  Fig.  24-13.  Here,  again,  the  orienta- 
tion of  the  rectangles  is  chosen  so  that  their 
short  sides  are  perpendicular  to  the  uni- 
form, externally  applied  magnetic  field®. 


24-3  Magnetic  Dipoles  and  Their  Applications  1145 


The  torque  exerted  on  an  arbitrary  planar  current  loop  by  a uniform  applied  mag- 
netic field  equals  the  cross  product  of  its  magnetic  dipole  moment  vector  and  the  ap- 
plied magnetic  field  vector. 

A current-carrying  loop  in  a uniform,  applied  magnetic  field  © will 
tend  to  orient  itself  so  that  its  magnetic  dipole  moment  m is  parallel  to  ffi. 
Then  the  torque  on  the  loop  will  be  T = m x ® = 0.  This  will  occur  when 
the  plane  of  the  loop  is  normal  to  the  direction  of  the  applied  magnetic 
field.  It  is  sometimes  useful  to  state  this  result  in  terms  of  the  magnetic  flux 
<f>,„  associated  with  the  applied  magnetic  field  which  penetrates  the  loop.  In 
the  case  of  a planar  loop  of  area  a located  in  a uniform,  applied  magnetic 
field  ffi,  that  flux  will  have  the  maximum  value  0$a  when  m is  parallel  to  ffi 
and  will  be  equal  to  zero  when  ffi  lies  in  the  plane  of  the  loop.  Thus  the 
tendency  of  the  loop  to  orient  itself  with  m parallel  to  © can  be  stated  in  the 
alternative  fashion:  A current-carrying  loop  tends  to  orient  itself  so  as  to  maximize 
the  externally  applied  magnetic  flux  penetrating  it. 

We  have  shown  that  the  torque  exerted  on  a planar  current  loop 
placed  in  a certain  uniform  magnetic  field  does  not  depend  on  its  individ- 
ual characteristics — such  as  shape,  area,  and  current  carried.  Instead  it  de- 
pends on  only  the  combination  of  characteristics  specified  by  its  magnetic 
dipole  moment  m = iai.  It  can  be  shown,  though  we  do  not  do  so  here,  that 
the  same  situation  pertains  to  the  distant  parts  of  the  current  loop's  own 
magnetic  field.  That  is,  at  points  far  from  a loop  of  area  a carrying  current  i its 
magnetic  field  is  determined  by  its  characteristics  only  th  rough  the  combina  tion  m = 
iaz. 


According  to  Eq.  (24-39),  T = iaz  x © the  torque  T exerted  on  a loop 
of  wire  located  in  a uniform,  applied  magnetic  field  © is  proportional  to  the 
current  i carried  by  the  wire.  This  fact  is  exploited  in  the  instrument  called 
the  galvanometer.  The  galvanometer  can  be  used  to  measure  electric  cur- 
rents, in  which  case  it  is  usually  called  an  ammeter.  When  connected  in 
series  with  a relatively  large  resistor,  the  galvanometer  can  be  used  to  mea- 
sure the  (small)  current  flowing  through  the  resistor  and  thus,  by  means  of 
Ohm's  law,  the  voltage  across  it.  In  this  application  it  is  called  a voltmeter. 

Figure  24-15  illustrates  the  basic  design  principles  of  the  galvanom- 
eter. file  horseshoe-shaped  permanent  magnet  has  pole  faces  cut  to 
form  a cylindrical  cavity,  as  shown.  This  cavity  is  nearly  (but  not  quite) 
filled  with  a somewhat  smaller  cylindrical  core  of  soft  iron,  which  is  magne- 
tized by  induction.  The  cylindrical  symmetry  of  the  remaining  annular  gap 
is  such  that  the  magnetic  field  lines  passing  through  the  gap  are  every- 
where very  nearly  parallel  to  the  radii  of  the  core. 

A small  coil  of  N turns,  each  having  rectangular  shape  and  area  a,  is 
suspended  so  that  two  of  its  sides  lie  in  the  gap,  as  shown  in  the  figure.  The 
current  to  be  measured  passes  through  very  flexible  leads  and  flows 
through  the  coil.  As  a consequence  of  the  current  flow,  there  is  a torque  ex- 
erted on  the  coil,  tending  to  rotate  it  about  its  axis,  which  coincides  with  the 
axis  of  the  core.  Over  most  of  the  angular  distance  around  the  annular  gap, 
the  magnetic  field  © has  essentially  constant  magnitude.  Its  direction  is 
nearly  parallel  to  the  plane  of  the  coil  and  thus  perpendicular  to  the  direc- 
tion z normal  to  that  plane.  Thus,  the  magnitude  T of  the  torque  on  the 
coil  is  independent  of  the  angular  position  of  the  coil  and  is  proportional  to 
the  current  i.  Specifically,  T = NiaSft,  since  each  of  the  N turns  experiences 
torque  ia2ft.  This  property  is  exploited  to  render  the  scale  of  the  galvanom- 


1146  Magnetic  Fields,  II 


Fig.  24-15  (a)  Schematic  view  of  a galvanometer, 

showing  the  general  arrangement  of  the  permanent 
magnet,  the  soft-iron  core,  the  suspended  coil  and  its  re- 
straining spring,  the  attached  indicator  needle,  and  the 
scale.  ( b ) Detail,  showing  the  locations  of  the  magnet,  the 
core,  and  the  coil. 


eter  linear.  That  is,  the  deflection  of  the  needle  is  made  directly  propor- 
tional to  the  quantity  being  measured,  as  is  usually  desirable  in  measuring 
instruments.  This  is  accomplished  as  follows. 

A small  spring  is  attached  to  the  loop.  There  are  several  ways  of  doing 
this;  Fig.  24-15  shows  a small  spiral  spring  like  the  hair  spring  on  the  bal- 
ance wheel  of  a watch.  The  spring  is  carefully  designed  to  conform  very 
closely  to  Hooke’s  law.  That  is,  its  restoring  torque  is,  as  closely  as  possible, 
directly  proportional  to  the  angle  through  which  the  spring  is  twisted.  (In 
order  to  render  the  galvanometer  reading  insensitive  to  the  ambient  tem- 
perature, often  the  spring  is  made  of  an  alloy  whose  elastic  constants  de- 
pend only  weakly  on  temperature.)  As  the  coil  turns,  the  spring  is  twisted 
until  the  magnitude  of  its  restoring  torque  is  equal  to  that  of  the  magnetic 
dipole  torque  T = NiaSft.  The  coil  comes  to  rest  in  this  equilibrium  orienta- 
tion, and  the  current  is  read  on  a calibrated  scale.  The  greater  the  current, 
the  greater  the  torque  and  the  larger  the  angle  through  which  the  coil 
turns  before  it  is  brought  to  rest  by  the  spring.  In  conformity  with  Hooke’s 
law,  that  angle  is  proportional  to  the  current. 

The  name  “galvanometer”  for  a current  detector  based  on  magnetic  dipole 
torque  was  first  used  by  Ampere  in  1820,  in  honor  of  the  Italian  physiologist  Luigi 
Galvani  (1737-1798),  who  in  1786  accidentally  discovered  (though  he  did  not 
properly  understand)  the  fact  that  an  electric  current  can  be  made  to  flow  between 
two  dissimilar  metals  which  are  in  contact  with  a conducting  solution.  However, 
the  galvanometer  did  not  take  the  form  just  described  until  the  1880s.  This  form, 
now  used  in  all  but  very  special  applications,  is  called  the  d’Arsonval  galvanom- 
eter after  its  inventor,  a French  physiologist. 


24-3  Magnetic  Dipoles  and  Their  Applications  1147 


Fig.  24-16  (a)  A galvanometer  connected  for  use  as  a multirange 

ammeter.  The  galvanometer  proper  is  contained  within  the  dashed 
rectangle  and  may  be  considered  to  consist  of  an  ideal  resistance- 
less galvanometer  in  series  with  an  equivalent  resistance  rg  equal 
to  the  actual  resistance  of  the  device.  The  switch  and  the  so-called 
shunt  resistors  are  usually  located  inside  the  instrument  case,  (b)  A 
galvanometer  connected  for  use  as  a multirange  voltmeter. 


The  galvanometer  can  be  converted  into  a multirange  ammeter  by  connecting 
it  in  the  circuit  shown  in  Fig.  24-16a.  The  galvanometer  itself  has  an  “internal”  re- 
sistance rg.  This  is  mainly  the  resistance  of  the  wire  forming  the  coil.  If  the  range 
switch  connects  the  galvanometer  in  parallel  with  a calibrated  resistor  R,  you  can 
easily  show  that  the  total  current  i flowing  through  the  device  is  given  by 

R + vg 

(24-40) 

where  iy  is  the  current  flowing  through  the  galvanometer  itself. 

In  Fig.  24-16b,  the  galvanometer  is  a part  of  a multirange  voltmeter.  It  is  evi- 
dent that  the  voltage  V required  to  make  a current  ig  pass  through  the  voltmeter  is 
given  by  the  equation 

V=ig(R+r9)  (24-41) 

Remarkably  rugged  galvanometers  of  moderate  sensitivity  can  be  made  quite 
cheaply.  Whatever  the  quantity  marked  off  on  the  scale  may  be,  the  great  majority 
of  meters  in  general  use  are  galvanometers.  Examples  are  the  fuel  level  and  oil 
pressure  gauges  on  automobile  dashboards  and  the  input  and  output  level  meters 
on  many  high-fidelity  sound  systems.  But  the  modern  tendency  is  to  replace  them 
with  electronic  systems  having  no  moving  parts,  which  use  digital  readouts. 

The  operation  of  the  electric  motor  is  based,  like  that  of  the  galvanom- 
eter, on  the  torque  exerted  on  a current-carrying  loop  in  a magnetic 
field.  The  simplest  form  is  the  permanent-magnet  motor  shown  in  Fig.  24-17. 
The  rotor  consists  of  a rotatable  shaft  on  which  a coil  of  wire  is  mounted, 
with  the  shaft  parallel  to  the  plane  of  the  coil.  The  ends  of  the  coil  are  con- 
nected to  the  two  sides  of  the  split-ring  commutator  in  the  figure.  Electric 
contact  is  made  to  a source  of  emf  in  the  outside  world  by  means  of  sliding 
contacts,  called  brushes.  The  rotor  is  surrounded  by  a permanent  magnet, 
whose  pole  faces  only  are  illustrated.  Its  function  is  to  provide  the  magnetic 
field  (B. 

An  electric  current  supplied  through  the  brushes  flows  in  the  coil, 
whose  magnetic  dipole  moment  m is  perpendicular  to  the  plane  of  the  coil. 
The  torque  exerted  by  the  external  magnetic  field  © sets  the  rotor  into  mo- 
tion. The  commutator  and  brushes  are  aligned  so  that  just  as  the  magnetic 


1148 


Magnetic  Fields,  II 


Ill 


commutator 

Fig.  24-17  A simple  electric  motor.  Current  flows  from  the  posi- 
tive brush  through  one  of  the  commutator  segments  and  then  via  a 
lead  to  the  rotor  coil,  whose  plane  is  shown  as  horizontal  in  the  dia- 
gram. From  the  coil  the  current  flows  through  another  lead  to  the 
other  commutator  segment  and  then  to  the  negative  brush  and 
back  to  the  source  of  emf  (not  shown).  As  a result  of  the  current 
flow  through  it,  the  coil  possesses  a magnetic  dipole  moment  m. 
The  torque  T = m x © about  the  shaft  axis  tends  to  turn  the  coil 
so  that  its  plane  becomes  normal  to  ffi,  thus  maximizing  the  ex- 
ternal magnetic  flux  cbm  penetrating  it.  At  this  point,  m is  parallel  to 
®,  and  the  torque  is  zero.  However,  the  rotation  of  the  commutator 
in  this  process  results  in  a reversal  of  the  sense  of  the  current 
through  the  coil,  and  the  magnetic  dipole  moment  reverses,  be- 
coming antiparallel  to  (B.  The  inertia  of  the  rotor  carries  it  past  this 
orientation,  and  the  torque  T = m x ® tends  to  keep  the  shaft  in 
continuing  rotation  in  the  sense  shown. 


moment  vector  of  the  coil  comes  to  be  parallel  with  the  magnetic  held  ap- 
plied to  it  by  the  permanent  magnet,  the  electric  contacts  are  reversed. 
Thus  the  current  flows  through  the  coil  in  a reversed  sense,  and  its  new 
magnetic  moment  vector  is  antiparallel  to  the  externally  applied  magnetic 
field.  The  resulting  torque  impels  the  rotor  through  another  half  turn,  at 
the  end  of  which  the  current  reverses  again,  and  so  on. 

Permanent-magnet  motors  are  suitable  only  in  small-scale  applications 
using  direct  current  (dc).  For  larger  motors  the  permanent  magnets  be- 
come too  bulky  and  expensive,  and  electromagnets  are  used  to  provide  the 
external  magnetic  field  which  acts  on  the  magnetic  dipole  moment  of  the 
rotating  coil  (or  coils)  of  the  rotor. 

Motors  are  used  to  do  mechanical  work.  What  is  the  mechanism  by 
which  electric  energy  is  converted  into  the  kinetic  energy  of  motion  of  the 
shaft?  An  increase  in  the  output  work  done  by  the  motor — resulting,  say, 
from  an  increase  in  the  mechanical  load  attached  to  the  shaft — must  be  re- 
flected in  an  increase  in  the  work  required  to  pass  a quantity  of  electric 
charge  through  the  rotor  coil.  This  work  must  be  independent  of  the  Joule 
heating,  ?R,  which  has  to  do  with  the  resistance  of  the  wire  in  the  coil  and 
which  thus  has  nothing  to  do  with  the  mechanical  work  done  by  the  motor. 
The  answer  to  the  question  cannot  lie  in  the  equations  with  which  we  have 
been  dealing,  such  as  Eq.  (24-86),  F/L  = z's  x (B.  This  equation  describes 
the  magnetic  force  exerted  on  moving  charges  in  a direction  perpendicular 
to  their  motion.  But  the  performance  of  mechanical  work  requires  that  the 
applied  force  have  a component  along  their  direction  of  motion.  Another 
fundamental  law  of  electromagnetism  is  needed  to  complete  the  explana- 
tion. This  law,  called  Faraday’s  law,  is  discussed  in  Chap.  25. 


24-3  Magnetic  Dipoles  and  Their  Applications  1149 


24-4  AMPERE’S 
CONJECTURE  AND 
DIAMAGNETISM 


(b) 


Fig.  24-18  (a)  A ring-shaped  “horse- 

shoe” magnet.  Shown  are  a few  field 
lines,  representing  schematically  the 
pattern  of  the  magnetic  field  of  the 
magnet  in  the  region  of  the  gap  in  the 
ring.  In  principle,  such  a magnet  could 
be  made  by  bending  a bar  magnet  into  a 
nearly,  but  not  quite,  closed  ring,  (b) 
The  pattern  of  the  magnetic  field  in  the 
region  of  the  gap  of  this  not-quite- 
closed  toroidal  current-carrying  coil  is 
the  same  as  that  of  the  permanent 
magnet  of  part  a.  In  principle,  such  a 
coil  could  be  made  by  bending  a so- 
lenoid into  a nearly,  but  not  quite, 
closed  ring. 


The  time  lias  now  come  to  bridge  the  tantalizing  gap  that  remains  between 
what  seem  to  be  two  quite  different  sources  of  magnetic  fields.  A bar 
magnet  has  a magnetic  dipole  moment  given  by  Eq.  (24-34),  m = |i//|2d+, 
where  |i|/|  is  the  magnitude  of  the  pole  strength  of  either  pole  and  d+  is  the 
vector  from  its  center  to  its  North  pole.  A current-carrying  loop  of  wire  has 
a magnetic  dipole  moment  given  by  Eq.  (24-32),  m — iaz,  where  i is  the  cur- 
rent flowing  through  the  wire,  a is  the  area  of  the  loop,  and  z is  the  direc- 
tion of  the  axial  magnetic  field.  The  magnetic  field  of  a bar  magnet  looks 
just  like  that  of  a solenoid — that  is,  a “stack  of  loops”  of  the  same  length 
and  thickness — carrying  the  proper  amount  of  current.  Similarly,  the  mag- 
netic field  of  a horseshoe  magnet  can  be  imitated  by  cutting  a small  piece 
out  of  a current-carrying  toroid,  as  in  Fig.  24-18.  Thus  magnetic  fields  arise 
from  magnetic  solids,  where  we  cannot  see  directly  what  goes  on  “inside.” 
Magnetic  fields  also  arise  from  various  configurations  of  current-carrying 
wires,  where  we  can  “see”  better. 

As  early  as  1820  — the  year  in  which  Oersted,  Ampere,  Biot,  and  Sa- 
vart  first  investigated  the  connections  between  electric  currents  and  mag- 
netic fields — Ampere  had  a brilliant  insight  which  to  this  day  has  not  been 
pursued  to  exhaustion.  Ampere’s  conjecture  is  easy  to  state:  The  magnetic 
field  of  permanent  magnets  arises  f rom  the  flow  of  electric  curren  ts  within  the  mag- 
netic material. 


How  big  are  the  internal  currents?  Consider,  for  example,  the  bar 
magnet  shown  in  Fig.  24- 19a,  which  has  length  2 d,  cross-sectional  area  a, 
and  pole  strength  |t/t|.  Its  dipole  moment  has  magnitude  m = \\Jj\2d.  This  is 
equal  to  the  magnitude  m = ia  of  the  magnetic  dipole  moment  of  a loop  of 
cross-sectional  area  a which  carries  a current  i.  But  suppose  we  identify  the 
cross-sectional  area  of  the  loop  with  that  of  the  magnet.  That  is,  we  assume 
that  the  “loop”  is  the  surface  of  the  magnet  itself.  According  to  Ampere's 
conjecture,  a sheet  of  current  is,  called  the  amperean  surface  current,  cir- 
culates indefinitely  around  the  magnet  as  shown  in  Fig.  24-196.  While  the 
current  cannot  be  measured  directly,  its  magnitude  can  be  inferred  from 
the  dipole  moment  of  the  magnet,  which  can  be  measured.  4'he  current  is 


m 


(24-42) 


Fig.  24-19  Ampere’s  conjecture,  (a)  A bar 
magnet  of  cross-sectional  area  a and  length  2d 
has  pole  strength  of  magnitude  |i/»|.  Its  mag- 
netic dipole  moment  is  thus  m = |i/t|2d+.  (b) 
The  same  magnet  is  thought  of  as  a bar  sur- 
rounded by  a thin  conducting  sheet  which 
carries  a uniformly  distributed  amperean  sur- 
face current  ia , having  a dipole  moment  m = 
isai.  While  this  current  cannot  be  observed 
directly,  the  magnetic  field  of  the  actual 
magnet  can  be  accounted  for  completely  if  it  is 
assumed  that  is  = |t)/|2 d/a. 


1150  Magnetic  Fields,  II 


The  similarity  of  the  hypothetical  picture  of  Fig.  24-196  to  a solenoid 
wound  with  a single  turn  of  wide,  flat  conducting  ribbon  is  striking. 

Example  24-5  will  give  you  a feel  for  the  magnitude  of  the  amperean 
surface  current  in  a typical  situation  involving  a permanent  magnet. 


EXAMPLE  24-5 

The  needle  of  a small  magnetic  pocket  compass  has  length  / = 3 cm.  It  has  rectan- 
gular cross  section,  and  its  width  w and  thickness  t are  both  small  compared  to  its 
length.  When  the  needle  is  disturbed  slightly,  it  oscillates  about  its  equilibrium 
north-south  orientation  with  a frequency  v = 2 Hz.  If  you  take  the  value  of 
the  horizontal  component  of  the  earth’s  magnetic  field  to  be  5fth  = 5 x 10~5  T,  what 
is  the  value  of  the  amperean  surface  current  ia  which  circulates  around  the  axis  of 
the  needle?  Take  the  density  of  the  steel  of  which  the  needle  is  made  to  be  p = 8 X 
103  kg/m3. 

■ You  think  of  the  compass  needle  as  a torsion  pendulum,  in  which  the  restoring 
torque  is  provided  not  by  a torsion  fiber,  but  by  the  magnetic  forces  exerted  on  the 
magnetized  needle  by  the  earth’s  magnetic  held.  If  the  magnetic  dipole  moment  of 
the  needle  is  m,  the  torque  is  given  by  Eq.  (24-37),  T = m x (B.  In  this  case,  only  the 
horizontal  component  of  the  earth’s  magnetic  held  is  effective,  since  the  needle 
can  turn  freely  only  in  a horizontal  plane.  So  you  can  write  the  magnitudes  of  the 
quantities  given  in  Eq.  (24-37)  in  the  form 

T = m£Rh  sin  8 (24-43a) 

where  8 is  the  (smaller)  angle  between  the  magnetic  held  lines  of  the  earth  (which 
run  roughly  south  to  north)  and  the  orientation  of  the  needle  at  any  particular  in- 
stant. For  small  angular  displacements  6,  you  can  use  the  approximation  sin  8 — 6, 
so  that 


T = m2fth  9 


(24-43  b) 


Since  m and  gfth  are  constants,  the  torque  T obeys  the  rotational  form  of  Hooke’s 
law,  being  directly  proportional  to  the  angular  displacement  8.  The  torsion  constant 
is  k = m<3ih. 

The  oscillation  frequency  of  a torsion  pendulum  is  given  by  Eq.  (10-26),  v = 
(1/2  n)(k/I)112,  where  k is  the  torsion  constant  and  I is  the  moment  of  inertia  about 
the  vertical  axis  through  the  center  of  the  needle.  In  Example  10-1,  you  found  that 
the  moment  of  inertia  of  a thin  bar  about  a perpendicular  axis  through  its  center  is 
given  by  I = T2MI2,  where  M is  the  mass  of  the  bar  and  / is  its  length.  For  the 
compass  needle,  you  And  M by  multiplying  its  volume  Iwt,  where  w is  its  width  and  t 
its  thickness,  by  the  density  p,  obtaining  M = plwt.  The  moment  of  inertia  is  thus 

/ = T2  pwtl3 


Substituting  the  values  of  the  torsion  constant  k and  the  moment  of  inertia  I 
thus  found  into  the  expression  for  the  frequency  v,  you  obtain 

1 /12mS8ft 
2 77 1 pwtl3  / 

Solving  this  equation  for  the  magnitude  m of  the  magnetic  dipole  moment  of  the 
needle,  you  find 

v2v2pcotl3 

m = m 


Now  you  can  use  Eq.  (24-42)  to  find  the  amperean  surface  current  ia  = m/a. 
Since  the  cross-sectional  area  of  the  needle  is  a = wt,  you  have 


7 rVp/3 

3£3fc 


(24-44) 


24-4  Ampere’s  Conjecture  and  Diamagnetism  1151 


You  have  numerical  values  for  all  the  quantities  on  the  right  side  of  this  equation. 
You  can  thus  calculate  the  value 


I'd 


jt 2 x (2  Hz)2  x 8 x 103  kg/m3  x (3  x 10  2 nr)3 
3 x 5 x 10-5 T 


= 6 x 104  A 


This  is  not  a small  current  to  be  flowing  around  the  surface  of  an  object  the  size  of  a 
pocket  compass  needle! 


Example  24-5  shows  that  amperean  surface  currents  associated  with 
ordinary  permanent  magnets  are,  in  general,  cpiite  large.  That  is,  given  a 
permanent  magnet  of  a certain  size,  it  would  take  a considerable  current, 
flowing  through  a hollow-core  solenoid  of  the  same  size  and  shape,  to  pro- 
duce the  same  magnetic  field. 

What  is  the  origin  of  the  internal  current  associated  with  magnetism  in 
matter?  Here  we  can  pursue  the  answer  only  partially,  by  considering  qual- 
itatively the  three  general  kinds  of  magnetic  behavior  in  matter.  They  are 
called  diamagnetism,  paramagnetism,  and  ferromagnetism. 

Most  matter  exhibits  the  property  of  diamagnetism,  to  which  we  de- 
vote most  of  the  remainder  of  this  section.  (We  treat  paramagnetism  and 
ferromagnetism  in  Sec.  24-5.)  When  a sample  of  diamagnetic  material  is 
placed  in  an  external  magnetic  field  (B0,  the  magnetic  field  (Bint  measured 
inside  the  sample  (say,  in  a tiny  cavity  hollowed  out  in  the  sample)  is  slightly 
smaller  than  ®„.  This  behavior  is  analogous  to  that  of  a dielectric,  inside 
which  the  electric  field  Sint  is  smaller  than  the  applied  external  electric 
field  80  (see  Sec.  21-7).  Diamagnetism  is,  in  most  cases,  a rather  weak  effect. 
The  metal  bismuth,  however,  is  notable  for  exhibiting  diamagnetism  rela- 
tively strongly. 

While  diamagnetism  is  a property  of  all  matter,  it  is  overwhelmed  in 
many  substances  by  an  opposing  effect  called  paramagnetism.  In  paramag- 
netic materials  located  in  an  external  magnetic  field,  the  internal  field  ®jnt 
is  slightly  larger  than  (B„. 

Both  diamagnetism  and  paramagnetism  are  examples  of  induced 
magnetism.  In  both  cases,  the  existence  of  a measurable  macroscopic  mag- 
netic moment  depends  on  the  presence  of  the  external  magnetic  field  ffi0. 
That  is,  if  (Bn  is  zero,  then  (Bint  will  be  zero  as  well.  Besides  these  rather  small 
effects,  there  is  also  the  relatively  rare,  but  dramatic  and  well-known,  effect 
called  ferromagnetism — the  kind  of  magnetism  exhibited  by  iron  and 
other  materials  we  usually  think  of  as  “magnetic  materials.”  While  ferro- 
magnetism is  a special  case  of  paramagnetism,  there  are  important  differ- 
ences. One  difference  is  that  even  in  the  absence  of  an  externally  applied 
magnetic  field,  a strong  spontaneously  induced  field  (Bint  can  remain  under 
proper  conditions.  This  is  the  phenomenon  called  permanent  magnetism. 


To  gain  insight  into  the  phenomenon  of  paramagnetism,  we  will  con- 
sider the  simplest  possible  (hypothetical)  case.  Suppose  that  a “sample”  of 
matter  consists  of  a single  electron  moving  with  a speed  v,  typical  of  the 
speeds  of  electrons  in  solids,  inside  a box  so  large  that  the  electron  rarely 
collides  with  the  walls.  When  an  external  magnetic  field  <B(I  is  applied,  the 
electron  moves  in  a helical  path,  as  shown  in  Sec.  23-2.  If  the  magnitude  of 
the  component  of  the  thermal  velocity  in  the  plane  perpendicular  to  <B0  is 
v± , the  radius  r of  the  helical  path,  given  by  Eq.  (23-9),  is  the  “cyclotron 


radius” 


me  vx 

~W0 


(24-45) 


1152  Magnetic  Fields,  II 


Fig.  24-20  (a)  A free  electron  moves 
with  velocity  v in  a region  of  uniform, 
externally  applied  magnetic  field  ®„  11 
the  component  of  its  velocity  in  the 
plane  normal  to  ®0  is  of  magnitude  v±, 
it  will  revolve  in  the  sense  shown  in  a cir- 
cular orbit  of  radius  r = (me/e)(vx /$&<>). 
( b ) The  revolution  of  the  electron  in  die 
sense  shown  in  part  a is  equivalent  to  an 
electric  current  i = ev±/2irr  having 
the  opposite  sense.  Associated  with  this 
induced  current  is  an  induced  magnetic 
dipole  moment  m whose  direction  is  an- 
tiparallel to  that  of  ®„.  At  or  near  the 
center  of  the  electron  orbit,  the  direc- 
tion of  the  magnetic  field  of  the  dipole  is 
also  antiparallel  to  ®0,  as  shown.  This 
field  is  called  the  demagnetizing  field  ®,; . 


where  mje  is  the  reciprocal  of  the  charge-to-mass  ratio  of  the  electron.  For 
the  purposes  of  this  discussion,  the  motion  of  the  electron  along  the  mag- 
netic held  can  be  ignored.  Figure  24-20«  shows  the  motion  of  the  electron 
in  the  plane  perpendicular  to  ®n.  The  sense  of  rotation  shown  can  be  veri- 
fied by  using  the  procedure  developed  in  Example  23-1  and  illustrated  in 
Fig.  23-10.  " 

When  an  electron  moves,  there  is  a transportation  of  electric  charge, 
that  is,  an  electric  current.  The  electron  revolving  in  the  sense  shown  in 
Fig.  24-20a  is  equivalent  to  a current  i flowing  in  the  opposite  sense,  as 
shown  in  Fig.  24-206.  This  current  is  called  the  induced  amperean  current. 
Connected  with  this  loop  current  i is  a magnetic  dipole  moment  of  magni- 
tude ia  — irrr2,  where  r is  the  path  radius  given  by  Eq.  (24-45).  This  mag- 
netic moment  is  called  an  induced  dipole  moment.  The  word  “induced”  is 
used  because  the  existence  of  the  dipole  moment  depends  on  the  presence 
of  the  externally  applied  magnetic  field  (B0.  In  the  absence  of  the  exter- 
nally applied  field,  the  electron  does  not  revolve,  there  is  no  amperean 
current,  and  consequently  there  is  no  dipole  moment.  Using  the  right- 
hand  rule  for  magnetic  field  lines,  which  relates  their  sense  to  that  of  the 
electric  current  with  which  they  are  associated,  you  can  see  that  the  direc- 
tion of  the  induced  dipole  moment  is  antiparallel  to  that  of  the  externally 
applied  magnetic  field  (B0,  as  shown  in  Fig.  24-206.  This  is  contrary  to  the 
case  of  the  electric  clipole  discussed  in  Sec.  21-4.  The  electric  dipole  mo- 
ments induced  in  a sample  of  matter  by  the  application  of  an  external  elec- 
tric field  line  up  parallel — not  antiparallel — to  the  field.  But  there  is  really 
no  contradiction,  because  a magnetic  dipole  behaves  analogously  to  an  elec- 
tric dipole  only  “outside”  the  dipole.  “Inside,”  where  the  dipole  field  is 
strongest,  the  effect  is  the  same  in  both  the  electric  and  magnetic  cases:  The 
induced  dipole  field  opposes  the  external  field  and  thus  reduces  the  magnitude  of  the 
resultant  field.  In  the  electric  case,  this  opposing  field  is  called  the  depolarizing 
field  S>d.  In  the  magnetic  case  it  is  called  the  demagnetizing  field 

The  internal  field  ©int  near  the  center  of  the  electron  orbit  is  found  by 
taking  the  vector  sum  of  the  external  and  demagnetizing  fields: 

«int  = ®u  + «,/  (24-46) 

Because  of  the  opposing  directions  of  ffi0  and  ©<*,  we  have  53int  < £$0-  The 
permeability  p of  the  region  near  the  center  of  the  electron  orbit  is  defined 
by  the  equation 

- 33int  - — 33o  (24-47a) 

M Mo 

For  diamagnetic  materials,  the  permeability  p is  smaller  than  the  perme- 
ability of  free  space  p0,  because  £$int  is  smaller  than  S80-  The  name  “perme- 
ability” comes  from  the  idea  that  p is  a measure  of  the  ease  with  which  mag- 
netism “permeates”  matter.  The  units  of  p,  like  those  of  p0,  are  tesla- 
meters  per  ampere.  The  permeability  of  free  space  is  the  permeability 
appropriate  in  the  complete  absence  of  matter. 

The  relative  permeability  Km  is  defined  as  the  ratio  of  the  permeabil- 
ity of  a sample  of  matter  (in  the  present  case,  a single  electron)  to  that  of 
free  space.  Thus  we  can  write  Km  as  the  dimensionless  quantity 

Km  - i (24-474) 

Mo  o 

(In  the  hypothetical  case  of  a single  electron,  Km  and  p depend  on  position. 

24-4  Ampere’s  Conjecture  and  Diamagnetism  1153 


However,  in  real  materials  containing  many  electrons,  this  is  not  so  because 
Km  and  /jl  are  average  values  over  the  sample.)  The  relative  permeability  Km 
is  less  than  1 for  diamagnetic  materials.  Like  the  isolated  free  electron  we 
have  been  discussing,  diamagnetic  materials  tend  to  exclude  magnetic  flux 
from  the  region  surrounded  by  the  induced  amperean  current.  In  the 
simple  case  of  the  isolated  free  electron,  for  example,  this  region  is  the 
region  within  the  electron  orbit,  as  shown  in  Fig.  24-21.  This  tendency  to 
exclude  magnetic  llux  is  the  fundamental  property  underlying  diamagne- 
tism. 


(b) 

Fig.  24-21  (a)  The  field  line  pattern  of 

a magnetic  dipole  is  shown  schemati- 
cally. It  is  superimposed  on  the  uniform 
external  magnetic  field  lines  as  though 
they  existed  separately.  The  magnetic 
dipole  moment  arises  from  the  orbital 
revolution  of  an  electron  induced  by  the 
external  magnetic  field.  The  associated 
current  loop  is  shown.  ( b ) Schematic  dia- 
gram of  actual  field  line  pattern  re- 
sulting from  superposition  of  induced 
and  external  magnetic  fields. 


Fig.  24-22  1 he  relation  among  the  ex- 

ternally applied  magnetic  field  (B0,  the 
amperean  surface  current  is  to  which  it 
gives  rise,  and  the  resulting  magnetiza- 
tion M. 


Usually  we  are  interested  not  in  the  diamagnetism  of  a single  electron, 
but  in  the  magnetic  properties  of  a macroscopic  sample  of  matter.  The  in- 
duced magnetic  dipole  moment  of  the  entire  sample  will  be  the  vector  sum 
of  the  microscopic  magnetic  dipole  moments  which  it  contains.  In  dis- 
cussing macroscopic  effects,  it  is  useful  to  define  the  magnetic  dipole  mo- 
ment per  unit  volume,  which  is  called  the  magnetization  M.  That  is,  we 
define 

magnetic  dipole  moment  vector 
of  a homogeneous  sample 

M = 5 * (24-48) 

volume  of  the  sample 

The  SI  units  of  M are  ampere-square  meters  per  cubic  meter,  or  amperes 
per  meter. 

The  magnetization  has  a direct  physical  meaning  in  terms  of  the  am- 
perean surface  current  is.  In  Fig.  24-22,  a cylindrical  sample  of  material  has 
length  / and  cross-sectional  area  a.  It  lies  in  an  external  magnetic  held  (B0. 
As  a result,  there  is  induced  inside  the  sample  a held  so  that  the  in- 
ternal held  (Bint  is  given  by  the  sum 

® int  = «0  + ®rf 

With  the  held  (Bf,  can  be  associated  an  amperean  surface  current  is  in  the 
sample.  The  induced  magnetic  moment  of  the  sample  has  magnitude  equal 
to  isa.  And  since  the  magnetization  M is  the  magnetic  dipole  moment 
per  unit  volume,  the  magnitude  of  the  induced  magnetic  dipole  moment  of 
the  sample  is  also  equal  to  the  product  of  M with  the  sample  volume  al. 
Thus  we  have  for  the  magnitude  of  the  total  magnetic  dipole  moment  of 
the  sample 

Mai  = isa 

Solving  for  the  magnitidue  of  the  magnetization,  we  obtain 

M = j (24-49) 

The  quantity  ijl  is  the  amperean  surface  current  per  unit  length  of 
sample.  (The  current  hows  uniformly  around  the  surface,  so  that  the  ex- 
ternal magnetic  held  induces  a greater  current  in  a longer  sample.) 

Now  imagine  that  the  surface  layer  of  the  cylindrical  sample,  where 
the  amperean  current  flows,  is  replaced  by  a single-turn  solenoid  winding. 
The  demagnetizing  field  magnitude  53d  associated  with  the  current  is  in  the 
winding  can  be  found  from  Eq.  (23-62),  S3  = /jL0ni.  This  equation  gives  the 
magnetic  held  inside  a solenoid  with  n turns  per  unit  length  carrying  cur- 


1154  Magnetic  Fields,  II 


rent  i.  For  the  hypothetical  single-turn  solenoid  we  have  n = 1//,  and  the 
induced  held  is 

= (24-50a) 

Combining  this  equation  with  Eq.  (24-49)  gives  the  result 

M (24-50 6) 

which  expresses  the  induced  held  in  terms  of  the  induced  magnetic  mo- 
ment per  unit  volume. 


Experimental  measurements  made  on  many  materials  over  a wide 
range  of  conditions  indicate  that  the  magnitude  M of  the  induced  magneti- 
zation is  directly  proportional  to  that  of  the  externally  applied  magnetic 
held,  £$0.  Since  for  such  materials  M is  proportional  to  S30,  it  follows  from 
Eq.  (24-506)  that  is  also  proportional  to  S30-  With  this  in  mind,  we  dehne 
the  magnetic  susceptibility  x (lowercase  Greek  chi)  as  the  constant  given 
by  the  experimentally  determinable  ratio 


_ M-o M 
X= 


(24-5  la) 


And  since  according  to  Eq.  (24-506),  [x0M  = 3id,  we  have 

_ 

X 


(24-516) 


Since  the  magnetic  susceptibility  is  a measure  of  the  amount  of  magnetic 
moment  induced  per  unit  of  external  magnetic  held  applied,  it  measures 
the  “ease”  of  magnetizing  a given  material.  To  put  it  slightly  differently,  it 
measures  the  “susceptibility”  of  the  material  to  magnetization.  While  x is  a 
scalar,  it  is  conventional  to  denote  the  fact  that  M and  ®,/  are  antiparallel  to 
<B„  in  diamagnetic  materials  by  assigning  negative  values  to  their  magni- 
tudes, as  though  they  were  signed  scalars.  Thus  x has  a negative  value  for 
diamagnetic  materials. 

According  to  Eq.  (24-476),  the  relative  permeability  Km  is  dehned  as 
the  ratio  of  the  resultant  internal  held  53int  to  the  inducing  held  S30: 


K 


m 


^int 


Writing  S3int  = S30  + S3rf  and  using  x = S3d/S3 o>  we  have 


% + = 

S3 


(24-52) 


For  diamagnetic  materials,  Km  is  less  than  1,  since  x negative. 

Example  24-6  considers  the  values  of  the  magnetic  susceptibility  and 
the  relative  permeability  for  the  metal  bismuth,  which  exhibits  compar- 
atively strong  diamagnetic  properties. 


EXAMPLE  24-6 


■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 
The  magnetic  susceptibility  of  bismuth  at  room  temperature  is  x = — 1.3  X 10~° 
a.  Find  the  relative  permeability  Km  for  this  metal. 

■ Using  Eq.  (24-52),  you  find 


24-4  Ampere's  Conjecture  and  Diamagnetism  1155 


Km  = 1 + x = 1 ^ 1.3  x 1(T5 


Amperean  curve: 

Jffi-rfl  = 0 


r 


(inward) 

Fig.  24-23  A view  along  the  externally 
applied  magnetic  field  ®„  of  a planar 
slice  through  a cylindrical  container 
filled  with  free  electrons.  The  orbits  of 
the  electrons  are  shown. 


or 


Km  = 0.999987 

b.  A sample  of  bismuth  at  room  temperature  is  placed  inside  a high-field  so- 
lenoid, so  that  it  experiences  an  externally  applied  field  S30  = 10  T.  Evaluate  the 
demagnetizing  field  Sftd. 

■ Solving  Eq.  (24-51  b),  x = £3d/3#0>  for  gives  you 

= X^o  = -1.3  x 10~5  x 10  T 
= -1.3  x 10-4  T 

The  negative  sign  of  denotes  the  fact  that  the  demagnetizing  field  is  directed  op- 
posite to  the  externally  applied  field  In  spite  of  the  fact  that  S50  is  quite  large, 
the  value  of  is  between  2 and  3 times  that  of  the  relatively  weak  magnetic  field  of 
the  earth.  According  to  Eq.  (24-46),  (Bint  = ®„  + (Bd,  the  magnitude  35int  of  the 
magnetic  field  inside  the  bismuth  sample  is  very  slightly  smaller  than  580,  that  of  the 
external  field, 


We  are  now  prepared  to  consider  the  effect  of  an  externally  applied 
magnetic  field  on  a solid.  As  we  have  done  before,  we  make  a first  approxi- 
mation by  considering  the  solid  to  be  an  idealized  container  holding  a “gas” 
of  free  electrons.  If  the  average  magnetic  dipole  moment  of  the  electrons  is 
(m)  and  if  there  are  n electrons  per  unit  volume,  the  magnetization  (that  is, 
the  magnetic  dipole  moment  per  unit  volume)  of  the  electron  gas  is  given 
by  the  expression 

M = n<m>  (24-53) 

What  follows  is  a qualitative  discussion  of  the  magnetization  of  a sample  of 
solid,  considered  as  a “box”  of  free  electrons. 

Figure  24-23  shows  a planar  slice  of  a cylindrical  box  containing  many 
free  electrons.  There  is  an  externally  applied  magnetic  field  (B0  directed 
perpendicular  to  the  slice,  into  the  plane  of  the  page  (so  that  you  are  look- 
ing along  the  direction  of  ffi0).  The  cylindrical  slice  has  been  divided  into 
two  regions.  The  larger  region,  within  the  dot-dash  circle,  comprises  all  the 
electron  gas  lying  more  than  one  cyclotron  radius  r from  the  surface,  with  r 
being  given  by  Eq.  (24-45).  The  small  outer  region  lies  within  a distance  r of 
the  surface. 

First,  consider  the  inner  region.  All  the  electrons  revolve  in  their  orbits 
in  the  same  sense  (clockwise  in  the  figure)  under  the  influence  of  the  mag- 
netic field.  At  any  point  on  an  electron  orbit,  therefore,  the  revolution  of 
the  electron  amounts  to  a current  flowing  opposite  to  the  current  arising 
from  the  motion  of  its  immediate  neighbor.  A small  amperean  curve  like 
that  shown  in  the  figure  encloses  zero  net  current,  and  thus  the  circulation 
around  that  amperean  curve  is  zero.  At  any  point  well  inside  the  container, 
then,  there  is  no  net  current  to  contribute  to  an  induced  magnetic  field. 

Now  consider  what  happens  at  the  dot-dash  circle  in  Fig.  24-23.  If  we 
ignore  for  the  time  being  the  electrons  outside  this  circle,  the  small  currents 
of  the  outermost  electrons  taken  together  amount  to  a large  amperean  sur- 
face current  iinner  flowing  along  the  dot-dash  circle  in  a counterclockwise 
sense,  since  the  sense  of  the  current  is  opposite  to  that  of  the  circulation  of 
the  electrons.  The  analysis  is  essentially  the  same  as  that  accompanying  Fig. 
24-14,  where  the  magnetic  moment  of  a large  loop  is  found  to  be  equiva- 
lent to  that  of  many  very  small  subloops.  This  amperean  surface  current 


1156 


Magnetic  Fields,  II 


should  have  a macroscopic  magnetization  Minner  associated  with  it,  directed 
antiparallel  to  (Bfl.  But  now  we  must  take  into  account  the  electrons  in  the 
surface  layer,  outside  the  dot-dash  circle.  Because  they  are  so  close  to  the 
surface,  they  never  complete  an  orbit.  Such  an  electron  moves  along  an  arc 
of  a circle  in  the  clockwise  sense,  just  like  the  inner  electrons.  But  it  collides 
with  the  surface  and  rebounds  in  a random  direction.  Whatever  the  initial 
direction  of  rebound,  the  electron  begins  a new  clockwise  orbit  which  again 
brings  it  into  collision  with  the  wall.  As  you  can  see  from  the  figure,  the  net 
result  is  a transport  of  the  electron  in  the  counterclockwise  sense  around  the 
container  and  hence  a net  current  in  the  clockwise  sense.  The  magnetization 
Mouter  associated  with  this  “outer”  surface  current  iouter  is  parallel  to  (B0  and 
tends  to  cancel  the  magnetization  Minner  arising  from  the  electrons  in  the 
region  inside  the  dot-dash  circle,  farther  than  one  cyclotron  radius  from 
the  surface. 

Using  a more  detailed  and  more  general  argument  along  these  lines, 
Hendrika  Johanna  van  Leeuwen,  a Dutch  graduate  student  working  with 
Lorentz,  showed  in  1919  that  the  two  magnetizations  always  cancel  exactly; 
that  is,  Minner  + Mouter  = 0.  Stated  more  generally,  this  conclusion  is  called 
van  Leeuwen’s  theorem:  No  system  of  electric  charges  which  strictly  obeys  the  laws 
of  newtonian  mechanics  can  exhibit  induced  magnetism. 


24-5  PARAMAGNETISM  I n spite  of  van  Leeuwen’s  theorem,  all  materials  do  exhibit  magnetic  effects. 

AND  most  materials,  the  observed  effects  are  diamagnetic.  That  is,  the  in- 
FERROMAGNETISM  c^ucec^  field  ffi,/  is  directed  so  as  to  oppose  the  inducing  held  ©0,  and  the 

susceptibility  has  a negative  value.  Diamagnetism  does  indeed  arise  from 
the  influence  of  an  externally  applied  magnetic  field  on  the  motion  of 
electrons  in  matter.  All  matter  contains  electrons,  and  the  effect  is  univer- 
sal, although  the  details  vary  from  substance  to  substance.  It  is  specifically 
because  electrons  conform  to  the  laws  of  quantum  mechanics,  and  not  new- 
tonian mechanics,  that  diamagnetism  is  actually  observed.  Indeed,  all  mag- 
netic effects  in  matter  have  bases  which  are  explicitly  quantum-mechanical. 
The  remainder  of  this  chapter  must  therefore  be  limited  in  large  measure 
to  a qualitative  description  of  the  most  important  of  these  effects. 

Because  diamagnetism  is  a nonspecific  phenomenon  arising  from  the 
presence  in  matter  of  electrons,  there  is  a diamagnetic  effect  in  all  matter 
tending  to  produce  a diamagnetic  susceptibility  — that  is,  a susceptibility  of 
negative  value.  In  some  materials,  however,  the  diamagnetism  is  overshad- 
owed by  a larger  effect  whose  value  is  of  opposite  sign,  called  paramagne- 
tism, for  which  the  susceptibility  has  a positive  value.  Paramagnetism  de- 
pends on  the  presence  in  a sample  of  matter  of  permanent  atomic  or 
molecular  magnetic  dipole  moments.  To  give  one  example,  these  perma- 
nent moments  can  arise  from  atoms  which  have  odd  numbers  of  electrons, 
such  as  aluminum,  which  has  13.  Electrons  tend  to  pair  up  with  their  mag- 
netic moments  in  opposite  directions,  as  do  two  bar  magnets  put  close 
together.  When  two  bar  magnets  so  oriented  come  into  contact,  they  form  a 
closed  loop  through  which  all  the  magnetic  flux  can  circulate,  so  that  there 
is  no  magnetic  field  external  to  the  magnets.  In  the  same  way,  two  electrons 
oriented  so  that  they  have  equal,  but  opposite,  magnetic  moments  provide 
a closed  loop  for  magnetic  flux,  and  no  magnetic  field  can  be  observed  out- 
side the  immediate  vicinity  of  the  pair  of  electrons.  But  if  there  is  an  odd 
electron,  there  is  an  unpaired  magnetic  moment  left  over.  In  the  simplest 


24-5  Paramagnetism  and  Ferromagnetism  1157 


case,  this  magnetic  moment  is  not  the  orbital  magnetic  moment  due  to  the 
revolution  of  the  electron  in  an  orbit  around  the  nucleus,  but  a purely 
quantum-mechanical  entity  called  electron-spin  magnetic  moment.  It  may  be 
thought  of  as  deriving  from  the  amperean  current  arising  from  the  rota- 
tion of  the  charge  on  the  electron  when  the  electron  rotates  or  “spins” 
around  its  own  “axis.”  But  such  a picture  should  not  be  taken  too  seriously, 
helpful  as  it  is  in  visualizing  the  effect.  The  electron-spin  magnetic  dipole 
moment  has  a fixed  magnitude  which  can  be  observed  experimentally.  It 
is  called  the  Bohr  magneton  mB,  and  its  value  is 

mB  = 9.27  x 10-24  A-m2  (24-54) 

Thus  an  individual  electron  acts  like  a microscopic  permanent  magnet. 

In  spite  of  the  presence  of  permanent  atomic  magnetic  moments,  a 
paramagnetic  material  does  not  exhibit  magnetic  properties  in  the  absence 
of  an  externally  applied  magnetic  held.  This  is  because  the  individual  mag- 
netic moments  are  oriented  randomly,  as  in  Fig.  24-24a.  They  are  too  weak 
to  affect  each  other’s  orientations  significantly,  and  their  orientations  are 
randomized  by  thermal  agitation.  In  the  presence  of  an  external  magnetic 
held  (B0,  the  magnetic  moments  tend  to  line  up  parallel  to  the  held,  as  in 
Fig.  24-246.  The  induced  magnetic  held  (Bd  is  thus  parallel  to  (B0.  Conse- 
quently, the  magnitude  of  the  internal  held.  ®int  = (B0  + is  greater 
than  that  of  ®0  in  a paramagnetic  material.  The  magnetization  has 
a positive  value,  and  the  paramagnetic  susceptibility  x>  given  by 
Eq.  (24-516),  x = has  a positive  value  as  well. 


(a) 

«o 


Fig.  24-24  (a)  A collection  of  n electrons  per  unit 

volume,  each  having  magnetic  dipole  moment  of 
magnitude  mB,  is  shown  with  the  moments  ran- 
domly oriented.  The  magnetization  of  the  sample 
containing  them  is  zero.  ( b ) The  application  of  an 
external  magnetic  field  ®0  tends  to  align  the  indi- 
vidual magnetic  dipole  moments.  However,  the 
randomizing  effect  of  thermal  agitation  prevents 
complete  alignment.  The  magnetization  of  the 
sample  is  greater  than  zero,  but  less  than  nmB.  ( c ) 
If  the  magnetic  field  ®0  is  very  large  or  the  tem- 
perature T is  very  low,  the  magnetic  dipole  mo- 
ments are  effectively  aligned  parallel  to  the  field. 
The  magnetization  has  magnitude  nmB  and  cannot 
be  increased  further.  This  is  the  condition  called 
saturation. 


®o 

,,  mB 


(c) 


1158 


Magnetic  Fields,  II 


The  energy  of  interaction  of  a magnetic  dipole  with  an  applied  mag- 
netic held  can  be  found  by  analogy  with  Eq.  (21-35),  which  gives  the  orien- 
tational potential  energy  of  an  electric  dipole  moment  p with  a uniform, 
externally  applied  electric  held  8 as 

U = — p • 8 (24-55) 

An  identical  argument  (in  which  the  magnetic  dipole  is  considered  as  two 
separated  poles  of  pole  strength  |i//|)  gives  the  orientational  potential  energy 
of  a magnetic  dipole  moment  m in  a uniform,  externally  applied  mag- 
netic held  (B  as 

U—  — m • ® (24-56) 

The  orientational  potential  energy  of  a magnetic  dipole  in  a uniform,  externally  ap- 
plied magnetic  field  is  equal  to  the  negative  of  the  dot  product  of  its  magnetic  dipole 
moment  vector  and  the  applied  magnetic  field  vector. 

The  amount  by  which  the  magnetic  dipole  energy  U decreases  as  the 
individual  magnetic  dipole  moments  m line  up  parallel  to  the  externally  ap- 
plied magnetic  held  (B0  is  a measure  of  their  tendency  to  line  up.  Similarly, 
the  thermal  energy  kT  (where  k is  Boltzmann’s  constant  and  T is  the  abso- 
lute temperature)  is  a measure  of  the  tendency  toward  randomization  of 
the  orientation  of  the  moments.  (Remember  that  thermal  motion  is  disor- 
dered motion.)  Thus  the  extent  to  which  the  magnetic  moments  align 
themselves  with  ©„  is  determined  by  the  quantity  U /kT.  In  the  extreme  case 
of  very  large  held  or  very  low  temperature,  the  magnetic  moments  may  be 
aligned  substantially  parallel  to  the  held.  4'his  condition,  called  saturation, 
is  described  in  Fig.  24-24c  and  its  caption.  Given  magnetic  helds  of  magni- 
tude typical  of  laboratory  magnets,  saturation  is  not  usually  observed  in 
paramagnetic  materials  except  at  temperatures  far  below  room  tempera- 
ture. This  point  is  made  in  quantitative  terms  in  Example  24-7. 

EXAMPLE  24-7 

a.  A free  electron  in  a metal  is  oriented  so  that  its  spin  magnetic  moment  is  an- 
tiparallel to  an  externally  applied  magnetic  field  ®0.  This  external  field,  produced 
by  a pulsed  coil,  has  a magnitude  £J80  = 35.0  T which  can  be  regarded  as  constant 
over  the  brief  duration  of  the  experiment.  The  electron  experiences  a “spin  flip." 
That  is,  the  direction  of  its  spin  magnetic  moment  mB  changes  by  180°,  so  that  the 
final  orientation  of  mB  is  parallel  to  ®0-  Find  the  change  A U = Uf  — U,-  in  the  orien- 
tational potential  energy  of  the  electron-spin  magnetic  moment  with  the  externally 
applied  magnetic  field.  Express  your  answer  in  joules  and  in  electron  volts. 

■ Using  Eq.  (24-56).  U = — m • ®.  and  remembering  that  mB  is  initially  anti- 
parallel to  ®„.  you  have  for  the  initial  energy  U, 

Ui  = — mB  • ®n  = -(-wbS90)  = mBTi0 

Using  the  value  mB  = 9.27  x 10-24  A-m2  given  in  Eq.  (24-54)  and  the  value  of  S90 
given  above,  you  find 

Ut  = 9.27  x KT24  A-m2  x 35.0  T = 3.24  x 10“22  J 

And  since  the  final  orientation  of  mB  is  parallel  to  ®„,  you  have  for  the  final  energy 
U/ the  expression 

Uf  = - mB  • ®n  = - mBS80 

= -9.27  x 10“24  A-m2  x 35.0  T = -3.24  x 10”22  J 

The  change  in  orientational  energy  is  thus 

AU  = Uf-  Ui  = -3.24  x 10-22  J - (3.24  x HU22  J)  = -6.48  x 10“22  J 


24-5  Paramagnetism  and  Ferromagnetism  1159 


To  express  this  result  in  electron  volts,  you  use  the  equivalence  1 eV  = 
1.60  x 1 0-19  J to  obtain 


AU 


-6.48  X KT22  J 
1.60  x 10-19  J/eV 


4.06  X 10-3  eV 


b.  Suppose  that  the  metal  of  which  the  free  electron  is  a part  is  at  room  tem- 
perature (T  = 300  K).  Is  it  likely  that  saturation  will  take  place,  that  is,  that  the 
electron-spin  magnetic  moments  of  the  many  free  electrons  in  the  metal  will  all  be 
substantially  aligned  parallel  to  the  externally  applied  field  ®„?  Is  it  likely  that  satu- 
ration will  take  place  if  the  metal  is  cooled  to  a temperature  T = 0.10  K? 

■ The  fact  that  the  value  of  AU  just  determined  is  negative  indicates  that  there 
is  a tendency  for  the  electrons  to  orient  themselves  so  that  their  spin  magnetic  mo- 
ments are  parallel  to  ffi0  and  the  orientational  energy  is  minimized.  However,  you 
must  compare  the  magnitude  |A[/|  with  the  thermal  energy  of  the  electron.  The 
value  of  that  thermal  energy  is  kT  multiplied  by  a factor  whose  value  is  of  the  order 
of  magnitude  of  1.  For  the  purposes  of  this  approximate  calculation,  it  is  suffi- 
ciently accurate  to  set  the  thermal  energy  equal  to  kT.  For  T = 300  K,  you  use  the 
value  k = 1.4  X 10~23  J/K  to  find 

kT  = 1.4  x 1(T23  J/K  x 300  K = 4.2  x 1(T21  J 

To  compare  this  value  with  that  of  |AC/|,  it  is  convenient  to  take  the  ratio  \AU\/kT. 
(This  is  the  quantity  which  appears  in  the  Boltzmann  factor  in  a statistical  calcula- 
tion to  determine  the  distribution  of  electron  spin  orientations.  The  Boltzmann 
factor  is  discussed  in  Sec.  18-5.)  Using  the  numerical  values  just  obtained,  you  have 
for  the  value  of  the  ratio 


\AU\  6.5  x 10-22  J 

= = 0 15 

kT  4.2  x 10"21J 

Since  this  value  is  less  than  1,  it  is  fair  to  guess  that  the  electron-spin  magnetic  mo- 
ments of  the  free  electrons  in  the  metal  are  not  substantially  all  aligned  parallel  to 
the  externally  applied  magnetic  field  ®0.  (How  can  you  justify  this  guess  on  the 
basis  of  a calculation  like  that  of  Example  18-6?)  You  could  not  be  sure  that  such 
was  the  case  unless  the  value  of  \AU\/kT  was  substantially  greater  than  1. 

Now  consider  the  case  T = 0. 10  K.  Here  you  have  for  the  thermal  energy  kT  the 
value 


kT  = 1.4  x 1CT23  J/K  x 0.10  K = 1.4  x KT24  J 
In  this  case  the  value  of  the  ratio  \AU\/kT  becomes 


|At/|  6.5  X 10“22J 

~kT  ~ 1.4  x 1(T24  J 


This  value  is  very  much  greater  than  1.  So  you  can  safely  guess  that  the  tendency 
toward  alignment  of  the  spin  magnetic  moments  measured  by  |AU|  dominates  the 
tendency  toward  random  alignment  measured  by  kT,  and  saturation  therefore 
takes  place. 


Over  a wide  range  of  temperatures,  the  magnetic  susceptibilities  of 
most  paramagnetic  substances  conform  to  the  rule 


X 


C 

T 


(24-57) 


where  C is  an  experimentally  determined  quantity  called  Curie’s  constant. 
That  is,  provided  the  magnetic  field  is  not  so  large  as  to  cause  saturation, 
the  paramagnetic  susceptibility  is  inversely  proportional  to  the  absolute 


1160 


Magnetic  Fields,  II 


temperature.  Equation  (24-57)  is  one  form  of  what  is  called  Curie’s  law, 
after  the  French  physicist  Pierre  Curie  (1859-1906)  who  Hrst  studied  the 
behavior  of  paramagnetic  materials  in  detail. 

Curie  is  known  more  widely  for  his  later  collaboration  with  his  wife,  Marie 
Sklodowska  Curie,  in  the  discovery  of  radium.  Grasping  the  importance  of  what 
she  was  doing,  he  dropped  his  own  first-rate  work  on  magnetism  in  favor  of  the 
collaboration.  His  untimely  death  in  an  accident  cut  short  his  research.  But  he 
nevertheless  founded  a distinguished  French  school  of  studies  in  magnetism 
which  continues  to  this  day. 

Figure  24-25  is  a graph  of  the  magnetic  susceptibility  versus  l/T,  the 
reciprocal  of  the  absolute  temperature,  for  the  paramagnetic  salt  chro- 
mium potassium  alum.  At  temperatures  above  20  K,  the  data  conform 
quite  well  to  Curie’s  law. 

The  most  dramatic  and  most  familiar  case  of  magnetic  behavior  in 
solids  is  that  of  ferromagnetism,  in  spite  of  the  fact  that  the  number  of  fer- 
romagnetic materials  is  relatively  small.  The  best  known  ferromagnetic 
materials  are  iron,  cobalt,  and  nickel  — the  so-called  transition  metals  because 
of  their  position  in  the  periodic  table  of  the  elements — together  with  many 
of  their  alloys  with  one  another  and  with  other  metals.  Several  of  the 
rare-earth  metal  group  are  ferromagnetic  as  well,  notably  gadolinium.  It  is 


T (in  K) 


Fig.  24-25  Plot  of  the  magnetic  suscep- 
tibility of  chromium  potassium  alum,  a 
paramagnetic  salt,  as  a function  of  l/T, 
the  reciprocal  of  the  absolute  tempera- 
ture. According  to  Curie’s  law.  this  plot 
should  be  a straight  line.  The  data, 
shown  as  points,  conform  quite  well  to 
Curie’s  law  except  at  large  values  of 
l/T — that  is,  at  low  temperatures — 
where  they  must  be  interpreted  in  terms 
of  a more  detailed  theory.  The  tempera- 
ture T is  plotted  on  a separate  axis  at  the 
top  of  the  graph.  (Data  after  W.  J.  de 
Haas  and  C.  J.  Goi  ter.) 


24-5  Paramagnetism  and  Ferromagnetism  1161 


worth  noting  in  this  connection  that  the  atomic  structure  of  the  rare-earth 
metals  bears  strong  resemblance  to  that  of  the  transition  metals.  In  fact,  the 
possibility  of  ferromagnetism  arises  from  atomic  structure.  But  it  is  also  af- 
fected strongly  by  the  way  the  atoms  are  put  together  into  solids.  The  ferro- 
magnetic properties  of  different  alloys  and  chemical  compounds  of  (say) 
iron  differ  widely  or  do  not  exist  at  all.  Ferromagnetism  occurs  only  in  crys- 
talline solids,  which  comprise  the  most  highly  ordered  state  of  matter.  As 
you  will  see  shortly,  this  is  because  ferromagnetism  depends  on  the  exis- 
tence of  a very  high  degree  of  order  on  the  atomic  scale. 

Unlike  paramagnetic  materials,  ferromagnetic  materials  contain  atomic 
magnetic  moments  which  interact  strongly.  This  so-called  exchange  interac- 
tion is  again  a purely  quantum-mechanical  phenomenon.  It  cannot  be  ac- 
counted for  on  the  atomic  level  by  a simple  lining  up  of  each  magnetic  mo- 
ment in  the  magnetic  field  of  its  neighbors,  like  iron  filings  in  the  field  of  a 
bar  magnet.  As  a result  of  the  exchange  interaction,  each  atomic  magnet 
lines  up  with  its  neighbors,  and  the  magnetization  is  saturated  in  the  ab- 
sence of  any  external  magnetic  field.  Calculation  of  the  magnetization, 
M = nm,  leads  to  values  which  correspond  to  internal  fields  S5int  lying 
usually  in  the  range  1 T to  10  T. 

Even  with  these  internal  fields,  the  sample  of  material  may  not  exhibit 
the  familiar  attractive  and  repulsive  macroscopic  forces  which  characterize 
a permanent  magnet.  While  microscopic  regions  of  the  sample,  called  fer- 
romagnetic domains,  contain  very  many  atoms  in  nearly  perfect  align- 
ment, the  domains  themselves  can  be  oriented  at  random  with  respect  to 
one  another,  so  that  their  magnetic  moments  cancel  out  in  the  large.  This 
domain  structure  can  be  made  visible,  as  in  Fig.  24-26.  Although  there  is 
considerable  regularity  in  the  sizes  and  shapes  of  the  domains  in  the  sample 
shown,  the  orientation  of  the  magnetic  moment  is  not  regular  from  one  do- 
main to  the  next. 


Fig.  24-26  Photograph  of  the  domain  structure  in  a ferromagnetic  metal  sample.  To  make  the 
photograph,  the  sample  is  carefully  polished  and  then  dipped  into  a liquid  containing  a sus- 
pension of  very  finely  divided  ferromagnetic  particles.  The  particles  tend  to  cling  to  the  do- 
main boundaries,  where  the  magnetic  field  emerges  from  the  sample.  (Why?)  The  sample  is 
then  photographed  through  an  ordinary  microscope.  This  technique,  devised  by  the  U.S. 
physicist  Francis  W.  Bitter,  is  a sophisticated  extension  of  the  “iron  filings”  method  used  to 
visualize  the  field  lines  around  magnets. 


The  random  orientation  of  the  domains  reflects  the  fact  (which  we  do 
not  justify  here)  that  it  is  energetically  favorable  for  the  sample  as  a whole 
to  have  zero  magnetization,  even  though  this  is  not  the  case  on  the  atomic 
scale  because  of  the  exchange  interaction.  Thus  the  randomly  oriented  do- 
mains in  a ferromagnetic  material  play  on  a larger  scale  the  role  of  the  ran- 
domly oriented  atomic  moments  in  a paramagnetic  material. 

As  a fully  macroscopic  analogue  to  the  microscopic  picture  just  given,  con- 
sider the  following  thought  experiment.  A large  number  of  small  bar  magnets  are 
thrown  into  a sack,  which  is  vigorously  shaken.  When  the  sack  is  opened,  the 
magnets  will  have  arranged  themselves  so  that  most  of  them  are  in  contact  with 
their  neighbors  with  North  pole  to  South  pole.  As  a consequence,  little  of  the  mag- 
netic field  will  be  observable  in  the  region  outside  the  mass  of  magnets. 

When  a ferromagnetic  sample  is  placed  in  an  externally  applied  mag- 
netic field  Sft0,  there  is  a tendency  for  the  magnetic  moments  of  the  individ- 
ual domains  to  line  up  with  the  external  field.  Also,  the  domain  boundaries 
move  in  such  a way  as  to  make  those  domains  which  are  aligned  with  the 
external  field  grow  at  the  expense  of  those  not  so  aligned.  The  value  of  S90 
required  to  do  this  to  an  appreciable  extent  depends  on  the  sample  and  on 
such  other  conditions  as  the  temperature.  However,  that  value  is  typically 
much  smaller  than  the  1 to  10  T typical  of  the  magnetic  field  S8int  within  the 
domains.  Consequently,  the  sample  as  a whole  exhibits  an  internal  mag- 
netic field  £®int  which  can  be  much  larger  than  the  external  field  3Z0.  This  is 
the  principle  of  the  iron-core  electromagnet,  invented  in  1820  by  Ampere 
and  his  friend  Frangois  Arago,  and  independently  by  the  British  chemist 
Humphry  Davy.  To  make  an  electromagnet,  a coil  of  wire  is  wound 
around  an  iron  core  of  the  desired  shape.  The  relatively  weak  field  S90  of 
the  coil  produces  a much  larger  field  53int,  which  can  be  put  to  many 
practical  uses.  Most  electric  motors,  generators,  and  transformers  depend 
on  this  principle,  as  do  innumerable  other  useful  devices. 

The  maximum  magnetic  field  which  can  be  produced  with  the  aid  of 
an  iron-core  electromagnet  is  limited  by  the  phenomenon  of  saturation. 
When  substantially  all  the  domains  in  the  core  have  been  aligned  parallel  to 
the  external  field,  increasing  the  electric  current  in  the  coil  will  have  little 
effect.  For  this  reason,  practical  iron  electromagnets  are  limited  to  max- 
imum magnetic  fields  around  3 T.  For  higher  fields,  “brute-force”  methods 
using  coreless  high-current  coils  must  be  used. 

In  all  ferromagnetic  materials,  energy  is  required  to  move  domain 
walls  around.  The  process  is  frictional,  in  the  sense  that  the  energy  is  dissi- 
pated as  heat.  As  a result,  it  can  be  energetically  favorable  for  a domain 
structure  to  persist  after  it  has  been  brought  into  existence  by  an  external 
magnetic  field,  even  though  the  external  field  has  been  removed.  This  phe- 
nomenon, called  magnetic  remanence,  underlies  the  existence  of  perma- 
nent magnets.  Remanence  depends  in  a complicated  (and  not  completely 
understood)  manner  on  crystal  structure,  chemical  composition,  and  in- 
ternal mechanical  strain.  Thus  the  manufacture  of  so-called  “magnetically 
hard”  materials,  having  large  remanence,  is  partly  a matter  of  art. 

Permanent  magnets  are  made  of  magnetically  hard  materials,  such  as  the 
series  of  alloys  called  Alnico.  The  material,  fabricated  into  the  desired  shape,  is 
placed  inside  a winding  through  which  a large  pulse  of  electric  current  is  passed, 


24-5  Paramagnetism  and  Ferromagnetism  1163 


Table  24-1 


Curie  Temperatures  of  Some  Ferromagnetic  Materials 


Substance 

Tc  (in  K) 

Iron 

1043 

Cobalt 

1403 

Nickel 

630 

Gadolinium 

290 

Y3Fe5012  (yttrium  iron  garnet) 

130 

EuS  (europium  sulfide) 

17 

usually  for  0.01  to  0.1  s.  The  momentary  magnetic  field  exceeds  that  required  to 
saturate  the  magnetization  of  the  sample.  When  the  external  field  is  removed,  the 
domains  retain  some  of  (though  not  all)  their  alignment,  and  the  object  is  a per- 
manent magnet. 

All  ferromagnetic  materials  have  a Curie  temperature  Tc  above  which 
they  cease  to  be  ferromagnetic  and  become  simply  paramagnetic.  Table 
24-1  lists  representative  Curie  temperatures.  It  is  found  experimentally 
that  above  its  Curie  temperature,  a ferromagnetic  material  obeys  a modi- 
fied form  of  the  Curie  law,  Eq.  (24-57),  called  the  Curie-Weiss  law: 

X = jSjr  (24-58) 

Figure  24-27  is  a plot  of  the  reciprocal  of  the  magnetic  susceptibility  1/y 
versus  temperature  for  the  ferromagnetic  metal  gadolinium.  [Since  y is 
proportional  to  1 /(T  — Tc),  1/y  is  proportional  to  T — Tc  as  well.]  Com- 
pared to  the  simple  Curie  law,  it  is  as  if  the  absolute  zero  of  temperature 
had  been  shifted  to  a higher  temperature  Tc,  as  far  as  the  paramagnetic 
properties  of  the  material  are  concerned.  This  can  be  understood  in  a gen- 
eral way  by  imagining  the  ferromagnetic  material  to  be  a paramagnetic 
material  in  which  the  exchange  interaction  provides  an  additional  magnetic 


0 200  400  600 


1000  1200  1400  1600 


Fig.  24-27  Plot  of  the  reciprocal  1/x  of  the  magnetic 
susceptibility  of  gadolinium  as  a function  of  absolute 
temperature  T.  The  data  points  lie  on  a straight  line,  in 
conformity  with  the  Curie-Weiss  law.  The  Curie  temper- 
ature, determined  by  the  intercept  of  the  straight  line 
with  the  horizontal  axis,  is  Tc  = 310  K.  The  behavior  dis- 
played above  this  temperature  is  paramagnetic.  Gado- 
linium actually  becomes  ferromagnetic  at  290  K.  In  gen- 
eral, the  Curie-Weiss  law  fails  at  temperatures  close  to 
the  Curie  temperature.  (Data  after  S.  Arajs  and  R.  V. 
Colvin.) 


1164  Magnetic  Fields,  II 


800 

ran  k) 


field  This  field  helps  in  aligning  the  atomic  magnetic  moments.  At 
temperatures  below  the  Curie  temperature,  the  field  55 E suffices  by  itself  to 
produce  perfect  alignment  — that  is,  saturation.  At  the  Curie  temperature, 
the  disordering  effect  of  thermal  agitation  becomes  adequate  to  overcome 
the  ordering  influence  of  the  field  55£.  At  higher  temperatures,  alignment 
requires  the  “help”  of  an  externally  applied  magnetic  field.  This  phenom- 
enon is  analogous  to  the  melting  of  a solid,  which  occurs  when  the  effect  of 
thermal  agitation  is  large  enough  to  overcome  the  influence  of  the  at- 
tractive interaction  holding  the  atoms  together  in  fixed  order. 

If  a permanent  magnet  is  heated  above  its  Curie  temperature  and  allowed  to 
cool  in  the  absence  of  an  external  magnetic  field,  the  domain  structure  it  acquires 
as  it  passes  through  the  Curie  temperature  will  lead  to  zero  overall  magnetization. 
But  note  that,  according  to  Eq.  (24-58),  the  magnetic  susceptibility  is  very  large  at 
temperatures  just  above  the  Curie  temperature.  Thus  a very  small  external  field 
®0,  applied  as  a ferromagnetic  substance  cools  through  its  Curie  temperature,  will 
induce  a relatively  large  magnetization  M parallel  to  ®„.  This  magnetization  is 
“frozen  in”  as  the  substance  continues  to  cool  and  becomes  ferromagnetic.  Once 
the  temperature  is  well  below  the  Curie  temperature,  the  material  becomes  mag- 
netically hard,  and  only  a large  external  field  can  change  the  magnetization. 

This  sequence  of  events  occurs  naturally  in  the  course  of  the  geophysical  evo- 
lution of  the  earth.  As  a consequence,  the  orientation  of  ferromagnetic  rocks  with 
respect  to  the  earth’s  magnetic  field  at  the  time  of  their  original  formation  and 
cooling  can  be  discovered  by  measuring  their  permanent  magnetization.  By  com- 
paring such  rocks  from  many  locations  around  the  world,  it  is  possible  to  con- 
struct a record  of  the  slow  movement  of  continents  with  respect  to  one  another. 
The  motion,  called  continental  drift,  has  been  traced  back  at  least  300  million 
years  in  this  way  (which  can  be  corroborated  by  other  methods). 

One  of  the  major  driving  forces  of  continental  drift  is  sea-floor  spreading.  Hot 
material  from  beneath  the  earth’s  crust  wells  up  into  the  crust  along  certain  long, 
narrow  zones.  Notable  among  these  zones  is  the  mid-Atlantic  ridge  which  runs,  as 
its  name  suggests,  generally  north-south  almost  all  the  way  along  the  midline  of 
the  Atlantic  Ocean.  The  upwelling  material  cools  and  forms  new  crust,  forcing  the 
older  material  away  in  both  directions  perpendicular  to  the  ridge.  As  a result,  the 


EVENTS  EPOCHS 


Fig.  24-28  Reversals  of  the  earth’s  magnetic  held,  as  evidenced  by  the  direction  of  mag- 
netization of  ferromagnetic  rocks  collected  in  many  parts  of  the  world.  Each  rock  sample  was 
magnetized  along  the  earth’s  magnetic  held  at  the  time  when  it  cooled  through  its  Curie  tem- 
perature. Correlation  of  data  from  many  widely  distributed  samples  makes  possible  a recon- 
struction of  the  changes  in  orientation  of  the  samples  as  a result  of  the  drift  of  the  land  masses 
or  sea  floors  of  which  they  are  part.  This  information  is  used  in  charting  the  relative  motions 
of  the  crustal  plates  of  the  earth.  Many  of  these  plates  correspond  roughly  to  continents  or 
major  parts  of  continents  (such  as  India.)  Other  plates  mainly  or  entirely  comprise  the  sea 
floor. 

When  the  slow  orientation  changes  due  to  relative  motion  of  the  crustal  plates  are  taken 
into  account,  there  remains  clear  evidence  of  gross  (and  rather  sudden)  reversal  of  the  earth’s 
magnetic  held.  The  above  graph  is  a “timetable”  of  these  reversals,  which  are  of  two  kinds. 
The  epochs,  shown  in  the  right  column,  have  durations  of  order  of  magnitude  106  yr.  They 
are  named  after  scientists  who  have  made  important  contributions  to  the  study  of  geomag- 
netism. “Normal”  epochs,  denoted  in  gray,  are  those  when  the  general  orientation  of  the  earth’s 
magnetic  held  was  the  same  as  it  is  at  present,  when  the  North  pole  of  a magnetic  compass 
points  in  the  general  direction  of  the  geographic  (and  astronomical)  north  pole.  “Reverse” 
epochs,  denoted  in  white,  are  those  when  the  North  pole  of  the  compass  (if  it  had  been  around) 
would  have  pointed  in  the  generally  opposite  direction,  toward  the  geographic  south  pole. 
The  epochs  are  interrupted  by  relatively  brief  reversals  of  the  epochal  direction  of  the  earth’s 
magnetic  held.  These  brief  reversals,  called  events,  have  durations  of  order  of  magnitude 
105  yr  and  are  named  after  the  geographical  regions  where  the  rocks  were  first  found,  re- 
cording the  evidence  of  their  existence.  (After  A.  Cox,  G.  B.  Dalrymple,  and  R.  R.  Doell.) 


24-5  Paramagnetism  and  Ferromagnetism  1165 


Atlantic  Ocean  is  widening  fairly  steadily  at  a rate  of  about  1 cm/yr.  Conse- 
quently, the  age  of  ocean-bottom  crust  from  the  time  of  its  formation  from  hot 
material  is  roughly  proportional  to  its  distance  from  the  mid-Atlantic  ridge. 
Samples  of  ferromagnetic  material  from  the  sea  bottom  show  magnetization  direc- 
tions oriented  toward  the  magnetic  poles  of  the  earth  at  the  time  of  their  formation 
and  cooling,  as  expected.  While  the  magnetization  orientations  record  the  slow 
drift  of  the  magnetic-pole  locations  within  a confined  region  close  to  the  geo- 
graphic poles  of  the  earth,  there  is  a much  more  dramatic  effect.  The  direction  of 
magnetization  reverses  abruptly  (on  a geological  time  scale)  and  irregularly.  Suc- 
cessive bands  of  sea  bottom  parallel  to  the  mid-Atlantic  ridge  have  opposite  mag- 
netic field  orientations. 

The  earth’s  magnetic  field  has  passed  through  four  major  epochs  of  orienta- 
tion in  the  last  4.5  x 106  yr.  Each  epoch  has  lasted  roughly  1 million  years  and  has 
begun  and  ended  (except,  of  course,  for  the  present  one)  with  a field  reversal. 
Interrupting  these  relatively  long  epochs  are  relatively  brief  reversal  intervals, 
called  events,  of  duration  typically  about  100,000  yr.  The  reversal  history  of  the 
earth’s  magnetic  field  for  the  period  studied  to  date  is  shown  in  Fig.  24-28. 

The  mechanism  for  the  field  reversal  is  understood  at  present  in  only  a very 
general  way.  There  is  no  detailed  knowledge  as  yet  concerning  the  electric  cur- 
rents in  the  earth’s  core  which  give  rise  to  the  earth’s  magnetic  field. 


EXERCISES 

Group  A 

24-1.  Geomagnetic  force  on  a transmission  line.  In  the 
United  States,  the  vertical  component  of  the  earth’s  mag- 
netic field  is  about  5 x 10~5  T downward,  and  the  hori- 
zontal component  is  about  2 x 10~5  T northward.  A 
north-south  electrical  transmission  line  carries  a steady 
current  of  500  A northward.  What  is  the  magnitude  and 
direction  of  the  force  per  meter  on  the  transmission  line 
produced  by 

a.  the  vertical  component? 

b.  the  horizontal  component? 

24-2.  Magnetic  pump.  One  type  of  nuclear  reactor  em- 
ploys molten  sodium  metal  as  the  medium  to  carry  heat 
front  the  reactor  to  the  steam  boiler.  The  liquid  sodium  is 
circulated  by  magnetic  pumping.  Electrodes  A and  B are 
placed  in  the  inner  surfaces  of  the  nonconducting  pipe 
carrying  liquid  sodium,  and  a current  is  passed  in  at  A. 
through  the  liquid  sodium,  and  out  at  B.  See  Fig.  24E-2.  A 
magnetic  held  is  applied  to  the  region.  What  must  be  its 
direction  to  pump  the  liquid  sodium  to  the  right? 

Fig.  24E-2 


24-3.  Levitation.  In  Fig.  24E-3,  wire  AC,  which  is  part 
of  the  circuit,  can  slide  without  friction  on  two  upright 
metal  wires.  The  linear  density  of  AC  is  2.0  g/m.  What 
must  be  the  magnitude  of  a uniform  magnetic  field 
oriented  normal  to  the  plane  of  the  page  if  it  will  support 


the  wire  against  gravity  when  the  current  is  10  A?  What  is 
the  direction  for  the  sense  of  the  current  shown? 


A 


Fig.  24E-3 


24-4.  Force  on  current  elements.  There  is  a counter- 
clockwise current  i — 10  A in  the  large  circular  loop  in  Fig. 
24E-4,  which  is  in  a uniform  magnetic  field  of  1 .0  T,  in  the 
positive  x direction.  Angles  AOC,  AOD,  and  AOE  are  60°, 
90°,  and  120°,  respectively.  What  is  the  magnitude  and 
direction  of  the  force  on  current  elements  1.0  cm  long  at 
C,  D,  and  E ? 


1166  Magnetic  Fields,  II 


24-5.  Force  on  square  current  loop.  In  Fig.  24E-5,  the 
current  in  the  long  straight  wire  is  10. 0 A.  A square  loop 
whose  sides  are  10.0  cm  long  is  placed  so  that  one  side  is 

5.0  cm  from  the  long  wire.  If  the  current  in  the  loop  is  5.0 
A,  what  net  force  acts  on  the  loop?  Does  the  net  force  at- 
tract or  repel  the  loop? 

24-6.  Drift  speed.  A current  of  5.0  A flows  in  a wire. 
There  are  1.0  x I022  free  electrons  per  meter  in  the  wire. 
What  is  the  drift  speed  of  these  electrons? 

24-7.  Fly  by.  A rectangular  flat  metallic  plate  carries 
a net  charge  per  unit  area  cr  precisely  equal  to  1 x 
10-6  C-m-2.  What  charge  per  unit  area  would  be  observed 
by  a physicist  passing  the  plate  on  a jet  airplane  travelling 
at  a speed  of  600  m/s?  What  current  does  she  observe? 
Are  your  answers  affected  by  the  orientation  of  the  plate 
relative  to  the  direction  of  motion  of  the  jet? 

24-8.  Current  loop.  The  rectangular  loop  of  Fig.  24-12 
has  100  turns,  carries  a current  of  2.0  A,  and  has  a length 
and  width  of  20  cm  and  10  cm,  respectively. 

a.  What  are  the  magnitude  and  direction  of  the  mag- 
netic dipole  moment? 

b.  If  the  strength  of  the  uniform  magnetic  field  ap- 
plied to  the  loop  is  0.60  T and  the  angle  between  it  and  the 
magnetic  dipole  moment  is  30°,  what  net  torque  acts  on 
the  loop? 

c.  What  net  force  acts  on  the  loop? 

24-9.  Galvanometer.  For  the  galvanometer  in  Fig. 
24-15,  there  are  N turns  in  the  coil  and  each  has  area  a. 
The  magnetic  field  in  the  gap  has  magnitude  S9,  and  the 
spring  constant  is  k.  Find  an  expression  relating  the  angle 
8 through  which  the  coil  rotates  before  coming  to  rest  to 
the  current  i flowing  through  the  coils. 

24-10.  Dimensionless  susceptibility.  By  analyzing  the  di- 
mensions of  the  quantities  /jl0 , M,  and  S30.  prove  that  the 
magnetic  susceptibility  x = P-o M/930  is  dimensionless. 

24-11.  Magnetization  of  aluminum.  The  magnetic  sus- 
ceptibility of  aluminum  at  room  temperature  is  \ = 2.2  x 
10“°.  A sample  of  aluminum  at  room  temperature  is 
placed  inside  a solenoid  so  that  it  experiences  an  exter- 
nally applied  field  of  magnitude  S30  = 1.0  T.  What  is  the 
magnitude  of  the  magnetization  M in  the  sample?  What  is 
the  relation  between  the  direction  of  the  externally  ap- 
plied field  and  the  direction  of  the  magnetization? 

24-12.  Susceptibility  of  paramagnetic  salt.  For  a certain 
paramagnetic  salt,  the  Curie  constant  is  C = 1.0  X 10-3  K. 
Compute  the  susceptibility  at  T = 300  K. 

Group  B 

24-13.  Semicircular  current  loop.  A wire  in  the  form  of 
a semicircle  lies  on  the  top  of  a smooth  table.  A 
downward-directed  uniform  magnetic  field  of  magnitude 
S3  is  confined  to  the  region  shown  in  Fig.  24E-13.  The 
ends  of  the  semicircle  are  attached  to  springs,  C and  D, 
whose  other  ends  are  fixed.  The  current  i is  introduced  by 


attaching  a battery  to  the  ends  of  the  springs.  Show  that 
the  sum  of  the  tension  of  the  springs  is  22ftir,  where  r is 
radius  of  the  semicircle. 


24-14.  Three-sided frame . In  Fig.  24E-14,  a three-sided 
frame  is  pivoted  at  AC  and  hangs  vertically.  Its  sides  are 
each  of  the  same  length  and  have  a linear  density  of 

1.0  g/cm.  A current  of  10.0  A is  sent  through  the  frame 
which  is  in  a uniform  magnetic  field  of  1.0  x 10~2  T di- 
rected upward.  Through  what  angle  will  the  frame  be 
deflected? 


Fig.  24E-14 


24-15.  Diverging  magnetic  field.  A symmetric  diverging 
magnetic  field  has  a magnitude  ® at  a circular  loop  of  wire 
of  radius  R carrying  current  i.  See  Fig.  24E-15.  The  field 
at  a given  point  of  the  loop  lies  in  a plane  determined  by 
the  normal  to  the  loop  and  the  radius  to  the  point.  The 
direction  of  the  field  in  this  plane  makes  an  angle  8 with 
the  normal  to  the  plane  of  the  loop  in  the  direction  away 
from  its  center.  What  is  the  magnitude  of  the  force  on  the 
loop? 


24-16.  Everything  is  relative,  1.  A beam  of  protons  of 
speed  5.00  X 107  m/s  passes  a given  point  at  the  rate  of 

1.00  X 1015  per  second.  What  is  the  charge  per  meter  in 


Exercises  1167 


the  beam  determined  by  an  observer: 

a.  stationed  at  the  point? 

b.  moving  with  the  protons? 

24-17.  Everything  is  relative,  II.  A straight  copper  con- 
ductor is  made  of  lengths  of  wire  with  diameters  of 
0.0500,  0.100,  and  0.150  cm  as  shown  in  Fig.  24E-17.  The 
conductor  carries  a current  of  1.00  A.  Take  the  number  of 
conduction  electrons  per  unit  volume  from  Example 
24-3a,  and  let  the  current  flow  to  the  right. 

a.  Calculate  the  average  drift  speed  of  the  conduc- 
tion electrons  in  each  section  of  the  wire,  as  measured  by 
an  observer  stationed  at  the  conductor. 

b.  An  observer  moves  along  the  conductor  to  the 
right  with  speed  9.40  x 10-5  m/s.  Calculate  the  current 
due  to  positive  ions,  i'+,  and  the  current  due  to  electrons, 
i'_,  as  measured  by  this  observer. 

Fig.  24E-17 


24-18.  Lorentz force,  1.  In  Fig.  24E-18,  wire  1 carries  a 
positive  net  charge  per  unit  length  A.+  . Wire  2 carries  a net 
charge  per  unit  length  A._  = ~k+.  Their  separation  is  r. 

a.  Calculate  the  Lorentz  force  per  unit  length  on  wire 
2 due  to  wire  1,  as  measured  by  an  observer  at  rest  with 
respect  to  the  wires. 

b.  Calculate  the  Lorentz  force  per  unit  length  on  wire 
2 due  to  wire  1 as  seen  by  an  observer  moving  to  the  right 
at  a slow  speed  v with  respect  to  the  wires.  Is  the  force  seen 
by  this  observer  due  to  electric  forces  or  magnetic  forces? 
Is  there  a reference  frame  in  which  the  force  is  totally 
magnetic? 

+ + + + + + + + + Fig.  24E-18 

1 


2 

24-19.  Lorentz  force,  II.  A beam  of  N protons  per 
meter  is  constrained  to  travel  with  a small  speed  v,  at  uni- 
form distance  r from  a neutral  wire  carrying  current  i. 

a.  What  is  the  Lorentz  force  on  a unit  length  of  the 
wire,  as  determined  by  an  observer  at  rest  with  respect  to 
the  wire?  Is  the  force  entirely  electric,  entirely  magnetic, 
or  a superposition  of  the  two? 

b.  Substitute  physically  reasonable  numbers  for  N,  v, 
and  i.  Comment  on  the  magnitude  of  the  resulting  force. 

c.  Is  there  an  observer  who  interprets  the  Lorentz 
force  per  unit  length  as  being  entirely  a magnetic  force? 
Entirely  an  electric  force? 

d.  What  is  the  form  of  the  Lorentz  force  per  unit 
length  determined  by  an  observer  who  moves  along  with 
the  protons? 

24-20.  Torque  produced  by  a couple. 

a.  Verify  that  the  torque  exerted  on  a bar  magnet  by 
a uniform  applied  magnetic  field  is  given  correctly  by  Eq. 


(24-36),  no  matter  what  origin  is  used  to  specify  the 
torque.  Do  this  by  repeating  the  calculation  leading  to  the 
equation,  but  using  the  north  pole  of  the  magnet  as  the 
origin.  Compare  the  situation  here  with  the  one  involving 
an  electric  dipole  in  a uniform  applied  electric  field. 

b.  Generalize  the  argument  to  show  that  if  a pair  of 
equal  magnitude  and  oppositely  directed  but  noncollinear 
forces  act  on  any  object,  the  torque  which  they  produce 
depends  only  on  the  force  vectors  and  on  the  vector  ex- 
tending between  the  points  of  application  of  the  forces,  so 
that  it  is  independent  of  the  choice  of  origin.  Such  a pair 
of  forces  is  called  a couple. 


24-21.  Internal  resistance  of  a galvanometer.  The  follow- 
ing method  can  be  used  to  determine  the  internal  re- 
sistance of  a sensitive  galvanometer.  A variable  high 
resistance  connected  in  series  with  the  galvanometer  and  a 
battery  is  adjusted  until  the  galvanometer  reads  full  scale. 
A variable  low  resistance  is  now  connected  across  the  ter- 
minals of  the  galvanometer  without  changing  anything 
else.  This  resistance  is  in  parallel  with  the  moving  galvanom- 
eter coil.  It  is  varied  until  the  galvanometer  reads  half 
scale.  Prove  that  the  galvanometer  internal  resistance  is 
then  equal  to  the  variable  resistance  across  its  terminals. 

24-22.  Multirange  ammeter.  In  Fig.  24E-22,  the  shunt 
resistance  arrangement  ABCD  and  switch  S converts  the 
galvanometer  G into  a multirange  ammeter.  The  resist- 
ance between  points  A and  B is  RAB  = 0.0100  fl;  the  other 
resistances  are  RBc  = 0.0900  fl  and  RCd  = 0.900  fl.  If 
full-scale  deflection  occurs  when  1.00  x 10-3  A flows 
through  the  galvanometer,  whose  internal  resistance  is 
99.0  fl,  what  is  the  maximum  current  that  can  be  mea- 
sured with  S in  position  Di  in  position  C?  in  position  Bi 


24-23.  Electric  motor. 

a.  The  electric  motor  in  Fig.  24-17  has  a coil  with  N 
turns,  each  of  area  a,  and  the  current  flowing  through  it  is 
i.  Show  that  the  magnitude  of  the  torque  acting  on  the  coil 
is  T = Nia  58  sin  9,  where  9 is  the  angle  between  the  ap- 
plied magnetic  field  (B  and  the  coil's  magnetic  dipole  mo- 
ment. 

b.  Show  that  the  power  output  of  the  motor  is  P = 
4 vNia  38,  where  v is  the  number  of  revolutions  of  the  coil 
per  second. 


24-24.  Measuring  a magnetic  dipole  moment.  An  iron  bar 
magnet  is  10  cm  long  and  has  a rectangular  cross  section 


1168  Magnetic  Fields,  II 


1.0  cm  by  0.50  cm.  When  the  magnet  is  suspended  hori- 
zontally by  a thread  tied  to  its  midpoint,  it  oscillates  about 
a north-south  direction  with  a period  of  20  s.  Estimate  the 
magnetic  dipole  moment  of  the  magnet,  using  the  value  of 
the  horizontal  component  of  the  earth's  magnetic  held 
in  Exercise  24-1.  The  density  of  iron  is  7.9  x 103  kg/m3. 

24-25.  Pole  strength.  An  iron  bar  magnet  of  length 
10  cm  and  cross  section  1.0  cm2  has  a magnetization  of 
102  A/m.  Calculate  the  magnet’s  magnetic  pole  strength. 

24-26.  Iron-core  electromagnet.  A toroid  has  an  iron 
core.  Measurement  shows  that  the  internal  magnetic  held 
in  the  iron  core  is  8ftint  = 0.60  T when  the  current  in  the 
winding  of  1000  turns  per  meter  is  1.0  A. 

a.  What  is  350 , t he  magnetic  held  applied  externally  to 
the  core  (the  held  present  if  there  were  no  iron  core)? 

b.  What  is  35d,  the  demagnetizing  held? 

c.  What  is  is/l,  the  equivalent  amperian  surface  cur- 
rent per  unit  length? 

d.  If  there  were  no  iron  core,  what  current  would  be 
necessary  in  the  winding  for  35  to  equal  0.60  T? 

e.  What  is  Km,  the  relative  permeability  of  iron, 
under  the  conditions  of  this  exercise? 

24-27.  Curie  constant.  Using  Fig.  24-25,  compute  the 
Curie  constant  for  chromium  potassium  alum. 

24-28.  Curie  constant  and  Curie  temperature.  Using  Fig. 
24-27,  compute  the  Curie  constant  and  the  Curie  temper- 
ature for  gadolinium. 

Group  C 

24-29.  Current  balance.  Figure  24E-29  illustrates  sche- 
matically a current  balance  used  to  measure  current.  A pi- 
voted metal  frame  ECDF,  with  three  sides  of  equal  length 
/,  hangs  vertically  with  the  lower  part  CD  touching  a long 
straight  wire  AG.  The  frame  and  wire  are  connected  in 
series  (not  shown  in  the  figure)  to  a source  of  current  i. 
This  causes  CD  to  move  away  from  AG,  with  the  square 
frame  pivoting  about  EF.  The  angle  of  rotation  6 is  mea- 
sured. For  the  small  angles  involved,  the  distance  between 
C'D'  and  AG  can  be  approximated  by  10. 


a.  Show  that  the  torque  Trn  on  the  frame  about  an  ori- 
gin on  the  rotation  axis  due  to  the  current  is  Tm  = 
Fml cos  ( 6/2 ) — Fml,  where  Fm  is  the  magnetic  force  on  CD. 

b.  Show  that  Fm  — 2 x 1 ( )~7  i2/9.  in  SI  units. 


c.  The  restoring  torque  applied  to  the  frame  by  grav- 
ity is  due  to  the  weight  of  the  frame  which  acts  at  its  center 
of  mass.  Show  that  this  is  located  at  a distance  §/  from  the 
axis  of  rotation. 

d.  Show  that  the  restoring  torque  is  Tg  = 
Mg  §/  sin  0 — iMgl  6,  where  M is  the  mass  of  the  frame. 

e.  For  a frame  with  M = 0.022  kg,  a current  i causes  a 
rotation  6 = 0.023  rad.  Evaluate  i. 

24-30.  Just  drifting  along.  A wire  carries  a current  of 
0.100  A.  An  electron  moves  parallel  to  the  current  a dis- 
tance of  1.00  cm  from  the  wire  at  the  drift  speed  of  the 
electrons  in  the  wire  of  1.00  X 10~3  m/s. 

a.  Calculate  the  magnetic  and  electric  forces  on  the 
electron  measured  by  an  observer  at  rest  with  respect  to 
the  wire. 

b.  Calculate  the  magnetic  and  electric  forces  on  the 
electron  measured  by  an  observer  moving  with  the  elec- 
tron. 

24-31.  Everything  is  relative,  III.  A beam  of  protons 
passing  a given  point  moves  at  a velocity  of  magnitude 

5.00  x 107  m/s  and  carries  a charge  per  unit  length  of 

2.00  x 10-12  C/m,  as  measured  by  an  observer  in  the  labo- 
ratory frame  of  reference.  An  isolated  proton  moves  at 
the  same  velocity,  parallel  to  and  at  a distance  of  0.0100  m 
from  the  beam. 

a.  Compute  the  electric  and  magnetic  forces  on  the 
proton  as  measured  by  an  observer  in  the  laboratory. 

b.  Compute  the  electric  and  magnetic  forces  on  the 
proton  as  measured  by  an  observer  moving  with  the  pro- 
ton. 

c.  Compute  the  electric  and  magnetic  forces  on  the 
proton  as  measured  by  an  observer  moving  at  a velocity  of 
magnitude  1.00  x 108  m/s  in  the  direction  of  the  beam. 

24-32.  Antimatter ? 

a.  In  Sec.  24-2  the  text  describes  the  action  of  a “de- 
tector electron"  near  a current-carrying  wire.  Describe 
how  the  experiment  or  its  results  would  be  affected  if  the 
detector  were  an  antielectron  (positron)  with  mass  me  and 
charge  + e. 

b.  For  each  particle,  there  is  an  antiparticle  with  the 
same  mass  as  the  particle,  and  a charge  opposite  to  that  of 
the  particle.  Could  the  experiment  described  in  Sec.  24-2 
be  performed  in  a distant  galaxy,  to  determine  whether  a 
current-carrying  wire  was  made  of  copper  or  of  anti- 
copper? 

24-33.  Far  field  of  a magnetic  dipole.  Current  flows 
around  a circular  loop.  Evaluate  the  magnetic  field  of  this 
current  loop  at  a distant  point  on  the  z axis,  which  is 
normal  to  the  plane  of  the  loop  in  the  direction  of  its  mag- 
netic dipole  moment  and  has  its  origin  at  the  center  of  the 
loop.  Show  that  the  held  at  such  a point  has  only  a z com- 
ponent, whose  value  is  given  by  35z  = (/u,0/47 r)  (2 m/7?), 
where  m is  the  magnetic  dipole  moment  magnitude  of  the 
loop.  Compare  your  result  with  the  far  electric  held  of  an 
electric  dipole  by  setting  x = 0 in  Eq.  (2 1 -28b) . 


Exercises  1169 


24-34.  Galvanometer  design.  The  galvanometer  of  Fig. 
24-15  is  to  have  an  internal  resistance  of  1.00  fl.  It  is  to  be 
wound  with  no.  30  copper  wire  whose  resistance  is 
338.6  fl/km.  The  dimensions  of  the  rectangular  coil  are 
2.50  cm  by  2.00  cm. 

a.  How  many  turns  will  the  coil  have? 

b.  The  magnetic  field  in  the  galvanometer  is  to  be 
0.40  T.  The  spring  constant  k equals  5.00  X 10-6 
N-m/rad.  The  galvanometer  is  to  give  full-scale  deflection 
with  a current  of  1.00  x 10-3  A.  What  is  the  full-scale  de- 
flection angle  in  radians? 

24-35.  Multimeter  design.  Figure  24E-35  represents  a 
variable  range  ammeter-voltmeter.  The  internal  resist- 
ance R of  the  galvanometer  movement  is  100  H.  The  gal- 
vanometer current  required  for  full-scale  deflection  is 
1.00  X 10-3  A. 

a.  What  must  be  the  resistance  of  each  of  the  "voltage 

o 

multipliers”  £,  F.  and  G? 

b.  What  must  be  the  resistance  of  each  of  the 
“shunts”  B.  C , and  £>? 

24-36.  Dipole  moment-angular  momentum  ratio.  A uni- 
formly charged  disk  whose  total  charge  has  magnitude  |g| 
and  whose  radius  is  r rotates  with  constant  angular  veloc- 
ity of  magnitude  co. 

a.  Show  that  the  magnetic  dipole  moment  has  the 
magnitude  aj|<7|r2/4. 

b.  It  the  mass  of  the  disk  is  m,  what  is  the  magnitude 
of  its  angular  momentum? 

c.  What  is  the  ratio  of  the  magnitude  of  its  magnetic 
dipole  moment  to  the  magnitude  of  its  angular  mo- 
mentum? How  does  the  sign  of  the  charge  govern  the  re- 
lation between  the  directions  of  these  vector  quantities? 
The  result  is  quite  general;  the  ratio  of  the  magnetic  di- 


Terminals for  reading 


pole  moment  magnitude  to  the  angular  momentum  mag- 
nitude is  \q\/2m  for  all  macroscopic  objects. 

24-37.  Spinning  electron.  An  electron  has  an  intrinsic 
magnetic  dipole  moment  of  magnitude  0.93  X 10-23 
A-m2.  It  has  an  intrinsic  angular  momentum  of  mag- 
nitude 0.53  x 10-34  kg-m2/s. 

a.  What  is  the  ratio  of  the  magnitudes  of  its  magnetic 
dipole  moment  to  its  angular  momentum?  Is  this  ratio 
equal  to  the  value  e/2 me  that  would  be  predicted  from  Ex- 
ercise 24-36  for  a rotating  macroscopic  body  with  the 
same  charge-to-mass  ratio  as  an  electron? 

b.  The  intrinsic  angular  momentum  could  be  attrib- 
uted to  a spinning  motion  of  the  electron.  Assuming  that 
the  electron  is  a spinning  sphere,  calculate  the  speed  of  a 
point  on  the  electron’s  equator  from  the  value  of  its  angu- 
lar momentum  magnitude  and  the  value  of  its  “classical 
radius,”  2.8  x 10-15  m.  Why  isn't  the  model  tenable? 

24-38.  Effect  of  an  air  gap. 

a.  The  number  of  turns  for  an  iron  core  toroid  is 
N = 20,  and  the  current  it  carries  is  i = 10  A.  The  average 
length  of  the  core  is  / = 25  cm,  and  its  relative  permeabil- 
ity is  Km  = 1000.  What  is  the  internal  magnetic  field  55intin 
the  core? 

b.  Suppose  a piece  of  iron  of  length  Zx  = 1.0  cm  is 
sawed  out  to  make  the  magnetic  field  in  the  iron  acces- 
sible, as  in  Fig.  24E-38.  53int  will  be  weakened  considerably. 
To  see  how  to  calculate  the  new  value,  note  that  Ampere’s 
law1  for  the  original  toroid  can  be  written  as  55int  / = 
Km/ji0Ni  or  S8int  l/Km  = fx0Ni.  If  there  were  no  iron. 
Ampere’s  law  could  be  written  S9int  / = fj-oNi.  Thus  the  ef- 
fect of  the  iron  is  to  reduce  / by  a factor  of  Km.  If  there  is 
an  air  gap,  Ampere’s  law  now'  states  that  £$int  f + 
3ftint  h/K-m  = fJ-oNi,  where  l2  is  the  average  length  of  the  re- 


Fig.  24E-35 


Common  terminal 
for  reading  V or  A 


1170  MagneticFields.il 


maining  iron.  Apply  this  form  to  calculate  the  new  value 
of  ®int- 


Fig.  24E-38 


24-39.  Quantum  phenomenon.  When  a ferromagnetic 
material  is  saturated  magnetically,  all  its  atomic  magnetic 
dipoles  are  aligned  and  their  orientational  potential  en- 
ergies are  a minimum. 

a.  Predict  approximately  the  energy  that  must  be 
supplied  per  dipole  to  destroy  the  alignment  from  the  fol- 
lowing information:  The  magnetic  dipole  moment  magni- 
tude m of  an  iron  atom  is  2 x 10-23  A-m2.  The  spacing 
between  dipoles  is  1 X 10-10  m.  The  (far)  magnetic  held  of 
a dipole  at  a point  along  its  axis,  the  z axis,  has  only  the 
component  given  by  = (/j.0/4tt)  (2m/z3);  see  Exer- 
cise 24-33. 

b.  It  is  found  experimentally  that  ferromagnetism  is 
destroyed  at  the  Curie  temperature,  Tc  = 1043  K.  Only 
when  this  temperature  is  reached  is  there  actually  enough 
thermal  energy  per  dipole  to  destroy  their  alignment.  Es- 
timate the  value  of  this  thermal  energy. 

The  thermal  energy  actually  required  to  destroy 
alignment  is  seen  to  be  three  orders  of  magnitude  greater 
than  the  predicted  energy  for  destroying  alignment.  This 
discrepancy  could  be  resolved  only  by  quantum  me- 
chanics. 


Exercises 


1171 


25 

Electromagnetic 

Induction 


25-1  FARADAY’S  LAW:  We  have  seen  that  a steady  magnetic  held  comes  into  existence  when  a 
INDUCED  CURRENTS  Steady  electric  current  is  made  to  flow.  Since  the  usual  experimental  proce- 
dure is  to  turn  on  an  electric  current  and  then  to  detect  the  magnetic  held, 
it  seems  natural  to  report  the  result  of  the  experiment  by  saying  that  the 
electric  current  “produces”  the  magnetic  field. 

If  a steady  electric  current  “produces”  a steady  magnetic  field,  it  is 
tempting  to  argue  on  the  basis  of  symmetry  that  a steady  magnetic  field 
should  “produce”  a steady  electric  current.  During  the  1820s,  many  inves- 
tigators searched  for  evidence  of  just  such  a phenomenon.  Figure  25- la  is  a 
sketch  of  an  experiment  which  fails  to  demonstrate  the  sought-for  connec- 
tion. A coil  of  wire  is  wound  around  a bar  magnet  and  then  attached  to  a 
galvanometer.  No  current  is  observed.  The  equivalent  experiment  of  Fig. 
25-16,  in  which  the  permanent  magnet  is  replaced  by  a solenoid  carrying  a 
steady  current,  leads  to  similar  negative  results.  It  makes  no  difference 
whether  the  source  of  the  steady  magnetic  field  is  a permanent  magnet  or  a 
solenoid:  no  current  flows  through  the  coil  attached  to  the  galvanometer. 

As  happens  again  and  again  in  the  progress  of  science,  a crucial  exper- 
imental observation  was  necessary  to  provoke  a fruitful  line  of  scientific 
thought.  This  observation  was  made  in  1831.  Two  investigators  indepen- 
dently saw  and  seized  on  what  must  have  been  seen  many  times  before  and 
overlooked.  In  doing  the  experiment  of  Fig.  25-1 , Michael  Faraday  and  Jo- 
seph Henry  noted  that  there  is  indeed  no  detectable  deflection  of  the  gal- 
vanometer when  the  magnet  is  held  in  place  (or  the  current  in  the  inner  so- 
lenoid continues  to  flow  steadily).  But  there  is  a substantial,  if  short-lived, 
deflection  when  the  magnet  is  inserted  into  the  coil  (or  the  current  in  the 


1172 


(b) 


Fig.  25-1  (a)  A coil  of  copper  wire  is 

wound  around  a permanent  magnet. 
The  circuit  is  completed  by  attaching 
the  ends  of  the  coil  to  the  terminals  of  a 
galvanometer  G.  If  the  system  is  undis- 
turbed. no  current  Hows.  But  if  the  mag- 
net is  removed  from  the  coil  or  rein- 
serted into  it,  an  electric  current  flows 
through  the  circuit  (as  evidenced  by  the 
galvanometer  reading)  during  the  pro- 
cess of  removal  or  reinsertion,  (b)  A 
current-carrying  solenoid  is  substituted 
for  the  permanent  magnet  of  part  a.  As 
long  as  the  switch  is  closed  and  a steady 
current  flows  through  the  solenoid,  no 
current  is  observed  in  the  coil  connected 
to  the  galvanometer.  But  if  the  switch 
is  opened  (interrupting  the  current  in 
the  solenoid)  or  reclosed  (allowing  the 
current  to  resume),  a current  is  ob- 
served in  the  galvanometer  circuit  for  a 
brief  interval. 


inner  solenoid  is  turned  on  by  closing  the  switch).  Inserting  the  magnet 
causes  a short-lived  current,  called  the  induced  current,  to  flow  through 
the  outer  coil  and  the  galvanometer  in  one  sense.  Withdrawing  the  magnet 
causes  a short-lived  flow  in  the  opposite  sense.  The  same  observations  are 
made  upon  closing  and  opening  the  switch  in  the  solenoid  circuit  of  Fig. 
25- lb.  Reversing  the  orientation  of  the  poles  of  the  magnet  or  inter- 
changing the  leads  to  the  battery  simply  reverses  all  the  observed  galvano- 
meter deflections. 

These  observations  are  the  experimental  foundation  of  electromag- 
netic induction,  to  which  this  chapter  is  devoted.  Faraday  and  Henry  in- 
ferred on  the  basis  of  the  experiment  that  it  is  the  change  in  the  magnetic 
field  over  an  interval  of  time,  and  not  the  field  itself,  that  induces  an  electric 
current.  We  will  refine  this  idea  at  length;  it  is  called  Faraday’s  law. 


Michael  Faraday  (1791-1867),  whose  work  had  enormous  impact  in  the 
fields  of  electricity,  magnetism,  and  electrochemistry,  was  born  into  a very  poor 
family  living  in  London.  His  childhood  was  one  of  great  deprivation  and  not  infre- 
quent hunger.  Since  he  did  not  have  the  physical  strength  to  follow  his  father  as  a 
blacksmith,  he  was  apprenticed  in  1804,  at  age  thirteen,  to  a bookbinder.  His 
master  was  a kindly  man  who  encouraged  Faraday  to  read  the  books  in  the 
bindery. 

In  1812,  toward  the  end  of  his  apprenticeship,  Faraday  attended  his  first  of 
two  courses  of  lectures  in  chemistry  at  the  Royal  Institution,  borrowing  the 
one-shilling  fee — a penny  a lecture — from  his  older  brother.  It  was  Faraday’s 
good  fortune  to  be  at  the  right  place  at  the  right  time.  Rumford  had  founded  the 
Royal  Institution  only  eleven  years  earlier,  and  one  of  its  major  purposes — then 
nearly  a unique  one — was  to  provide  educational  opportunities  in  science  and 
technology  to  the  general  public  at  low  or  no  cost. 

Faraday  resolved  to  abandon  a promising  career  as  a bookbinder  for  the  un- 
certain gamble  that  he  might  someday  become  a natural  philosopher.  He  bound 
his  meticulous  lecture  notes  and  presented  them  to  Sir  Humphry  Davy,  the 
director  of  the  Royal  Institution,  in  support  of  a job  application.  He  was  hired  as  a 
bottle  washer,  general  laboratory  assistant,  and  occasional  (though  rebellious) 
valet  to  Davy.  By  1816  he  was  publishing  his  own  work;  in  1821  he  gained  wide 
public  attention  by  inventing  one  form  of  the  electric  motor. 

Joseph  Henry  (1797-1878)  passed  the  first  half  of  his  life  in  Albany,  New 
York.  At  the  time  of  his  experimental  researches  in  electromagnetism,  he  was  a 
teacher  in  what  we  would  today  call  a high  school.  While  his  discovery  of  induced 
current  came  slightly  earlier  than  Faraday’s,  he  did  not  publish  his  results  until 
somewhat  later. 


As  is  often  the  case  with  an  important  experimental  discovery,  the 
Faraday-Henry  experiment  of  Fig.  25-1  does  not  provide  the  simplest  ap- 
proach to  Faraday’s  law,  the  physical  law  which  underlies  it.  We  therefore 
begin  the  quantitative  study  of  Faraday’s  law  by  means  of  the  idealized 
experiment  illustrated  in  Fig.  25-2.  (This  experiment  may  seem  at  first 
glance  to  be  unrelated  to  the  experiment  of  Fig.  25-1,  but  the  connection 
will  become  clear  in  due  course.)  A conducting  wire  of  length  Y extends  in  a 
direction  parallel  to  the  y axis.  It  is  pulled  along  in  the  direction  of  the  x axis 
(perpendicular  to  its  length)  with  velocity  v.  There  is  a uniform  magnetic 
field  (B  directed  along  the  z axis,  as  shown.  The  field  © extends  over  the  en- 
tire region  through  which  the  wire  is  moved. 

Fhe  experiment  is  reminiscent  of  the  Hall  effect  and  the  Thomson 
elme  experiments  of  Sec.  23-3.  As  in  those  experiments,  electric  charge  car- 


25-1  Faraday’s  Law:  Induced  Currents  1173 


Fig.  25-2  A conducting  wire  oriented  along  the  y direc- 
tion is  pulled  with  constant  velocity  v in  the  positive  x 
direction.  A uniform  magnetic  Held  © is  oriented  along 
the  positive  z direction.  Because  of  the  magnetic  force  ex- 
erted on  the  mobile  charge  carriers  in  the  wire,  an  elec- 
tric field  S0pp  is  induced  in  the  wire,  as  shown. 


tiers  are  moved  along  in  a direction  perpendicular  to  a magnetic  field. 
Although  here  the  carriers  are  moved  through  the  field  in  a different  way, 
the  result  is  the  same:  They  experience  a magnetic  force  f whose  direction 
is  perpendicular  both  to  v and  to  (B.  (Throughout  this  chapter  we  use  lower- 
case f to  denote  a magnetic  force  exerted  on  a single  charge  carrier  and  up- 
percase F to  denote  the  magnetic  force  exerted  on  a conductor.)  Positive 
charges  q in  the  wire  experience  a force 

f = q\  x « (25-1) 

in  the  negative  y direction,  while  negative  charges  experience  a force  of 
equal  magnitude  in  the  positive  y direction.  If  either  the  positive  or  the  neg- 
ative charges  (or  both)  are  mobile,  the  result  is  a separation  of  charge,  just 
as  in  the  Hall  effect  discussed  in  Sec.  23-4.  And  just  as  in  the  Hall  effect,  the 
charge  separation  continues  until  an  electric  field  £opp  builds  up  which  is 
large  enough  to  oppose  further  motion  of  the  charges,  as  shown  numeri- 
cally in  Example  25-1. 


The  wire  in  Fig.  25-2  has  length  Y = 50  cm.  It  is  pulled  to  the  right  at  constant 
speed  v = 1.5  m/s  through  a uniform  magnetic  field  of  magnitude  8ft  = 1.8  T, 
directed  as  shown  in  the  figure.  Find  the  magnitude  «?opp  of  the  opposing  electric 
field  and  the  magnitude  |V|  of  the  potential  difference  between  the  ends  of  the  wire. 

■ When  the  steady  state  is  reached,  the  Lorentz  force  on  a charge  carrier  must  be 
zero,  just  as  in  the  Hall  effect.  You  can  thus  write 

<?8„ pP  + <?v  x (B  = q( 8„pp  + v x ©)  = 0 

The  vectors  8„pp,  v,  and  (B  are  mutually  perpendicular  and  oriented  so  that  Sopp  is 
directed  oppositely  to  v x (B.  Thus,  as  far  as  the  magnitudes  are  concerned,  you 
satisfy  this  equation  if 

£0pp  = v8ft 

Inserting  the  numerical  values  gives  you 

^opp  = 1-5  m/s  x 1.8  T = 2.7  V/m 


1174  Electromagnetic  Induction 


Since  the  electric  field  is  uniform  throughout  the  wire,  you  can  find  the  magnitude 
|V|  of  the  potential  difference  between  the  ends  of  the  wire  (without  regard  to  sign) 
by  taking  the  product 

\V\  = ^opp  Y 

The  value  you  obtain  is 

|Vj  = 2.7  V/m  x 0.50  m = 1.4  V 

We  now  modify  the  experiment  so  as  to  avoid  the  buildup  of  the  op- 
posing electric  held  Sopp.  As  the  wire  is  pulled  through  the  magnetic  held,  it 
makes  sliding  contact  at  its  ends  with  a stationary  U-shaped  conductor,  as 
shown  in  Fig.  25-3.  This  conductor  provides  a path  for  how  of  the  sepa- 
rated charge.  Excess  electrons  are  drained  off  the  moving  wire  at  point  b 
and  How  around  to  recombine  with  excess  positive  charge  at  point  a.  This 
amounts  to  a current  i Bowing  through  the  system  in  the  sense  shown  — 
from  a to  b in  the  stationary  U-shaped  conductor  and  from  b back  to  a in 
the  moving  wire. 

From  the  point  of  view  of  the  effect  produced  in  the  circuit  external 
to  the  moving  wire,  it  is  as  though  there  were  a potential  difference  between 
points  a and  b.  This  is  so  because  potential  differences  across  conductors 
make  currents  How  through  them,  and  a current  is  flowing  through  the  ex- 
ternal circuit  from  a to  b.  But  there  is  not  a potential  difference  between  these  two 
points  in  any  other  meaning  of  the  term.  (A  potential  difference  is  associated 
with  a conservative  force,  whereas  the  magnetic  force  that  drives  current 
through  the  external  circuit  is,  as  we  shall  see  later,  not  conservative.)  We 
describe  the  situation  by  saying  that  the  magnetic  force  acting  on  the  charge 
carriers  in  the  moving  wire  is  acting  as  a source  of  an  electromotive  “force,”  that 
is,  as  a source  of  emf.  (It  was  noted  in  Sec.  22-1  that  a force  need  not  be  con- 
servative to  act  as  a source  of  emf.)  To  say  it  again,  moving  the  wire 
through  the  magnetic  field  results  in  an  emf  which  has  the  same  effect  in 
the  external  circuit  as  if  there  were  a potential  difference  between  the  “ter- 
minals” a and  b,  with  a at  the  higher  potential. 


Fig.  25-3  The  wire  is  pulled  through  a 
magnetic  field  as  in  Fig.  25-2.  Here, 
however,  the  U-shaped  stationary  wire 
completes  the  electric  circuit  between 
ends  a and  b of  the  moving  wire  exter- 
nally. A current  i flows  in  the  sense 
shown.  Its  direction  through  the 
moving  wire  is  denoted  by  the  unit 
vector  s.  An  external  force  Fext  = — Fm 
must  now  be  applied  to  the  wire  to  keep 
it  moving  at  constant  velocity,  since  the 
moving  charges  in  the  wire  experience  a 
total  magnetic  force  Fm , as  shown. 


x 


25-1  Faraday’s  Law:  Induced  Currents  1175 


Energy  must  be  supplied  to  maintain  the  current  flow  against  the  fric- 
tion of  electric  resistance.  The  motion  of  the  wire  through  the  magnetic 
field  induces  a current  i through  it,  as  you  have  just  seen.  Thus  the  moving 
wire  is  a current-carrying  wire,  and  there  is  a magnetic  force  exerted  on  it 
per  unit  length  which  is  given  by  Eq.  (24-86), 


L 


is  x (B 


(25-2) 


The  unit  vector  s is  chosen  in  the  direction  of  current  flow,  as  shown  in 
Fig.  25-3.  Since  the  length  of  the  wire  is  Y,  the  total  magnetic  force  exerted 
on  it  is 


F,„  = iY s x (B  (25-3a) 

Using  the  right-hand  rule  for  cross  products,  you  can  easily  verify  that  this 
force  is  directed  toward  the  inside  of  the  loop  in  Fig.  25-3,  so  that 

Fm  = -imk  (25-3  6) 

where  x is  the  unit  vector  in  the  positive  x direction  shown  in  the  figure.  (If 
the  wire  is  to  continue  to  move  at  the  constant  velocity  v,  an  external  force 
Fext  = — F,„  must  be  supplied,  so  that  the  net  force  on  the  wire  will  be  zero.) 

As  the  wire  moves  through  a displacement  dx,  the  work  dW  done  on  it 
by  the  magnetic  force  Fm  is  dW  = Fm  • dx.  Using  Eq.  (25-36)  gives 

dW  — —i'ZJbYx  • dx  — —iSftY  dx  (25-4) 

But  Y dx  is  the  area  da  swept  out  by  the  wire  when  it  moves  through  the  dis- 
tance dx.  So  Eq.  (25-4)  can  be  written 

dW  = — iStt  da 

According  to  Eq.  (23-49),  the  magnetic  flux  element  d^>m  is  defined  by  the 
equation 

d<Y>m  = (B  • da  (25-5 a) 


In  this  case,  where  (B  penetrates  the  surface  element  at  right  angles,  the 
vectors  (B  and  da  are  parallel,  both  being  oriented  in  the  positive  z direction. 
Equation  (25-5 a)  thus  takes  the  form 

d<Ym  = YU  da  (25-56) 

In  terms  of  d<t>m,  the  work  dW  is  given  by 

dW  = -id<S>m  (25-6) 

The  rate  dW/dt  at  which  work  is  done  by  the  magnetic  force  is  the 
power  P.  Equation  (25-6)  can  be  divided  by  dt  on  both  sides  to  obtain 

In  order  to  understand  the  physical  role  of  the  term  —d<t>m/dt,  we  compare 
this  equation  with  Joule’s  law,  Eq.  (22-50), 

P = iV 


Joule’s  law  describes  the  electric  power,  or  work  done  per  unit  time,  by  an 
electric  potential  difference  V when  it  drives  a current  i through  a circuit 
(or  part  of  a circuit).  In  Eq.  (25-7),  P is  the  electrical  work  done  per  unit 


1176  Electromagnetic  Induction 


time  when  a current  i is  driven  through  a circuit  as  a consequence  of  a 
change  in  the  magnetic  flux  penetrating  it.  The  quantity  — d<Y>m/dt  is  thus 
the  “driving  force”  or  emf.  We  assign  it  the  symbol  V — the  same  symbol 
used  for  electric  potential  difference — because  it  drives  a current  as  if  it 
were  an  electric  potential  difference.  We  therefore  write  the  relation  ob- 
tained by  comparing  Eqs.  (25-7)  and  (22-50)  as 


V — - 


d<Pm 

dt 


(25-8) 


The  emf  induced  in  an  electric  circuit  is  equal  to  the  negative  of  the  time  rate  of 
change  of  the  magnetic  flux  penetrating  the  circuit.  This  is  Faraday’s  law.  The 
significance  of  the  negative  sign  will  become  apparent  shortly.  Example 
25-2  illustrates  one  application  of  Faraday’s  law. 


EXAMPLE  25-2 

Figure  25-4  shows  the  apparatus  of  Fig.  25-3  constructed  of  no.  36  copper  wire,  stif- 
fened with  wooden  rods.  As  in  Example  25-1,  the  length  Y of  the  movable  wire  is  50 
cm,  and  the  magnetic  field  is  53  = 1.8  T.  The  wire  has  resistance  per  unit  length  r = 
1.7  n/m.  The  experiment  is  started  with  the  movable  wire  at  the  left  end  of  the 
apparatus,  and  the  wire  moves  to  the  right  at  a constant  speed  v = 1.5  m/s. 

a.  Use  Faraday’s  law  to  evaluate  the  magnitude  |V|  of  the  emf  between  ends  a 
and  b ol  the  movable  wire. 

■ The  magnetic  field  ® is  everywhere  the  same  and  is  normal  to  the  loop. 
Thus  you  have  ® • da.  = S3  da.  So  you  can  substitute  Eq.  (25-56)  into  Eq.  (25-8)  and 
write  Faraday’s  law  in  the  special  form 

. . d da 

w=7,ma)  = al« 

Since  the  width  of  the  loop  remains  constant  while  its  length  increases,  you  can 
write  the  rate  of  change  of  the  loop  area  as  da/dt  = Y dx/dt.  But  dx/dt  is  just  the 
speed  v of  the  movable  wire,  so  you  have  da/dt  = Yv,  and 

|Vj  = 33Kw 

Inserting  the  numerical  values  into  this  equation,  you  obtain 
|fj  = 1.8  T x 0.50  m x 1.5  m/s  = 1.4  V 


• <B  = 1 .8  T outward  • 


b 


• • • 


Fig.  25-4  Illustration  for  Example 
25-2. 


25-1  Faraday’s  Law:  Induced  Currents 


1177 


25-2  FARADAY’S  LAW: 
THE  CRUCIAL  ROLE 
OF  CHANGING 
MAGNETIC  FLUX 


Fig.  25-5  Idealized  picture  of  the 
Faraday  experiment. 


This  result  agrees  with  that  of  Example  25-1,  where  a Hall-effect  calculation  was 
used.  ■ 

b.  What  is  the  reading  on  the  ammeter  2.0  s after  the  experiment  is  begun? 

■ The  initial  resistance  of  the  circuit  is  R0  = 2 Yr.  Thereafter  it  increases  as  the 
movable  wire  slides  farther  and  farther  to  the  right.  At  time  t,  the  additional  wire 
length  is  2x  = 2 vt,  and  the  resistance  is  given  by 

R = R0  + 2 xr  = 2 (Y  + vt)r 

At  time  t = 2.0  s,  you  have 

R = 2(0.50  m + 1.5  m/s  x 2.0  s)  x 1.7  n/m  = 12  IT 

The  current  at  any  moment  is  i = |V|/i?.  From  part  a you  have  |V|  = 1.4  V.  At 
t = 2.0  s,  the  current  is 


1.4  V 

12  n 


0.12  A 


Up  to  this  point  it  may  appear  that  Faraday’s  law,  written  in  the  form  of  Eq. 
(25-8),  is  just  a new  and  somewhat  special  way  of  expressing  a sort  of  Hall 
voltage  produced  by  the  magnetic  force  f = q\  x (B  exerted  on  each  of  the 
charge  carriers  (on  ihe  average)  in  terms  of  the  magnetic  flux  <f>m  instead  of 
the  field®  . But  there  is  more  to  Faraday’s  law  than  this.  Faraday’s  original 
experiment,  sketched  in  Fig.  25-16,  is  shown  in  idealized  form  in  Fig.  25-5. 
Here  there  is  no  gross  mechanical  motion.  An  electric  current  is  turned  on 
(or  off)  in  one  coil,  and  an  induced  current  flows  briefly  in  a nearby  coil. 
There  is  no  change  in  the  area  enclosed  by  the  second  coil,  and  there  is  no 
velocity  v to  account  for  a force  f = x ® on  each  of  the  charge  carriers 
in  the  coil.  Nevertheless,  there  must  be  a force  exerted  on  the  carriers, 
since  they  do  move  through  the  circuit  in  spite  of  the  tendency  of  the  fric- 
tion of  electric  resistance  to  bring  them  to  rest.  (This  is  evidenced  by  the 
reading  on  the  galvanometer.)  What  is  happening  in  the  second  coil  as  the 
switch  in  tire  first  circuit  is  closed  is  as  follows:  At  any  location,  the  magnetic 
field  ® resulting  from  the  current  in  the  first  circuit  changes  at  some  rate 
d<$>/dt.  Hence  the  magnetic  flux  penetrating  the  second  coil  changes  at  a 
rate  d<t>m/dt.  In  the  sliding-wire  experiment,  on  the  other  hand,  the  flux 
change  is  accomplished  by  increasing  the  area  enclosed  by  the  circuit.  Since 
® is  kept  constant,  the  induced  emf  can  be  calculated  by  using  the  magnetic 
force  equation  F — iYYJi.  (We  use  the  symbol  F to  denote  the  magnitude  of 
the  force  on  the  entire  wire.)  But  if  a unified  account  is  to  be  made  of  both 
the  sliding- wire  experiment  of  Fig.  25-3  and  the  Faraday  experiment  of 
Fig.  25-5,  it  must  be  the  changing  magnetic  flux  which  is  crucial. 

To  see  this  in  a more  general  way,  consider  the  planar  loop  in  Fig. 
25-6,  which  encloses  an  area  whose  magnitude  and  orientation  are  speci- 
fied by  the  vector  a.  The  loop  lies  in  a region  of  uniform  magnetic  field  ®. 
and  there  is  an  arbitrary  angle  0 between  ® and  a.  According  to  the  defi- 
nition of  magnetic  flux  <f>,„  in  Sec.  23-6,  the  flux  penetrating  the  loop  is 
given  by  the  expression 

<t>,„  = ® • a = (33a  cos  8 

which  may  be  compared  with  Eq.  (23-49),  d( J>„,  = ® • da  = (33  da  cos  8. 


1178  Electromagnetic  Induction 


Fig.  25-6  A planar  surface  whose  area 
and  orientation  are  denoted  by  the 
vector  a is  located  in  a region  of  uniform 
magnetic  field  (B.  The  (smaller)  angle 
between  a and  ® is  0. 


Fig.  25-7  A search  coil  magnetometer. 
A small  coil  mounted  on  the  end  of  a 
rod  is  withdrawn  from  the  magnetic 
field  to  be  measured.  The  coil  leads  are 
connected  to  a ballistic  galvanometer 
(labeled  BG  in  the  figure).  The  relation 
between  the  magnetic  field  magnitude 
and  the  total  charge  Q flowing 
through  the  circuit,  as  measured  by  the 
ballistic  galvanometer,  is  explained  in 
the  text. 


There  are  three  ways  to  change  the  flux  penetrating  the  loop.  The  first 
is  to  change  the  magnitude  S3  of  the  magnetic  field;  this  is  what  takes  place 
in  the  Faraday  experiment  of  Fig.  25-5.  The  second  is  to  change  the  area  a 
of  the  loop;  this  is  what  takes  place  in  the  apparatus  of  Fig.  25-3.  The  third 
is  to  change  the  angle  d between  the  magnetic  field  (B  and  the  area  vector  a, 
either  by  rotating  the  loop  or  by  manipulating  the  source  of  the  magnetic 
field;  this  is  what  takes  place  in  the  electric  generator,  which  is  discussed  in 
detail  in  Sec.  25-4. 

We  now  apply  the  physical  argument  given  in  the  preceding  para- 
graph to  a quantitative  evaluation  of  the  emf  induced  in  the  loop  by  the 
change  in  the  magnetic  flux  penetrating  it.  Substituting  the  value  of  4% 
given  in  the  equation  displayed  above  into  Faraday’s  law  in  the  form  of  Eq. 
(25-8),  we  obtain 

V = — = — -j-  (Sfta  cos  6) 

dt  dt 


Application  of  the  rule  for  differentiation  of  a product  to  this  expression 
gives 


V = - 


d@i 

dt 


a cos  6 - 


da 

dt 


cos  6 — “S&a 


t/(cos  6) 
dt 


Rearranging  terms  and  using  the  result  d{ cos  9)/dt  = —sin  6 dd/dt , we  have 


d£ft  da  ^ dd 

V = — a cos  6 S3  cos  6 —r  + S3a  sin  6 —r 

dt  dt  dt 


(25-9) 


The  first  term  in  this  expression  depends  on  the  rate  of  change  of  the  field 
magnitude  S3,  the  second  term  on  the  rate  of  change  of  the  loop  area  a,  and 
the  third  term  on  the  angular  rotation  rate  of  the  loop  relative  to  the  mag- 
netic field.  In  most  practical  situations,  only  one  of  the  three  terms  on  the 
right  side  of  Eq.  (25-9)  has  a nonzero  value.  (However,  this  need  not  always 
be  the  case.)  What  is  most  important  is  the  experimental  observation  that  a 
change  in  any  of  the  three  quantities  £0,  a,  and  6 will  result  in  an  induced 
emf  V in  accordance  with  Eq.  (25-9).  This  verifies  the  assertion  that  the 
quantity  of  fundamental  importance  is  d^>in/dt,  the  rate  of  change  of  the 
magnetic  flux  penetrating  the  loop,  and  not  the  rate  of  change  of  one  of 
the  quantities  in  terms  of  which  the  magnetic  flux  is  defined.  Thus 
Faraday’s  law  is  an  independent  law  of  nature.  It  tells  us  something  new  and  is 
not  merely  a restatement  of  those  laws  of  electromagnetism  previously 
developed  in  Chaps.  23  and  24. 


Figure  25-7  shows  a simple  device  called  a search  coil  magnetometer,  long 
used  for  the  measurement  of  magnetic  fields.  A small  coil  of  cross-sectional  area  a, 
having  N turns,  is  mounted  on  the  end  of  a wooden  rod.  The  coil  leads  are  con- 
nected to  a ballistic  galvanometer,  a device  which  measures  the  total  electric 
charge  Q which  passes  through  it.  The  search  coil  is  placed  between  the  poles  of  a 
magnet  whose  field  £30  is  to  be  measured.  It  is  aligned  so  that  its  plane  is  parallel  to 
the  pole  faces,  thus  maximizing  the  magnetic  flux  which  passes  through  it.  The 
coil  is  then  withdrawn  from  the  magnet,  and  the  total  charge  Q passing  through 
the  ballistic  galvanometer  is  given  by  its  reading. 

If  the  resistance  of  the  entire  circuit  consisting  of  coil,  leads,  and  ballistic  gal- 
vanometer is  R , the  current  i flowing  at  any  instant  is  related  to  the  emf  at  that  in- 
stant by  Ohm’s  law, 

iR  = V 


25-2  Faraday’s  Law:  The  Crucial  Role  of  Changing  Magnetic  Flux  1179 


Using  Faraday’s  law,  Eq.  (25-8),  we  have 


iR  = - 


d<£>„, 

dt 


(25-10) 


Over  the  small  area  of  the  search  coil,  the  field  of  the  magnet  has  the  essentially  uni- 
form magnitude  380.  As  the  coil  is  withdrawn  from  the  magnet,  the  magnetic  flux 
penetrating  it  decreases  in  some  unknown  fashion  from  its  original  value  NaSft  to 
zero.  (Under  ordinary  circumstances  the  magnetic  field  of  the  earth  can  be 
neglected,  since  it  is  small  compared  to  380.)  Integrating  both  sides  of  Eq.  (25-10) 
over  time  from  the  initial  time  t,  (when  the  coil  is  in  the  magnetic  field)  to  the  final 
time  tf  (when  the  coil  is  withdrawn  completely)  gives  the  equation 


iR  dt 


d<I>„ 

dt 


dt 


(25-11) 


To  evaluate  the  integral  on  the  left  side  of  the  equation,  note  that  R is  constant  and 
i = dq  /dt,  so  that 


iR  dt  = R 


- dq 

df 


dt 


The  integrand  on  the  right  side  of  this  equation  is  simply  dq.  To  evaluate  the 
proper  limits  for  the  integral  of  dq,  note  that  at  the  initial  timet,,  zero  charge  has 
passed  through  the  ballistic  galvanometer.  And  at  the  final  timet/,  the  total  charge 
Q has  passed  through  the  ballistic  galvanometer.  We  thus  have  for  the  left  side  of 
Eq.  (25-11) 


b 

iR  dt  =R 


dq  = RQ 


The  integrand  on  the  right  side  of  Eq.  (25-11)  is  simply  dOm.  To  evaluate  the 
proper  limits  for  the  integral  of  d<b,„,  remember  that  at  the  initial  time  t*  the  flux 
penetrating  the  search  coil  has  the  value Na380,  while  at  the  final  timet/the  coil  is 
completely  withdrawn  from  the  magnetic  field  so  that  the  flux  penetrating  it  has 
the  value  zero.  Thus  the  right  side  of  Eq.  (25-11)  can  be  written 


f/dd>„ 


dt 


dt  = 


dd>„,  = 


NaMo 


r Nam  o 


d<t>„ 


(In  the  second  step  shown  here,  we  have  interchanged  the  limits  of  integration 
and  changed  the  sign  of  the  integral.)  Next,  note  thatN  and  a are  constants,  so  that 
at  any  instant  when  the  magnetic  field  is  38,  we  have  d<hm  = d(Na38)  = Na  d38.  So 
we  can  substitute  the  integrand  Na  dS5  for  the  integrand  d <5>„,  on  the  right  side  of 
the  equation  immediately  above.  The  proper  limits  of  integration  are  found  by 
noting  that  when  <t>,„  = Na380,  38  has  the  value  330.  And  when  <f>,„  = 0,  38  has  the 
value  0 also.  Making  these  substitutions,  we  have 


A d <D„, 
t dt 


dt 


d<b„, 


Na  d38  = Na  d 38  = Na380 


We  have  now  evaluated  the  integrals  on  both  sides  of  Eq.  (25-11).  Equating 
the  two  values  thus  obtained,  we  find  that 

RQ  = Na380 

Finally  we  solve  for  380,  the  magnitude  of  the  magnetic  field  we  wish  to  measure. 
The  result  is 


38  o 


RQ 

Na 


(25-12) 


1180 


Electromagnetic  Induction 


which  gives  380  entirely  in  terms  of  measurable  quantities.  The  result  depends  not 
on  the  value  of  the  emf  -dO,„/dt  at  any  moment,  but  only  on  the  total  change  in 


magnetic  flux  - AOm  = NaS80  - 0 = Na3ft0.  Thus  the  rate  at  which  the  search  coil  is 
withdrawn  from  the  magnet  is  not  important. 

The  use  of  the  search  coil  magnetometer  is  illustrated  in  Example  25-3. 


EXAMPLE  25-3  — — 

A search  coil  of  diameter  d = 1.53  cm  has  100  turns.  The  resistance  of  the  magne- 
tometer circuit  is  R = 66.1  Cl.  If  the  ballistic  galvanometer  reads  a total  charge  Q = 
92.4  fjb C when  the  coil  is  withdrawn  from  the  held  of  a magnet,  find  the  magnitude 
S90  of  the  held. 

■ From  Eq.  (25-12)  you  have 

_ 66.1  Q x 92.4  x l(r6  C 

0 “ 100  x 7r  x (1.53  x 10-2  m/2)2  ~ ■ 2T 

1 1 1I  i ini ' 


So  far,  we  have  discussed  only  the  magnitudes  of  the  quantities  in- 
volved in  Faraday’s  law.  Consider  now  the  sense  of  the  emf  induced  in  a 
loop  when  the  magnetic  flux  penetrating  the  loop  changes.  In  the 
moving-wire  experiment  of  Fig.  25-3,  an  externally  applied  force  pulls  the 
wire  outward,  that  is,  in  a direction  which  tends  to  increase  the  area  en- 
closed by  the  loop.  As  a result,  a current  flows  in  such  a direction  that  the  mag- 
netic force  exerted  on  the  moving  wire  tends  to  oppose  the  externcd  force.  This 
point  is  demonstrated  in  both  microscopic  and  macroscopic  terms  in  the 
caption  to  Fig.  25-8.  If  either  the  direction  of  the  field  (B  or  the  direction  of 
the  externally  applied  force  Fext  is  reversed,  the  sense  of  the  induced  cur- 
rent i will  reverse  also.  Various  combinations  of  possibilities  are  shown  in 
Fig.  25-8.  Thus  the  magnetic  force  Fm  exerted  on  the  moving  wire  will 
always  tend  to  oppose  the  change  in  the  state  of  the  system  being  brought 
about  by  the  application  of  the  external  force. 

The  fact  that  the  current  always  flows  in  the  sense  leading  to  opposition  to  a 
change  in  the  state  of  the  system  may  be  viewed  as  a consequence  of  the 
energy-conservation  principle.  Suppose,  for  instance,  that  the  current  in  Fig. 
25-8a  flowed  in  the  opposite  sense.  Then  the  magnetic  force  Fm  exerted  on  the 
moving  wire  would  aid,  rather  than  oppose,  the  driving  force  Fext.  A very  small 
initial  pull  to  the  right  on  the  movable  wire  would  induce  a current.  This,  in  turn, 
would  result  in  a magnetic  force  Fra  having  a direction  which  would  lead  to  fur- 
ther acceleration  of  the  wire,  and  thus  to  the  induction  of  still  more  current,  and  so 
on.  The  system  would  be  a perpetual-motion  machine  of  the  first  kind,  in  violation 
of  the  principle  of  energy  conservation. 

It  follows  from  the  arguments  of  the  previous  two  paragraphs  that  the 
induced  current  produces  effects  which  tend  to  oppose  any  change  in  the  state  of  the 
system.  This  tendency  is  called  Lenz’  law,  after  the  German-Estonian  physi- 
cist Heinrich  F.  E.  Lenz  (1804-1865),  who  spent  most  of  his  working  life  in 
St.  Petersburg.  Lenz’  law  is  quite  general.  It  is  not  restricted  to  the  determi- 
nation of  the  direction  of  forces,  as  in  Fig.  25-8.  Rather,  it  determines  the 
sense  of  the  induced  current  as  it  affects  all  the  changing  quantities  in  an 
electromagnetic  system.  Most  important  of  these  quantities  is  the  magnetic 
flux.  In  Fig.  25-8a,  for  example,  the  motion  of  the  wire  to  the  right  in- 
creases the  flux  penetrating  the  loop  by  increasing  the  enclosed  area  a. 
But  the  induced  current  gives  rise  to  a dipole  magnetic  field.  According 
to  the  right-hand  rule  of  Fig.  23-17  for  relating  the  sense  of  the  magnetic 
field  lines  to  that  of  the  induced  current,  its  magnetic  field  inside  the 


25-2  Faraday's  Law:  The  Crucial  Role  of  Changing  Magnetic  Flux  1181 


• • • • • ffi  (outward) 


(a) 


XXX  XX 

C b ) 


(c) 


Fig.  25-8  The  experiment  of  Fig.  25-3  is  repeated  for  all  possible  combi- 
nations of  magnetic  field  directions  normal  to  the  plane  of  the  loop  and 
directions  of  motion  of  the  movable  wire.  In  each  case,  the  sense  of  the  re- 
sulting electric  current  i is  shown.  Also  shown  are  the  direction  of  the 
magnetic  force  Fm  exerted  on  the  moving  wire  because  it  carries  an  elec- 
tric current  and  the  direction  of  the  externally  applied  force  Fext  required 
to  keep  the  wire  moving  at  constant  velocity  v.  (a)  The  wire  is  moving  to 
the  right,  and  the  magnetic  field  (B  is  directed  out  of  the  plane  of  the  page, 
toward  the  reader.  Consider  a hypothetical  mobile  positive  charge  q situ- 
ated in  the  moving  wire.  Because  it  is  moving  to  the  right  with  velocity  v,  it 
experiences  a magnetic  force  f = qx  x (B  which,  according  to  the  right- 
hand  rule  for  cross  products,  is  directed  along  the  unit  vector  s,  toward 
the  bottom  of  the  page.  Since  the  charge  is  mobile,  it  moves  in  this  direc- 
tion, resulting  in  a current  i whose  sense  is  shown  in  the  figure.  Again 
apply  the  right-hand  rule  to  determine  the  direction  of  the  cross  product 
of  v and  (B.  this  time  to  the  positive  charge  moving  in  the  direction  s.  You 
can  see  that  there  is  also  a magnetic  force  acting  on  the  charge  which  is 
directed  toward  the  left — that  is,  opposite  to  the  direction  of  motion  of 
the  wire  which  contains  the  charge.  This  leftward-directed  force  can  also 
be  understood  from  a macroscopic  point  of  view.  According  to  Eq. 
(25-3a),  the  magnetic  force  Fm  acting  on  a wire  of  length  Y which  carries  a 
current  i in  the  direction  s is  given  by  the  equation  Fm  = iYs  x (B.  Apply- 
ing the  right-hand  rule  for  cross  products  to  this  equation  in  the  situation 
shown  in  part  a again  yields  a leftward  direction  for  Fm . You  can  repeat 
the  argument  for  a mobile  negative  charge  in  the  wire;  you  will  obtain  the 
same  sense  for  the  current  and  the  same  direction  for  the  force  Fm.  Similar 
arguments  applied  to  the  cases  shown  in  parts  (b),  (c),  and  (d)  will  yield  the 
results  shown  for  the  sense  of  i and  the  direction  of  Fm . Identical  results 
are  also  obtained  from  an  energy-conservation  argument  in  the  text. 


xxx  xx  ®( inward) 


(d) 


loop  is  directed  into  the  page — that  is,  in  the  direction  opposite  to  the  ex- 
ternally applied  field  (B.  Thus  the  induced  current  reduces  the  flux  penetrating 
the  area  enclosed  by  the  loop  by  reducing  the  magnitude  of  the  total  magnetic 
field  inside  the  loop. 

Like  all  currents  flowing  through  conductors  having  electric  resistance,  the 
induced  current  quickly  dies  down  if  the  process  inducing  it  is  stopped.  This  is 
why  Lenz'  law  is  expressed  as  a tendency  to  resist  change.  But  if  the  loop  has  zero 
resistance,  the  reduction  in  net  magnetic  field  due  to  the  induced  current  reduces 
the  flux  penetrating  the  loop  exactly  as  much  as  the  increase  in  area  due  to  the 
driving  force  Fext  increases  it.  The  net  result  in  this  special  case  is  zero  change  in 
the  flux.  This  actually  happens  when  the  loop  is  superconducting.  It  requires  the 
induced  current  to  flow  unceasingly  after  the  driving  force  stops  acting,  in  order 
to  keep  the  flux  change  zero. 

To  see  how  Lenz’  law  applies  in  a case  where  there  is  no  macroscopic 
motion,  consider  the  circular  loop  of  copper  wire  in  Fig.  25-9.  It  is  located 
in  a uniform,  but  time-varying  magnetic  field  (B.  The  area  vector  a of  the 
loop  is  parallel  to  (B.  Suppose  that  increases  at  a constant  rate  d2ft/dt.  Ac- 
cording to  Lenz'  law,  the  current  i induced  in  the  loop  must  flow  in  such  a 
sense  as  to  oppose  the  change  in  the  system.  In  this  case,  the  change  is  an 


1182  Electromagnetic  Induction 


Fig.  25-9  Lenz’  law.  The  externally  ap- 
plied magnetic  field  <B  (which  is  uniform 
in  the  region  of  the  loop  of  area  a)  is 
changing  at  a rate  represented  by  the 
vector  d(R/dt.  (The  fact  that  d<SS/dt  is 
parallel  to  (B  implies  that  39  is  increasing 
while  (B  remains  the  same.)  The  in- 
crease in  the  externally  applied  mag- 
netic flux  38a  penetrating  the  loop  is  op- 
posed by  the  induced  electric  current  i 
in  the  loop.  The  sense  of  i is  such  that 
its  magnetic  field  has  a general  direction 
within  the  loop  shown  by  the  vector  (B'. 
which  is  antiparallel  to  <B 


increase  in  the  flux  <f>m  = Sfta  penetrating  the  loop.  Thus  the  current  acts  as 
the  source  of  a magnetic  held  (B'  whose  direction  within  the  loop  is  anti- 
parallel  to  (B.  According  to  the  right-hand  rule  for  the  sense  of  the  mag- 
netic held  lines  associated  with  a current,  the  sense  of  the  current  i must  be 
that  shown  in  the  figure  if  <B'  is  to  oppose  (B  inside  the  loop. 

The  flow  of  current  around  a ring  enclosing  a changing  magnetic  flux 
is  a result  of  the  emf 


Y _ _ d^n, 

dt 

which  drives  it.  That  emf  is  not  located  at  any  particular  place  on  the  ring. 
Unlike  a source  of  emf  such  as  a battery,  which  might  be  hooked  in  at  some 
particular  place  in  the  loop,  the  induced  Faraday  emf  is  a property  of  the 
loop  as  a whole.  To  put  it  another  way,  there  are  no  places  on  the  loop  which 
we  can  mark  plus  and  minus  as  though  they  were  terminals  with  a specified 
potential  difference  between  them.  Nevertheless,  we  can  assign  a sense  to 
the  induced  emf:  In  a current-carrying  circuit,  the  sense  of  the  induced  emf  is  the 
same  as  the  sense  of  the  current  it  drives,  as  shown  in  Fig.  25- 1 0.  This  assignment 
is  consistent  with  the  previous  use  of  emf  to  describe  a “driving  influence.” 

The  significance  of  the  negative  sign  in  Faraday’s  law,  V = — d<$>m/dt, 
can  now  be  made  apparent.  In  order  to  do  this,  we  use  the  right-hand  rule 
for  magnetic  held  lines,  which  relates  the  sense  of  electric  current  flow 
around  a loop  to  the  sense  of  the  associated  magnetic  held  lines  threading 
the  loop.  If  we  call  the  sense  of  the  current  positive,  then  we  must  call  the 
sense  given  by  this  rule  for  the  held  lines  positive  as  well;  if  we  call  the  sense 
of  the  current  negative,  then  we  must  call  the  sense  specified  by  the  rule  for 
the  held  lines  negative.  (This  point  has  not  previously  arisen  because  until 
now  we  have  always  been  free  to  call  the  sense  of  the  current  positive.)  With 
this  convention  in  mind,  we  explain  the  meaning  of  the  negative  sign  in 
Faraday’s  law. 

Consider  the  case  shown  in  Fig.  25-10o.  The  externally  applied  mag- 
netic flux  <t>m  which  penetrates  the  stationary  loop  is  changing  because  the 
uniform  magnetic  held  (B  in  the  vicinity  of  the  loop  is  increasing  at  a rate 
specified  by  the  vector  d(R/dt,  which  makes  constant  acute  angle  6 with  the 
constant  area  vector  a.  According  to  the  discussion  leading  to  Eq.  (25-9), 
the  rate  of  change  of  <!>,„  is  therefore  given  by  the  expression 


d<Pm  dM 

Tr  = “cose^r 


Induced  emf - 
i — 


Fig.  25-10  Illustration  for  the  argument,  given  in  the  text,  which 
shows  how  Lenz'  law  is  expressed  by  means  of  the  negative  sign  in 
the  equation  for  Faraday’s  law,  V = —d<t>m/dt.  Shown  for  two  cases 
are  the  area  vector  a of  the  planar  conducting  loop,  the  vector 
d($>/dt  which  expresses  the  rate  of  change  of  the  uniform  magnetic 
field  in  which  the  loop  is  immersed,  the  angle  0 between  a and 
<l(R/dt,  the  senses  of  the  induced  emf  and  the  electric  current  t 
which  it  drives,  and  the  sense  of  the  magnetic  field  lines  £8'  pro- 
duced by  the  current. 


25-2 


Faraday’s  Law:  The  Crucial  Role  of  Changing  Magnetic  Flux  1183 


25-3  FARADAY’S  LAW: 
INDUCED  ELECTRIC 
FIELDS 


This  quantity  has  a positive  value,  since  a and  are  the  magnitudes  of 

vectors,  and  cos  6 is  positive  because  the  angle  0 lies  between  0°  and  90°. 

Figure  25-  10a  shows  the  senses  of  the  induced  emf,  of  the  current  i 
which  it  drives,  and  of  this  current’s  magnetic  held  lines,  labeled  35'.  These 
senses,  which  are  mutually  consistent,  have  been  determined  by  fixing  the 
sense  of  the  induced  held  lines  in  accordance  with  Lenz’  law  and  then 
working  backward  to  determine  the  senses  of  i and  the  induced  emf.  Since 
dd?m/dt  has  a positive  value,  the  sense  of  the  induced  held  lines  must  be  neg- 
ative, because  they  thread  the  loop  in  the  sense  opposite  to  the  sense  in 
which  d<R/dt  threads  the  loop.  For  consistency  with  the  right-hand  rule  re- 
lating the  sense  of  the  current  how  around  a loop  to  that  of  the  current’s 
magnetic  held  lines  threading  the  loop,  we  must  therefore  say  that  the 
sense  of  the  current  i is  negative.  Hence  we  must  also  say  that  the  sense  of 
the  induced  emf  which  drives  this  current  is  negative.  We  have  already 
shown  that  the  magnitude  of  the  induced  emf  V is  equal  to  d^>m/dt. 
Faraday’s  law  is  therefore  written  as  in  Eq.  (25-8), 


dt 


The  negative  sign  in  this  equation  gives  the  required  negative  sense  to  the 
induced  emf,  V. 

In  Fig.  25-106,  the  vector  d($>/clt  makes  an  obtuse  angle  with  the  area 
vector  a.  Can  you  make  an  argument  parallel  to  the  one  immediately  above, 
showing  that  (1)  d<$>m/dt  has  a negative  value,  (2)  the  sense  of  the  induced 
emf  is  positive,  and  (3)  Faraday’s  law,  V = —d<$>m/dt,  is  valid  in  this  case? 
Suppose  that  you  choose  the  direction  of  the  area  vector  a opposite  to  that 
shown  in  Fig.  25-10.  Why  is  the  sign  in  Faraday’s  law  not  affected? 

The  discussion  of  the  four  preceding  paragraphs  can  be  summarized 
as  follows:  Lenz’  law  is  incorporated  into  Faraday’s  law  by  means  of  the  negative 
sign  in  the  equation  V = — d<Pm/dt. 


As  it  is  described  in  Sec.  25-2,  Faraday’s  law  is  important  as  a new  and  inde- 
pendent relationship  between  the  rate  of  change  of  the  magnetic  flux  pene- 
trating a conducting  loop  and  the  emf  around  the  loop  which  drives  an  in- 
duced current  around  the  loop.  But  it  still  does  not  display  the  symmetry 
which  was  the  original  motivation  for  the  experiments  leading  to  its  discov- 
ery. To  recapitulate,  Faraday  and  his  contemporaries  were  looking  for  a 
physical  phenomenon  symmetrical  to  the  phenomenon  summarized  in 
Ampere’s  law.  According  to  Ampere’s  law,  the  magnetic  circulation — the 
integral  of  the  magnetic  held  © around  a dosed  curve  consisting  of  ele- 
ments cl\ — is  directly  proportional  to  the  net  electric  current  i penetrating 
the  area  enclosed  by  that  curve: 


25-3  FARADAY’S  LAW: 
INDUCED  ELECTRIC 
FIELDS 


(25-13) 


closed 

curve 


1184  Electromagnetic  Induction 


Now  consider  the  steady-state  experiment,  discussed  at  the  beginning  of 
Sec.  25-1,  that  “failed.”  Reduced  to  its  essentials,  it  is  sketched  in  Fig.  25-1  1. 
When  a conducting  loop  lies  in  a steady  magnetic  held  (so  that  it  is  pene- 


Fig.  25-11  T he  “failed”  experiment  of 
Fig.  25-16  reduced  to  its  essentials.  A 
conducting  loop  enclosing  an  area  de- 
fining the  vector  a is  placed  in  a uniform 
magnetic  field  ® which  does  not  change 
in  time.  The  integral  of  the  directed 
current  id  1 around  the  loop,  which  is  di- 
vided into  infinitesimal  length  elements 
dl,  must  be  zero  because  direct  experi- 
mental observation  shows  that  no  cur- 
rent flows  in  the  loop.  Thus  this  integral 
cannot  be  used  to  write  an  equation  par- 
allel to  Ampere’s  law. 


Fig.  25-12  Faraday’s  successful  experi- 
ment reduced  to  its  essentials,  (a)  A cur- 
rent i is  induced  in  a conducting  loop 
when  the  flux  penetrating  it  is  changing. 
Here  the  change  is  shown  as  being  due 
to  a change  in  the  uniform  magnetic 
field  ffl  at  a rate  d (S>/dt.  ( b ) An  electric 
current  implies  the  motion  of  electric 
charge.  This,  in  turn,  implies  the  exist- 
ence of  a force  acting  on  the  mobile 
charges  in  the  conductor.  The  force  can 
be  interpreted  as  the  consequence  of  the 
existence  of  an  electric  field  £. 

dffi 


€ 


c/ffi 

dt 


trated  by  a constant  magnetic  flux),  no  current  flows  through  it.  The  nega- 
tive result  of  the  experiment  can  be  written 

J i dl  — 0 ^ constant  x <t>m 

closed 

curve 

It  will  be  useful  in  acquiring  a deeper  understanding  of  this  negative  resull 
to  write  it  in  a mathematical  form  as  closely  parallel  as  possible  to  that  of 
Eq.  (25-13).  To  do  this,  we  introduce  the  unit  vector  dl,  so  that  we  can  write 
the  integrand  i dl  as  the  dot  product  (i  dl)  • dl,  in  analogy  to  the  dot  prod- 
uct (B  • c/1  in  the  integrand  on  the  left  side  of  Eq.  (25-13).  We  then  have 

J ( i dl)  • dl  f constant  x <t>m 

closed 

curve 

By  comparing  Eq.  (25-13)  and  the  inequality  immediately  above,  you  can 
see  that  the  failure  of  the  analogy  (and  of  the  experiment)  is  due  to  the 
fact  that  the  ecpiations  are  not  symmetrical  enough!  The  magnetic  field  ® 
in  the  integral  of  Eq.  (25-13)  is  static  — that  is,  steady  or  unchanging.  The 
corresponding  quantity  in  the  inequality  is  the  current  (i  c/1),  written  in 
vector  form  to  express  its  local  direction.  But  a current  signifies  a flow  or 
motion;  it  is  not  astatic  quantity.  Or  consider  the  terms  on  the  right  side  of 
the  equation  and  of  the  inequality.  The  current  i in  Eq.  (25-13)  implies  mo- 
tion. But  in  the  inequality,  the  magnetic  flux  <f>m  is  astatic  quantity.  (Do  not 
be  confused  by  the  fact  that  it  is  a flux.  It  is  a static  flux,  like  the  electric  flux 
<J>e , and  not  a dynamic  flux  involving  physical  motion,  like  the  electric  cur- 
rent i.) 

Faraday’s  experiment  suggests  a remedy  for  the  lack  of  symmetry 
between  the  right  sides  of  Eq.  (25-13)  and  the  inequality.  For  the  quantity 
(£>,„,  which  produces  no  direct  effect,  we  substitute  the  rate  of  change  of  mag- 
netic flux  d®Jdt,  which  induces  a current,  as  Faraday’s  sharp  eyes  and  pre- 
pared mind  discovered. 

We  can  remedy  the  lack  of  symmetry  between  the  left  sides  of  Eq. 
(25-13)  and  the  inequality  by  studying  the  equation — Ampere’s  law  — 
carefully.  The  magnetic  circulation  on  its  left  side  is  the  dot  product  of  a 
magnetic  field  © and  a length  element  dl,  integrated  over  a closed  curve. 
Can  we  define 

Js-d, 

closed 

curve 

the  analogous  electric  circulation?  Indeed  we  can.  In  Fig.  25-12«,  current 
is  induced  in  a conducting  loop  because  it  is  penetrated  by  a magnetic  flux 
which  changes  at  a rate  dd?m/d,t,  as  a result  of  the  fact  that  the  magnetic  field 
is  changing  at  a rate  d(S>/dt.  Now,  the  current  exists  because  there  is  motion 
of  electric  charge.  The  electric  charge  moves  because  it  is  propelled  by  a 
force  F.  And  a force  on  an  electric  charge  q can  be  expressed  in  terms  of  an 
electric  field  8 = F /q.  In  the  general  case  we  cannot  evaluate  this  induced 
electric  field  at  any  particular  point  on  the  loop  by  using  Faraday’s  law, 
which  gives  only  the  total  emf  V = —d<i>m/dt  around  the  loop.  But  we  can 


25-3  Faraday’s  Law:  Induced  Electric  Fields  1185 


integrate  the  induced  electric  field  around  the  loop,  as  in  Fig.  25-126,  and 
set  the  integral  equal  to  the  induced  emf  V: 

I 8 • dl  = V (25-14) 

closed 

curve 

This  equation  is  akin  to  the  relation  between  a static  electric  field  and  the 
applied  potential  difference  V associated  with  this  electric  field,  found  by 

[Sf 

evaluating  the  integral  8 • ds  = — V between  the  points  s,  and  sf  on  an 

J si 

arbitrary  path  s.  [See  Eq.  (21-446).  The  sign  reversal  arises  because  the 
sense  of  an  induced  emf  is  the  same  as  the  sense  that  it  drives  a positive 
charge,  whereas  the  sense  of  an  applied  potential  difference  is  opposite  to 
the  sense  of  the  force  exerted  on  a positive  charge  by  the  associated  electric 
field.]  In  the  static  electric  field  case,  however,  integrating  around  a closed  curve 
always  gives  zero,  as  shown  in  Sec.  23-6. 

We  now  use  Faraday’s  law  in  the  form  of  Eq.  (25-8),  V = — d<T>m/dt. 
Substituting  this  into  Eq.  (25-14)  gives 

I 8 • d 1 = - (25- 1 5«) 

closed 

curve 

Equation  (25- 15a)  is  Faraday’s  law  written  in  a form  which  reveals  its  sym- 
metry with  Ampere’s  law,  Eq.  (25-13).  (Ampere’s  law  involves  the  constant 
factor  /jl0,  while  Faraday’s  law  has  no  such  factor.  But  this  is  merely  a reflec- 
tion of  the  way  in  which  the  system  of  units  has  been  defined.)  The  sym- 
metry of  the  two  laws  can  now  be  stated  in  words  as  well. 

Ampere’s  law:  The  magnetic  circulation  evaluated  around  a closed  curve  is 
equal  to  (a  constant  times ) the  rate  of  flow  of  electric  charge  through  the  area  enclosed 
by  the  curve. 

Faraday’s  law:  The  electric  circulation  evaluated  around  a closed  curve  is  equal 
to  the  negative  of  the  rate  of  change  of  the  magnetic  flux  through  the  area  enclosed  by 
the  curve. 

Equation  (25- 15a)  can  be  written  in  a slightly  different  form,  which 
will  be  useful  in  Chap.  27.  Remember  that  the  magnetic  flux  Om  pene- 
trating a loop  is  given  by  the  integral  over  the  area  enclosed  by  the  loop  of 
the  dot  product  of  the  magnetic  field  (B  and  the  surface  element  vector  da 
at  whose  location  the  value  of  (B  is  specified.  This  integral,  given  by  Eq. 
(23-5  la),  can  be  written  for  the  case  of  the  loop  of  interest  here  in  the  form 

= | (B  • da 

enclosed 

surface 

Substituting  into  Eq.  (25- 15a),  we  have 

closed  enclosed 

curve  surface 

1186  Electromagnetic  Induction 


Now  assume  the  loop  and  its  surface  elements  do  not  change  with  time.  Then 
the  operations  of  taking  the  derivative  with  respect  to  time  and  integrating 
over  the  area  of  the  loop  can  be  interchanged,  giving 

I S • dl  = - J ^ • da  (25-15 b) 

closed  enclosed 

curve  surface 

In  this  equation,  we  have  written  the  time  derivative  as  a partial  derivative. 
This  is  necessary  because  the  magnetic  field  (B  may  be  a function  of  posi- 
tion as  well  as  of  time.  (Such  is  not  the  case  for  the  total  flux  <£>,„  which  pene- 
trates the  loop.  The  reason  is  that  the  value  of  4>m  is  characteristic  of  the  en- 
tire area  enclosed  by  the  loop.)  Equation  (25-156)  is  another  form  of 
Faraday’s  law.  In  it  the  sense  of  traversal  of  the  closed  curve  specifies  which 
normal  direction  is  that  of  the  surface  element  vectors  da..  The  specification 
is  given  by  the  right-hand  rule  for  surface  element  vectors,  stated  near  the 
end  of  Sec.  23-6. 


Expressing  Faraday’s  law  in  terms  of  the  induced  electric  field  rather 
than  the  induced  current  makes  it  possible  to  dispense  entirely  with  the 
conducting  loop.  An  electric  field  can  exist  whether  or  not  there  is  a cur- 
rent! As  you  have  already  seen  in  connection  with  Ampere’s  law,  only  in 
cases  of  sufficiently  high  symmetry  can  the  field  be  evaluated  at  a specific 
point.  Example  25-4  illustrates  just  such  a case. 


EXAMPLE  25-4  1 — 

Figure  25-13  shows  a segment  of  a very  long  solenoid  of  radius  5.0  cm.  It  is  closely 

wound  with  no.  36  wire,  so  that  the  number  n of  turns  per  meter  is  7900  m-1  to  two 
significant  figures.  The  solenoid  winding  is  part  of  a circuit  in  which  the  current  os- 
cillates at  a frequency  of  60  Hz.  (That  is,  the  current  changes  its  magnitude  and 
direction  in  a periodic  fashion,  going  through  one  cycle  in  wo  s.)  At  a certain  in- 
stant, the  current  is  changing  at  a rate  di/dt  = 250  A/s. 

a.  Evaluate  the  emf’s  Vx  and  V2  around  the  two  circular  paths  shown  in  the  fig- 
ure, whose  radii  are,  respectively,  rx  = 10  cm  and  r2  = 20  cm.  The  circular  paths  lie 
in  a plane  perpendicular  to  the  axis  of  the  solenoid,  and  their  centers  lie  in  its  axis. 

■ Since  the  solenoid  is  very  long,  the  magnetic  flux  inside  it  is  the  only  flux  you 
need  consider.  The  return  paths  of  the  flux  lines  outside  the  solenoid  are  so  widely 


€2  dl 


Fig.  25-13  Illustration  for  Example  25-4.  (<z) 
A solenoid  carries  a varying  current  i.  Two  cir- 
cular paths  are  shown  about  which  the  emf  is 
to  be  evaluated,  by  using  Faraday’s  law.  (b) 
Schematic  end  view  of  the  same  long  solenoid. 
The  sense  of  the  current  i is  shown.  Within  the 
solenoid  there  is  a magnetic  field  © associated 
with  the  current.  Its  direction  is  into  the  page 
(away  from  the  reader)  as  denoted  by  the  x’s. 


25-3  Faraday’s  Law:  Induced  Electric  Fields  1187 


dispersed  that  only  a negligible  “return”  flux  penetrates  the  two  circular  paths. 
Thus  if  a is  the  cross-sectional  area  of  the  solenoid  and  39  is  the  uniform  magnetic 
held  within  the  solenoid,  the  magnetic  flux  penetrating  either  circle  at  any  moment 
is 


<bm  = 39a 

Using  Eq.  (23-62),  you  have  39  = /jl0 ni,  so  that  the  flux  is  given  by 

= n0nia 

I'he  rate  of  change  of  flux  penetrating  either  circular  path  is 

d<i>m  di 

^r  = tXonajt 

(This  equation  is  valid  for  any  closed  path  surrounding  the  solenoid,  as  long  as  the 
path  is  not  so  long,  relative  to  the  length  of  the  solenoid,  that  it  is  penetrated  by  a 
significant  amount  of  return  flux.)  Faraday’s  law  in  the  form  of  Eq.  (25-8)  gives  the 
emf  around  either  circle  to  be 


d<t>m  di 

v--nr=-»’n°7, 

Inserting  the  numerical  values,  you  have 

V = —47 t x 10-7  T-m/A  x 7900  m-1  x 77  x (0.050  m)2  x 250  A/s 
= - 1.9  x 1(T2  V 


The  negative  sign  indicates  that  the  sense  of  the  emf  is  opposite  to  that  of  the 
instantaneous  current  in  the  winding  of  the  solenoid.  That  is,  the  emf  would  drive  a 
current  around  a conducting  loop  surrounding  the  solenoid  in  the  sense  that  would 
induce  an  instantaneous  magnetic  field  opposing  the  instantaneous  field  of  the  so- 
lenoid. So  you  see  that  the  negative  sign  in  Faraday’s  law  gives  the  same  result  as  to 
directions  as  does  Lenz’  law. 

b.  Find  the  magnitude  and  direction  of  the  induced  electric  field  at  a point  on 
each  circle. 

■ Just  as  in  Ampere’s-law  calculations,  you  must  use  a symmetry  argument  to 
establish  the  magnitude  and  direction  of  the  induced  electric  held  on  the  circular 
paths  shown  in  Fig.  25-13.  The  situation  is  closely  analogous  to  that  in  Sec.  23-6. 
There  it  was  shown  that  the  magnitude  of  the  magnetic  held  was  the  same  every- 
where on  a circle  centered  on  a current-carrying  wire  and  that  the  magnetic  held 
was  everywhere  tangent  to  that  circle,  which  coincided  with  a magnetic  held  line.  In 
this  case,  the  circles  of  Fig.  25-13 6 coincide  with  electric  held  lines.  Thus,  by  sym- 
metry the  electric  held  has  the  same  magnitude  at  every  point  on  such  a circle  and 
is  everywhere  tangent  to  the  circle.  It  is  directed  in  the  sense  that  it  would  drive  a 
current  around  the  circle  if  the  circle  represented  a conducting  loop.  As  concluded 
in  part  a,  this  is  the  opposite  of  the  sense  of  the  current  in  the  solenoid  windings. 
Thus  8 is  directed  as  shown  in  Fig.  25-136.  You  take  the  elements  dl  of  the  closed 
curve  formed  by  the  circle  of  radius  r to  have  the  sense  of  8,  as  in  Fig.  25-126  de- 
fining the  electric  circulation.  You  then  find  the  electric  circulation  to  be 


I 8 • dl  = I 9dl=9  j dl  = £(277 -r) 

closed  closed  closed 

curve  curve  curve 

According  to  Eq.  (25-14),  this  quantity  is  equal  to  V,  so  you  have  £(27rr)  = V,  or 


£ =— -T 

27tr 


(25-16) 


1188 


Electromagnetic  Induction 


You  can  now  use  the  numerical  values  given  to  find  the  magnitude  of  the  in- 
duced electric  field.  For  the  radial  distance  rx  — 10  cm,  you  have 


1.9  X 10-2  V 
2 7r  X 0.10  m 


-3.0  x 10“2  V/m 


For  the  distance  r2  = 20  cm,  you  have 


1.9  x 10-2  V 
277  x 0.20  m 


10"2  V/m 


It  is  again  evident  in  Example  25-4  that  the  induced  emf  is  not  a poten- 
tial difference,  in  spite  of  the  fact  that  it  is  expressed  in  volts.  To  see  this, 
suppose  that  the  emf  were  a potential  difference.  What  would  be  the  value  of 
the  integral  of  8 • d I,  taken  around  a closed  curve  as  in  the  example? 
Similarly,  the  induced  electric  held  is  not  conservative.  A charge  q taken 
exactly  once  around  a loop  encircling  the  solenoid  will  experience  a change 
qV  in  its  energy.  But  in  a conservative  field,  the  energy  change  around 
a closed  path  must  be  zero. 


Is  Faraday’s  law  applicable  in  a region  where  there  is  a static  (that  is, 
steady)  electric  field  8stat  as  well  as  an  induced  electric  field  8ind  resulting 
from  a changing  magnetic  flux?  In  Sec.  23-6,  all  the  basic  laws  governing 
the  behavior  of  steady  electric  and  magnetic  fields  were  summarized  in  Eqs. 
(23-6 la)  through  (23-6 Id).  One  of  these,  Eq.  (23-6 lr),  states  the  conserva- 
tive nature  of  steady  electric  fields: 

I 8stat  ■ dl  = 0 (25-17) 

closed 

curve 

Equation  (25-15)  gives  Faraday’s  law  in  the  form 

f Sind*  dl  = -^  (25-18) 

closed 

curve 

This  certainly  looks  different.  But  there  is  no  conflict.  Equation  (25-17)  ap- 
plies to  situations  in  which  the  electric  field  8stat  arises  from  the  presence  of 
stationary  electric  charges.  Equation  (25-18)  applies  to  situations  in  which 
the  electric  field  8ind  arises  from  the  presence  of  electric  charges  moving  in 
such  a way  as  to  result  in  a changing  electric  current  and  hence  a nonzero 
value  of  d<$>Jdt.  In  general,  both  kinds  of  electric  field  can  be  present 
simultaneously.  To  find  the  overall  effect,  the  two  equations  are  added  to 
give 

I 8stat  * dl  + | 8ind  • dl  = 

closed  closed 

curve  curve 


If  the  two  integrals  are  evaluated  around  the  same  closed  curve,  this  equa- 
tion can  be  written 


(&stat  + ^ind)  * d 1 


d^m 

dt 


closed 

curve 


25-3  Faraday's  Law:  Induced  Electric  Fields  1189 


That  is,  the  two  different  kinds  of  electric  fields  add  like  vectors.  Physically, 
this  means  that  a test  charge  q acted  on  simultaneously  by  a static  held  and 
an  induced  held  will  experience  a force  F given  by 

F = <?(£  stat  +6  ind) 

The  test  charge  does  not  “know”  whether  the  electric  field  acting  on  it  is  a static  field, 
an  induced  field,  or  some  combination  of  the  two.  What  it  “sees”  is  a total  electric 
held  8.  given  by  the  vector  sum 

S = 8 stat  T 8 ind 

So  the  sum  of  Eqs.  (25-17)  and  (25-18)  can  be  written 

I 8 • dl  = (25-19) 

closed 

curve 

This  is  the  same  whether  or  not  there  is  a static  electric  held  present — that 
is,  a held  arising  from  the  presence  of  electric  charges  at  rest  with  respect 
to  the  observer.  Thus  Faraday’s  law  is  perfectly  general. 

You  can  visualize  the  behavior  of  a charged  particle  under  the  influence  of 
both  static  and  induced  electric  fields  by  analogy  with  the  behavior  of  a bit  of 
paper  suspended  in  water  which  is  about  to  flow  down  a bathtub  drain.  If  (as  often 
happens)  a whirlpool  forms,  the  paper  will  move  down  the  drain  in  a helical 
path,  under  the  influence  of  the  “static  field”  which  has  zero  circulation  and  the 
“induced  field”  which  has  nonzero  circulation.  The  axial  part  of  the  helical  mo- 
tion is  analogous  to  the  effect  of  the  static  field,  and  the  circular  part  to  that  of  the 
induced  field. 


Equation  (23-6 la),  Gauss’  law  for  static  electric  fields,  is  unaffected  by 
the  presence  of  induced  electric  fields.  The  equation  is 


8 stat  * da  = — q 


closed 

surface 


(25-20) 


where  q is  the  net  electric  charge  enclosed  within  the  closed  surface.  But  in- 
duced electric  fields,  which  are  associated  with  changing  magnetic  fluxes, 
do  not  begin  or  end  at  charges.  Thus  no  additional  charges  are  needed 
within  the  surface  to  account  for  the  induced  field.  And  since  they  neither 
begin  nor  end  at  charges,  induced  electric  field  lines  must  be  closed  curves,  just  as 
magnetic  field  lines  are  closed  curves.  (A  field  line  can  never  begin  or  end  in 
empty  space!)  Consequently,  the  argument  which  led  to  Gauss'  law  for 
magnetism,  Eq.  (23-6 16), 

| (B  • da  = 0 

closed 

surface 


applies  as  well  to  induced  electric  fields.  We  can  write 


I Sind  ’ da  = 0 

closed 

surface 


1190 


Electromagnetic  Induction 


Adding  this  equation  to  Eq.  (25-20)  gives 


25-4  ELECTRIC 
GENERATORS  AND 
MOTORS 


closed 

surface 


Si 


ndt 


da.  = — q 
£o 


or 


S • da  = — q 

J 

closed 

surface 

which  is  precisely  die  same  as  Gauss’  law  in  the  absence  of  induced  electric 
fields.  Thus  Gauss’  law,  as  well  as  Faraday’s  law,  is  completely  general.  You 
will  see  in  Chap.  26,  however,  that  Ampere’s  law  in  the  form 

| (B  • dl  = /jL0i 

closed 

surface 

is  valid  only  for  steady  currents  and  needs  to  be  modified  to  make  it  per- 
fectly general. 


The  electric  generator  takes  many  forms,  but  they  all  have  in  common  the 
property  that  they  convert  mechanical  energy  into  electric  energy.  The 
class  of  generators  discussed  in  this  section — by  far  the  most  common 
class — depend  for  their  operation  on  Faraday’s  law. 

The  sliding-wire  apparatus  of  Sec.  25-1  is  an  electric  generator, 
though  not  a practical  one.  The  motion  of  the  sliding  wire  induces  an  emf; 
consequently  a current  flows  through  the  circuit. 

A much  more  practical  approach  is  to  rotate  a coil  of  wire  in  a constant 
magnetic  field,  thus  changing  the  magnetic  flux  which  penetrates  the  coil. 
Nearly  all  commercial  generators  operate  in  this  fashion.  The  simplest 
form  of  the  rotating-coil  generator,  called  the  magneto,  is  illustrated  in  Fig. 
25-14.  The  magneto  is  essentially  identical  to  the  permanent-magnet  motor 
shown  in  Fig.  24-17.  An  external  torque  applied  to  the  shaft  rotates  the  coil 
in  the  field  of  the  permanent  magnet  at  a constant  angular  velocity  given  by 
the  signed  scalar  at.  The  coil  has  N turns.  Assume  for  simplicity  that  each  of 

Fig.  25-14  A magneto. 


commutator 


25-4  Electric  Generators  and  Motors  1191 


the  turns  encloses  an  equal  planar  area  a.  The  magnetic  field  © is  constant. 
Also  for  simplicity,  assume  that  it  is  uniform  over  the  region  occupied  by 
the  coil.  To  apply  Faraday’s  law  to  this  situation,  we  express  it  in  the  form 
of  Eq.  (25-9).  Then  we  allow  for  the  fact  that  there  are  N turns,  and  obtain 


_ j ..  / d33  da  . dd  \ 

V = N —a  cos  0—: 33  cos  0~r  + 33a  sin  0 — 

\ dt  dt  dt  ) 

where  0 is  the  instantaneous  value  of  the  angle  between  © and  a.  In  this 
case,  the  first  two  terms  on  the  right  side  of  this  equation  are  equal  to  zero, 
since  39  and  a are  constant  in  time.  Thus  only  the  last  term  makes  a contri- 
bution to  the  induced  emf  V,  and  we  have 

d0 

V = N^a  sin  0-j-  (25-21) 

dt 

The  coil  orientation  angle  0 is  a function  of  time.  Since  the  angular 
velocity  is  constant,  0 is  given  by 

0 = ojt 


where  it  is  assumed  for  simplicity  that  the  time  t = 0 is  chosen  at  an  instant 
when  the  area  vector  a of  the  coil  is  parallel  to  ffi.  The  time  derivative  d0/dt 
can  be  evaluated  immediately  to  obtain 


d0  _ 
dt  ~ 


■n 

I col 


_7T 
I CO  I 


3ir 
I co  I 


47T 

I co  I 


Fig.  25-15  Output  emf  V of  the  mag- 
neto of  Fig.  25-14  as  a function  of  time  t. 
The  angular  speed  of  the  rotor  coil  is 


Winding 

1 


Fig.  25-16  Schematic  end  view  of  the 
rotor  of  a more  practical  generator. 
Here  the  three  windings  shown  produce 
emf’s  which  are  out  of  phase  with  one 
another.  This  makes  possible  the  pro- 
duction of  an  output  emf  which  does 
not  fluctuate  as  much  as  the  output  emf 
of  the  magneto  of  Fig.  25-14. 


Substitution  of  these  values  for  0 and  d0/dt  into  Eq.  (25-21)  yields  the  fol- 
lowing result  for  the  induced  emf: 

V = coNSfta  sin(w£)  (25-22) 

The  output  emf  of  the  generator  is  a sinusoidal  function  of  time.  However, 
the  commutator,  through  which  the  emf  reaches  the  outside  world,  re- 
verses the  connections  to  the  coil  twice  during  each  revolution.  The  com- 
mutator and  its  brushes  are  oriented  so  that  the  reversal  takes  place  just  as 
the  value  of  sin(cot)  passes  through  zero.  This  occurs  when  a and  ffi  are 
parallel,  or  antiparallel,  that  is,  when  the  rotating  coil  presents  maximum 
area  to  the  magnetic  field.  The  resulting  output  emf  is  plotted  in  Fig. 
25-15.  While  its  magnitude  varies  sinusoidally,  its  sign  is  always  the  same.  If 
an  external  resistor  is  connected  across  the  brushes  which  are  the  terminals 
of  the  source  of  emf,  the  current  through  the  resistor  will  always  flow  in  the 
same  sense.  This  type  of  output  is  called  pulsating  direct  current,  or  pul- 
sating dc. 

The  fluctuation  in  output  is  undesirable  for  most  dc  applications.  In  practical 
generators,  the  variation  is  smoothed  out  by  a design  technique  which  amounts  to 
mounting  on  the  rotor  a number  of  coils  with  different  orientations,  as  in  Fig. 
25-16.  The  number  of  segments  in  the  commutator  is  increased  accordingly.  Be- 
cause each  coil  passes  through  the  angle  for  which  dd/dt  = 0 at  a different  time, 
their  output  emf's  are  out  of  phase  with  one  another.  The  resulting  net  output  emf 
is  shown  in  Fig.  25-17. 

If  alternating  current  (ac)  is  desired,  in  which  the  output  emf  follows  a com- 
plete sine  curve,  the  coil  leads  of  the  simple  magneto  shown  in  Fig.  25-14  can  be 
connected  to  slip  rings  instead  of  a commutator.  Figure  25-18  shows  such  slip 
rings.  There  is  no  reversal  of  connection  as  the  rotor  shaft  turns,  and  the  emf 
between  the  brushes  follows  Eq.  (25-22). 


1192  Electromagnetic  Induction 


V 


Fig.  25-17  The  emf's  V,,  V2,  and  V3  produced  by  the  individ- 


For  larger  generators,  as  for  larger  motors,  it  becomes  impractical  to  provide  a 
magnetic  field  by  means  of  a permanent  magnet.  In  medium-size  generators,  some 
of  the  output  current  of  the  generator  is  passed  through  field  coils.  These  are  sim- 
ply stationary  electromagnets  which  provide  the  magnetic  field  in  which  the  rotor 
turns.  Such  generators  are  called  self-excited.  In  very  large  generators  of  the  kind 
used  in  electric  power  stations,  the  current  for  the  field  coils  is  usually  provided 
by  smaller  auxiliary  generators. 

When  no  external  electric  load  is  connected  to  a generator,  no  me- 
chanical energy  input  is  required  to  keep  it  rotating,  beyond  the  small 
amount  required  to  overcome  friction  and  similar  losses.  Such  losses  are 
neglected  in  the  discussion  which  follows.  But  if  an  external  load  (say  a re- 
sistor) is  connected  across  the  generator  terminals,  the  emf  between  the  ter- 
minals will  drive  a current.  This  current  will  flow  through  the  load  and 
through  the  rotor  coil  as  well,  since  they  are  connected  in  a loop.  The  cur- 
rent flowing  through  the  coil  results  in  a magnetic  dipole  moment,  so  that 
the  rotor  becomes  a magnet,  as  shown  in  Fig.  25-19.  According  to  Lenz’ 
law,  the  direction  of  this  dipole  moment  m is  that  which  opposes  the 
change  in  flux  penetrating  the  coil  (which  in  Fig.  25-19  is  an  increase 
caused  by  the  rotation  of  the  coil).  Thus  the  magnetic  dipole  moment  is  in 
the  direction  shown.  Since  a magnetic  dipole  moment  tends  to  align  itself 


Leads  to  rotor  coil 


Fig.  25-18  The  general  design  princi- 
ple of  slip  rings,  used  to  make  electric 
connection  between  the  rotor  shaft  of  a 
generator  and  the  outside  world  when 
an  alternating-current  (ac)  output  is  de- 
sired. 


Fig.  25-19  The  rotor  coil  of  the  magneto  in  Fig. 
25-14  rotates  in  the  externally  applied  magnetic 
field  represented  by  the  vector  (B.  Its  angular 
velocity  is  a>.  The  magneto  is  connected  to  an  ex- 
ternal load  which  completes  the  electric  circuit  of 
which  the  rotor  coil  is  a part,  and  a current  i flows 
through  the  circuit  as  a result  of  the  emf  induced 
in  the  coil.  The  sense  of  the  current  is  such  that  it 
flows  away  from  the  reader  in  the  upper  part  of 
each  turn  and  toward  the  reader  in  the  lower  part, 
as  indicated  respectively,  by  the  x and  the  • in  the 
figure.  Resulting  from  this  current  is  an  induced 
magnetic  dipole  moment  m.  The  corresponding 
induced  magnetic  field  at  the  center  of  the  coil  is 
represented  by  the  vector  ®ind.  According  to 
Lenz’  law,  the  direction  of  ®ind  must  be  that 
shown,  which  opposes  the  externally  applied  field 
®.  Working  backward,  the  sense  of  the  induced  cur- 
rent can  be  found  by  applying  the  right-hand  rule 
for  relating  the  sense  of  magnetic  field  lines  to  the 
sense  of  the  associated  current. 


25-4  Electric  Generators  and  Motors  1193 


with  an  externally  applied  magnetic  field  (as  discussed  in  Sec.  24-3),  the 
rotor  shaft  must  be  turned  against  an  opposing  magnetic  torque. 

The  larger  the  current  drawn  from  the  generator,  the  greater  is  the 
magnetic  dipole  moment  of  the  rotor  coil.  The  corresponding  increase  in 
the  externally  applied  shaft  torque  required  to  overcome  the  opposing 
magnetic  torque  means  that  the  mechanical  work  input  must  increase  if  the 
angular  speed  of  the  generator  is  to  be  kept  constant. 


Fig.  25-20  The  magneto  of  Fig.  25-19 
run  “in  reverse"  as  a motor.  An  external 
voltage  V imposed  across  the  rotor  coil 
makes  a current  i flow  in  the  same  sense 
shown  in  Fig.  25-19.  The  coil  is  thus  a 
magnetic  dipole,  and  there  is  a magnetic 
torque  Tm  exerted  on  the  coil  whose 
direction  is  away  from  the  reader,  as 
denoted  in  the  figure.  If  the  external 
load  is  not  too  great,  the  rotor  will  turn 
in  the  sense  shown.  This  is  opposite  to 
the  sense  of  rotation  of  the  magneto. 
The  angular  velocity  is  denoted  by  the 
signed  scalar  a>' . As  explained  in  the 
text,  this  rotation  leads  to  a reduction  in 
the  current  from  the  value  it  has  if  the 
rotor  is  held  stationary.  The  induced 
magnetic  field  at  the  center  of  the  di- 
pole, (Bmd,  is  therefore  reduced  below 
the  stationary  value,  shown  as  a dashed 
vector. 


An  electric  motor  is  essentially  a generator  run  in  reverse,  with  electric 
energy  converted  into  mechanical  energy,  instead  of  vice  versa.  Thus  the 
above  discussion  can  be  reversed  to  answer  the  question  left  pending  in 
Sec.  24-3:  How  is  electric  energy  converted  to  mechanical  energy  in  a 
motor? 

In  answering  this  question,  we  assume  for  simplicity  that  the  mechan- 
ical friction  of  the  rotor  is  negligible.  Let  the  electric  resistance  of  the  rotor 
winding  be  R.  When  an  external  voltage  V is  imposed  across  the  winding,  a 
current  i flows  through  the  rotor  and  a magnetic  torque  is  exerted  on  it. 
Suppose,  to  begin  with,  that  the  mechanical  load  on  the  rotor  is  so  large 
that  the  motor  stalls.  The  current  flowing  through  the  rotor  coil  induces  a 
magnetic  dipole  moment,  and  there  is  thus  a magnetic  torque  exerted 
on  the  motor,  as  shown  in  Fig.  25-20.  Now  let  the  load  be  reduced  until  the 
motor  starts  turning  slowly.  The  flux  penetrating  the  rotor  coil  changes  as  a 
result  (in  Fig.  25-20  it  is  decreasing).  According  to  Lenz’  law,  a current 
must  be  induced  which  opposes  the  change.  As  depicted  in  Fig.  25-20,  the 
opposition  to  change  requires  that  the  magnetic  field  of  the  externally 
driven  current  flowing  through  the  rotor  coil  be  reduced,  so  as  to  increase 
the  flux  penetrating  the  coil.  But  this  reduction  in  field  must  be  associated 
with  a reduction  in  the  current  flowing  through  the  coil.  Since  the  exter- 
nally imposed  voltage  does  not  change,  there  is  an  emf  Vb  present  in  the  coil, 
opposing  the  external  voltage  V , which  drives  the  current  through  the 
rotor.  This  internal  emf  Vb  is  called  the  back  emf,  because  of  its  “backward” 
sense,  which  is  opposite  to  the  sense  of  the  externally  imposed  voltage  V. 

Now,  the  rotor  coil  does  not  "know"  whether  it  is  rotating  as  a motor 
shaft  driving  an  external  load  or  is  being  driven  as  a generator  by  an  ex- 
ternal mechanical  power  source.  The  back  emf  induced,  according  to 
Faraday’s  law  when  it  is  driven  as  a motor  is  exactly  the  same  as  the  emf  in- 
duced according  to  Faraday’s  law  when  it  is  driven  with  the  same  angular 
velocity  as  a generator.  Equation  (25-22)  can  therefore  be  used  to  evaluate 
the  back  emf  Vb.  If  the  motor  is  turning  with  an  angular  velocity  to',  Vb  is 
given  by  the  expression 

= u'NMa  sin(ca't)  (25-23) 

The  lighter  the  mechanical  load,  the  faster  the  motor  turns.  And  the 
faster  the  motor  turns,  the  greater  is  (o',  so  that  the  back  emf  increases  as 
well.  If  the  motor  spins  perfectly  freely,  under  zero  load,  the  back  emf  is 
exactly  equal  to  the  external  voltage,  and  no  current  will  flow  through  the 
rotor  coil.  (In  real  motors  this  ideal  situation  can  only  be  approximated.) 

Fhe  value  of  the  back  emf  Vb  can  also  be  written  by  applying  to  the 
rotor  coil  an  equation  which  superficially  resembles  Ohm’s  law.  I he  quan- 
tity Vb  is  subtracted  from  the  known  value  of  the  externally  imposed  emf, 
V.  The  magnitude  of  Vb  is  chosen  so  that  the  “net"  emf  V — Vb  has  the  mag- 
nitude which  would  be  required  to  drive  the  measured  current  i through 


1194  Electromagnetic  Induction 


Motor 


V 


Fig.  25-21  An  electric  motor  regarded 
as  an  ideal  resistor  R connected  in  series 
with  an  ideal  source  of  emf  Vb.  The 
motor  is  driven  by  an  externally  imposed 
voltage  V,  shown  here  as  supplied  by 
a battery.  The  sense  of  Vb  is  always 
opposite  to  that  of  V,  and  the  magni- 
tude of  Vb  is  always  less  than  (or,  in 
the  ideal  case  of  a frictionless  motor 
not  connected  to  a mechanical  load, 
equal  to)  the  magnitude  of  V.  The  sense 
of  the  current  i through  the  circuit 
is  shown. 


the  coil,  whose  resistance  is  R , if  the  coil  were  ohmic.  Stated  mathemati- 
cally, we  have 

V - Vb  = iR  (25-24) 

or 

Vb  = V - iR  (25-25) 

According  to  Eq.  (25-24),  the  value  of  the  back  emf  Vb  is  such  that  the  externally 
imposed  emf  V will  drive  the  same  current  i through  the  rotating  coil  as 
is  driven  through  the  same  coil,  when  it  is  stationary,  by  an  externally 
imposed  emf  having  the  value  V — Vb . 

A convenient  way  to  view  an  electric  motor  in  these  terms  is  shown  in 
the  ideal  electric  circuit  of  Fig.  25-21.  Here  the  motor  is  regarded  as  an 
ideal  resistance  R in  series  with  an  ideal  source  of  back  emf  Vb. 


The  electric  power  input  Pin  to  the  motor  can  be  determined  by  apply- 
ing Joule’s  law  in  the  form  of  Eq.  (22-50),  P = Vi,  to  the  circuit  of  Fig. 
25-21.  The  voltage  drop  V across  the  motor  may  be  regarded  as  having  two 
parts.  The  first  is  the  iR  drop  across  the  resistor.  The  second  is  the  “voltage 
drop”  Vb  across  the  source  of  back  emf.  We  thus  have  V = iR  + Vb,  and 
Joule’s  law  can  be  written 

P in  = Vi  = (iR  + Vb)i 
or 

Pin  = fR  + Vbi  (25-26) 

The  first  term  on  the  right  side  of  this  equation  is  the  power  dissipated  as 
heat  in  the  flow  of  electric  charge  through  the  windings,  as  a result  of  the 
electric  resistance  of  the  windings.  We  soon  show  that  the  second  term,  Vbi, 
represents  the  electric  power  input  which  the  motor  converts  into  mechan- 
ical power  output.  (In  a well-designed  motor  operating  under  rated  condi- 
tions, the  quantity  Vbi  usually  is  considerably  greater  than  the  quantity  i2R. 
Thus  electric  motors  are  quite  efficient  in  the  sense  that  they  convert  most 
of  the  total  input  power  into  useful  output  power.  Efficiencies  in  excess  of 
90  percent  are  not  at  all  unusual.)  Using  the  value  of  Vb  given  by  Eq. 
(25-23),  we  can  write  the  input  power  to  the  motor  as 

Pin  = i2R  + w'iNSfta  sin(&/f)  (25-27) 

Next  we  consider  the  mechanical  power  output  of  the  motor.  If  a cur- 
rent i is  flowing  through  the  rotor  coil,  the  magnetic  dipole  moment  of  the 
coil  is  m = Aha.  Since  this  magnetic  dipole  moment  lies  in  a uniform,  exter- 
nally applied  magnetic  field  (B,  it  has  a potential  energy  of  orientation  U 
given  by  Ecp  (24-56),  U = — m • (B.  Substituting  into  this  equation  the  value 
of  m given  immediately  above,  we  have  U = — Aha  • (B.  At  time  t,  the  rotor 
is  oriented  so  that  the  angle  between  a and  (B  is  c o't,  and  the  potential  en- 
ergy can  be  written 

U = —NiaVJi  cos(co'f) 

If  the  speed  of  the  motor  remains  constant,  there  is  no  change  in  its  kinetic 
energy,  and  so  any  change  in  its  potential  energy  U must  be  absorbed  by 
output  work  the  motor  does.  The  mechanical  output  power  Pou t is  thus 


25-4  Electric  Generators  and  Motors  1195 


given  by  the  rate  of  change  of  the  potential  energy  and  is  Pout  = dU/dt,  or 

Pout  = — Nia’Sb  -7-  cos(oj't)  = co'NiaSfc  sin(a)'t)  (25-28) 

dt 

This  quantity  is  identical  to  the  second  term  on  the  right  side  of  Eq.  (25-27), 
which  gives  the  electric  input  power  to  the  motor.  That  is,  aside  from  the 
input  power  dissipated  as  Joule  heat  in  the  motor  windings,  the  electric 
input  power  is  indeed  converted  into  mechanical  output  power. 

Suppose  the  load  torque  on  an  electric  motor  is  suddenly  increased. 
What  happens  to  the  back  emf,  the  input  current,  and  the  output  power? 


25-5  INDUCTANCE  Figure  25-22  shows  two  coils  set  up  in  a slight  modification  of  the  Faraday 
AND  INDUCTORS  experiment.  When  a steady  current  q flows  through  coil  1,  some  of  the  as- 
sociated magnetic  flux  penetrates  the  area  enclosed  by  coil  2.  This  part  of 
the  flux  is  said  to  link  the  two  coils;  call  it  <h12.  If  the  current  q is  changed  at  a 
rate  dix/dt,  the  magnetic  flux  penetrating  coil  2 will  vary  at  a rate  d(d\2/dt  which  is 
proportional  to  dii/dt.  According  to  Faraday’s  law,  the  voltmeter  across  coil  2 
will  read  an  emf 


V2 


dt E*  12  dii 

oc 

dt  dt 


Thus  the  emf  induced  in  coil  2 is  proportional  to  the  rate  of  change  of  the 
current  flowing  through  coil  1.  The  proportionality  constant  is  called  the 
mutual  inductance  M12.  In  terms  of  the  mutual  inductance,  the  emf  V2  can 
be  written 

T2  = -M12f  (25-29) 

According  to  Eq.  (25-29),  the  unit  of  mutual  inductance  is  the  volt-second 
per  ampere.  This  unit  is  called  the  henry  (H)  after  Joseph  Ffenry,  who  was 
the  first  to  study  inductance  in  1829.  The  definition  of  the  henry  is  thus 

1 H = 1 V-s/A  (25-30) 


Fig.  25-22  Modification  of  the  Faraday  apparatus  in  terms  of 
which  the  mutual  inductance  of  the  two  coils  is  defined  in  the  text. 
When  an  increasing  current  flows  in  the  sense  shown  in  the  circuit 
on  the  left,  containing  coil  1,  the  induced  emf  read  by  the  voltmeter 
in  the  circuit  on  the  right,  containing  coil  2,  has  the  sense  shown. 
Can  you  use  Lenz’  law  to  verify  this  fact?  Hint:  Imagine  that  the  re- 
sistance of  the  voltmeter  is  large,  but  not  infinite,  so  that  a small 
current  can  flow  in  the  circuit  on  the  right. 


1196  Electromagnetic  Induction 


Mutual  inductance  depends  on  the  geometry  of  the  system.  It  depends 
on  the  size  and  shape  of  each  of  the  coils,  the  distance  between  them,  and 
their  relative  orientation.  If  either  or  both  of  the  coils  are  extended  in 
length  (like  a solenoid),  some  of  the  magnetic  flux  may  link  only  part  of  one 
coil  to  all  or  part  of  the  other.  (This  is  the  case  in  Fig.  25-22.)  Equation 
(25-29)  is  nevertheless  valid. 

Example  25-5  considers  a very  important  case  of  mutual  inductance. 


Two  solenoids  are  nested  coaxially  one  inside  the  other,  as  shown  in  Fig.  25-23. 
Both  solenoids  have  the  same  length  / = 30  cm.  The  inner  one  has  radius  r1  = 1.5 
cm  and  a winding  density  nx  = 7000  turns  per  meter.  The  corresponding  specifica- 
tions for  the  outer  solenoid  are  r2  = 3.0  cm  and  n2  = 5000  turns  per  meter.  Find  the 
mutual  inductance  M12.  Then  find  the  mutual  inductance  M21,  defined  by  the  equa- 
tion Vx  = — M2 1 di2/dt,  where  i2  is  a current  flowing  in  the  outer  coil  and  Vj  is  the 
emf  induced  in  the  inner  coil  as  a result  of  a change  in  i2. 

■ Since  the  solenoids  are  quite  long  relative  to  their  diameters,  you  can  assume 
that  when  a current  q flows  through  the  inner  solenoid,  nearly  all  the  magnetic  field 
lines  of  the  inner  solenoid  have  return  paths  outside  the  outer  solenoid.  Thus  the 
total  magnetic  flux  cjq  passing  through  the  inner  solenoid  is  the  flux  linking  the  two 
solenoids.  If  3ftx  is  the  magnetic  field  inside  the  inner  solenoid  and  ax  is  its  cross- 
sectional  area,  you  have  for  that  flux,  just  as  in  Example  25-4, 

$12  = <!>!  = S^tq  = fjL0riiiiTrr\ 

Using  Faraday’s  law,  you  have  for  the  emf  induced  in  each  of  the  n2l  turns  of  so- 
lenoid 2 


V 


dt 


2 dix 

TT^oniri  — 


Since  the  emf ’s  of  the  individual  turns  add,  you  have  for  the  total  emf  induced  in  so- 
lenoid 2 


F2  = 


-irix0nxr\ 


dii 

dt 


( n2l ) 


The  mutual  inductance  is  defined  as  the  quantity  multiplying  — dix/dt , so  you  have 


M12  = 7r/u.0n1n2/ri 


(25-31a) 


In  calculating  the  mutual  inductance  M2i,  you  must  take  into  account  the  fact 
that  only  the  part  $21,  of  the  total  magnetic  flux  $2  produced  by  the  current  i2 
flowing  through  the  outer  solenoid  which  passes  through  the  smaller  inner  so- 
lenoid, links  the  two.  Since  the  flux  inside  a long  solenoid  is  uniform,  you  have 


$21 


a i 

= $,  — = <&. 


7rr\ 


2 — ^2  , 
a2  vr2 


Fig.  25-23  Illustration  for  Example  25-5,  in 
which  the  mutual  inductance  of  a pair  of  coaxial 
coils  is  calculated. 


25-5  Inductance  and  Inductors  1197 


where  a2  is  the  cross-sectional  area  of  solenoid  2.  The  flux  <f>2  is  given  by 

$2  = ®2«2  = M-0  n2i2TT)l 

where  is  the  magnetic  held  of  the  current  i2  . Thus 

O21  = Tr^0n2i2r\ 

Again  using  Faraday’s  law  and  adding  the  emf's  of  the  individual  turns,  you  have 

,r  d& 2i  9 dh  , ,, 

Vi  = (ntl)  = ~ n /i0n2ri  — (nj) 


or 


(25-316) 


M2 1 = 7r/xowirt2Zr5 
Comparing  this  with  the  value  of  M12,  you  see  that 

M2 1 = M12 

Using  the  numerical  values  given,  you  find 

M2 1 = M12  = 7r  x 4n  x ]0“7  T-m/A  x 7.0  x 103  m_1  x 5.0  x 103  m_1  x 0.30  m 

x (0.015  m)2 

= 9.3  x 10~3  H 


In  the  relatively  symmetrical  arrangement  of  the  two  solenoids  of  Ex- 
ample 25-5,  it  has  been  possible  to  show  by  direct  calculation  that  M21  and 
M12  are  equal.  It  is  by  no  means  obvious,  but  it  is  also  true  for  the  mutual  in- 
ductances of  all  possible  pairs  of  circuits  in  all  possible  geometric  arrange- 
ments. That  is,  the  equation 

M21  = M12  = M (25-32) 

is  always  true.  Ij  an  emf  V is  induced  in  one  circuit  as  a residt  of  a change  di/dt  in 
the  current  flowing  through  a second,  nearby  circuit,  the  same  emf  V will  be  induced 
in  the  second  circuit  by  the  same  rate  of  change  of  current  di/dt  in  the  first  circuit. 
For  this  reason,  we  now  chop  the  subscripts  and  refer  to  the  mutual  induc- 
tance M. 

The  mutual  inductance  of  the  pair  of  solenoids  in  Example  25-5  can  be 
greatly  increased  by  filling  the  inner  solenoid  with  a core  of  iron  or  some 
other  substance  with  a large  permeability  /x-  In  that  case  Eqs.  (25-31)  be- 
come 

M = TT[xyixn2lr\  (25-33) 

Since  fx  can  be  several  thousand  times  larger  than  /jl0,  the  presence  of  an 
iron  core  increases  the  inductance  by  that  factor. 

The  phenomenon  of  inductance  is  not  limited  to  pairs  of  unconnected 
coils.  When  a current  flows  through  a single  solenoid,  for  example,  each 
turn  makes  a contribution  to  the  total  magnetic  flux.  And  that  flux  contribu- 
tion links  each  turn  with  at  least  some  of  the  neighboring  turns.  Indeed, 
in  a long  solenoid  or  a toroid,  the  magnetic  flux  contribution  of  each  turn 
links  essentially  all  the  other  turns.  The  phenomenon  of  self-inductance  is  a 
consequence  of  this  fact.  Suppose  the  current  flowing  through  the  solenoid 


is  increased.  The  amount  of  flux  linking  each  turn  to  other  turns  then  in- 
creases. According  to  Faraday’s  law  (remembering  the  negative  sign!), 
there  is  induced  a back  emf  which  opposes  the  growth  of  current.  This 
self-induced  emf  V is  related  to  di/dt,  the  rate  of  change  of  current  which 
induces  it,  by  a factor  L: 


V = 


(25-34) 


The  factor  L is  called  the  self-inductance,  or  simply  the  inductance  when 
there  is  no  possibility  of  confusion.  The  unit  of  L,  like  the  unit  of  M,  is  the 
henry. 

Every  conductor  has  a self-inductance,  although  it  may  be  small.  Con- 
sider, for  example,  a long,  straight  wire.  No  real  wire  has  zero  thickness. 
The  magnetic  field  associated  with  the  current  flowing  in  the  inner  part  of 
the  wire  links  the  inner  part  with  the  outer  part  of  the  wire.  When  the  cur- 
rent changes,  there  must  be  an  induced  emf  which  opposes  the  change  and 
therefore  a self-inductance. 


EXAMPLE  25-6 


Find  the  self-inductance  of  a solenoid  identical  to  the  outer  solenoid  of  Example 
25-5. 

■ Just  as  in  Example  25-5,  the  magnetic  flux  penetrating  the  outer  solenoid  when 
a current  i2  flows  through  it  is  given  by 

<f>2  = ^n2i2-nr\ 


All  this  flux  links  with  every  one  of  the  n2l  turns  of  the  solenoid.  Thus  you  have 


d<t>  2 di 

V2  = — (n2l)  = — n0n2Trr2  — ( n2t ) 

dt  dt 


so  that 


L = 7r/jL0nllrl 

The  numerical  values  give  you 

L = 77  x 4tt  x 10“7  T-m/A  x (5.0  x 103  nr1)2  x 0.30  m x (0.030  m)2 
= 2.7  x 10-2  H 

You  can  use  a symmetry  argument  to  obtain  the  same  result  without  calcula- 
tion. Begin  with  Eq.  (25-31a)  for  the  mutual  inductance  of  solenoids  with  nx  = n2. 
Let  the  radius  of  the  inner  one  increase  until  it  is  equal  to  the  radius  of  the  outer  one. 
Then  you  can  think  of  the  inner  and  outer  solenoids  as  being  the  same  solenoid, 
and  you  will  have  evaluated  the  “mutual”  inductance  of  the  solenoid  with  itself! 


Since  the  equation  for  the  self-inductance  obtained  in  Example  25-6  is 
perfectly  general,  we  can  drop  the  subscripts  and  write  it  in  the  form 

L = vix0n2lr~  (25-35) 

This  equation,  which  is  exact  for  an  infinitely  long  solenoid,  is  reasonably 
accurate  for  any  closely  wound  solenoid  which  is  long  and  thin. 

An  electric  circuit  element  possessing  an  appreciable  self-inductance  is 
called  an  inductor.  The  symbol  for  an  inductor  is  — ''ITO''1 — . If  a circuit 


25-5  Inductance  and  Inductors  1199 


/ 


nrm' nmp — . 

“ b 

Fig.  25-24  A current  i flows  through 
two  inductors  in  series.  Their  respective 
self-inductances  are  Lt  and  L 2 , and  it  is 
assumed  that  they  are  so  located  that 
their  mutual  inductance  M is  negligible. 


contains  two  inductors  in  series,  as  in  Fig.  25-24,  the  total  inductance  of  the 
circuit  is  the  sum  of  the  two  self-inductances  Lx  and  L2,  provided  the  two  in- 
ductors are  sufficiently  far  apart  as  to  have  negligible  mutual  inductance. 
The  proof  of  this  statement  is  as  follows.  The  quantity  di/dt  is  the  same  for 
both  inductors.  The  back  emf  V induced  in  the  entire  system  between  a and 
6 can  therefore  be  written 

di  di  di 

v = ~l'7,  + <25‘36> 


The  value  of  V is  just  the  same  as  if  the  two  inductors  were  replaced  by  a 
single  inductor  whose  self-inductance  is 


VT5TV 


L 2 

-^RRT^ 


Fig.  25-25  A current  i flows  through 
two  inductors  in  parallel.  Here  again  it 
is  assumed  that  the  mutual  inductance 
M of  the  two  inductors  is  negligible 
compared  to  the  self-inductances  Lx  and 
I2. 


L = Lx  + L2  for  series  connection,  well  separated  (25-37) 


In  Fig.  25-25,  two  inductors  having  self-inductance  Lx  and  L2  are  con- 
nected in  parallel.  They  are  well  separated  physically.  A varying  current  i 
flows  between  a and  b.  Since  the  voltage  drop  between  points  a and  b (that 
is,  the  reading  on  a voltmeter  connected  between  these  points)  must  be  the 
same  through  either  path,  the  back  emf’s  induced  in  the  two  inductors 
must  be  the  same.  Thus  Eq.  (25-34)  gives 


-y 


dq 

Ll  dt 


di2 

~di 


l - 

J^2  i. 


or 


and 


_F  _ di, 

Lj  dt 


y_  _dd_ 2 
L2  dt 


(25-3 8a) 


(25-386) 


At  any  instant  of  time,  the  total  current  i divides  into  two  parts,  q and 
i2,  which  satisfy  the  condition 


i = q + i2 

Differentiating  with  respect  to  time  yields  the  relation 

di  _ dii  di2 
dt  dt  dt 


Adding  Eqs.  (25-38a)  and  (25-386),  and  using  this  relation,  we  have 


Fig.  25-26  A real  inductor  usually  pos- 
sesses electric  resistance  as  well  as  self- 
inductance. It  may  be  represented  as  an 
ideal  resistor  having  resistance  R in 
series  with  an  ideal  inductor  having 
self-inductance  L. 


L Rl 

r7RnP' — vw 


-v(- U-M-* 

U,  U)  dt 

If  a single  inductor  L were  substituted  for  Lx  and  L2  connected  in  parallel, 
the  relation  between  V and  di/dt  would  be  the  same  if  the  condition 

■j  = - — I-  y-  for  parallel  connection,  well  separated  (25-39) 
L Lx  L2 

were  satisfied.  Equation  (25-39)  is  thus  the  combination  rule  for  inductors 
in  parallel. 

The  combination  rules  for  inductors  in  series  and  in  parallel  are  of  the 
same  form  as  the  rules  for  resistors  in  series  and  in  parallel.  But  the  under- 
lying reasons  are  different  in  the  two  cases.  I he  voltage  drop  across  a re- 


1200  Electromagnetic  Induction 


sistor  depends  on  the  current  i flowing  through  it.  The  voltage  drop  across 
an  inductor  depends  on  the  time  rate  of  change  di/dt  of  the  current  flowing 
through  it. 

The  ideal  inductor  has  zero  electric  resistance.  With  the  exception  of  super- 
conducting inductors,  no  real  inductor  satisfies  this  condition.  In  applications 
where  the  resistance  of  an  inductor  is  significant,  it  is  represented  symbolically  by 
an  ideal  resistor  RL  in  series  with  an  ideal  inductor  L,  as  in  Fig.  25-26. 


25-6  ENERGY  IN 
INDUCTORS  AND 
MAGNETIC  FIELDS 


If  an  increasing  current  is  to  be  driven  through  an  inductor,  work  must  be 
done  against  the  back  emf  V induced  by  t he  change  in  the  current.  In  the 
case  of  the  electric  motor  discussed  in  Sec.  25-4,  this  work  was  converted 
into  mechanical  work  done  on  a load  attached  to  the  motor  shaft.  In  the 
case  of  the  inductor,  the  work  is  stored  as  potential  energy.  You  will  soon 
see  that  this  energy  is  stored  in  the  form  of  potential  energy  of  the  mag- 
netic field.  Something  similar  happens  when  a capacitor  is  charged  and  en- 
ergy is  stored  in  the  form  of  potential  energy  of  the  electric  field. 

Suppose  that  the  current  flowing  through  an  inductor  having  self- 
inductance  L is  changed  by  an  infinitesimal  amount  di  over  an  infinitesimal 
time  interval  dt.  The  emf  thus  induced  is  given  by  Eq.  (25-34): 


V — —L 


di 

dt 


If  di/dt  is  positive,  V opposes  the  passage  of  the  entire  current  i = dq/dt.  To 
drive  this  current  through  the  inductor  against  the  induced  emf  V,  you 
must  apply  an  external  voltage  Text  — —V.  But  according  to  Eq.  (25-34), 
the  induced  emf  is  given  by  the  expression  V = — L di/dt,  so  that 


On  the  basis  of  this  equation,  it  is  possible  to  show  that  the  behavior  of  an  in- 
ductor in  an  electric  circuit  is  analogous  to  the  behavior  of  a particle  having  a cer- 
tain mass  in  a mechanical  system.  That  is,  the  two  quite  different  quantities,  in- 
ductance and  mass,  play  the  same  mathematical  role  in  the  equations  describing 
the  behavior  of  the  systems  of  which  the  inductor  and  the  particle  are  parts.  (A 
kindred  discussion  in  Sec.  21-7  led  to  the  conclusion  that  the  behavior  of  a capaci- 
tor in  a circuit  is  analogous  to  the  behavior  of  a spring  in  a mechanical  system.) 

To  see  the  analogy  between  inductance  and  mass,  we  use  the  definition  of 
current  as  the  time  derivative  of  electric  charge  q,  i = dq/dt,  to  write  the  quantity 
di/dt  in  the  equation  displayed  above  in  the  form 

di  d /dq  \ d2q 

dt  dt  1 dt  / dt2 

Making  this  substitution  in  the  equation  and  dividing  both  sides  by  L,  we  have 

Text  = d/q 

L ~ dt2 

Now  consider  a particle  of  mass  m which  is  acted  on  by  a force  of  magnitude 
F.  If  the  force  acts  in  the  positive  x direction,  the  acceleration  of  the  particle  is 
d2x/dt2  and  is  given  by  Newton’s  second  law, 

F _d2x 
m ~ dfi 


25-6  Energy  in  Inductors  and  Magnetic  Fields  1201 


The  two  equations  displayed  above  are  identical  in  mathematical  form.  Thus  the 
applied  voltage  Vext  (the  electromotive  “force”)  in  the  electric  circuit  plays  the 
role  of  the  force  in  the  mechanical  system.  The  rate  of  change  of  electric  current 
di/dt  = d2q/dt2  is  analogous  to  the  mechanical  acceleration  d2x/dt2.  And  the  in- 
ductance L is  analogous  to  the  mass  m.  Mass  is  a measure  of  mechanical  inertia, 
the  tendency  of  the  rate  of  change  of  position  of  a particle,  dx/dt,  to  remain  con- 
stant. Inductance  is  a measure  of  electrical  inertia,  the  tendency  of  the  rate  of 
charge  flow,  dq/dt,  to  remain  constant. 


In  driving  an  infinitesimal  charge  dq  through  the  inductor,  the  ex- 
ternal voltage  yext  does  work  dW  given  by  the  equation 

di 

dW  = Text  dq  — L—dq 


From  the  definition  of  electric  current,  i = dq/dt,  we  have  dq  = i dt.  Substi- 
tuting this  value  of  dq  into  the  equation  displayed  immediately  above,  we 
have 


di 

dW  = L-ridt 
dt 


Li  di 


Suppose  now  that  at  time  t,  the  inductor  has  zero  current  flowing 
through  it.  The  external  voltage  is  applied,  and  the  current  increases  until 
at  time  tf  it  attains  a final  value  if.  The  total  work  done  by  Vext  against  the 
back  emf  V is 


[Wr  f'r  Lij 

W=  dW=  Li  di  = ~ (25-40o) 

JWi  Jo  2 

This  work  is  not  done  against  a dissipative  force,  but  results  in  the  storage 
of  energy.  To  see  this,  consider  what  happens  if  the  current  i is  subse- 
quently reduced.  During  the  process  of  reduction,  the  quantity  di/dt  is  neg- 
ative. Thus  the  induced  emf  V = — L di/dt  has  a positive  value.  That  is,  its 
sense  is  such  as  to  aid  the  how  of  current  and  hence  to  do  work  on  the 
charge  passing  through  the  inductor.  In  doing  so,  the  induced  emf  is  op- 
posing the  change  (in  this  case,  a decrease)  in  current,  in  conformity  with 
Lenz’  law.  Thus  the  energy  stored  during  the  increase  in  current  does  work 
by  helping  to  drive  the  current  as  the  current  decreases. 

The  work  done  in  increasing  the  current  Rowing  through  the  inductor 
is  stored  as  potential  energy  U in  its  magnetic  field,  as  the  previous  para- 
graph implies.  Setting  the  energy  stored  in  the  system  equal  to  the  work  W 
done  on  the  system  gives 


U = 


Lf 

9 


(25-406) 


The  subscript  / has  been  dropped,  since  if  can  have  any  value. 


If  the  inductor  is  a long  solenoid,  the  uniform  magnetic  field  inside  it 
makes  it  relatively  easy  to  calculate  the  stored  energy  Li2/ 2 in  terms  of  the 
magnitude  of  the  magnetic  field.  In  order  to  avoid  the  question  of 
whether  significant  energy  is  stored  in  the  field  outside  the  solenoid,  we 
bend  the  solenoid  into  a toroid,  as  in  Example  23-12.  For  a toroid,  all  the 
flux  is  confined  inside  the  winding.  Owing  to  the  great  length  and  small 
diameter  of  the  solenoid,  the  magnetic  field  inside  the  resulting  toroid  will 
not  be  appreciably  different  from  that  of  the  unbent  solenoid  and  will  still 


1202  Electromagnetic  Induction 


be  given  by  Eq.  (23-62): 


(25-41) 


53  = p0ni 

where  n is  the  number  of  turns  per  unit  length  of  the  solenoid. 

The  self-inductance  of  the  toroid  is  given  by  Eq.  (25-35),  L = 7 Tp0n2lr2, 
where  r is  the  radius  of  the  solenoid,  / is  its  length,  and  n is  the  number  of 
turns  per  unit  length.  Substituting  this  value  into  Eq.  (25-406)  gives 

Li2  TTfJL0n2lr2i2 

U = ~2~  ~ 2 


Solving  Eq.  (25-41)  for  i gives  i — 53/p,0 n . Thus  i2  = 532//Xq?72,  and  the  ex- 
pression for  U immediately  above  becomes 

_ 7 T/Ji0n2lr2  I <3&  \2 

U ~ 2 W / 

This  equation  simplihes  to  the  form 

532 

U = tttH  (25-42) 

4p-o 

In  Sec.  21-7  the  electric  held  energy  density  pe  was  defined  as  the  energy 
per  unit  volume  stored  in  an  electric  held.  The  magnetic  field  energy  den- 
sity p,„  is  defined  analogously  to  be  the  energy  per  unit  volume  stored  in  a 
magnetic  held.  Since  the  volume  enclosed  by  the  toroid  is  7 rr2l,  we  have 
from  Eq.  (25-42) 

U _ 5327rr2Z 
Pm  = 7 rr2l  ~ 2lx0nr2l 


or 


(25-43) 


We  are  justihed  in  calculating  the  energy  per  unit  volume  in  the  magnetic 
held  of  this  very  long  toroid  by  dividing  the  total  energy  by  the  total  vol- 
ume. This  is  because  the  magnitude  53  of  the  held  is  everywhere  the  same. 
Equation  (25-43)  can  be  compared  with  the  analogous  expression  given  in 
Eq.  (21-56)  for  the  electric  held  energy  density, 


(25-44) 


Equation  (25-43)  has  been  derived  for  the  special  case  of  a very  long 
toroid  where  the  magnetic  held  is  uniform.  However,  just  like  the  corre- 
sponding equation  for  the  electric  held  energy  density,  Eq.  (25-43)  applies 
even  when  the  held  varies  from  place  to  place.  The  reason  is  that  Eq. 
(25-43)  relates  the  value  of  pm  at  a point  in  the  held  to  the  value  of  53  at  that 
point,  regardless  of  what  is  true  at  some  other  point.  When  53  varies  from 
point  to  point,  the  total  magnetic  energy  U stored  in  a given  volume  is  the 
integral  of  the  infinitesimal  contributions  dU  = pm  dv  stored  in  the  infini- 
tesimal volume  elements  dv  comprising  that  volume. 

The  fact  that  the  energy  density  can  be  expressed  in  terms  of  the 
purely  magnetic  quantities  53  and  p0  gives  further  weight  to  our  earlier 
statement  that  the  energy  is  indeed  stored  in  the  magnetic  held.  The  ulti- 
mate justihcation  of  Eqs.  (25-42)  and  (25-43)  lies  in  the  correctness  of 
experimentally  verifiable  quantities  derived  from  them,  as  in  Example 
25-7. 


25-6  Energy  in  Inductors  and  Magnetic  Fields  1203 


EXAMPLE  25-7 


Find  the  magnetic  energy  density  pm  and  the  magnetic  energy  U stored  in  the 
coaxial  cable  shown  in  Fig.  25-27  when  it  carries  a current  i = 10.0  A.  Use  the  ex- 
pression for  U to  evaluate  the  inductance  L of  the  cable.  The  cable  length  is  / = 10 
km.  The  radius  of  the  inner  conductor  is  k1  = 1.0  mm,  and  that  of  the  outer  con- 
ductor is  k2  = 3.0  mm.  Find  numerical  values  for  U and  L. 

■ In  Example  23-13  you  found  the  magnitude  of  the  magnetic  held  between  the 
conductors  of  a coaxial  cable  in  terms  of  the  current  it  carries.  When  the  current  is  i, 
you  have 

53  = 7^--  for  kx<  r <k2  (25-45) 

277  r 

The  held  is  zero  everywhere  else.  For  any  point  a distance  r from  the  axis  of  the 
cable,  you  can  insert  the  square  of  this  value  of  Sft  into  Eq.  (25-43)  to  write  the  en- 
ergy density  in  the  form 

S32  p, o *2 

Pm  = q — = i — for  kx<r  <k2  (25-46) 

2p,0  877“  r i 

This  energy  density  is  uniform  over  a cylindrical  shell  of  inhnitesimal  thickness  dr 
and  length  l,  whose  volume  is  dv  = 2rrlr  dr.  The  magnetic  energy  contained  in  such 
a shell  is  thus 


dU  = pm  dv 


Mo  *2 
8t72  r2 


u0i2  dr 
277 Irdr  =^—l  — 
477  r 


The  total  magnetic  energy  contained  in  the  cable  is 


U = 


dU 


Mo*2 

477 


volume 

between 

conductors 


(25-47) 


The  easiest  way  to  obtain  the  inductance  is  to  solve  Eq.  (25-406),  U — Li2 / 2,  for 
L.  (This  is  analogous  to  the  procedure  followed  in  Example  21-146.)  You  have 


2 U p-ol  / 62 

“l2”  ~2^r  n\I1 


You  can  now  insert  the  numerical  values  to  find 


(25-48) 


Fig.  25-27  A coaxial  cable,  considered 
in  Example  25-7. 


1204  Electromagnetic  Induction 


4tt  x 1(T7  T-m/A  x (10.0  A)2  xl.Ox  104  m , 

U = x In 

477 


3.0  mm 

1 .0  mm 


= 1.1  x ur'j 


and 


L = 


2 U _ 2 x 1,1  x IQ"1  J 

1 2 ~ (10.0  A)2 


2.2  x 10"3  H 


Note  that  the  energy  U and  the  inductance  L in  Example  25-7  are  pro- 
portional to  the  length  l of  the  cable,  but  they  depend  on  only  the  ratio  of 
the  radii  of  the  conductors  of  the  coaxial  cable,  not  on  the  radii  themselves. 

EXERCISES 


Group  A 

25-1.  Broken  voltmeter?  In  performing  the  experiment 
described  in  Example  25-1,  you  design  a simple  apparatus 
to  measure  the  potential  difference  between  the  ends  of 
the  wire.  You  fasten  the  test  wire  ol  length  50  cm  to  a 
wooden  board,  and  connect  the  ends  to  the  leads  of  a 
small  voltmeter,  which  is  also  fastened  to  the  board.  The 
strength  of  the  magnetic  held  is  1.8  T,  and  you  pull  the 
board  through  the  held  at  a speed  of  1.5  m/s.  But  the  volt- 
meter reads  zero.  Explain. 

25-2.  Same  dimensions.  Show  that  the  dimensions  of 
d&m/dt  and  of  V are  the  same. 

25-3.  Emf  induced  in  a coil.  The  plane  of  a circular  coil 
of  area  200  cm2  is  normal  to  a magnetic  held  0.50  T.  The 
coil  has  100  turns. 

a.  What  is  the  magnetic  flux  through  the  coil? 

b.  What  is  the  magnitude  of  the  induced  emf  if  the 
held  decreases  to  zero  in  1.0  s? 


25-4.  Emf  induced  in  a loop.  A loop  of  cross  section 
15  cm2  is  broadside  to  a uniform  magnetic  held  of  0.30  T. 
The  loop  is  rotated  so  that  it  is  parallel  to  the  held  in 
0.10  s.  What  is  the  magnitude  of  the  induced  emf? 


25-5.  Clockwise  or  counterclockwise?  In  Fig.  25E-5  the 
rectangular  loop  of  wire  is  being  pulled  to  the  right,  away 
from  the  long  straight  wire  through  which  a steady  cur- 
rent i flows  upward.  Does  the  current  induced  in  the  loop 
how  in  the  clockwise  sense  or  in  the  counterclockwise 
sense? 


Fig.  25E-5 


25-6.  Jumping  ring.  An  aluminum  ring  is  placed 
around  the  projecting  core  of  a powerful  electromagnet. 
See  Fig.  25E-6.  When  the  circuit  is  closed  the  ring  jumps 
up  to  a surprising  height.  Explain. 


25-7.  Magnetic  braking,  I. 

a.  A pendulum  consists  of  a pivoted  rod  at  the  lower 
end  of  which  there  is  a metal  ring.  The  pendulum  swings 
in  the  plane  of  the  ring.  The  ring  is  raised  and  released. 
At  the  bottom  of  its  swing,  the  ring  enters  a magnetic  held 
normal  to  the  plane  of  the  ring,  in  the  gap  of  a strong 
horseshoe  magnet.  On  entering  this  region,  it  soon  comes 
to  a dead  stop.  This  is  an  example  of  electromagnetic 
braking.  Explain  why  the  pendulum  stops  swinging. 

b.  If  a small  piece  of  the  ring  in  part  a is  cut  out  and 
the  experiment  repeated,  the  pendulum  keeps  swinging 
for  some  time.  Explain. 

25-8.  Induced  electric  field.  A ring  whose  radius  is 
5.0  cm  lies  on  a table.  A downward-directed  magnetic  hux 
through  the  ring  is  increasing  at  the  rate  of  10  Wb/s. 
What  is  the  magnitude  and  direction  of  the  electric  held 
induced  in  the  metal  of  the  ring? 

25-9.  Generating  emf.  A rectangular  coil  with  500 
turns  is  30  cm  long  and  20  cm  wide  and  rotates  at  1800 
revolutions/min  in  a uniform  magnetic  held  of  0.50  T. 
What  is  the  maximum  value  of  the  emf  this  coil  generates? 


Exercises  1205 


i = 1 0 A 


Fig.  25E-16 


25-10.  Hand-operated  generator.  A hand-operated  gen- 
erator is  easy  to  turn  when  it  is  not  connected  to  any  elec- 
trical device.  However,  it  becomes  quite  difficult  to  turn 
when  it  is  connected,  particularly  if  the  device  has  a low 
resistance.  Explain. 

25-11.  Nested  solenoids.  The  inner  solenoid  of  the  set 
of  nested  solenoids  treated  in  Example  25-5  is  connected 
to  a circuit  which  drives  a current  through  its  windings 
that  increases  at  the  rate  of  10  A/s.  Evaluate  the  magni- 
tude of  the  emf  induced  in  the  outer  solenoid. 

25-12.  Spark  coil.  The  spark  coil  used  in  automobiles 
depends  on  mutual  induction  between  two  solenoids 
called  the  primary  and  the  secondary.  If  the  primary  coil 
carries  a current  of  5.0  A which  falls  to  zero  in  0.0020  s, 
what  mutual  inductance  is  required  if  an  emf  of  2.5  x 
104  V must  be  induced  in  the  secondary? 

25-13.  Self-inductance  of  a toroid.  The  self-inductance 
of  a toroid  having  100  turns  is  1.0  H.  What  is  the  magnetic 
flux  through  the  toroid  when  it  carries  a current  of 
0.50  A? 

25-14.  Equivalent  inductance. 

a.  What  is  the  inductance  between  terminals  A and  B 
in  the  network  shown  in  Fig.  25E-14a,  assuming  the  three 
inductors  are  well  separated. 


1.0  H 

2.0  H 


3.0  H 


Fig.  25E-14 


(.a) 


4.0  H 5.0  H 

4 • — — • b 

lb) 


b.  If  the  separation  between  the  two  inductors  in  Fig. 
25E-146  is  decreased  to  the  point  that  it  becomes  small, 
and  the  inductors  are  wound  in  the  same  sense,  would  you 
expect  the  inductance  between  A and  B to  be  larger  or 
smaller  than  it  is  when  they  are  well  separated?  Why? 

25-15.  Magnetic  field  energy.  A current  of  10  A is 
flowing  through  the  solenoid  treated  in  Example  25-6. 

a.  Evaluate  the  energy  stored  in  the  magnetic  field. 

b.  Evaluate  the  energy  density  of  the  magnetic  field 
in  the  interior  of  the  solenoid. 

Group  B 

25-16.  Potential  difference  in  a moving  wire.  In  Fig. 
25E-16,  a wire  perpendicular  to  a long  straight  wire  is 
moving  parallel  to  the  latter  with  a speed  v = 10  m/s  in 
the  direction  of  the  current  flowing  in  the  latter.  The  cur- 
rent is  10  A.  What  is  the  magnitude  of  the  potential  dif- 

1206  Electromagnetic  Induction 


^-1,0  cm 

r 


1 0.0  cm 


v = 10  m/s 


ference  between  the  ends  of  the  moving  wire?  What  is  the 
sign  of  the  potential  difference? 

25-17.  Faraday’s  disk.  Figure  25E-17  illustrates 
Faraday’s  disk,  the  first  generator.  A copper  disk  of  radius 
R rotates  about  O with  clockwise  angular  speed  u>.  The 
lowest  part  of  the  disk  dips  into  a trough  of  mercury  at  A. 
A voltmeter  V makes  contact  with  the  mercury  at  D and 
sliding  contact  with  the  metal  axle.  The  disk  is  in  a uni- 
form magnetic  field  of  magnitude  S3,  directed  into  the 
plane  of  the  page. 


Fig.  25E-17 


a.  In  what  direction  does  current  flow  through  the 
voltmeter? 

b.  Show  that  the  emf  is  given  by  |Vj  = iyftwR2. 

c.  If  S3  = 0.50  T,  co  = 1200  revolutions/min,  and 
R = 10  cm,  evaluate  the  emf. 

25-18.  Easy  letdown.  A conductor  of  length  / and  mass 
m can  slide  along  a pair  of  vertical  metal  guides  connected 
by  a resistor  R.  as  in  Fig.  25E-18.  Friction  and  the  resist- 
ance of  the  conductor  and  the  guides  are  negligible. 
There  is  a uniform  horizontal  magnetic  field  of  strength 


Fig.  25E-18 


01  normal  to  the  plane  of  the  page  and  directed  outward. 
Show  that  the  final  steady  speed  of  fall  under  the  influ- 
ence of  gravity  is  given  by  mgR/2ft2l2. 

25-19.  Energy  loss  in  a search-coil  magnetometer.  Refer- 
ring to  Example  25-3,  suppose  that  the  coil  is  withdrawn 
from  the  magnetic  field  in  0.50  s.  Assuming  that  d<S>m/dt  is 
constant  during  the  withdrawal,  find  the  energy  E dissi- 
pated in  the  resistance  of  the  magnetometer  circuit  during 
the  withdrawal. 

25-20.  Rotating  loop.  A circular  loop  of  radius  r is 
fixed  to  a rotation  axis  along  the  z direction,  as  shown  in 
Fig.  25E-20,  so  that  the  plane  of  the  loop  is  always  perpen- 
dicular to  the  xy  plane.  The  loop  is  rotating  about  the  con- 
stant angular  velocity  cmL  directed  along  the  z axis.  At  t = 
0,  the  loop  lies  in  the  yz  plane. 


a.  Take  the  direction  n of  the  normal  to  the  rotating 
loop  to  lie  along  the  x axis  at  t = 0.  That  is,  n(0)  = +x. 
Express  n(t)  in  terms  of  x,  y,  and  a>Lt. 

b.  1 here  is  a uniform  and  constant  externally  ap- 
plied magnetic  field  ® = S3®  = S3(cos  a x + sin  a z).  Eval- 
uate the  magnetic  flux  $,„(<)  through  the  loop. 

c.  Determine  the  emf  V(t)  induced  in  the  loop. 

25-21.  Rotating  ring.  A circular  ring  of  diameter 
20  cm  has  a resistance  of  0.01  SI.  Flow  much  charge  will 
flow  through  the  ring  if  it  is  tinned  from  a position  per- 
pendicular to  a uniform  magnetic  field  of  2.0  T to  a posi- 
tion parallel  to  the  field? 

25-22.  Induced  electric  field  inside  a solenoid.  Refer  to 
the  situation  described  in  Example  25-4,  and  consider  a 
circular  path  in  a plane  perpendicular  to  the  axis  of  the  so- 
lenoid, centered  on  the  axis.  T he  path  has  a radius  r which 
is  smaller  than  the  inner  radius  of  the  windings. 

a.  Determine  the  induced  emf  V(r)  around  the  path. 
First  express  your  result  in  symbolic  form.  Then  give  a 
numerical  value  for  each  of  the  following  values  of  r: 
0.50  cm,  2.0  cm,  4.0  cm. 

b.  Find  the  induced  electric  field  c?(r),  and  then  ob- 
tain a numerical  value  for  each  radius  in  part  a. 


25-23.  Between  the  sheets.  A flat  conducting  sheet  lies 
in  the  xy  plane  and  is  carrying  a current  in  the  x direction 
with  uniform  surface  current  density  (that  is,  current  per 
unit  length  perpendicular  to  the  direction  of  flow)  given 
by  j,  = j\x.  A parallel  sheet  at  z = D carries  a surface  cur- 
rent density  j2  = — jjx. 

a.  Assuming  that  the  magnetic  field  is  zero  for  z < 0 
and  for  z > D.  find  the  magnetic  field  between  the  sheets. 

b.  T he  magnitude  of  the  surface  current  densities 
varies  slowly  with  time.  Find  the  induced  electric  field 
S(x,  y,  z,  t)  in  terms  of  x,  y,  z,  D.  and  dfi/dt.  Is  your  re- 
sult consistent  with  Lenz’  law? 

25-24.  Primitive  motor.  Figure  25E-24  represents  a 
primitive  motor.  A metal  wire  slides  on  a horseshoe- 
shaped metal  loop  of  width  0.25  nr.  These  have  negligible 
resistance,  but  there  is  a 1 .0-0  resistor  in  the  circuit  as  well 
as  a 6.0-V  battery.  There  is  a uniform  magnetic  field 
directed  into  the  plane  of  the  page  of  magnitude  0.50  T. 
The  slide  wire  is  pushed  to  the  right  by  the  magnetic 
force.  A force  ol  0.25  N to  the  left  is  required  to  keep  it 
moving  with  constant  speed  to  the  right. 

i _ Fig.  25E-24 


a.  What  is  the  current  i in  the  circuit? 

b.  What  is  the  voltage  drop  across  the  1.0-0  resistor? 

c.  What  is  the  back  emf  generated  by  the  moving  wire? 

d.  With  what  speed  is  the  wire  moving? 

e.  What  mechanical  power  does  the  motor  produce? 

f.  Show  that  the  electric  power  converted  to  me- 
chanical power  is  equal  to  the  power  in  parte.  What  is  the 
efficiency  of  the  motor? 

25-25.  Magneto.  The  resistance  of  the  rotating  coil  of 
a magneto  is  R.  It  is  connected  to  an  external  resistance 
Re- 

a.  It  the  magneto  is  rotated  at  constant  angular  veloc- 

O o 

ity,  show  that  the  power  delivered  is  a maximum  when 
Re  = R. 

b.  What  is  the  efficiency  at  maximum  power? 

c.  How  can  the  efficiency  be  increased  without  al- 
tering the  magneto? 

25-26.  Torque  in  a simple  motor.  The  instantaneous  me- 
chanical output  of  an  electric  motor  can  be  written  in  the 
form  Tout  = Tco' , where  T is  the  magnitude  of  the  mag- 
netic torque  exerted  on  the  rotor  at  any  instant  and  w'  is 
the  angular  velocity  of  the  motor  at  that  instant,  expressed 
as  a signed  scalar.  Show  that  for  the  simple  motor  dis- 
cussed in  Sec.  25-4,  the  torque  is  given  by  the  expression 
T = NiaSR  sin(atT). 


Exercises  1207 


25-27 . Evaluating  mutual  inductance.  Evaluate  the 
mutual  inductance  between  the  rectangular  loop  in  Fig. 
25E-27  and  the  long  straight  wire. 

Fig.  25E-27 


25-28.  Measuring  mutual  inductance.  Figure  25E-28 
shows  an  arrangement  for  measuring  the  mutual  induc- 
tance M12  of  a pair  of  coils,  1 and  2.  Coil  1 is  connected  to  a 
battery,  an  ammeter  A,  and  a switch.  Coil  2 is  connected  to 
a ballistic  galvanometer  G which  measures  the  total  charge 
flowing  through  it  during  a current  pulse.  The  switch  is 
closed,  the  galvanometer  reading  is  observed,  and  the 
steady  reading  of  the  ammeter  is  noted.  Show  that  M12  = 
q2 R>  / i,  where  q2  is  the  charge  read  by  the  galvanometer, 
R2  is  the  total  resistance  of  the  galvanometer  and  coil  2, 
and  i is  the  ammeter  reading. 


Fig.  25E-28 


25-29.  Cooperating  solenoids.  Prove  that  for  solenoids  1 
and  2 of  equal  radius  r,  situated  end  to  end  and  wound  in 
the  same  sense,  the  total  self-inductance  L is  L = Lx  + 
L2  + 2 M12.  ffere  Lx  and  L2  are  the  self-inductances  and 
M12  is  the  mutual  inductance.  What  if  the  two  solenoids 
are  wound  in  opposite  senses? 

25-30.  Thermal  versus  magnetic  energy  densities  in  the 
solar  atmosphere.  Sunspots  are  regions  in  the  lower  solar 
atmosphere  which  are  cooler  (and  therefore  less  bright) 
than  neighboring  regions.  It  has  been  known  for  nearly  a 
century  that  fairly  strong  magnetic  fields  exist  within 
sunspots.  Modern  estimates  of  the  physical  conditions 
within  sunspots  include  these  typical  values:  number  of 
free  particles  per  unit  volume  n — 1022  m-3;  gas  tempera- 
ture T — 4000  K;  magnetic  held  strength  8ft  — 0.1  T. 

a.  Use  the  ideal-gas  law  to  estimate  the  typical  gas 
pressure  within  a sunspot.  Compare  it  to  atmospheric 
pressure. 


b.  Assuming  that  the  gas  is  ideal  and  monatomic, 
find  the  typical  thermal  energy  density  p,  in  a sunspot. 

c.  Find  the  typical  magnetic  energy  density  pm  in  a 
sunspot. 

d.  Evaluate  the  ratio  pt/pm  of  the  thermal  and  mag- 
netic energy  densities.  This  ratio  of  energy  densities  indi- 
cates whether  pressure  forces  or  magnetic  forces  domi- 
nate the  gross  behavior  of  the  gas.  (If  the  ratio  greatly 
exceeds  one,  pressure  forces  dominate;  if  the  ratio  is 
much  less  than  one,  magnetic  forces  dominate.)  Within  a 
typical  sunspot,  which  forces  are  dominant? 

e.  Away  from  sunspots,  typical  conditions  in  the 
lower  solar  atmosphere  are  given  by  n — 1022  m~3,  T — 
6000  K,  and  8ft  — 10-4  T.  Which  forces  are  dominant 
here? 

Group  C 

25-31.  Magnetic  braking,  II.  A uniform  metal  rod  of 
mass  M is  able  to  slide  with  negligible  friction  along  a pair 
of  fixed  horizontal  rails  a distance  w apart,  as  shown  in 
Fig.  25E-31.  The  rails  and  the  fixed  cross-connection  at 
the  left  are  very  highly  conductive,  so  that  they  make  no 
significant  contribution  to  the  electric  resistance  of  the 
rectangular  circuit.  The  free  rod  plus  its  contacts  with  the 
fixed  rails  have  total  electric  resistance  R.  There  is  a uni- 
form and  steady  externally  applied  magnetic  held  of  mag- 
nitude 8ft  directed  vertically  upward. 


Fig.  25E-31 


a.  Find  the  current  i induced  in  the  circuit  in  terms  of 
w,  R , 8ft , and  v,  the  instantaneous  velocity  of  the  rod.  Take 
the  positive  sense  for  current  to  be  counterclockwise  as 
viewed  from  above.  In  determining  the  induced  current, 
neglect  the  magnetic  field  produced  by  the  current  itself. 

b.  Suppose  that  at  t = 0 the  rod  has  position  x0  and 
velocity  v0.  Determine  v(t)  and  x(t). 

c.  Obtain  numerical  results  for  i(t),  v(t),  and  x(t)  in  the 
following  case:  M = 0.10  kg,  w = 1.0  m,  R = 1.0  fi,  8ft  = 
0.20  T,  x0  = 3.0  m,  and  d0  = 10  m/s.  How  far  does  the  rod 
slide  before  coming  to  rest? 

d.  How  small  must  the  coefficient  ol  friction  be  in 
order  for  the  mechanical  frictional  force  to  be  less  than  10 
percent  as  strong  as  the  initial  magnetic  force? 

e.  The  neglect  of  the  magnetic  held  produced  by  the 
current,  8ftc,  as  in  part  a , is  justified  as  long  as  its  magni- 
tude is  everywhere  small  compared  to  the  applied  mag- 
netic held  8ft.  The  maximum  value  of  8ftc  is  approximately 
given  by  p.0i/2na,  where  a is  the  thickness  of  the  metal  rod. 
Using  a = 3.0  mm  and  the  other  numerical  values  given 
in  part  c,  is  it  reasonable  to  neglect  8ftc  in  solving  this 
problem? 


1208  Electromagnetic  Induction 


Fig.  25E-34 


25-32.  Heating  an  iron  ring.  A fixed  metal  ring  of  den- 
sity p and  conductivity  cr  in  the  shape  of  a toroid  is  shown 
in  Fig.  25E-32.  The  ring  lies  in  the  xy  plane  and  is  im- 
mersed in  a spatially  uniform  but  sinusoidally  varying 
magnetic  held®  (t)  = S§0  cos(<w/)z. 

Z Fig.  25E-32 


a.  Find  the  mass  M and  electric  resistance  R of  the 
ring.  Give  your  results  in  terms  of  p,  cr,  r1 , and  r2,  where 
the  radii  rx  and  r2  are  defined  in  the  figure. 

b.  Find  the  induced  current  i(t)  in  the  ring,  taking  the 
positive  sense  to  be  counterclockwise  as  viewed  from 
above.  Express  your  result  in  terms  of  r2,  co,  2ft0,  and  R. 

c.  What  is  the  average  power  P dissipated  in  the  ring? 

d.  If  the  specific  heat  capacity  of  the  ring  material  is 
c,  find  the  rate  of  temperature  rise  clT/dt  of  the  ring, 
assuming  that  no  heat  is  lost. 

e.  Obtain  numerical  results  for  parts  a through  d for 
the  case  of  an  iron  ring,  with  = 2.0  mm,  r2  = 20  mm, 
3d0  = 1.0  x 10-2  T,  and  o)/2tt  = 60  Hz.  For  iron,  p = 
7.87  X 103  kg/m3,  cr  — 1.0  X 107  S/m  and  c = 
0.107  kcal/(kg-°C). 

f.  As  in  part  c of  Exercise  25-31,  obtain  an  expression 
which  can  be  evaluated  to  check  the  validity  of  neglecting 
the  magnetic  held  produced  by  the  ring  itself.  Evaluate 
your  expression  for  the  case  of  the  iron  ting  described 
above.  Comment  on  your  result. 

25-33.  Concentric  conductors.  A conducting  rod  of 
radius  a lies  along  the  z axis  and  carries  a current  i in  the 
positive  z direction.  The  rod  is  surrounded  by  a concentric 
conducting  tube  with  inner  radius  b and  outer  radius  c. 
The  tube  carries  an  equal  current,  oppositely  directed.  Let 
r denote  distance  from  the  z axis.  That  is,  r = (x2  + y2)112. 

a.  Find  the  magnetic  held  ® in  the  region  between 
the  conductors  (a  < r < b)  and  in  the  region  outside  the 
tube  (r  > c). 

b.  The  magnitude  i of  the  currents  is  slowly  varied. 
Show  that  the  induced  electric  held  8 has  magnitude  that 
depends  only  upon  r and  is  purely  axial  in  direction. 

c.  Show  that  %(b)  — %(a)  = (/jl0/2tt)  In  (b/a)  di/dt. 


(dSft/dt)t,  where  dCti/dt  > 0.  What  is  the  induced  emf  V(t) 
around  the  ring?  Assuming  that  no  charge  can  how  across 
the  gap,  which  face  of  the  gap  (Fj  or  F2)  will  accumulate  an 
excess  of  positive  charge? 

c.  The  accumulation  of  charge  on  the  gap  faces  will 
cease  when  the  total  electric  held  within  the  ring  material 
is  zero.  When  this  happens,  what  will  be  the  electric  held 
in  the  gap?  Does  your  expression  depend  upon  R ? 

d.  A spark  will  jump  the  gap  if  the  electric  held  mag- 
nitude exceeds  the  breakdown  held  necessary  to  ionize 
air.  Find  the  minimum  gap  width  8min  that  can  tolerate  the 
held  found  in  part  c,  in  terms  of  a,  and  %h. 

e.  Obtain  a numerical  value  of  8min  for  the  case  a = 
0.10  m2,  cM/dt  = 1.0  x 103  T/s,  and  = 3.0  x 
106  V/m. 

25-35.  Convenient  property  of  an  electric  motor.  The  sta- 
tionary coil  producing  the  magnetic  held  and  the  rotating 
coil  of  a motor  are  connected  in  parallel.  The  current  in 
the  rotating  coil  is  1.4  A,  and  is  resistance  R is  5.0  Q.  The 
voltage  applied  to  the  motor  is  220  V. 

a.  What  is  the  back  emf? 

b.  What  is  the  mechanical  power  output? 

c.  If  an  added  mechanical  load  slows  the  motor  by  5 
percent,  what  is  now  the  back  emf? 

d.  What  is  the  current  in  the  rotor  now? 

e.  What  is  the  new  mechanical  power  output? 

f.  How  do  the  results  of  part  c,  d,  and  e show  that  an 
electric  automobile  does  not  require  a gear  shift? 

25-36.  The  solenoid  connection.  In  Fig.  25E-36,  coil  AA' 
(solid  line)  and  coil  BB'  (dashed  line)  are  wound  on  a long 
plastic  tube  in  the  same  sense. 


25-34.  At  the  gap.  A highly  conductive  ring  of  radius 
R is  perpendicular  to  and  concentric  with  the  axis  of  a 
long  solenoid,  as  shown  in  Fig.  25E-34.  The  ring  has  a 
narrow  gap  of  width  8 in  its  circumference.  The  solenoid 
has  cross-sectional  area  a and  a uniform  internal  held  of 
magnitude  9i. 

a.  What  is  the  induced  emf  around  the  loop  if  91  is 
constant? 

b.  Beginning  at  t = 0,  the  solenoid  current  is  steadily 
increased,  so  that  the  held  magnitude  is  9Rt)  = S80  + 


a.  Ends  A and  B are  joined  together  and  a source  of 
current  i is  connected  to  A'  and  B' . Show  that  the  induced 


Exercises  1209 


emf  is  equal  to  (LA  + LB  — 2 MAB)  di/dt,  where  L is  self- 
inductance. 

b.  Ends  A and  B'  are  joined  together  and  the  current 
source  is  connected  to  A'  and  B.  Show  that  the  induced 
emf  is  equal  to  (LA  + LB  + 2 MAB)  di/dt. 

c.  If  the  two  equal  length  coils  have  the  same  number 
N of  turns  and  LA  = 0.010  H.  what  is  the  effective  in- 
ductance (i)  in  part  a?  (ii)  in  part  b ? 

d.  Ends  A and  B are  joined,  and  ends  A'  and  B'  are 
also  joined.  The  current  source  is  connected  to  A'  and  B’ . 
What  is  the  effective  inductance? 

25-37.  Magnetic  field  energy  within  a wire.  A long 
straight  wire  of  circular  cross  section  is  made  of  nonmag- 
netic  material  (that  is,  Km  = 1 to  a good  approximation). 
It  is  of  radius  a.  The  wire  carries  a current  i which  is  uni- 
formly distributed  over  its  cross  section.  Compute  the  en- 
ergy  per  unit  length  stored  in  the  magnetic  held  contained 
within  the  wire. 

25-38.  Energy  content  of  the  earth’s  magnetic  field.  The 
external  magnetic  held  of  a spherical  object  of  radius  R, 


surrounded  by  vacuum  and  carrying  an  (idealized)  central 
point  magnetic  dipole,  contains  magnetic  energy  U = 
(tt/2){S^2pR3/2/x0).  The  quantity  Sftp  is  the  maximum  mag- 
netic held  strength  at  the  surface  of  the  object  — that  is, 
the  value  of  at  the  object’s  magnetic  poles. 

a.  The  earth's  external  magnetic  held  is  not  that  of  a 
pure  centered  point  dipole,  nor  is  the  earth  surrounded 
by  a perfect  vacuum.  Nevertheless,  a reasonable  estimate 
of  the  energy  content  U in  the  external  held  is  U — 
3ft2pR3/ 2/u,o.  Evaluate  this,  using  = 6.0  x 10~5T  and 
R = 6.4  x 106  m. 

b.  Compare  the  energy  U to  the  total  annual  usage  of 
electrical  energy  in  the  United  States.  It  was  1.7  x 
1012  kW-h  in  1972. 

c.  The  terrestrial  magnetic  held  is  believed  to  be 
maintained  by  internal  currents  resulting  from  an  inter- 
play between  the  earth’s  rotation  and  the  thermal  convec- 
tion of  an  electrically  conductive  liquid  in  the  earth’s  core. 
Estimate  the  earth’s  rotational  kinetic  energy  K and  evalu- 
ate the  ratio  U/K. 


1210 


Electromagnetic  Induction 


Changing  Electric 
Currents 


26-1  INDUCTANCE, 
RESISTANCE,  AND 
CAPACITANCE  IN 
ELECTRIC  CIRCUITS 


In  Chap.  22,  we  considered  at  length  the  properties  of  electric  circuits  in 
which  steady  currents  flow.  The  key  components  of  such  circuits  are  re- 
sistors and  sources  of  emf,  which  may  be  connected  simply  or  in  complex 
networks.  There  is  much  wider  practical  application,  however,  for  circuits 
in  which  the  current  is  not  steady,  but  changes  with  time  both  in  magnitude 
and  in  sense — sometimes  in  a very  complex  way.  Such  circuits  are  loosely 
called  alternating-current  circuits,  or  ac  circuits  for  short.  (Strictly 
speaking,  this  name  applies  to  only  circuits  in  which  the  variation  of  the 
current  is  periodic  in  nature.  However,  the  broader  application  of  the  term 
is  commonplace,  and  it  is  usually  not  necessary  to  make  finer  distinctions  in 
the  terminology.)  Alternating-current  circuits  are  of  enormous  practical 
importance;  they  have  essential  applications  in  every  aspect  of  modern  sci- 
ence, technology,  and  everyday  life. 

Sources  of  emf  and  resistors  are  important  elements  of  ac  circuits,  just 
as  they  are  of  circuits  carrying  steady  currents.  (In  ac  circuits,  however,  the 
emf’s  produced  by  the  sources  may  vary  with  time.)  Capacitors  and  in- 
ductors (discussed,  respectively,  in  Chaps.  21  and  25)  also  play  key  roles  in 
the  behavior  of  ac  circuits.  In  this  chapter,  we  first  review  the  current- 
carrying  behavior  of  inductors,  resistors,  and  capacitors  as  individual  ele- 
ments. Then  we  study  the  behavior  of  combinations  of  two  different  types. 
Finally  we  study  the  behavior  of  combinations  of  all  three  types.  While  we 
confine  our  considerations  to  relatively  simple  circuits,  the  general  princi- 
ples developed  are  applicable  to  ac  circuits  in  general. 


When  an  inductor,  a resistor,  or  a capacitor  is  considered  as  a circuit 
element,  we  do  not  concern  ourselves  with  what  is  going  on  within  it. 
Rather,  interest  centers  on  what  can  be  measured  by  applying  probes  to  the 


1211 


terminals  of  the  circuit  element,  namely,  the  voltage  imposed  across  the 
element  and  the  current  flowing  into  or  out  of  it.  We  now  bring  together 
the  rules  relating  the  imposed  voltage  V to  the  current  i for  the  three  dif- 
ferent kinds  of  elements  under  consideration: 

Inductor  [see  Eq.  (25-34)]: 

Resistor  [see  Eq.  (22-18)]: 

Capacitor  [see  Eq.  (21-45a)]: 

Note  that  in  Eq.  (26- la)  there  is  no  minus  sign.  The  reason  is  that  the  volt- 
age of  interest  across  a circuit  element  is  the  external  voltage  V imposed  to 
make  a current  flow  through  it,  and  not  the  back  emf  Vb  induced  by  the 
current.  For  an  ideal  inductor  (having  zero  resistance  and  zero  capacitance 
between  its  terminals)  we  have  V — — Vb.  We  thus  have  from  Eq.  (25-34) 
V = — Vb  = —(  — Ldi/dt)  = L di/ dt. 

The  voltage  across  an  inductor  depends  on  the  time  derivative  of  the 
current.  The  voltage  across  an  ideal  resistor  (having  zero  inductance  and 
capacitance)  depends  on  the  current  itself.  And  the  voltage  across  an  ideal 
capacitor  (having  zero  resistance  and  inductance)  depends  on  the  time 
integral  of  the  current.  Because  of  these  different,  but  related,  behaviors,  a 
rich  variety  of  effects  can  be  obtained  by  proper  combinations  of  any  two  or 
all  three  of  these  circuit  elements. 

Equations  (26- la)  through  (26-  lc)  can  be  written  in  an  alternative  form 
by  using  the  definition  i = dq/clt  to  express  them  in  terms  of  the  charge  q. 
This  yields 

Inductor:  V = L (26-2 a) 

dt2 

Resistor:  V = R^f  (26-2  b) 

dt 

Capacitor:  V = q (26-2 c) 

Just  as  the  analysis  of  circuits  with  steady  currents  was  carried  out  by 

multiple  application  of  Ohm’s  law  [Eq.  (26-1/d  or  Eq.  (26-2/))]  to  the  re- 
sistors in  the  circuit,  so  the  analysis  of  circuits  with  changing  currents  is  car- 
ried out  by  multiple  application  of  Eqs.  (26- la)  through  (26- lc)  or  Eqs. 
(26-2a)  through  (26-2r)  to  the  richer  variety  of  circuit  elements  comprising 
them.  This  is  our  task  for  the  remainder  of  this  chapter. 


V = L~  (26- la) 

at 

V = iR  (26-1  b) 

v = 1c='c\",'dt  <26-lc) 


26-2  THE  RL  CIRCUIT  The  circuit  shown  in  Fig.  26-1  is  the  simplest  form  of  the  so-called  RL  cir- 
cuit. It  consists  of  a source  of  emf  supplying  a constant  voltage  Vo  - an  ideal 
resistor  R,  and  an  ideal  inductor  L in  series.  Let  us  begin  to  study  this  cir- 
cuit by  considering  qualitatively  what  happens  when  the  current  is  “turned 
on.”  Prior  to  the  moment  when  the  switch  is  first  moved  from  its  initial  po- 
sition C to  position  A,  no  current  has  been  flowing  through  the  circuit.  As  is 
generally  true  of  physical  processes,  the  current  cannot  achieve  its 
steacly-state  value  instantaneously,  but  must  build  up  over  a finite  period. 
At  some  instant  very  early  in  this  buildup  period,  the  current  i through  the 
circuit  is  very  small  (although  the  rate  di/dt  at  which  it  is  increasing  may  be 


1212  Changing  Electric  Currents 


R L 


large).  As  a consequence,  the  iR  drop  (as  we  continue  to  call  the  voltage 
drop  across  the  resistor)  is  also  small.  Indeed,  we  can  choose  the  instant  so 
early  in  the  buildup  process  that  the  iR  drop  is  negligible  compared  to  the 
voltage  V0.  Then,  to  a close  approximation,  the  voltage  V which  appears 
across  the  inductor  must  be  the  entire  battery  voltage  V0-  How  can  this  be 
when  the  inductor  has  zero  resistance?  The  answer  is  that  the  changing 
current  through  the  inductor  induces  a back  emf  Vb  = —V.  In  order  for 
this  to  take  place,  the  current  must  change  at  a rate  di/dt  which  depends  on 
the  inductance  L of  the  inductor.  At  the  very  early  instant  when  the  voltage 
across  the  inductor  has  the  value  F = F0,  it  follows  immediately  from  Eq. 
(26- la)  that  the  instantaneous  value  of  di/dt  must  be  given  by  the  expres- 
sion 


di  = V = Vo 
dt  L L 

As  the  current  i grows — and  it  must  grow  because  di/dt  is  greater  than 
zero — the  iR  drop  across  the  resistor  increases  in  proportion.  When  a suf- 
ficiently long  time  has  passed,  the  current  i through  the  circuit  reaches  a 
steady  state.  In  this  steady  state,  i = constant  and  so  di/dt  = 0.  Thus, 
the  voltage  drop  across  the  inductor,  V — L di/dt,  is  then  also  zero.  This 
means  the  entire  battery  voltage  V0  must  appear  across  the  resistor  so  that 
V0  = iR,  just  as  in  a clc  circuit  containing  only  a battery  and  a resistor.  In 
the  steady  state,  it  is  just  as  though  the  inductor  were  not  present,  and  the 
current  flowing  is  i = V0/R.  (1  his  is  why  it  was  unnecessary  to  consider  in- 
ductors in  studying  dc  circuits  in  Chap.  22.) 


This  qualitative  sketch  of  the  current-growth  process  must  be  reflected 
in  a quantitative  treatment  of  the  circuit.  The  treatment  is  based  on  Kirch- 
hoff’s  loop  rule:  The  algebraic  sum  of  the  voltages  across  all  the  circuit  elements  in 
a loop  is  zero.  (Kirchhoff’s  rules  are  developed  in  Sec.  22-7,  where  the  sign 
conventions  for  their  application  are  also  given.)  While  the  current  changes 
continually,  it  must  be  true  at  any  particular  instant  that  the  sum  of  the  volt- 
age changes  around  the  circuit  adds  to  zero.  We  can  therefore  apply  the 
loop  rule  to  the  circuit  of  Fig.  26- 1 . Going  around  in  the  clockwise  sense,  we 
obtain 

Vq  — iR— L-j  = 0 (26-3) 

dt 


In  this  equation,  F0  is  the  voltage  increase  in  passing  from  the  negative  to 
the  positive  pole  of  the  battery,  iR  is  the  voltage  drop  across  the  resistor, 
and  L di/dt  is  the  voltage  drop  across  the  inductor.  [In  writing  the  loop 
rule,  a voltage  increase  is  given  a positive  sign,  and  a voltage  drop  (decrease)  a 
negative  sign.]  The  quantity  i is  the  current  at  the  particular  instant  being 
considered,  and  di/dt  is  the  rate  of  change  of  current  at  that  instant. 

We  can  find  an  analytical  solution  for  this  differential  equation  for  i by 
gathering  all  terms  involving  the  dependent  variable  i on  one  side  and  all 
terms  involving  the  independent  variable  t on  the  other.  This  can  be  done 
as  follows.  First,  divide  both  sides  of  the  equation  by  R to  obtain 


Vo 

R 


L di 
R dt 


— — = 0 


Next,  transpose  the  last  term  on  the  left  side  to  the  right  side  of  the  equa- 


26-2  The  RL  Circuit  1213 


tion,  and  interchange  the  resulting  left  and  right  sides.  This  gives 

L di  = Vo  _ . 

Rdt  R 1 


Finally,  divide  both  sides  of  the  equation  by  V0/R  — i and  multiply  both 
sides  by  (. R/L)dt  to  obtain  the  desired  form 


di 

V0/R  - i 


(26-4) 


The  solution  of  this  equation,  like  that  of  all  differential  equations,  de- 
pends on  the  initial  conditions.  In  this  case,  we  have  chosen  the  initial  time 
to  be  that  when  the  switch  in  Fig.  26-1  is  moved  from  C to  A,  so  that  the  ini- 
tial current  is  zero.  Expressed  mathematically,  when  t = 0,  we  have  i = 0. 
Integrating  both  sides  of  Eq.  (26-4)  from  these  initial  values  of  current  and 
time  to  final  values  if  and  tf,  we  have 

p di  = j'tf  R 
Jo  To /R  ~i  Jo  L 


Evaluating  the  integrals  on  both  sides  of  the  equation  yields 


-In 


(VJR  if\ 
\ VJR  ) 


R 

L 


tf 


Since  tf  can  be  any  desired  time  later  than  t — 0,  the  subscript/  can  be 
dropped  from  both  t and  the  corresponding  value  of  i.  Doing  so,  and 
making  some  algebraic  simplifications,  give 


In 


R 


or 


R_ 

To 


= g-t,RIL)t 


Thus  the  solution  for  i is 


— e 


-( RIL)tl 


(26-5) 


This  equation  is  consistent  with  the  preceding  qualitative  discussion. 
When  t = 0,  the  quantity  in  brackets  is  zero  and  the  initial  current  is  i = 
0.  After  a long  time,  when  t — > <*,  the  equation  reduces  to  the  expression 
i = V0/R , since  the  second  term  in  brackets  approaches  zero.  This 
value  of  i is  the  maximum  value  achieved  by  the  current,  and  it  is  identical 
to  the  value  obtained  by  applying  Ohm's  law  to  the  battery  and  the  resistor 
in  the  circuit,  as  though  the  inductor  were  not  present. 

Between  these  extremes,  the  current  follows  the  dashed  curve  in  Fig. 
26-2,  rising  asymptotically  toward  the  value  i = V0/R.  In  particular,  at  the 
moment  when  t = L/R,  the  term  in  brackets  in  Eq.  (26-5)  has  the  value 
1 - \/e  — 0.63,  so  that  the  current  has  reached  about  63  percent  of  its 
final,  maximum  value.  The  quantity  tl,  defined  to  be 


is  called  the  time  constant  of  the  circuit.  If  tl  is  large,  as  happens  if  the  in- 
ductance is  large  and/or  the  resistance  is  small,  the  current  rises  slowly.  If 


1214  Changing  Electric  Currents 


Fig.  26-2  Behavior  of  the  RL  circuit  of 
Fig.  26-1.  When  the  switch  is  moved  ini- 
tially from  C to  A,  the  current  i grows 
from  an  initial  value  of  zero  to  an 
asymptotic  value  V0/R  along  the  dashed 
curve.  In  this  graph,  the  dimensionless 
quantity  iR/V0  is  plotted  against  the  di- 
mensionless quantity  tL/R.  The  curve  is 
thus  applicable  to  any  circuit  like  that  of 
Fig.  26-1,  regardless  of  the  particular 
values  of  V0.  L,  and  R.  When  the  switch 
is  initially  moved  from  A,  where  it  has 
been  held  for  a time  long  compared  to 
R/L,  to  B,  the  current  decays  from  an 
initial  value  V0/R  (equal  to  the  asymp- 
totic value  of  the  dashed  curve)  to  a final 
asymptotic  value  of  zero  along  the  solid 
curve. 


tl  is  small,  as  happens  if  the  inductance  is  small  and/or  (he  resistance  large, 
the  current  rises  rapidly.  (As  noted  in  Sec.  25-6,  an  inductor  in  an  electric 
circuit  behaves  in  an  “inertial”  manner  analogous  to  that  of  a body  having 
mass  in  a mechanical  system.)  Both  inductors  and  resistors  are  available 
over  enormous  ranges  of  values.  Thus  by  proper  choice  of  components, 
the  circuit  designer  can  pick  any  desired  time  constant  over  a very  wide 
range.  A specific  case  is  discussed  in  Example  26-1. 


EXAMPLE  26-1  mi  

In  an  RL  circuit  like  that  of  Fig.  26-1,  the  inductance  has  the  value  L = 2.5  H,  and 
the  resistance  is  R = 3.2  0.  The  battery  voltage  is  V0  = 6.0  V.  Find  the  maximum 
current  ax , the  time  constant  tl  , and  the  time  required  for  the  current  to  reach  the 
value  i = im^/2  from  an  initial  value  of  zero. 

■ Since  imax  = V0/R , you  have  for  the  maximum  current 


6.0  V 
3.2  n 


1.9  A 


The  time  constant,  given  by  Eq.  (26-6),  is 


L _ 2.5  H 
R ~ 3.2  fl 


0.78  s 


If  the  current  is  to  rise  from  an  initial  value  of  zero  to  half  its  maximum  value,  the 
term  in  brackets  in  Eq.  (26-5)  must  rise  to  the  value  i.  You  therefore  write 

1 — e~{RIL)t  = 1 — e~tlT'-  = i 


or 


e e/Ti  = I 

Taking  the  natural  logarithm  of  both  sides  of  this  equation,  you  have 

- — = Ini  = -In  2 
rL 


so  that 


t = tl  In  2 = 0.78  s X In  2 = 0.54  s 


26-2  The  Hi  Circuit  1215 


R L 

am omr^ 


Fig.  26-3  T he  battery  has  been  re- 
moved from  the  RL  circuit  by  moving 
the  switch  to  B after  it  has  been  in  posi- 
tion A for  some  time.  The  current 
decays  as  explained  in  the  text. 


When  the  current  through  the  circuit  of  Fig.  26-1  has  flowed  long 
enough  that  i = imax  = V0/R,  the  switch  is  moved  to  position  B.  removing 
the  battery  from  the  circuit  as  shown  in  Fig.  26-3.  Because  of  the  energy 
stored  in  the  magnetic  held  of  the  inductor,  the  current  does  not  stop 
flowing  immediately.  Rather,  the  decrease  in  current  induces  an  emf  across 
the  inductor.  According  to  Lenz'  law,  the  sense  of  the  emf  is  such  as  to  op- 
pose change  in  the  circuit.  In  this  case,  this  means  that  the  sense  of  the  emf  is 
that  which  tends  to  maintain  the  flow  of  current  in  the  clockwise  sense 
around  the  circuit.  (Here  again,  note  the  “inertial'’  behavior  of  the  inductor, 
which  tends  to  oppose  changes  in  the  current.) 

It  is  somewhat  tricky  to  keep  the  signs  straight  in  applying  Kirchhoff’s 
loop  rule  to  the  circuit  of  Fig.  26-3,  because  of  the  fact  that  an  inductor  can 
be  regarded  in  two  ways.  If  it  is  regarded  as  a source  of  emf,  the  voltage 
across  it  is  Vb  = —L  di/dt.  But  if  it  is  regarded  as  a passive  impedance — that 
is,  as  an  “obstacle"  which  impedes  the  change  of  current  through  the 
circuit — the  voltage  across  it  is  V = L di/dt.  In  Fig.  26-1,  we  regarded  it  in 
the  latter  fashion.  In  this  case  it  may  cause  confusion  to  do  so;  if  the  in- 
ductor is  not  the  source  of  emf  in  the  circuit,  what  is?  One  way  to  circum- 
vent this  difficulty  is  to  begin  with  Eq.  (26-3),  V0  — iR  — L di/dt  = 0,  which 
results  from  applying  Kirchhoff’s  loop  rule  to  the  circuit  of  Fig.  26-1. 
where  the  inductor  is  regarded  as  a passive  impedance.  But  the  two  circuits 
are  identical,  except  that  the  battery  has  been  removed  from  the  circuit  of  Fig.  26-3. 
So  in  order  to  make  Eq.  (26-3)  apply  to  the  present  case,  we  need  merely  re- 
move from  it  the  term  T0,  which  represents  the  contribution  of  the  battery 
to  the  circuit.  This  gives  us 

- iR  — = 0 (26-7a) 

dt 

To  interpret  this  equation,  we  write  it  in  the  mathematically  identical  form 

-iR  + ( -Ljf ) =0  (26-7  b) 

Here,  as  in  Eq.  (26-3),  the  iR  drop  across  the  resistor  is  given  a negative  sign 
in  the  loop  equation  because  it  represents  a voltage  drop  across  the  resistor 
in  the  sense  specified  as  positive.  The  term  ( — L di/dt)  has  a positive  value  be- 
cause L di/dt  has  a negative  value.  Thus  in  Eq.  (26-7 b)  this  term  does  not 
represent  a voltage  “drop,”  but  a voltage  increase.  The  positive  sign  preced- 
ing the  term  ( — L di/dt)  is  therefore  the  correct  one. 

The  solution  to  the  differential  equation  (26-7a)  is  carried  out  similarly 
to  that  of  Eq.  (26-3).  Again,  we  gather  all  terms  involving  i on  one  side  of 
the  equation  and  all  terms  involving  t on  the  other.  This  yields  the  form 

- = -~dt  (26-8) 

? L 

Here  again,  we  must  specify  the  initial  conditions.  Let  t = 0 at  the  moment 
when  the  switch  is  moved  to  position  B.  At  that  initial  moment,  the  current 
has  its  steady-state  value  i = V0/R.  Integrating  from  these  initial  values  of 
current  and  time  to  final  values  if  and  tf,  we  have 

r"  r m 

J Vo  IR  1 F JO 


1216 


Changing  Electric  Currents 


R 

Wv 

I Positive 
l sense 

\ V° 


-nrrr- 


Fig.  26-4  A large,  steady  current  i0  = V0/R  is  flowing  through  the  circuit  shown  when  a sud- 
den break  occurs  at  some  point  X in  the  circuit.  As  a result,  the  current  falls  rapidly,  in  a 
manner  whose  details  depend  on  how  the  circuit  is  broken.  A general  idea  of  what  happens 
can  be  based  on  the  assumption  that  the  fall  of  current  is  represented  approximately  by  the 
mathematical  expression  i = i0e~ust  = (V0/R)e~tlbt.  In  this  expression,  8t  roughly  represents  the 
time  over  which  the  break  takes  place — perhaps  of  the  order  of  milliseconds.  (The  reason  for 
using  an  exponential  function  is  that  it  makes  the  subsequent  calculations  easy.)  The  voltage 
drop  VR  across  the  resistor  can  be  calculated  by  substituting  the  expression  for  i into  Ohm’s 
law.  This  gives  VR  = iR  = V0e~tlsi , a quantity  which  decays  smoothly  to  zero  from  its  original 
value  T0-  But  the  voltage  drop  across  the  inductor  must  be  calculated  from  Eq.  (26-la).  V = 
L di/dt.  The  quantity  di/dt  is  found  by  differentiating  the  assumed  expression  for  the  current  i. 
We  have  di/dt  = (V0/R)(—  l/8t)e~‘l6‘,  so  the  voltage  drop  across  the  inductor  is  V = 
V0(L/R)(—  1 /8t)e~um.  By  using  the  definition  tl  = L/R  given  by  Eq.  (26-6),  this  equation  be- 
comes V = — V0(TL/8t)e~,lb'.  In  the  present  case,  tl  is  large  because  the  circuit  contains  a large 
inductance  L but  a small  resistance  R.  In  particular,  tl  is  much  greater  than  the  “break  time" 
Si,  and  TL/8t  is  much  greater  than  1 Just  after  the  break  occurs,  when  t <3C  Si,  the  exponential 
term  in  the  equation  for  V is  only  slightly  less  than  1,  and  the  voltage  drop  across  the  inductor 
is  approximately  V = — (rL/Si)T0,  which  can  be  much  greater  in  magnitude  than  T0,  the  rated 
voltage  of  the  circuit.  This  short-lived  induced  voltage  can  lead  to  insulation  breakdown  and 
similar  failures  which  render  the  circuit  incapable  of  carrying  its  rated  voltage  V0.  The  dam- 
age may  take  place  at  the  location  of  the  break  or  elsewhere  in  the  circuit. 


Carrying  out  ihe  integrations  yields 


This  can  be  rewritten  in  the  form 


i = ^ e~iRIL)t  (26-9) 

K 

where  we  have  again  dropped  the  subscript/because  fy-can  be  chosen  to  be 
any  time  later  than  t = 0.  The  current  now  decays  exponentially,  reaching 
1 /e  (or  about  37  percent)  of  its  initial  value  in  the  time  jL  = L/R.  The  decay 
curve  is  plotted  by  the  solid  line  in  Fig.  26-2. 

In  circuits  containing  large  inductances  but  having  low  resistance  (such  as 
power  lines  serving  electric  motors)  there  can  be  considerable  danger  in  a sudden 
accidental  breaking  of  the  circuit,  which  is  shown  schematically  in  Fig.  26-4.  The 
analysis  in  the  caption  to  the  figure  shows  that  the  sudden  interruption  of  the  cur- 
rent leads  to  the  appearance  of  a very  large  induced  emf  across  the  inductor.  This 
emf  can  be  large  enough  to  lead  to  the  establishment  of  an  electric  arc  (a  sustained 
spark)  over  the  break  in  the  circuit,  or  to  insulation  failure  and  subsequent  arcing 
elsewhere  in  the  system.  To  make  matters  worse,  such  an  arc,  once  established, 
can  be  sustained  by  the  much  smaller  emf  furnished  by  the  source  V0,  and  can 
lead  to  considerable  damage.  To  avoid  arcing  when  current  is  shut  off  intention- 
ally, heavy-duty  circuit  breakers  in  power  lines  often  use  a spray  of  oil  between 
the  rapidly-opening  switch  contacts  to  “quench”  the  spark  which  tends  to  form  as 
the  contacts  open. 


26-3  THE  RC  CIRCUIT  The  circuit  shown  in  Fig.  26-5  is  the  simplest  form  of  the  so-called  RC  cir- 
cuit. It  consists  of  a source  of  constant  voltage  V0 , an  ideal  resistor  R,  and 
an  ideal  capacitor  C in  series.  (The  capacitor  is  discussed  in  Secs.  21-6  and 
21-7.  As  noted  there,  the  behavior  of  a capacitor  in  an  electric  circuit  is 
analogous  to  that  of  a spring  in  a mechanical  system.)  Suppose  that  the 
switch  has  been  set  at  position  B for  some  time,  so  that  the  capacitor  is  com- 
pletely discharged  and  no  current  is  flowing.  At  the  instant  when  the  switch 
is  first  moved  to  position  A,  a current  i = V0/R  flows  through  the  circuit.  As 


26-3  The  RC  Circuit  1217 


Fig.  26-5  A series  RC  circuit. 


charge  flows  onto  the  capacitor  plates,  however,  an  opposing  voltage  ap- 
pears across  the  capacitor  and  the  current  diminishes.  Eventually,  the  volt- 
age across  the  capacitor  becomes  equal  in  magnitude  to  that  of  the  source, 
and  the  current  falls  to  zero. 

The  quantitative  analysis  of  the  RC  circuit  is  carried  out  in  a manner 
not  very  different  from  that  for  the  LR  circuit.  Kirchhoff’s  loop  rule  is  used 
to  And  the  sum  of  the  voltage  changes  around  the  circuit  at  a moment  when 
the  current  has  the  value  i.  Applying  the  loop  rule  around  the  circuit  in  the 
clockwise  sense  gives 


To  - iR  - ^ = 0 


(26-10) 


In  this  equation,  T0  is  the  voltage  increase  in  passing  from  the  negative  to 
the  positive  pole  of  the  battery,  iR  is  the  voltage  drop  across  the  resistor, 
and  q/C  is  the  voltage  drop  across  the  capacitor,  given  by  Eq.  (26-  lc),  with  q 
being  the  charge  on  the  left  plate. 

Equation  (26-10)  can  be  solved  for  q , if  this  quantity  is  needed,  by 
making  the  substitution  i = dq/dt.  However,  interest  in  a circuit  usually 
centers  on  the  easily  measurable  quantity  i.  In  order  to  solve  Eq.  (26-10) 
directly  in  terms  of  i rather  than  q,  we  use  the  trick  of  differentiating  both 
sides  of  the  equation  with  respect  to  time.  Since  T0  is  a constant,  we  obtain 


d_ 

dt 


udi  1 • - n 
R dt  Cl  U 


or 


p & _ _ J_  • 
Rdt~  c ' 


(26-11) 


Dividing  both  sides  of  this  equation  by  iR  and  multiplying  by  dt  yields 


di 

i 


(26-12) 


At  the  initial  time  t = 0,  the  current  is  i = V0/R.  Integrating  from  these  ini- 
tial values  to  the  time  tf  and  the  corresponding  current  if,  we  obtain 

[if  di  1_  ftf  , 

J VqIR  t RC  J 0 

Carrying  out  the  integrations,  we  have 

ln  ( V0/jR  ) ='~~RCtf 

Again  dropping  the  subscript  / and  solving  for  i,  we  obtain  the  result 

i=Y±e-tiRC  (26-13) 


This  exponential  decay  of  the  current  is  illustrated  graphically  in  Fig.  26-6. 
The  time  constant  rc  for  an  RC  circuit  is  defined  to  be 

rc  = RC  (26-14) 


1218  Changing  Electric  Currents 


Now  let  us  consider  what  happens  when  the  capacitor  in  Fig.  26-5  is 
charged  to  the  potential  difference  T0  and  the  switch  is  moved  to  position 


t 

RC 


Fig.  26-6  Behavior  of  the  RC  circuit  of  Fig.  26-5.  The 
current  i falls  off  from  its  initial  magnitude  V0/R  to  an 
asymptotic  value  of  zero.  In  this  graph,  the  dimensionless 
quantity  iR/V0  has  been  plotted  against  the  dimensionless 
quantity  t/RC\  compare  with  Fig.  26-2. 


B.  The  battery  is  thus  removed  from  the  circuit,  and  the  capacitor  dis- 
charges by  means  of  a flow  of  charge  from  one  plate  to  the  other  through 
the  resistor  R.  We  can  again  apply  Kirchhoff’s  loop  rule.  Proceeding  in  a 
manner  analogous  to  that  employed  in  Sec.  26-2,  we  note  that  the  circuit  of 
Fig.  26-5  with  the  switch  in  position  B is  identical  to  the  circuit  with  the 
switch  in  position  A , except  for  the  absence  of  the  battery.  So  in  order  to 
make  Eq.  (26-10)  apply  to  the  present  case,  we  need  merely  remove  from  it 
the  term  V0,  which  represents  the  contribution  of  the  battery  to  the  circuit. 
This  gives  us 


-iR 


(26-15) 


Differentiating  both  sides  of  this  equation  with  respect  to  time,  as  we  did 
with  Eq.  (26-10),  yields  the  result  —R  di/dt  — (1/C)?  = 0,  or 

Rdt~  c ' 

and  the  same  rearrangement  of  terms  car- 

o 


This  is  identical  to  Eq.  (26-1 1 
tied  out  on  this  differential  equation  leads  to  the  form 


di 

i 


RC 


dt 


which  is  Eq.  (26-12).  The  integration  is  carried  out  in  the  same  way  as  be- 
fore, except  that  the  initial  conditions  are  different.  The  capacitor,  rather 
than  the  battery,  is  now  the  source  of  the  voltage  which  drives  the  current. 
While  it  is  initially  charged  to  a voltage  of  magnitude  C0  equal  to  the  voltage 
of  the  battery,  the  sense  of  the  capacitor  voltage  is  such  that  it  will  drive  the 
current  in  a sense  opposite  to  the  sense  of  the  current  when  the  switch  in 
Fig.  26-5  is  at  position  A and  the  capacitor  is  being  charged.  (To  see  this, 
compare  the  signs  of  the  charges  on  the  capacitor  plates  with  the  signs  of 
the  battery  terminals  in  the  figure.) 

This  reversal  of  sense  is  denoted  by  writing  the  initial  capacitor  voltage 
as  —T0-  Then  the  initial  current,  when  the  switch  is  moved  to  position  B, 
has  the  value  i = — V0/R.  That  is,  the  sense  of  the  capacitor  voltage  and  the 
sense  of  the  current  are  opposite  those  we  chose  as  positive  when  the  switch 
was  in  position  A.  Choosing  t — 0 to  represent  the  time  when  the  switch  is 
moved  to  position  B,  we  integrate  Eq.  (26-12)  from  the  initial  values  to  the 


26-3  The  RC  Circuit  1219 


time  tf  and  the  corresponding  current  if,  and  we  obtain 


di  = _ J_ 

I -Vo m i Jo 

Carrying  out  the  integrations,  we  have 

if 


dt 


In 


-Vo /R 


RC 


tf 


Again  dropping  the  subscript/ and  solving  for  i,  we  obtain  the  result 

(26-16) 


i — 


Vo 
R 


tIRC 


where  the  negative  sign  denotes  the  fact  that  this  discharging  current  flows 
in  a sense  opposite  to  the  charging  current  of  Eq.  (26-13). 


Up  to  this  point,  we  have  treated  currents  as  quantities  having  magni- 
tude only,  and  we  have  denoted  the  sense  (when  it  was  necessary  to  do  so) 
by  other  means.  But  in  dealing  with  alternating  currents,  there  is  a re- 
peated reversal  of  the  sense  of  current  flow.  It  is  therefore  convenient  to 
treat  currents  as  signed  scalars,  and  we  often  do  so  from  now  on.  However, 
the  sign  of  the  quantity  i does  not  denote  a direction,  as  is  the  case  when  a 
signed  scalar  is  used  to  denote  a one-dimensional  vector.  Rather,  the  sign 
denotes  a sense  of  current  flow.  The  choice  of  the  positive  sense  is  arbitrary 
and  is  made  in  a manner  convenient  to  each  particular  case.  As  always, 
however,  it  is  necessary  to  adhere  consistently  to  the  convention,  once  it  is 
chosen  for  that  case. 


The  mathematical  analysis  developed  in  this  section  is  applied  to  a par- 
ticular RC  circuit  in  Example  26-2. 


EXAMPLE  26-2 

In  an  RC  circuit  like  that  shown  in  Fig.  26-5  with  the  switch  in  position  A,  the  resis- 
tance is  R = 32  kfl  (=32  x 103  Cl),  and  the  capacitance  is  C = 3.5  /u.F  (=3.5  x 
10-6  F).  The  battery  voltage  is  V0  = 6.0  V.  Find  the  initial  current  i0,  the  time  con- 
stant rc,  and  the  time  required  for  the  current  to  decrease  to  the  value  i = t0/2 
from  its  initial  value. 

■ From  Eq.  (26-13)  you  find  that  the  current  at  t = 0 s is  given  by  i0  = V0/R.  You 
thus  have 


6.0  V 

3.2  x 104  If 


= 1.9  x 1 0-4  A 


The  time  constant  is  defined  by  Eq.  (26-14)  to  be  tc  — RC,  so  you  have 

tc  = 3.2  x 104  H x 3.5  x 10"6  F = 0.11  s 

If  the  current  is  to  fall  from  an  initial  value  i0  to  half  that  value,  the  exponential 
term  in  Eq.  (26-13),  i = i0e~tlRC,  must  fall  to  the  value  You  therefore  write 

e~mc  = i 

Taking  the  natural  logarithm  of  both  sides  of  this  equation  gives  you  —t/RC  = In  i, 
or  t = RC  In  2 = tc  hi  2.  You  thus  find  the  numerical  value 

t = tc  In  2 = 0. 1 1 s X In  2 = 7.6  x 10"2  s 


1220  Changing  Electric  Currents 


26-4  THE  LC  CIRCUIT 


Fig.  26-7  An  LC  circuit. 


I lie  circuit  shown  in  Fig.  26-7  is  the  simplest  form  of  what  is  called  an  LC 
circuit.  After  the  switcli  has  been  in  position  A for  some  time,  the  ideal 
capacitor  C is  fully  charged  by  the  battery  with  its  left-hand  plate  positive, 
and  no  current  is  flowing.  Then  the  switch  is  moved  to  position  B.  The  bat- 
tery is  thus  disconnected,  and  the  capacitor  begins  to  discharge  by  driving 
current  through  the  ideal  inductor  L in  the  counterclockwise  sense.  The 
condition  immediately  after  the  switch  is  moved  to  position  B is  that  the 
magnitude  |tj  of  the  current  is  essentially  zero,  but  the  magnitude  \di/dt\  of 
its  rate  of  change  is  relatively  large.  That  is,  the  magnitude  of  the  current  is 
building  up  rapidly  from  its  initial  value  of  zero.  The  changing  current  in 
the  inductor  causes  an  induced  voltage  to  appear  across  the  inductor 
which,  according  to  Lenz’  law,  is  rec]uired  to  have  the  sense  that  opposes 
the  change  in  current.  According  to  Faraday’s  law,  the  magnitude  of  the  in- 
duced voltage  is  L\di/dt\.  Since  the  inductor  is  connected  across  the  capaci- 
tor, this  quantity  must  be  equal  to  the  magnitude  | V|  of  the  voltage  between 
the  capacitor  plates.  Thus  we  must  have  L\di/dt\  = |V|,  or  \di/dt\  = \V\/L. 
But  because  current  is  flowing,  the  capacitor  must  discharge,  and  so  the 
magnitude  |Vj  of  the  voltage  between  its  plates  must  tend  toward  zero. 
Fhns  \di/dt\  also  tends  toward  zero.  In  other  words,  the  current  continues 
to  grow  in  magnitude,  hut  at  a decreasing  rate.  When  |Vj  = 0,  the  equation 
\di/dt\  — \V\/L  demands  that  \di/dt\  = 0 as  well.  That  is,  the  rate  of  change 
of  the  current  is  zero,  and  the  current  itself  therefore  attains  its  maximum 
magnitude.  This  is  not  what  you  might  have  expected  in  advance  — the 
current  has  its  greatest  magnitude  when  the  voltage  across  the  capacitor  is 
zero! 

Further  flow  of  current  results  in  further  transfer  of  charge  between 
the  capacitor  plates,  with  the  right-hand  plate  now  becoming  positive.  Thus 
the  voltage  across  the  capacitor  now  tends  to  oppose  further  flow  of  cur- 
rent, and  the  magnitude  of  the  current  begins  to  diminish.  As  a conse- 
quence, Lenz’  law  now  demands  that  the  sense  of  the  voltage  induced 
across  the  inductor  be  such  as  to  oppose  the  diminution  of  current — that 
is,  the  induced  voltage  tends  to  drive  the  current  in  the  same  counterclock- 
wise sense  as  before.  But  the  magnitude  of  the  current  decreases,  neverthe- 
less, as  the  capacitor  charges.  When  the  current  reaches  zero,  the  capacitor 
is  fully  charged  with  its  left-hand  plate  negative.  At  this  instant  the  situation 
is  just  as  it  was  immediately  after  the  switch  was  moved  to  position  B , except 
that  the  voltage  between  the  capacitor  plates  is  reversed.  Current  then 
begins  to  flow,  but  now  in  the  opposite,  clockwise,  sense.  1 his  initiates  a 
second  half-cycle,  which  is  just  the  reverse  of  the  first  one.  The  second 
half-cycle  ends  when  the  system  has  returned  to  the  same  state  as  at  the 
beginning  of  the  first  half-cycle.  The  system  then  executes  the  third  half- 
cycle, which  is  just  like  the  first  one,  and  then  the  fourth,  which  is  just  like 
the  second,  and  so  on. 

The  behavior  of  the  LC  circuit  is  thus  oscillatory.  It  is  worthwhile  to  con- 
sider the  oscillation  from  an  energy  point  of  view.  Initially,  the  capacitor  is 
charged  and,  according  to  Eq.  (21-54),  the  energy  Ue  = CTjj/2  is  stored  in 
the  electric  field  between  its  plates.  As  current  starts  to  flow,  the  capacitor 
begins  to  discharge,  and  its  energy  diminishes.  But  the  current  flow 
through  the  inductor  implies  the  existence  of  a magnetic  field  in  and 
around  it  and,  according  to  Eq.  (25-40 b),  there  is  a magnetic  energy  Um  = 
Li2/ 2.  As  described  in  the  previous  paragraph,  the  current  reaches  its  ex- 
tremal value  ±imax  just  when  the  capacitor  is  completely  discharged.  At 
that  instant,  all  the  energy  in  the  circuit  is  in  the  form  of  magnetic  energy 


26-4  The  LC  Circuit  1221 


Um.  And  when  the  capacitor  is  charged  to  the  extremal  voltage  ±T0,  the 
current  is  zero,  so  that  all  the  energy  in  the  circuit  is  in  the  form  of  electric 
energy  Ue.  In  general,  there  is  a continual  interchange  of  energy  between 
electric  and  magnetic  forms  as  the  magnetic  held  grows  at  the  expense  of 
the  electric  held,  and  vice  versa. 

There  is  a strong  analogy  to  the  ideal,  frictionless  system  consisting  of 
a body  of  mass  m attached  to  a spring  having  force  constant  k,  which  is  dis- 
cussed in  detail  in  Chap.  6.  Specifically,  the  inductance  L is  analogous  to  the 
mass  m,  while  the  reciprocal  of  the  capacitance,  1/C,  is  analogous  to  the 
force  constant  k.  While  the  LC  circuit  and  the  body-and-spring  system  are 
not  identical  physically,  the  physical  analogy  is  striking  when  seen  in  terms 
of  energy  interchange.  In  the  body-and-spring  system,  the  velocity  v of  the 
body  increases  in  magnitude,  and  it  gains  kinetic  energy  K = mv2  / 2,  as  the 
displacement  x of  the  end  of  the  spring  from  its  equilibrium  position  de- 
creases in  magnitude.  As  this  happens,  the  system  loses  potential  energy 
U = kx2 / 2.  The  opposite  happens  as  x increases  in  magnitude. 

The  mathematical  similarity  of  the  LC  circuit  to  the  body-and-spring 
system  is  even  closer  than  the  physical  similarity.  Indeed,  the  two  systems 
are  mathematically  equivalent.  This  equivalence  will  be  a great  advantage  as 
we  develop  a mathematical  description  of  the  LC  circuit.  We  begin  this 
development  by  writing  Kirchhoff’s  loop  rule  for  the  LC  circuit.  As  we  did 
in  studying  the  RL  circuit  in  Sec.  26-2,  hrst  we  write  the  equation  for  the 
circuit  containing  the  battery — that  is,  the  circuit  of  Fig.  26-7  with  the 
switch  in  position  A.  Beginning  at  the  switch  and  going  around  the  loop  in 
the  clockwise  sense,  we  obtain 

V«-Ljri  = 0 (26-17) 

The  hrst  term  on  the  left  side  of  this  equation  represents  the  voltage  in- 
crease in  passing  through  the  battery  from  the  negative  to  the  positive  ter- 
minal. The  second  term  represents  the  voltage  drop  across  the  inductor, 
and  the  third  represents  the  voltage  drop  across  the  capacitor  when  its 
left-hand  plate  holds  a charge  q and  its  right-hand  plate  holds  a charge  — q. 
[Note  that  when  an  increasing  current  is  Rowing  in  the  sense  shown  in  Fig. 
26-7,  i and  cLi/dt  both  have  positive  values,  and  the  left-hand  plate  of  the 
capacitor  is  acquiring  a positive  charge.  Thus  the  signs  of  the  three  terms 
on  the  left  side  of  Eq.  (26-17)  are  mutually  consistent.] 

It  is  possible  to  solve  Eq.  (26-17).  But  what  is  of  interest  here  is  the 
behavior  of  the  circuit  after  the  capacitor  has  been  charged  and  the  switch 
moved  to  position  B , removing  the  battery  from  the  circuit.  (Similarly,  for 
the  body-and-spring  system  we  often  wish  to  study  the  motion  of  the 
system  after  an  externally  applied  force  has  been  used  to  produce  an  initial 
displacement  of  the  system  from  its  equilibrium  position  and  the  system 
has  been  released  to  oscillate  freely.)  Just  as  in  Sec.  26-2,  we  argue  that  the 
removal  of  the  battery  from  the  LC  circuit  can  be  described  mathematically 
simply  by  removing  the  corresponding  term  V0  from  Eq.  (26-17).  Thus  the 
circuit  of  Fig.  26-7,  with  the  switch  in  position  B.  is  described  by  the  equa- 
tion 


-zA-i.  = 0 

it  c 


(26-18) 


1222  Changing  Electric  Currents 


Making  the  substitution  i = dq/dt  in  Eq.  (26-18)  yields 


— L 


d2q 

dt2 


or 


d2q  _ 1 

dt2  ~ LCq 


(26- 19a) 


This  equation  can  be  expressed  in  terms  of  the  current  i by  differentiating 
both  sides  with  respect  to  time,  to  obtain 


dH  = _ _J_ 
df  LC 


(26-196) 


Both  Eqs.  (26- 19a)  and  (26-196)  are  the  mathematical  equivalent  of  Eq. 
(6-16),  which  governs  the  motion  of  a body  of  mass  m attached  to  a spring 
of  force  constant  k.  Equation  (6-16)  was  derived  by  applying  Newton’s  sec- 
ond law  for  the  one-dimensional  case,  F — m d2x/dt2,  to  the  body-and- 
spring  system  in  which  the  spring  obeys  Elooke’s  law,  F = —kx,  to  obtain 
the  form 

m — = — kx  (26-20a) 


or 


fx  = _ k 
df  m 


(26-206) 


Let  us  make  evident  the  physical  analogy  between  Eq.  (26-20a)  and  Eq. 
(26-19a).  To  do  so,  we  multiply  both  sides  of  Eq.  (26-19a)  by  L and  write  the 
resulting  form  together  with  Eq.  (26-20a)  for  comparison: 

d2x  d2q  1 

m~dfi  = -kx  Lle-  = -cq  (26‘21) 


Here  the  mathematical  equivalence  between  m and  L,  and  between  k and 
1/C,  is  especially  striking.  But  the  parallelism  goes  further  than  this.  For 
each  quantity  having  physical  meaning  for  the  oscillating  body-and-spring 
system — or,  indeed,  any  mechanical  system  containing  bodies  and 
springs — there  is  an  analogous  quantity  having  physical  significance  for 
the  LC  oscillator — or,  indeed,  any  electric  circuit  containing  inductors  and 
capacitors.  These  analogies,  and  others  to  be  discussed,  are  listed  in  Table 
26-1. 


The  frequency  of  oscillation  of  the  LC  circuit  can  be  determined 
without  actually  solving  Eq.  (26- 19a)  or  (26-196),  because  the  mathemati- 
cally equivalent  Ecp  (26-206)  has  already  been  solved  in  Chap.  6.  Here  we 
write  o>0  for  the  angular  frequency  of  the  harmonic  oscillations  of  the 
body-and-spring  system.  Equation  (6-23)  gives  its  value  to  be 


oj0  - 


(26-22) 


For  the  LC  circuit,  the  quantity  mathematically  equivalent  to  k/m  is  1/LC,  as 
you  can  see  by  comparing  Eqs.  (26- 19a)  and  (26-196)  with  Eq.  (26-206). 
Thus  we  have  for  the  angular  frequency  w0  of  the  oscillatory  LC  circuit 


1 

Vlc 


(26-23) 


26-4  The  LC  Circuit  1223 


Table  26-1 


Analogy  between  Ideal  Electric  Circuits  and  Mechanical  Systems 
Electric  circuit  Mechanical  system 


Capacitor  charge  q 

Current  i = ^p- 
dt 


Displacement  x 
Velocity  v = 


Rate  of  change  of  current  = pp^- 

at  at" 

Inductance  L 


Acceleration  a = 


dt2 


Mass  m 


1 

Reciprocal  of  capacitance  — 

Li 

Resistance  R 


Ri  “drop”  -R 


dt 


Q factor  ^ 

Magnetic  potential  energy  Um 
Electric  potential  energy  Ue  = 


LR 

2 


o 

01 

2C 


Angular  frequency  co0  = — ;= 

VLC 

Maximum  capacitor  charge  q0 


Spring  constant  k 

Frictional  drag  coefficient  r 
dx 

Frictional  force  — r -A 
dt 

Q factor 

Kinetic  energy  of  body  K = 


Potential  energy  of  spring  U 


kx 2 
2 


Angular  frequency  w0  = 
Amplitude  A 


The  general  solution  of  Eq.  (26-19a),  d2q/dt 2 = — (1  /LC)q.  can  be 
found  by  analogy  with  Eq.  (6-24),  which  is  the  solution  to  the  equation  dis- 
played above  as  Eq.  (26-206).  Equation  (6-24)  gives  the  displacement  of  a 
mechanical  oscillator  as  a function  of  time  to  be 

x = A cos (aj0t  + S)  where  co0  = \ — (26-24) 

V m 


By  analogy,  the  solution  for  Eq.  (26-19a)  has  the  form 


q = q0  cos(o )0t  + 8)  where  a>0 


(26-25) 


Here  q is  the  charge  on  the  left-hand  plate  of  the  capacitor  (which  can  have 
a positive  or  a negative  value),  q0  is  the  magnitude  of  the  maximum  charge  on 
the  capacitor,  and  8 is  an  arbitrary  phase  constant.  The  phase  constant  is  deter- 
mined by  the  choice  made  for  the  time  t = 0,  as  shown  in  Fig.  26-8,  which  is 
a graphic  display  of  Eq.  (26-25).  In  describing  the  startup  of  the  LC  oscil- 
lator of  Fig.  26-7,  we  have  set  t = 0 when  the  left-hand  plate  of  the  capaci- 
tor has  a charge  q = q0.  Thus  for  this  particular  initial  condition  we  have 
8 = 0,  and  Eq.  (26-25)  becomes 

q = r/o  cos(a>o0  (26-26) 


1224  Changing  Electric  Currents 


<7 


Fig.  26-8  Graph  of  the  charge  q on  the  left-hand 
plate  of  the  capacitor  in  Fig.  26-7  as  a function  of 
d>0t,  when  the  switch  is  in  position  B.  The  max- 
imum charge  on  the  capacitor  is  q0.  The  period  of 
oscillation  is  T.  The  phase  constant  6 depends  on 
the  choice  made  for  the  time  t = 0.  Compare  with 
Fig.  6-15,  which  shows  graphically  the  general  so- 
lution for  the  analogous  body-and-spring  me- 
chanical oscillator  system. 


I he  current  can  be  found  from  the  relation  i = dq/dt.  The  value  of  i 
can  be  of  either  sign,  with  positive  values  corresponding  to  the  clockwise 
sense  in  Fig.  26-7.  Differentiation  of  Eq.  (26-26)  with  respect  to  time  gives 


i = — w0<7o  sin(oj0t) 


do 

Vlc 


sin(oj0t) 


(26-27) 


Comparison  of  Eqs.  (26-26)  and  (26-27)  shows  that  the  charge  on  the 
capacitor  and  the  current  flowing  through  the  system  are  90°  out  of  phase. 
The  current  reaches  its  maximum  magnitude  just  when  the  capacitor  is 
completely  discharged;  note  that  when  sin(&j0t)  = 1,  cos(o>0t)  = 0.  That 
maximum  magnitude  is 

?max  = w0?0  = (26-28) 


When  i = 
ductor  is 


± 'max'  the  energy  Lr/2  stored  in  the  magnetic  held  of  the  in- 
f L qfi  qo 

2 _ 2 TC  ~ 2C 


This  is  equal  to  the  value  attained  by  the  energy  qz/2C  stored  in 
the  electric  held  of  the  capacitor  when  q = ±q0 . Again  this  is  analogous  to 
the  situation  in  the  body-and-spring  system.  When  the  body  attains  an  ex- 
treme displacement  ±xmax,  it  is  at  rest  and  the  energy  of  the  system  is  the 
potential  energy  of  the  spring,  kx^ax/2.  When  the  spring  is  completely  re- 
laxed, the  body,  moving  in  either  the  positive  or  the  negative  direction,  has 
an  extreme  velocity  ±umax,  and  die  energy  of  the  system  is  the  kinetic 
energy  rnv2m ax/2. 


The  correctness  of  the  solution  to  Eqs.  (26-1 9a ) and  (26-1 9b ) for  the  LC  circuit, 
given  by  Eqs.  (26-26)  and  (26-27),  can  be  checked  by  differentiating  the  latter  once 
again  with  respect  to  time,  to  obtain 

^ = -«4q0  cos(w0t)  (26-29) 

Squaring  Eq.  (26-23)  gives  C05  = 1 /LC.  Substituting  this  value  into  Eq.  (26-29) 
yields 

di  1 

dt  = “ LCq°  CO8  (W°t) 

Using  Eq.  (26-26),  we  have 

di  _ q 
dt  ~ LC 


26-4  The  LC  Circuit  1225 


EXAMPLE  26-3 


26-5  THE  LRC 
CIRCUIT 


Multiplying  this  equation  on  both  sides  by  L gives  L di/dt  = —q/C.  This  can 
immediately  be  rewritten  as 


which  is  Eq.  (26-18),  Kirchhoff’s  loop  rule  applied  to  the  LC  circuit.  Thus  at  any 
moment  Eqs.  (26-26)  and  (26-27)  satisfy  the  loop  rule. 


The  mathematical  analysis  developed  in  this  section  is  applied  to  a par- 
ticular LC  circuit  in  Example  26-3. 


A 3.5-p.F  (=  3.5  x 10-6  F)  capacitor  is  charged  to  a potential  difference  V0  = 250  V 
and  connected  across  an  inductor  having  inductance  L = 2.5  mH 
(=2.5  x 10-3  H).  Find  the  magnitude  q0  of  the  initial  electric  charge  on  the  capaci- 
tor, the  total  energy  £ stored  in  the  system,  the  angular  frequency  of  oscillation  a>0, 
the  corresponding  frequency  v0,  and  the  maximum  current  tmax  which  flows 
through  the  system. 

■ You  have  C = 3.5  x 10~6  F.  and  the  charge  is  given  by  the  expression  q0  = 
CV0.  Flence  you  obtain 

<?„  = 3.5  x 10“6  F x 250  V = 8.8  x 10~4  C 

The  energy  in  the  system  is  always  equal  to  the  energy  CV2/ 2 stored  in  the 
capacitor  when  V has  its  initial  value  E0-  You  thus  have 

E = CV2/ 2 

= 3.5  x 10“6  F x (250  V)2/2  = 0.11  J 

The  angular  frequency  of  oscillation  is 

1 

WO  _ 7 

Vlc 

= (2.5  x IQ'3  H x 3.5  x 10“6  F)~1/2  = 1.1  x 104  rad/s 

The  frequency  v0  is  related  to  cj0  by  the  expression  v0  = a>0/2v.  So  you  have 

1 

v0  = — x 1.1  x 104  rad/s  = 1.7  x 103  Hz 

2 7 T 

To  find  the  maximum  current,  you  can  use  Eq.  (26-28)  to  obtain 

tmax  = W0<7o  = W0Cr0 
Inserting  numerical  values  gives 

W = 11  x 104  rad/s  x 3.5  x 10-6  F x 250  V = 9.6  A 

Even  though  the  charge  is  quite  small,  the  maximum  current  tmax  is  appre- 
ciable. This  is  because  the  charge  is  transferred  back  and  forth  quite  rapidly;  it 
moves  from  one  plate  of  the  capacitor  to  the  other  every  half-cycle,  that  is,  2 x 
1700  = 3400  times  every  second. 


If  a resistor  R is  added  to  an  LC  circuit,  as  in  Fig.  26-9,  it  becomes  an  LRC 
circuit.  Just  as  the  mathematics  of  an  ideal  harmonic  oscillator  is  directly 
adaptable  to  the  study  of  the  LC  circuit,  die  mathematics  of  the  damped  oscil- 
lator, discussed  in  Sec.  6-6,  applies  directly  to  the  LRC  circuit.  The  resistor 
supplies  the  “friction”  which  removes  energy  from  the  circuit  as  time 
passes.  An  LRC  circuit  is  physically  analogous  to  the  damped  oscillator.  I hus 
the  equations  governing  the  two  systems  are  mathematically  equivalent. 


1226  Changing  Electric  Currents 


Fig.  26-9  A series  LRC  circuit. 


Almost  every  equation  in  Sec.  6-6  can  be  “translated"  into  terms  appro- 
priate to  the  LRC  circuit  by  substituting  quantities  according  to  Table  26-1. 
The  pair  of  “frictional”  terms  in  that  table  are  the  only  ones  needed  beyond 
those  already  discussed  in  connection  with  the  undamped  systems  in  Sec. 
26-4.  In  order  to  make  the  physical  situation  clear,  however,  the  equations 
describing  the  LRC  circuit  will  be  derived  from  Kirchhoff’s  loop  rule.  You 
will  see  how  close  the  analogy  is  to  the  damped  mechanical  oscillator 
system. 

Just  as  we  did  for  the  LC  circuit,  we  begin  our  analysis  of  the  LRC  cir- 
cuit by  writing  the  loop  rule  for  a circuit  containing  a battery.  With  the 
switch  in  Fig.  26-9  in  position  A,  we  begin  at  the  switch  and  go  around  the 
loop  in  the  clockwise  sense.  We  obtain 


Vo 


di 

LJt 


R‘-c  = ° 


(26-30) 


The  hrst  term  on  the  left  side  of  this  equation  represents  the  voltage  in- 
crease in  passing  through  the  battery  from  the  negative  to  the  positive  ter- 
minal. The  second  term  represents  the  voltage  drop  across  the  inductor  at 
an  instant  when  the  current  has  the  value  i and  the  rate  of  change  di/dt,  the 
third,  the  voltage  drop  across  the  resistor  at  the  same  instant;  and  the 
fourth,  the  voltage  drop  across  the  capacitor  at  the  same  instant,  wdien  its 
left-hand  plate  holds  a charge  q (and  its  right-hand  plate  a charge  —q). 

Our  immediate  interest  lies  in  the  behavior  of  the  system  after  the 
capacitor  has  been  fully  charged  and  the  switch  moved  to  position  B , re- 
moving the  battery  from  the  circuit.  Again  we  represent  this  situation  by 
removing  the  term  V0  from  Eq.  (26-30).  The  circuit  of  Fig.  26-9,  with  the 
switch  in  position  B , is  thus  described  by  the  equation 

-L^--Ri-1  = 0 (26-31) 

at  C 


In  analyzing  this  circuit,  it  is  more  convenient  to  deal  with  currents  rather 
than  charges.  To  obtain  an  equation  involving  only  currents,  we  differen- 
tiate each  term  in  Eq.  (26-31)  with  respect  to  time  and  use  the  definition  i = 
dq/dt  to  obtain 


-T  — - R-  - 0 

L df  Rdt  C ~ ° 


(26-32) 


Solving  this  equation  for  (Ri/dt2  and  rearranging  terms  slightly  give 

d2i  1_  R_di_ 

dt2  ~ ~LC  1 TJt 


This  is  more  complicated  than  the  equation  for  the  undamped  LC  cir- 
cuit because  it  contains  a term  in  di/dt  as  well  as  terms  in  i and  d2i/ dfr . How- 
ever, it  can  readily  be  rewritten  in  a form  which  is  convenient  for  numerical 
solution.  The  procedure  followed  is  the  same  as  that  used  in  Sec.  6-6.  First 
define 

a = -j-Q  and  (3  = -j-  (26-33) 

so  that 

d2i  ch 

(26-34) 


26-5  The  LRC  Circuit 


1227 


This  equation  is  mathematically  equivalent  to  Eq.  (6-31)  for  the  clamped 
mechanical  oscillator,  which  is 


d2x 

~df 


ax 


P 


dx 

dt 


(26-35) 


The  physical  meanings  of  Eqs.  (26-34)  and  (26-35)  are  quite  different. 
The  current  i and  the  displacement  x are  not  analogous  to  each  other.  The 
quantities  a and  /3  in  the  two  equations  stand  for  quite  different  (though 
analogous)  physical  quantities.  Nevertheless,  Eqs.  (26-34)  and  (26-35)  are 
mathematically  equivalent  and  therefore  have  mathematically  equivalent  so- 
lutions. 


In  Sec.  6-6  we  obtained  solutions  to  Eq.  (26-35)  for  several  particular 
cases  by  using  numerical  methods.  There  we  also  obtained  analytical  solu- 
tions to  that  equation  for  die  condition  of  light  damping  and  quoted 
(without  proof)  its  analytical  solutions  for  the  conditions  of  critical 
damping  and  heavy  damping.  Since  only  a simple  change  of  symbolism  is 
required  to  make  these  analytical  solutions  apply  to  Eq.  (26-34),  it  might 
seem  that  this  is  what  we  should  do  next.  But  in  order  to  gain  a physical 
understanding  of  the  properties  of  the  LRC  circuit  described  by  Eq. 
(26-34),  we  must  study  plots  of  the  current  i flowing  through  the  circuit 
versus  the  time  t,  for  the  distinct  cases  of  light  damping,  critical  damping, 
and  heavy  damping.  It  is  actually  easier  to  obtain  such  plots  by  carrying  out 
numerical  solutions  to  the  differential  equation  than  by  plotting  its  analyti- 
cal solutions!  Either  way,  numerical  work  on  some  computing  device  must 
be  done  to  produce  plots.  But  when  this  work  involves  numerical  solutions 
to  the  differential  equation,  it  is  done  in  exactly  the  same  way  in  all  cases, 
and  a program  for  the  device  already  is  available.  If  the  numerical  work  in- 
volves using  the  analytical  solutions,  it  must  be  done  differently  for  each  of 
the  three  cases  because  the  forms  of  the  analytical  solutions  are  different 
for  each,  and  three  different  programs  must  be  developed. 

Hence  we  write  Eq.  (26-34)  in  the  standard  form  used  earlier  in  car- 
rying out  the  numerical  solution  of  second-order  differential  equations.  To 
do  this,  we  give  the  name  Q to  the  right  side  of  Eq.  (26-34) — not  to  be  con- 
fused with  the  electric  charge  q\  The  equation  can  then  be  written 

= Q where  Q = — ai  — /3  ^ (26-36) 

Now  the  damped  oscillator  program  listed  in  the  Numerical  Calculation 
Supplement  can  be  applied  directly  to  the  LRC  circuit.  Examples  26-4 
through  26-7  investigate  the  behavior  of  the  current  as  a function  of  time 
in  the  LRC  circuit  of  Fig.  26-9.  In  all  cases,  the  capacitor  is  initially  charged 
to  a voltage  V0  = 250  V with  the  switch  in  position  A.  At  time  t = t0  — 0, 
the  switch  is  thrown  to  position  B,  removing  the  battery  from  the  circuit.  In 
all  the  examples,  we  set  C = 10.0  p.F  and  L = 0.100  H.  In  Example  26-4 
the  resistance  R is  given  the  value  0,  so  that  the  system  reduces  to  the  LC 
circuit  discussed  in  Sec.  26-4.  In  subsequent  examples,  increasing  values  of 
resistance  are  used. 

Attention  must  be  given  to  setting  the  initial  conditions.  Since  no  cur- 
rent is  flowing  through  the  resistor  and  the  inductor  at  the  instant  when  the 
switch  is  moved  to  position  B , the  initial  current  i0  must  be  zero.  But  what  is 
the  initial  value  of  di/dt?  Its  sense  will  be  negative,  since  the  current  i will  in- 


i (in  A) 


itially  change  from  zero  to  more  and  more  negative  values  as  it  begins  to 
flow  in  the  counterclockwise  (negative)  sense  through  the  circuit  of  Fig. 
26-9.  We  reach  this  conclusion  by  noting  that  the  initial  voltage  across  the 
capacitor  is  — Vo*  having  the  same  magnitude  as,  but  opposite  sense  to,  the 
voltage  T0  across  the  battery.  Since  no  current  is  initially  flowing,  there  is  in- 
itially a zero  voltage  drop  across  the  resistor.  So  the  initial  magnitudes  |V0| 
and  L\(di/dt)0\  of  the  voltages  across  the  capacitor  and  inductor  must  be 
equal,  or  \(di/dt)0\  = \V0\/L.  Including  the  sign,  we  have 


(di  \ = = -250  V 

\dt) o L 0.100  H 


-2.50  x Kr3  A/s 


EXAMPLE  26-4 

Run  the  damped  oscillator  program  with  the  following  set  of  initial  conditions  and 
parameters,  corresponding  to  the  values  of  V0,  L , R,  and  C given  in  the  discussion 
immediately  above.  (The  definitions  a = \/LC  and  (3  = R/L  are  used  to  determine 
the  numerical  values  of  a and  /3.) 

i0  = 0 (in  A);  (di/dt) 0 = —2.5  x 103  (in  A/s);  t0  = 0;  At  = 0.2  x It)-3  (in  s);  a = 
1 x 106  (in  s“2);  (3  = 0 (in  s"1). 

■ The  sequence  of  dots  in  Fig.  26-10  is  a graph  of  the  current  i as  a function  of 
time.  The  curve  is  sinusoidal  (as  you  can  verify  by  measuring  selected  points)  in 
agreement  with  the  analytical  prediction  of  Eq.  (26-27),  i = ~co0q0  sin(co0t).  You  can 
find  by  direct  measurement  on  the  graph  that  the  period  is  T = 6.28  ms  ( = 6.28  x 


Fig.  26-10  Plots  of  the  results  of  the  numerical  calculations  of  Examples  26-4  and  26-5.  The 
dots  display  the  current  i as  a function  of  time  for  the  TRC  circuit  of  Fig.  26-9  when  R = 0,  so 
that  the  circuit  becomes  an  LC  circuit  (Example  26-4).  Time  is  measured  from  the  moment, 
t = 0,  when  the  switch  is  moved  from  position  A to  position  B.  The  capacitance  has  the  value 
C = 10.0  /U.F  and  is  initially  charged  so  that  the  voltage  across  it  is  — V0  = 250  V.  The  induc- 
tance has  the  value  L = 0.100  H.  The  x’s  display  i for  the  same  circuit  when  R = 50.0  Cl  (Ex- 
ample 26-5).  The  values  of  L and  C and  the  initial  conditions  are  unchanged  from  Example 
26-4. 


26-5  The  LRC  Circuit  1229 


10  3 s).  Equation  (26-23)  predicts  an  angular  frequency  w0  = ( LC ) 112 . Since  the  re- 
lation between  the  period  and  the  angular  frequency  is 


w 


you  have 

T = 2tt\/LC  = 2tt  x (0.100  H x 10.0  x 10“6  F)1/2  = 2tt  x 10“3  s = 6.28  ms 

This  agrees  with  the  numerically  calculated  result. 

According  to  the  graph,  the  maximum  current  is  imax  = 2.50  A.  Equation 
(26-28)  predicts 

?o  CV o 10.0  x 10~6  F x 250  V 

w ~~  VZZ  “ vZZ  ~ (0.100  H x 10.0  X 10"6  F)1/2  = 2'5°  A 

Again  there  is  good  agreement. 

In  Example  26-5,  a resistance  R = 50.0  O is  added,  thus  making  the 
LC  circuit  into  an  LRC  circuit. 


EXAMPLE  26-5 

Run  the  damped  oscillator  program  with  the  following  set  of  initial  conditions  and 
parameters,  corresponding  to  the  values  - V0  = 250  V,  L = 0.100  H,  and  C = 
10.0  fi¥  (all  as  in  Example  26-4)  and  R = 50.0  Cl.  (The  definitions  a = 1/LC  and 
/ 3 = R/L  are  used  to  determine  the  numerical  values  of  a and  /3.) 

i0  = 0 (in  A);  (di/dt)0  = - 2.5  x 103  (in  A/s);  t0  = 0;  At  = 0.2  x 10-3  (in  s);  a = 

1 x 106  (in  s"2);  (3  = 500  (in  s_1). 

■ The  sequence  of  x’s  in  Fig.  26-10  is  a graph  of  the  current  i as  a function  of 
time.  I he  oscillation  is  a lightly  damped  one.  Compare  this  graph  with  that  of  Fig. 
6-18,  which  shows  the  displacement  of  a lightly  damped  mechanical  oscillator  as  a 
function  of  time.  In  both  graphs,  the  sinusoidal  oscillation  decays  in  an  apparently 
exponential  fashion. 


As  the  damping  in  the  circuit  is  increased  by  increasing  the  resistance 
R and  thus  the  value  of  /3  = R/L.  the  decay  of  the  oscillations  proceeds 
more  rapidly.  |ust  as  in  the  case  of  the  mechanical  oscillator,  a value  of  /3  is 
reached  where  the  LRC  circuit  becomes  critically  damped.  At  this  point  the 
system  just  ceases  to  oscillate;  instead,  after  the  initial  growth  in  its  magni- 
tude the  current  decays  asymptotically  to  zero,  lire  criterion  for  critical 
damping  can  be  found  from  the  expression  for  the  angular  frequency  of 
the  oscillation  in  terms  of  (3  and  a = \/LC.  If  we  write  the  angular  fre- 
quency as  a»0,  Eq.  (6-42)  shows  that 

(26-37) 

This  expression  is  found  by  solving  the  damped  oscillator  equation,  Eq. 
(26-35),  analytically  for  the  case  of  light  damping,  as  is  done  in  Sec.  6-6. 
Other  things  remaining  the  same,  an  increase  in  damping  is  reflected  in  an 
increase  in  the  value  of  f3  and  thus  a decrease  in  the  angular  frequency  o»0. 
In  the  case  of  the  circuit  studies  in  Examples  26-4  and  26-5,  the  damping 
“friction”  is  provided  by  the  resistance  R.  But  it  is  difficult  to  see  the  effect 
of  damping  on  angular  frequency,  which  is  inversely  proportional  to 
period,  by  comparing  the  two  curves  of  Fig.  26-10  because  (32/ 4 « a. 


1230  Changing  Electric  Currents 


/(in  A) 


The  condition  of  critical  damping  is  that  in  which  o>0  = 0.  Then  the 
period,  T = 2tt/o)q,  becomes  infinite  and  there  is  no  oscillation  at  all.  From 
Eq.  (26-37),  the  condition  for  critical  damping  is  a — (32/ 4 = 0,  so  that 


4 


= a 


For  the  system  considered  in  Examples  26-4  through  26-7,  critical  damping 
occurs  when  fi  = 2000  H/H  = 2000  s_1,  as  you  will  see  in  Example  26-6. 


EXAMPLE  26-6  ■■■■«■■■■ mmmmmmmmmmmmmmmmmmmm ■■■■■■■ 

Run  the  damped  oscillator  program  with  the  following  set  of  initial  conditions  and 
parameters,  corresponding  to  the  values  — F0  = 250  V,  L = 0.100  H,  and  C = 
10.0  (A F (all  as  in  Examples  26-4  and  26-5)  and  R = 200  Cl. 

i0  = 0 (in  A);  (di/dt)0  = — 2.5  X 103  (in  A/s);  t0  = 0;  At  = 0.2  x 10-3  (in  s);  a 
1 X 106  (in  s-2);  /3  = 2000  (in  s-1)- 

■ The  sequence  of  dots  in  Fig.  26-1 1 is  a graph  of  the  current  i as  a function  of 
time.  After  an  initial  growth,  the  magnitude  of  the  current  decays  in  an  approxi- 
mately exponential  way  toward  zero. 


With  further  increase  in  the  resistance  R,  the  LRC  circuit  of  Fig.  26-9 
becomes  heavily  damped,  or,  as  the  condition  is  sometimes  called,  overdamped. 
In  Example  26-7  the  resistance  is  increased  to  R = 250  f 1. 


EXAMPLE  26-7  ——— — 1 

Run  the  damped  oscillator  program  with  the  following  set  of  initial  conditions  and 
parameters,  corresponding  to  the  values  — V0  = 250  V,  L = 0.100  H,  and  C = 
10.0  fjiF  (all  as  in  Examples  26-4  through  26-6)  and  R = 250  14. 

i0  = 0 (in  A);  (di/dt) 0 = —2.5  x 103  (in  A/s);  t0  = 0;  At  = 0.2  x 10-3  (in  s); 
a = 1 X 106  (in  s~2);  (3  = 2500  (in  s-1)- 

■ The  sequence  of  x’s  in  Fig.  26-11  is  a graph  of  the  current  i as  a function  of 
time.  Because  the  “friction”  of  the  electric  resistance  removes  energy  from  the  heav- 
ily damped  system  more  rapidly  than  in  the  critically  damped  case  of  Example  26-6, 
the  magnitude  of  the  maximum  current  is  smaller  than  in  that  case  and  is  attained 
sooner.  The  subsequent  decay,  however,  is  slower. 


f (in  ms) 


Fig.  26-11  Plots  of  the  results  of  the  numerical 
calculations  of  Examples  26-6  and  26-7.  The  resist- 
ance is  increased  still  further  from  the  case  stud- 
ied in  Example  26-5,  all  other  quantities  being 
kept  the  same.  The  dots  display  the  current  i as  a 
function  of  time  for  the  case  of  Example  26-6, 
where  R = 200  O.  This  is  the  condition  of  critical 
damping,  in  which  oscillation  just  ceases  and  the 
magnitude  of  the  current  falls  asymptotically  to 
zero  after  an  initial  rise.  The  x’s  display  the  corre- 
sponding results  for  the  case  of  Example  26-7, 
where  R = 250  fl.  Here  the  LRC  circuit  of  Fig. 
26-9  is  heavily  damped  or  overdamped.  While  the 
magnitude  of  i does  not  grow  to  a value  as  great  as 
that  in  the  critically  damped  case,  the  decay 
toward  zero  is  slower  and  is  not  asymptotic  to  zero. 
(For  practical  purposes,  nevertheless,  the  current 
does  become  negligible  when  a sufficient  time  has 
passed.) 


26-5  The  LRC  Circuit  1231 


So  far  we  have  studied  the  time  dependence  of  the  current  i flowing 
through  the  LRC  circuit.  1 he  same  numerical  calculation  procedure  can  be 
used  to  study  the  time  dependence  of  the  charge  q on  the  capacitor.  In 
order  to  cast  Ecp  (26-31), 

into  a form  convenient  for  numerical  solution  for  q , we  again  use  the  defi- 
nition i = dq/dt.  But  this  time  it  is  used  to  eliminate  i.  This  gives 

-Lf-R*-c  = 0 (26-38) 

which  is  equivalent  in  mathematical  form  to  Eq.  (26-32)  for  the  current  i. 
Further  manipulation  thus  leads  to  the  standard  mathematical  form  used 
for  numerical  solution,  that  given  for  the  current  by  Eq.  (26-36).  The  cor- 
responding equation  for  the  charge  q on  the  left-hand  plate  of  the  capaci- 
tor is 

= Q where  Q = - aq  - p ^ (26-39) 

Here  we  have  a = \/LC  and  (3  = R/L , just  as  before. 

To  solve  this  equation  numerically,  the  initial  conditions  must  be  de- 
termined. When  the  switch  in  Fig.  26-9  is  first  moved  to  position  B.  the 
capacitor  is  fully  charged  with  the  left-hand  plate  positive,  so  that 

q = q0  = CV o = 10.0  x Hr6  F x 250  V = 2.50  x 10“3  C 

And  since  the  current  must  begin  at  zero,  we  have  i0  = {dq/dt) 0 = 0.  These 
are  the  conditions  used  in  Example  26-8.  The  values  of  V0  and  C given 
immediately  above,  and  the  values  L = 0.100  H and  R = 50.0  H,  corre- 
spond to  those  used  in  Example  26-5,  the  lightly  damped  case. 


EXAMPLE  26-8 

Run  the  damped  oscillator  program  with  the  following  set  of  initial  conditions  and 
parameters. 

q0  = 2.5  x 10-3  (in  C);  (dq/dt) 0 = 0 (in  C/s);  t0  = 0;  A t = 0.2  x 10-3  (in  s); 
a = 1 x 1()6  (in  s-2);  (3  = 500  (in  s-1). 

■ The  results  of  the  calculation  are  plotted  as  dots  in  Fig.  26-12.  For  comparison, 
the  current  i calculated  for  the  same  system  in  Example  26-5  is  replotted  as  a series 
of  x's  on  the  same  graph. 


An  important  distinction  between  the  LRC  circuit  typified  by  Example 
26-8  and  the  undamped  LC  circuit  has  to  do  with  the  phase  difference 
between  the  current  i and  the  charge  q on  the  capacitor.  Consider  first  the 
LRC  circuit  of  Example  26-8.  In  Fig.  26-12,  the  points  at  which  the  curves 
representing  i and  q cross  the  t axis  in  going  from  negative  to  positive  val- 
ues are  marked  with  ticks  for  clarity.  The  phase  difference  is  customarily 
represented  as  an  angle  </>.  In  the  present  case,  you  can  see  from  the  graph 
that  one  complete  period  (or  360°)  requires  approximately  6.4  ms  to  com- 
plete. Also,  the  i curve  crosses  the  t axis  in  going  from  negative  to  positive 
values  approximately  1.9  ms  before  the  q curve  does  so.  We  say  that  the 
current  leads  the  charge  in  phase.  The  amount  is  specified  by  the  phase 
difference  angle  </>,  or  phase  angle  <p  for  short,  of  the  current  with  respect 
to  the  charge.  Its  value  is  given  by  the  ratio 


<p  _ 1 .9  ms 
360°  ~~  6.4  ms 


(26-40a) 


1232  Changing  Electric  Currents 


z (in  A) 


Fig.  26-12  Plot  of  the  charge  q on  the 
left-hand  plate  of  the  capacitor  in  the 
LRC  circuit  of  Fig.  26-9,  as  calculated  in 
Example  26-8.  and  shown  as  a series  of 
dots.  The  conditions  are  identical  to 
those  of  the  lightly  damped  circuit  of 
Example  26-5.  For  comparison,  the  cur- 
rent i obtained  in  that  example  is  re- 
plotted here  as  a series  of  x’s.  The  loca- 
tions where  the  two  curves  cross  the  t 
axis  in  going  from  negative  to  positive 
values  are  marked. 


The  time  6.4  ms  is  the  period  of  both  the  current  oscillation  and  the  charge 
oscillation.  Solving  for  the  phase  angle  of  the  current  with  respect  to  the 
charge,  we  obtain 

<f>  - 107°  (26-406) 

for  the  particular  LRC  circuit  of  Example  26-8. 

This  phase  angle  applies  to  all  parts  of  the  pair  of  curves  past  the  first 
half  oscillation  or  so,  and  not  just  to  the  points  where  they  cross  the  t.  axis. 
(It  is  simply  most  convenient  to  measure  <f>  at  the  crossing  points.)  During 
the  earliest  part  of  the  process  depicted  in  Fig.  26-12,  however,  the  initial 
conditions  force  the  phase  angle  to  be  90°.  It  takes  a little  time  for  the  phase 
angle  to  assume  its  “natural”  value  of  about  107°  for  the  particular  LRC  cir- 
cuit discussed  here. 

In  the  undamped  LC  circuit,  the  charge  and  the  current  are  90°  out  of 
phase  with  each  other,  with  the  current  leading  the  charge.  You  can  see 
this  by  comparing  Eq.  (26-26),  q — q0  cos(o )0t),  with  Eq.  (26-27),  i — 
— ( q0/\/LC ) sin(oi0t).  The  physical  reason  for  this  90°  phase  difference  is 
that  the  magnitudes  of  the  voltages  across  the  capacitor  and  the  inductor 
must  be  equal,  in  order  to  satisfy  Kirchhoff’s  loop  rule.  It  follows  that  q = 0 
when  \di/dt\  — 0.  But  when  q = 0 in  an  LRC  circuit  (such  as  the  one  consid- 
ered in  Example  26-8),  the  magnitude  L\di/dt\  of  the  voltage  across  the  in- 
ductor is  equal  to  the  magnitude  R\i\  of  the  voltage  across  the  resistor,  and 
\di/dt\  cannot  Ire  zero  at  that  moment.  Therefore  the  charge  and  the  cur- 
rent cannot  be  90°  out  of  phase  when  the  initial  conditions  no  longer  domi- 
nate the  behavior  of  the  LRC  circuit. 

The  equations  for  the  time  dependence  of  the  current  and  the  charge, 
Eqs.  (26-32)  and  (26-38),  respectively,  are  mathematically  identical  not  only 
to  each  other,  but  also  to  the  corresponding  equation  for  the  damped  me- 
chanical oscillator,  Eq.  (6-31).  They  therefore  possess  corresponding  ana- 
lytical solutions.  We  can  take  advantage  of  the  analogies  listed  in  Table  26- 1 
to  write  die  solutions  immediately  by  analogy  with  the  solutions  given  by 


26-5  The  LRC  Circuit  1233 


Eqs.  (6-43),  (6-45),  and  (6-46).  Equation  (6-43)  is  the  solution  to  the  me- 
chanical oscillator  equation  for  light  damping.  By  using  our  present  nota- 
tion a>0  for  the  angular  frequency  of  oscillation,  it  is 


x — Ae  Wl2)t  cos(co0 1 4-  8)  (26-4 la) 

where 

/ B2\ 1/2  B2 

cu0  = ( a — ) and  ~ < a (light  damping)  (26-416) 

Here  A and  8 are  constants  which  must  be  adjusted  so  that  Eq.  (26-4 la)  can 
be  fitted  to  any  pair  of  initial  conditions  specified  by  x0  and  (dx/dt)0.  In  the 
mechanical  case,  a = k/m  depends  on  the  spring  constant  k and  the  mass  m, 
while  (3  = r/m  depends  on  the  viscous  frictional  drag  coefficient  r and  the 
mass.  By  analogy,  the  solution  for  the  LRC  circuit  is 

q = Ae~wi2)t( cos  a >0t  + 8)  (26-42a) 

where 

aj0  = fa  - yj- j and  < a (light  damping)  (26-426) 

Here  A and  8 are  constants  which  must  be  adjusted  so  that  Eq.  (26-42a)  can 
be  fitted  to  any  pair  of  initial  conditions  specified  by  q0  and  ( dq/dt)0 . We  use 
the  analogies  given  in  Table  26-1  to  go  from  the  mechanical  quantities  (in 
the  second  column  of  the  table)  to  the  corresponding  electrical  quantities 
(in  the  first  column).  Specifically,  we  substitute  the  reciprocal  of  the  capaci- 
tance, 1/C,  for  the  spring  constant  k\  the  inductance  L for  the  mass  m,  and 
the  resistance  R for  the  frictional  drag  coefficient  r.  This  gives  us  a = 
1/Z.C  in  place  of  the  corresponding  mechanical  quantity  k/m  and  (3  = R/L 
in  place  of  the  corresponding  mechanical  quantity  r/m. 

The  angular  frequency  oj0  is  called  the  natural  frequency  of  the  LRC 
or  LC  circuit.  In  order  to  express  it  directly  in  terms  of  the  circuit  parame- 
ters L,  R,  and  C,  we  substitute  the  values  of  a and  /3just  determined  into  the 
definition  of  a>0  given  in  Eq.  (26-426).  In  these  terms,  co0  becomes 

/ 1 R2  \112 

“°  = (LC~4n)  (26-43> 

For  the  LC  circuit,  in  which  R = 0,  the  natural  frequency  has  the  value 
a>o  = (1/LC)1/2  given  by  Eq.  (26-23).  It  is  evident  from  inspection  of  Eq. 
(26-43)  that  the  value  of  cd0  decreases  as  R increases,  as  already  noted  in  the 
discussion  following  Example  26-5.  Critical  damping  occurs  when  w0  = 0. 
According  to  Eq.  (26-43),  this  takes  place  when  1/LC  = R2/4L2,  or 

R2  = 4 j;  (26-44) 

In  Example  26-9  the  analytical  discussion  immediately  above  is  applied 
to  the  LRC  circuit  studied  numerically  in  Examples  26-5  and  26-8. 

EXAMPLE  26-9 

Calculate  the  natural  frequency  co0  for  the  lightly  damped  LRC  circuit  of  Examples 
26-5  and  26-8.  Assume  that  the  quantities  L,  R,  and  C are  all  known  to  three  signifi- 
cant figures. 

1234  Changing  Electric  Currents 

Using  Eq.  (26-43),  you  have 

1 


a>o 


(50.0  Q)2  \112 


0.100  H x 10.0  x 10“6  F 4 x (0.100  H)2 


= 968  rad/s 


How  would  you  obtain  a numerical  value  of  w0  from  Fig.  26-12  to  compare  with  this 
analytical  value? 


The  equation  for  the  current  i in  the  lightly  damped  LRC  circuit  can  be 
found  by  differentiating  both  sides  of  Eq.  (26-42a)  with  respect  to  time. 
This  gives 

i = ^ c_(/3/2M  cos(o >0t  4 - 8)  - a>0Tc_(/3,2)f  sin(co0t  + 8) 


or 


i = — Ae 


-03/2X 


cos(a»o<  + 8)  + a>0  sin(a>ot  + 8) 


for < a (light  damping)  (26-45) 


The  same  process  of  analogy  based  on  Table  26-1  can  be  used  to  find  the 
behavior  of  q and  i as  functions  of  time  in  the  critically  damped  case,  by  beginning 
with  the  analytical  solution  for  the  critically  damped  mechanical  oscillator,  Eq. 
(6-45).  For  the  heavily  damped  case,  the  appropriate  equation  to  begin  with  is  Eq. 
(6-46). 


26-6  ALTERNATING- 
CURRENT  CIRCUITS: 

NUMERICAL 

DESCRIPTION 


Fig.  26-13  A series  ac  circuit.  The 
switch  is  at  A for  a long  time  before  t = 
0 so  that  the  capacitor  becomes  charged 
to  the  battery  voltage  V0,  with  its  left 
plate  positive.  At  t — 0 the  voltage  of  the 
ac  source  is  V0  cos(c ot)  = F0,  with  the 
same  polarity  as  the  battery,  and  the 
switch  is  thrown  to  B.  Immediately 
after,  the  magnitudes  of  the  voltages 
across  the  capacitor  and  of  the  ac  source 
are  equal,  so  there  can  be  no  voltage 
drop  across  the  resistor  or  inductor. 
Hence  initially  i = di/dt  = 0. 


In  all  but  the  simplest  applications,  the  practical  use  of  electricity  involves 
alternating  currents,  which  are  currents  whose  sense  changes,  usually  peri- 
odically. This  is  true  in  power  transmission,  where  the  generator  is  an  elab- 
oration of  the  type  described  in  Sec.  25-4  and  supplies  a voltage  reasonably 
well  approximated  by  the  pure  sinusoidal 

V — V0  cos(c ot  + 8)  (26-46) 

The  angular  frequency  at  used  in  power  transmission  is  relatively  low. 
(Nearly  everywhere  in  the  world,  the  standard  power  frequency  is  either 
v = (d/2tt  = 60  Hz  or  v = 50  Hz.)  Considerably  higher  frequencies 
usually  are  of  interest  in  such  areas  as  communication  and  computation, 
and  the  waveforms  often  are  much  more  complex  than  that  specified  by 
Eq.  (26-46).  Low-frequency  alternating  currents  are  essential  to  the  opera- 
tion of  animal  nervous  systems,  which  are  in  many  ways  analogous  to  elec- 
tronic communication  and  computation  systems. 

Alternating-current  (ac)  circuits  can  be  extremely  complicated,  with 
sources  of  alternating  emf,  inductors,  resistors,  capacitors,  and  other  types 
of  circuit  elements  connected  in  elaborate  networks.  However,  many  of  the 
fundamental  properties  of  such  networks  can  be  seen  in  the  simple  series 
ac  circuit  of  Fig.  26-13.  This  circuit  is  constructed  by  adding  a source  of 
alternating  emf  to  the  LRC  circuit  of  Sec.  26-5. 

A source  of  alternating  emf,  called  an  ac  source,  produces  a voltage 
which  can  be  represented  by  Eq.  (26-46).  Its  angular  frequency  at  is  called 
the  driving  frequency.  In  circuit  diagrams,  the  standard  symbol — Q) — is 
used.  The  miniature  sine  curve  in  the  symbol  refers  to  the  time  depend- 
ence given  by  Eq.  (26-46). 


26-6  Alternating-Current  Circuits:  Numerical  Description  1235 


Unless  the  driving  frequency  c o is  quite  high  (in  which  case  electromag- 
netic radiation,  discussed  in  Chap.  27,  becomes  significant),  Kirchhoff's 
loop  rule  can  be  used  to  describe  the  voltage  changes  around  the  circuit  of 
Fig.  26-13.  Except  for  the  fact  that  the  ac  source  produces  a variable  emf  V 
in  place  of  the  steady  emf  T0  produced  by  the  battery,  the  circuit  is  the 
same.  At  any  instant  when  the  switch  is  in  position  B,  the  sum  of  the  voltage 
changes  around  the  circuit  must  be  zero,  and  we  have 

V - L^--  Ri  -■£  = 0 (26-47) 

at  C 

This  is  the  same  as  Eq.  (26-30)  for  the  circuit  of  Fig.  26-9,  except  that  the 
variable  term  V replaces  the  constant  term  V0. 


Equation  (26-47)  can  be  cast  into  a form  suitable  for  numerical  solu- 
tion very  much  in  the  same  wray  as  Eq.  (26-31),  which  described  the  behav- 
ior of  the  LRC  circuit.  Differentiating  every  term  in  Eq.  (26-47)  with 
respect  to  time  yields 


dV  _ dH 
dt  L dt 2 


_ p dl  _ 1 
RJt  c 


(26-48) 


This  can  be  rearranged  into  the  form 


dH  = J_ 

dt2  L 


i-  - R — + — \ 


C 


dt  dt  ) 


We  now  use  Eq.  (26-46)  to  evaluate  dV/dt  explicitly.  Its  value,  and 
therefore  the  solution  of  Ecj.  (26-48),  will  depend  on  the  choice  of  the 
phase  constant  S.  We  choose  the  value  8 = 0,  which  gives  Eq.  (26-46)  in  the 
particular  form 

V = V0  cos  (cot)  (26-49) 

Then  we  differentiate  both  sides  of  Eq.  (26-49)  wdth  respect  to  time.  This 
gives  dV/dt  = ~ojV0  sin(o>t).  We  thus  have 

d2i  11  di 

d?  = L ~ c ~ R Jt  ~ ‘°V’ Sm{u,l>  \ 


or 


dH 

dt2 


R di 
L dt 


oiT0  • 

— — - sm(ajt) 


(26-50) 


The  quantities  a = \/LC  and  /3  = R/L  have  already  been  defined.  In  the 
same  spirit,  the  term  multiplying  the  sinusoidal  term  on  the  extreme  right  of 
Eq.  (26-50)  is  defined  as 


y = 


ojVq 

L 


(26-51) 


Introducing  a,  /3,  and  y,  we  can  write  Eq.  (26-50)  in  the  form 

= Q where  Q = ~ai  - (3  ^ - y sin(ajt)  (26-52) 

This  equation  can  be  solved  numerically  by  using  the  driven  LRC  circuit  pro- 
gram given  in  the  Numerical  Calculation  Supplement.  In  Examples  26-10 
through  26- 1 2 an  ac  source  is  added  to  the  lightly  damped  LRC  circuit  of  Ex- 
amples 26-5  and  26-8,  as  in  Fig.  26-13.  The  natural  frequency  of  the  ZJ?C  cir- 
cuit, calculated  in  Example  26-9,  is  o>0  = 968  rad/s.  The  driving  frequencies 


1236  Changing  Electric  Currents 


/(  in  A) 


ai  in  Examples  26-10  through  26-12  are  chosen,  respectively,  to  be  lower 
than,  equal  to,  and  higher  than  the  natural  frequency,  fust  as  in  Examples 
26-5  and  26-8,  the  following  values  are  used:  V0  = 250  V;L  = 0.100  H;i?  = 
50.0  ft;  C = 10.0  /x F.  The  initial  time  t = 0 is  chosen  as  that  when  the  switch 
in  Fig.  26- 1 3 is  thrown  to  B.  As  is  explained  in  the  caption,  this  starts  the  os- 
cillation “gently,”  with  initial  conditions  z0  = (di/dt) 0 = 0. 

In  Example  26-10,  the  driving  frequency  is  a>  = 200  racl/s. 


Run  the  driven  LRC  circuit  program  with  the  following  set  of  initial  conditions  and 
parameters,  corresponding  to  the  values  of  V0,  L,  R , C,  and  w given  immediately 
above.  (The  definitions  a = l/LC,  fi  = R/L,  and  y = wV0/L  are  used  to  determine 
the  numerical  values  of  a,  fi,  and  y.) 

i0  = 0 (in  A);  (di/dt)0  = 0(inA/s);f0  = 0;At  = 0.8  x 10_3(ins);a  = 1 x 1 06  (in 
s-2);  fi  = 500  (in  s-1);  y = 5 X 1()5  (in  A/s2);  m = 200  (in  racl/s). 

■ The  calculated  values  of  the  current  i are  plotted  as  a function  of  time  as  dots  in 
Fig.  26-14.  At  the  very  beginning  of  the  plot — up  to  about  t = 2 ms — there  is  a 
buildup  of  current  in  the  negative  sense,  which  looks  as  if  it  might  be  exponential. 
But  soon  this  nonrecurrent  transient  behavior  comes  to  an  end,  and  after  about  the 
first  half-cycle  the  current  settles  into  a steady-state  pattern  which  appears  to  be  sinu- 
soidal. The  angular  frequency  of  the  oscillation  is  not  the  natural  frequency  of  the 
corresponding  LRC  circuit,  but  the  driving  frequency  of  the  ac  source.  You  can 
check  this  point  by  measuring  the  time  between  any  two  points  one  cycle  apart.  The 
amplitude  of  the  oscillation  is  0.51  A. 

The  driving  voltage, 

V = V0  cos(cot)  = 250  V X cos[(200  rad/s)/] 


0.5 

0.4 

0.3 

0.2 

0.1 

0 


-0.1 


-0. 


2 


-0.3 


-0.4 


-0.5 


Fig.  26-14  Plot  of  the  results  of  the  numerical  calculation  of  Example  26-10.  The  circuit  of 
Fig.  26-13  is  driven  by  a voltage  whose  time  dependence  is  given  by  the  function  V = 
(250  V cos[(200  rad/s)t],  where  t is  expressed  in  seconds.  This  function  is  represented  on  the 
graph  by  the  dashed  line,  and  its  values  must  be  read  on  (he  vertical  axis  at  the  right  side  of 
the  figure.  The  resulting  current  is  represented  by  the  dots,  whose  values  must  be  read  on 
the  vertical  axis  at  the  left  side  of  the  figure.  The  circuit  parameters,  given  in  the  example, 
lead  to  a natural  frequency  oi0  = 968  rad/s.  The  fact  that  the  driving  frequency,  co  = 
200  rad/s,  is  substantially  smaller  than  the  natural  frequency  is  associated  with  the  fact,  evi- 
dent in  the  graph,  that  the  current  leads  the  voltage.  This  point  is  discussed  in  a more  general 
and  rigorous  way  later  in  this  section.  Note  that  the  angular  frequency  of  oscillation  (which 
can  be  determined  by  measuring  the  period  T from  the  graph  and  using  the  relation  a>  = 
27 t/T)  is  equal  to  the  driving  frequency  oj  and  not  to  the  natural  frequency  w0. 


(A  u!)/l 


is  plotted  as  a light  dashed  line  on  Fig.  26-14.  Note  that  in  the  steady  state  the  current 
leads  the  voltage.  That  is,  its  steady-state  waveform  is  the  same  as  that  of  V,  but  is 
ahead  of  it  in  phase  by  about  0.23  cycle,  so  that  the  phase  angle  </>  of  the  current  with 
respect  to  the  voltage  is  about  d>  = + 85°.  This  is  equivalent  to  a lead  time  of  about 
8 ms. 


In  Example  26-1  1 the  circuit  is  exactly  the  same  as  that  in  Example 
26-10.  However,  the  driving  frequency  has  been  set  equal  to  the  natural 
frequency  u>0  of  the  corresponding  undriven  LRC  circuit;  that  is,  at  = oj0  = 
968  rad/s. 


EXAMPLE  26-11  

Run  the  driven  LRC  circuit  program  with  the  following  set  of  initial  conditions  and 
parameters,  corresponding  to  T0  = 250  V,  L = 0.100  H.  R = 50.0  fl,  C = 
10.0  /uF,  and  co  = 968  rad/s. 

i0  = 0 (in  A);  (di/dt) 0 = 0 (in  A/s);  t0  = 0;  A t = 0.2  x 10-3  (in  s);  a = 1 x 106 
(in  s-2);  (3  = 500  (in  s-1);  y = 2.42  x 106  (in  A/s2);  cu  = 968  (in  rad/s). 

■ The  results  of  this  calculation  are  plotted  in  Fig.  26-15.  Here  again  you  see  the 
quasi-exponential  buildup  of  current  in  the  negative  sense  at  the  beginning.  But  it 
takes  between  two  and  three  cycles  to  achieve  the  steady  state.  The  steady-state  cur- 
rent oscillations  are  again  sinusoidal,  and  you  can  check  to  see  that  the  angular  fre- 
quency of  oscillation  is  968  rad/s,  which  is  the  angular  frequency  of  the  source  — 
the  so-called  driving  frequency.  The  steady-state  amplitude  is  now  4.9  A.  This  is  al- 
most 10  times  greater  than  the  maximum  current  achieved  in  the  LRC  circuit  of  Fig. 
26-14  and  Example  26-10. 

The  driving  voltage, 

V = 250  V x cos[(968  rad/s)t] 

is  again  plotted  as  a light  dashed  line.  Here  the  steady-state  current  is  approxi- 
mately in  phase  with  the  voltage.  That  is,  the  phase  angle  is  <f>  — 0°,  and  i and  V pass 
through  their  maxima  and  minima  together,  just  as  if  the  circuit  contained  only  a 
resistance  and  not  an  inductance  and  a capacitance  as  well. 


< 


Fig.  26-15  Plot  of  the  results  of 
the  numerical  calculation  of  Ex- 
ample 26-11.  The  conditions  are 
the  same  as  those  of  Example 
26-10,  except  that  the  driving 
frequency  is  now  a>  = o>0  = 968 
rad/s.  The  current,  represented 
by  dots,  is  to  be  read  on  the  left 
vertical  axis.  The  voltage,  repre- 
sented by  a dashed  curve,  is  to  be 
read  on  the  right  axis.  Note  that 
the  current  is  much  greater  than 
that  displayed  in  Fig.  26-14, 
although  the  driving  voltage  is 
the  same.  Note  also  that  it  now 
takes  several  cycles  of  oscillation 
before  the  current  builds  up  to  its 
maximum  steady-state  amplitude 
of  4.9  A. 


/ (in  A) 


In  Example  26-12  the  circuit  is  again  the  same  as  that  in  Examples 
26-10  and  26-11.  However,  the  driving  frequency  has  again  been  in- 
creased, this  time  to  the  value  o>  = 2000  rad/s. 

EXAMPLE  26-12 

Run  the  driven  LRC  circuit  program  with  the  following  set  of  initial  conditions  and 
parameters,  corresponding  to  E0  = 250  V,  L = 0.100  H,  R = 50.0  fi.  C = 
10.0  p,F,  and  cu  = 2000  rad/s. 

la  = 0 (in  A);  (di/dt)0  = 0 (in  A/s);  t0  = 0;  At  = 0.1  X 10-3  (in  s);  a = 1 X 106 
(in  s-2);  /3  = 500  (in  s-1);  y = 5 x 106  (in  A/s2);  w = 2000  (in  rad/s). 

■ I he  results  of  the  calculation  are  plotted  in  Fig.  26-16.  Here  it  takes  about  six 
cycles  to  attain  the  steady  state.  That  is,  the  transients  are  of  fairly  long  duration, 
relative  to  the  period.  The  steady-state  amplitude  here  is  1.63  A.  It  is  considerably 
less  than  that  found  in  Example  26-1 1,  where  the  driving  frequency  was  equal  to 
the  natural  frequency.  Especially  notable  is  the  second  cycle  of  oscillation,  where  the 
amplitude  falls  considerably  below  that  in  the  first  cycle  because  the  driving  voltage 
is  opposed  to  the  current  through  much  of  the  cycle.  (Why  is  this?  Remember  that  in 
this  case  co  — 2co0.)  You  can  check  again  to  verify  that  the  steady-state  frequency  is 
equal  to  the  driving  frequency. 

The  driving  voltage, 

V = 250  V x cos[(2000  rad/s)t] 


Fig.  26-16  Plot  of  the  results  of  the  numerical  calculation  of  Example  26-12.  The  conditions 
are  the  same  as  those  of  Examples  26-10  and  26-1 1,  except  that  the  driving  frequency  is  now 
oj  = 2000  rad/s.  The  current,  represented  by  dots,  is  to  be  read  on  the  left  vertical  axis.  The 
voltage,  represented  by  a dashed  curve,  is  to  be  read  on  the  right  axis.  The  number  of  cycles 
required  before  the  current  builds  up  to  its  steady-state  amplitude  is  now  even  greater  than  in 
Example  26-1 1 . 


300 


200 


100 


100 


-200 


-300 


26-6  Alternating-Current  Circuits:  Numerical  Description  1239 


P(in  V) 


is  again  plotted  as  a light  dashed  line.  In  this  case,  with  the  driving  frequency  well 
above  the  natural  frequency  w0,  the  steady-state  current  lags — in  other  words, 
follows  behind  — the  voltage  by  about  0.20  cycle,  which  corresponds  to  70°.  That  is, 
the  phase  angle  of  the  current  with  respect  to  the  voltage  is  c p — — 70°.  This  is  equiv- 
alent to  a time  lag  of  only  about  0.6  ms,  because  of  the  relatively  small  period  of  the 
relatively  high-frequency  oscillation. 


In  Examples  26-10  through  26-12,  the  driving  voltage  is  the  same.  But 
there  is  a quite  dramatic  dependence  of  the  value  of  the  maximum  current 
on  driving  frequency  (which  is  the  only  parameter  varied  from  example  to 
example).  This  dependence  is  typical  of  the  very  important  and  wide- 
spread physical  phenomenon  called  resonance.  Perhaps  the  most  familiar 
example  of  resonance  is  the  game  of  pushing  a child  on  a swing.  If  the 
pushes  are  timed  so  as  to  be  “in  step”  with  the  oscillation  of  the  swing,  a sus- 
tained series  of  quite  small  pushes  can  result  in  a large  steady-state  ampli- 
tude of  the  swing.  The  same  pushes,  incorrectly  timed,  produce  a much 
smaller  amplitude. 

Cyclotron  resonance,  discussed  in  Sec.  23-3,  is  another  example  of  the  reso- 
nance phenomenon.  If  the  external  electric  field  is  applied  at  a frequency  equal  to 
the  cyclotron  frequency,  the  charge  carriers  in  the  solid  (or  the  charged  particles 
in  the  cyclotron)  absorb  considerable  energy  from  the  electric  field.  Otherwise, 
they  do  not. 

In  all  systems  displaying  resonance,  an  active  component  of  the  system 
provides  a source  of  energy.  In  the  case  of  the  series  ac  circuit  of  Fig.  26-13, 
this  active  component  is  the  ac  source.  One  or  more  passive  components  of 
the  system  (in  the  case  of  the  ac  circuit,  the  inductor,  the  resistor,  and  the 
capacitor)  respond  in  a manner  which  depends  on  the  relation  of  the 
driving  frequency  w to  a resonant  frequency  o)r  which  is  characteristic  of 
the  system.  (In  the  series  ac  circuit,  the  response  which  is  usually  of 
greatest  interest,  and  which  we  have  studied  in  Examples  26-10  through 
26-12,  is  the  flow'  of  current  through  the  system.)  In  general,  the  response  is 
greatest  when  cu  = a>r,  that  is,  when  the  driving  frequency  is  equal  to  the  resonant 
frequency. 

Inspection  of  Figs.  26-14  through  26-16  will  make  it  clear  in  a qualita- 
tive way  why  this  is  so.  Figure  26-15  represents  the  system  being  driven  at  a 
frequency  quite  close  to  its  resonant  frequency  (as  you  will  see  in  Sec.  26-7, 
this  is  ojr  = 1000  rad/s).  In  this  case,  the  driving  voltage  is  always  “in 
step” — that  is,  in  phase — with  the  current.  The  voltage  thus  tends  to  build 
the  current  still  further;  you  can  see  this  process  taking  place  over  the  first 
few  cycles  before  the  steady  state  is  reached.  In  the  steady  state,  the  average 
energy  dissipation  in  the  resistor  becomes  equal  to  the  average  energy  pro- 
vided by  the  ac  source,  and  no  further  buildup  occurs.  The  situation  is 
analogous  to  that  which  obtains  when  a child’s  swing  is  pushed  in  the 
normal  way,  with  each  push  tending  to  increase  the  oscillation  amplitude 
still  further  until  the  frictional  dissipation  just  balances  the  energy  input 
from  the  repeated  pushes. 

This  is  not  the  case  in  the  situation  depicted  in  Fig.  26-16.  Over  the 
hi  st  oscillation  of  the  current,  the  sense  of  the  driving  voltage  is  such  as  to 
increase  the  magnitude  of  the  current  about  half  the  time  and  to  decrease  it 
about  half  the  time.  I bis  is  reflected  in  the  fact  that  the  amplitude  ol  the 
second  current  oscillation  is  substantially  smaller  than  that  of  the  first,  in 


1240 


Changing  Electric  Currents 


contradistinction  to  what  happens  in  Fig.  26-15,  where  each  oscillation  is 
larger  than  the  previous  one.  Although  the  driving  frequency  is  “wrong”  in 
the  present  case,  the  system  ultimately  reaches  a steady  state.  The  passive 
part  of  the  system  is  forced  to  follow  the  active  part,  and  the  system  as  a 
whole  oscillates  at  the  frequency  set  by  the  ac  source.  But  the  response  is 
considerably  smaller  than  in  the  system  of  Fig.  26-15.  A similar  argument 
applies  to  the  system  of  Fig.  26-14,  where  the  driving  frequency  is  smaller 
than  the  resonant  frequency. 

In  Secs.  26-7  and  26-8,  the  mathematical  description  of  the  series  ac 
circuit  is  developed  further.  Among  other  things,  this  will  make  possible  a 
quantitative  discussion  of  resonance  in  the  series  ac  circuit,  as  well  as  a 
more  specific  discussion  of  the  phenomenon  of  resonance  in  general. 


26-7  ALTERNATING- 
CURRENT  CIRCUITS: 

PHASOR 

DESCRIPTION 


In  Sec.  26-6  we  derived  and  applied  a numerical  method  of  determining 
the  relation  between  the  current  in  a series  ac  circuit  and  the  voltage  which 
drives  it.  This  method  is  direct  and  simple,  and  it  describes  the  transient  as 
well  as  the  steady-state  behavior  of  the  current.  As  usual,  however,  it  lacks 
the  generality  of  an  analytical  treatment.  In  this  section  and  the  next,  we 
use  the  insights  gained  from  the  numerical  treatment  to  develop  an  analyti- 
cal treatment. 

Our  ultimate  goal  is  to  develop  algebraic  expressions  which  can  be 
used  to  calculate  the  steady-state  current,  given  the  circuit  parameters  L,  R, 
and  C and  the  amplitude  and  frequency  of  the  driving  voltage  V,  which  we 
assume  to  be  sinusoidal.  As  is  often  the  case,  however,  there  is  an  equiva- 
lent geometrical  description  which  gives  a vivid  picture  of  the  function  of 
the  components  of  the  circuit  and  illuminates  the  analytical  development. 
This  geometrical  description,  called  the  phasor  description,  is  the  subject  of 
this  section. 


We  begin  by  defining  a phasor.  In  f ig.  26-17,  a vector  V0  is  shown 
making  an  angle  6 with  the  x axis  of  a cartesian  coordinate  system.  Its  com- 
ponent V along  the  y axis  is  the  signed  scalar 

V = V0  sin  6 

Now  imagine  that  the  vector  V0  is  rotating  counterclockwise  about  the  ori- 
gin at  constant  angular  speed  a>.  In  ac  circuit  analysis,  such  a rotating  vector 
is  called  a phasor.  If  V0  is  in  the  positive  x direction  at  the  initial  time  t — 0, 
its  component  along  the  y axis  is  given  at  any  moment  by  the  expression 

V = T0  sin(oit) 


(a)  ( b ) 


Fig.  26-17  Definition  of  a phasor.  (a)  The  component  along  the  y 
axis  of  the  vector  V0  is  the  signed  scalar  V = V0  sin  6.  (b)  The  vector 
V0  rotates  counterclockwise  about  the  origin  at  a constant  angular 
speed  to.  Such  a rotating  vector  is  called  a phasor.  Its  y component 

V = V0  sin(cot  + 8)  is  a geometrical  representation  of  the  sinusoidal 
driving  voltage  in  a simple  ac  circuit.  In  the  numerical  discussion  of 
Sec.  26-6,  we  found  it  convenient  to  assign  to  8 the  value  tt/2,  so 

V = V0  sin(cot  + tt/2)  — V0  cos(c ot).  The  same  assignment  of  phase 
constant  8 is  the  convenient  one  in  the  analytical  discussion  of  Sec. 
26-8.  But  here  it  is  most  convenient  to  set  8 = 0,  so  that  V = 
V0  sin  (cot)  as  shown  in  the  figure  and  in  the  text. 


26-7  Alternating-Current  Circuits:  Phasor  Description  1241 


R 

A/W 


(. b ) 


Fig.  26-18  (a)  A purely  resistive  cir- 

cuit consisting  of  an  ac  source  and  a re- 
sistor. (b)  The  phasor  diagram  illus- 
trating the  relation  between  V,  the 
instantaneous  voltage  of  the  ac  source, 
and  i,  the  current  in  the  circuit. 


Fig.  26-19  (a)  A purely  capacitive  cir- 

cuit consisting  of  an  ac  source  and  a 
capacitor.  ( b ) The  phasor  diagram  illus- 
trating the  relation  between  V and  i. 
While  the  phasors  V0  and  i0  both  rotate, 
they  have  the  same  angular  speed  and 
the  angle  between  them  remains  fixed. 
This  is  the  phase  angle. 


C 


( b ) 


Thus  they  component  T in  Fig.  26-17  is  the  geometrical  equivalent  of  the 
instantaneous  voltage  produced  by  an  ac  source.  (See  the  caption  to  Fig. 
26-17  for  a discussion  of  the  phase  constant  and  of  the  relation  of  this  ex- 
pression for  T to  drat  used  in  Sec.  26-6.) 

As  an  introduction  to  the  usefulness  of  this  representation,  we  apply  it 
first  to  the  circuit  of  Fig.  26-18,  consisting  of  an  ac  source  and  a resistor  R. 
At  any  instant,  Ohm's  law  is  obeyed  and  the  current  is  given  by 


i 


V 

R 


ho 

R 


sin(cot) 


(26-53) 


The  current  i is  just  the  y component  T multiplied  by  the  constant  factor 
1 /R.  It  is  also  the  y component  of  a phasor  i0  = V0/R  which  is  parallel  to  V0  and 
rotates  with  it,  as  shown  in  Fig.  26-18/1).  This  diagram  is  called  a phasor  dia- 
gram. It  illustrates  the  fact  that  the  ac  current  flowing  through  a resistor  is 
in  phase  with  the  ac  voltage  across  it. 


Next,  consider  the  circuit  of  Fig.  26- 19a,  consisting  of  an  ac  source  and 
a capacitor  C.  The  charge  q on  the  capacitor  is  related  to  the  instantaneous 
voltage  across  it  by  the  expression 

q = CV  = CT0  sin(ojt) 

Thus  the  current  flowing  through  the  capacitor  is 

da 

i = —r  — coCV o cos (wt) 

We  define  the  capacitive  reactance  Xc  to  be 

xc  - 4^  <26-54) 

In  terms  of  this  quantity,  the  equation  for  the  current  can  be  written 

T0  T0 

i = — cos(wt)  = — sin(wt  + 90°)  (26-55) 

Xc  Xc 

That  is,  the  current  leads  the  voltage  by  a phase  angle  of  90°,  because  90° 
has  been  added  to  cot  in  the  argument  of  the  sine.  Note  that  the  capacitive 
reactance  Xc  plays  a role  in  Eq.  (26-55),  which  describes  the  behavior  of  a 
capacitive  circuit,  that  is  analogous  to  the  role  played  by  the  resistance  R in 
Eq.  (26-53),  which  describes  the  behavior  of  a resistive  circuit. 

Figure  26-196  shows  the  relation  between  voltage  and  current  in  the 
capacitive  circuit  at  a particular  moment.  The  voltage  T is  represented  by 
the  y component  of  the  phasor  V0.  At  the  same  moment,  the  current  i is 
represented  by  the  y component  of  the  phasor  i0,  whose  magnitude  is 
V0/Xc  and  which  rotates  at  the  same  rate  as  V0,  but  leads  it  by  90°.  At  any 
subsequent  moment,  the  current  phasor  will  lead  the  voltage  phasor  by  90°, 
but  the  pair  will  have  rotated  counterclockwise. 


Consider  now  the  circuit  of  Fig.  26-20a,  consisting  of  an  ac  source  and 
an  inductor  L.  The  voltage  imposed  across  the  inductor  at  any  moment  is 


T = T0  sin(&jt) 


1242  Changing  Electric  Currents 


L 

yTTrv 


(b) 


Fig.  26-20  (a)  A purely  inductive  cir- 

cuit consisting  of  an  ac  source  and  an  in- 
ductor. ( b ) The  phasor  diagram  illus- 
trating the  relation  between  V and  i. 


EXAMPLE  26-13 


This  relation  implies  that  the  equation  for  the  current  i is 

i = — ^-7  cos(o >t) 
a )L 


We  can  verify  it  by  differentiating  with  respect  to  t to  show  that  it  leads  to 
the  relation,  as  follows: 

di  V0  r • , V0  . , , 

— = 7 [ — co  sin(o»t)]  = ~y~  sm(wt) 

at  10L  L 

or 

TT  • / x T di 

V0  sm(atU  = L — 
at 

VV7e  define  the  inductive  reactance  XL  to  be 

XL  = coL  (26-56) 

In  terms  of  this  quantity,  the  equation  for  the  current  can  be  written 

i = — -7  cos(o it)  = 77  sin(cut  — 90°)  (26-57) 

Xi  Xl 


Thus  the  current  lags  the  voltage  by  a phase  angle  of  magnitude  90°,  be- 
cause 90°  has  been  subtracted  from  cot  in  the  argument  of  the  sine.  The  in- 
ductive reactance  XL  plays  a role  in  Eq.  (26-57),  which  describes  the  behav- 
ior of  an  inductive  circuit,  that  is  analogous  to  the  role  played  by  the  resis- 
tance R in  Ecj.  (26-53),  which  describes  the  behavior  of  a resistive  circuit.  It 
is  also  analogous  to  the  role  played  by  the  capacitive  reactance  Xc  in  Eq. 
(26-55),  which  describes  the  behavior  of  a capacitive  circuit. 

Figure  26-20 b shows  a moment  when  the  voltage  V is  represented  by 
they  component  of  the  phasor  V0.  I he  current  i is  represented  at  the  same 
moment  by  the  y component  of  a phasor  io,  whose  magnitude  is  V0/XL 
which  rotates  at  the  same  rate  as  V0,  but  lags  it  by  90°.  The  two  phasors 
maintain  this  relation  as  they  rotate  counterclockwise. 

In  an  ac  circuit  containing  several  elements  in  series,  the  overall 
voltage  drop  can  be  found  by  taking  the  vector  sum  of  the  phasors  repre- 
senting the  voltage  drops  across  the  individual  elements.  This  proce- 
dure is  justified  rigorously  by  the  analytical  development  of  Sec.  26-8.  It  is 
analogous  to  that  used  in  the  derivation  of  the  rule  for  the  equivalent  resis- 
tance of  resistors  in  series  in  a dc  circuit.  It  is  illustrated  in  Example  26-13. 


Construct  a phasor  diagram  for  series  LRC  circuit  of  Example  26-12  with  L = 
0.100  H,  R = 50.0  fl,  C = 10.0  /xF,  and  co  = 2000  rad/s.  Use  the  diagram  to  find 
the  phase  angle  4>  of  the  current  i with  respect  to  the  voltage  V and  compare  it  with 
the  value  measured  from  Fig.  26-16.  Then  use  the  diagram  to  find  i0,  and  compare 
the  value  thus  found  with  that  found  by  measurement  on  Fig.  26-16.  Finally,  find 
the  maximum  voltages  Vfl0,  Vco,  and  VL0  across  the  circuit  elements. 

■ Using  polar-coordinate  graph  paper,  you  first  choose  any  convenient  instanta- 
neous direction  for  i0,  as  in  Fig.  26-21.  The  phasor  Vfl0,  whose  y component  repre- 
sents the  instantaneous  voltage  across  the  resistor,  is  parallel  to  io  - Its  magnitude  is 
ioR  = i 0 x 50.0  O.  You  do  not  yet  know  i0,  so  you  let  its  length  in  the  graph  be  1 
unit.  Thus  the  length  of  XR0  can  be  drawn  as  50.0  units,  and  you  draw  it  as  shown  in 
the  figure. 

You  next  construct  the  phasor  Vro-  Its  magnitude  is 


Eco 


1 

ioXc  = to  — = (1  unit) 
o)L 


1 

2000  rad/s  X 10.0  X 1 0-6  F 


50.0  units 


26-7  Alternating-Current  Circuits:  Phasor  Description  1243 


90' 


1 00°  90° 


80° 


Fig.  26-21  Phasor  diagram,  constructed  as  described  in  Example  26-13.  for 
the  series  ac  circuit  discussed  in  Example  26-12. 


Since  the  current  through  a capacitor  leads  the  voltage  by  90°,  the  voltage  lags  the 
current  by  the  same  amount.  So  you  draw  the  phasoi  Vco  to  be  50.0  units  long,  in 
the  direction  90°  behind  (clockwise  from)  v 

1 he  phasor  Vi0  is  constructed  in  like  manner.  Its  magnitude  is 

VL0  = i0XL  = i0  wL  = (1  unit)  x 2000  racl/s  x 0.100  H = 200  units 

Since  the  current  through  an  inductor  lags  the  voltage  by  90°,  the  voltage  leads  the 
current  by  the  same  amount.  So  you  draw  the  phasor  VL0  to  be  200  units  long,  in  the 
direction  90°  ahead  of  (counterclockwise  from)  i0. 

The  phasors  Vco  and  VL0  are  perpendicular  to  Vro,  and  they  are  antiparallel  to 
one  another,  as  the  figure  shows.  You  can  now  construct  the  net  reactive  voltage 
phasor  Vzo,  defined  by  the  equation 

V.Y0  - Vro  + Vco 

You  do  this  by  adding  Vco  and  Vi0  vectorially.  The  phasor  VA-0  is  shown  in  Fig. 
26-21.  You  can  now  use  the  vector  addition  rule  to  add  VXo  and  V/;0.  This  final  sum 
is  the  vector  sum  of  the  voltage  phasors  for  the  entire  circuit.  Its  y component  is  the 
total  instantaneous  voltage  V across  the  resistor,  the  capacitor,  and  the  inductor  and 
hence  also  the  instantaneous  voltage  of  the  ac  source.  Thus  the  vector  sum  itself  is 
the  phasor  V0. 


1244 


Changing  Electric  Currents 


The  angle  from  V0  to  i0  is  the  phase  angle  <f>  of  the  current  with  respect  to  the 
voltage.  Measuring  with  a protractor  shows  its  magnitude  to  be  |</>|  = 72°.  Measure- 
ment of  the  phase  angle  for  the  same  circuit  from  the  graph  of  the  numerical  calcu- 
lation, Fig.  26-16,  yields  a magnitude  \4>\  — 70°.  And  both  Fig.  26-16  and  Fig.  26-21 
agree  that  the  current  lags  the  voltage.  So  we  have  </>  — —72°  for  both. 

To  find  ;0  from  Fig.  26-21,  note  hrst  that  the  phasor  V0  must  represent  a volt- 
age of  amplitude  250  V.  Since  V„  and  Vfl0  are  drawn  in  the  same  (though  arbitrary) 
units,  you  can  write 


VR0  = V0 


Vo 


where  VR0/V0  is  the  ratio  of  the  lengths  of  the  two  phasors.  Measuring  VRo  and  T0 
directly  from  Fig.  26-21  gives 


T hus  you  have 


250  V x 


50.0  units 
157  units 


79.6  V 


Vno  = 79.6  V 
R 50.0  Q 


1.59  A 


The  corresponding  steady-state  value  from  Fig.  26-16  is  i0  = 1.63  A.  The  two 
values  of  i0  agree  within  about  3 percent;  better  agreement  between  approximate 
numerical  calculation  and  graphical  construction  cannot  be  expected. 

The  voltages  across  the  circuit  components  L and  C can  be  found  by  the  same 
method  used  to  find  VR0.  You  have 


VL0  = Y0^r  = 250  V x 
V0 

V co  =Vo'yr=  250  V X 
»o 


200  units 
157  units 


= 319  V 


50.0  units 
1 57  units 


= 79.6  V 


Note  that  the  instantaneous  voltage  across  the  inductor  can  be  larger  than  the  peak 
output  voltage  of  the  ac  source!  Is  this  true  as  well  of  the  instantaneous  voltage 
across  the  capacitor?  Across  the  resistor? 


L 


Fig.  26-22  A parallel  ac  circuit  which 
can  be  analyzed  by  means  of  the  phasor 
method. 


In  the  parallel  ac  circuit  of  Fig.  26-22,  tire  instantaneous  voltage  must  be  the 
same  across  all  the  components.  How  would  you  use  the  phasor  representation  to 
analyze  this  circuit? 

When  the  driving  frequency  co  in  an  ac  circuit  is  relatively  low,  the 
capacitive  reactance  Xc  = 1/c oC  is  relatively  large.  On  the  other  hand, 
XL  = coL  is  relatively  small.  Thus  at  low  frequencies  the  reactance  of  an  ac  circuit 
is  dominated  by  capacitive  effects.  As  the  frequency  increases,  Xc  decreases  and 
XL  increases.  In  the  phasor  diagram,  the  two  oppositely  directed  reactance 
phasors  become  closer  and  closer  to  one  another  in  magnitude.  In  particu- 
lar, consider  what  happens  when  the  chiving  frequency  co  has  the  special 
value  co  = a >r,  such  that 


corL 


1 

(OrC 


The  necessary  value  of  cor  can  be  found  by  solving  this  equation,  and  it  is 


(l)r 


1 

Vlc 


(26-58) 


26-7  Alternating-Current  Circuits:  Phasor  Description  1245 


Fig.  26-23  Phasor  diagram  for  a series  ac  cir- 
cuit when  the  driving  frequency  w is  equal  to  the 
resonant  frequency  (or.  The  capacitive  and  in- 
ductive reactances  are  equal  in  magnitude,  so 
that  the  instantaneous  voltages  across  the  capaci- 
tor and  the  inductor  are  always  equal  in  magni- 
tude and  opposite  in  sense  and  cancel.  Conse- 
quently, the  instantaneous  voltage  V of  the  ac 
source  is  equal  in  magnitude  to  VR , the  instanta- 
neous voltage  across  the  resistor. 


This  equation  is  called  the  resonance  condition,  and  a»r  is  called  the  reso- 
nant frequency.  When  the  driving  frequency  a>  is  equal  to  the  resonance 
frequency  cor,  the  inductive  and  capacitive  reactances  in  the  circuit  cancel 
each  other.  Suppose  that  an  ac  source  is  connected  across  the  two  terminals 
of  a “black  box”  containing  an  inductor,  a resistor,  and  a capacitor  in  series, 
comprising  a series  LRC  circuit.  Suppose  also  that  the  driving  frequency  is 
equal  to  the  resonant  frequency  of  this  circuit.  As  far  as  the  voltage  across 
the  black  box  and  the  current  flowing  through  it  are  concerned,  it  will  act  as 
if  it  contained  only  the  resistor  R , in  spite  of  the  individual  reactances  of  the 
inductor  and  the  capacitor.  The  instantaneous  current  appears  “ohmic”;  it 
is  given  by 

i = — sin(ojt)  = — 

But  you  can  see  from  the  phasor  diagram  of  Fig.  26-23  that  the  instanta- 
neous voltages  across  the  inductor  and  the  capacitor  can  be  very  large. 
They  do  not  appear  across  the  series  LRC  circuit  as  a whole  because  they 
are  180°  out  of  phase  and  always  cancel  each  other  exactly.  Flowever,  such 
resonance,  if  not  anticipated,  can  lead  to  insulation  failure  and  other 
destructive  effects. 

As  the  driving  frequency  increases  still  further  and  exceeds  the  reso- 
nant frequency  a»r,  the  inductive  reactance  begins  to  dominate.  At  high  fre- 
quencies the  reactance  of  an  ac  circuit  is  dominated  by  inductive  effects.  In  Sec.  26-8 
we  develop  an  algebraic  method  for  calculating  the  combined  effects  of  the 
inductance,  the  resistance,  and  the  capacitance  of  a series  ac  circuit  at  any 
frequency. 


1246 


Changing  Electric  Currents 


26-8  ALTERNATING- 
CURRENT  CIRCUITS: 
ANALYTICAL 
DESCRIPTION 


A completely  general  analytical  solution  of  this  equation  for  the  cur- 
rent i is  rather  complicated,  because  it  must  describe  a rather  complicated 
current  waveform.  As  you  can  see  from  Figs.  26-14  through  26-16,  most  of 
the  complicated  behavior  of  the  waveform  lies  in  the  “buildup,”  or  transient 
part,  of  the  oscillation.  While  the  imposed  voltage  V is  represented  by  the 
simple  sinusoidal  function  given  by  Eq.  (26-49),  V = V0  cos  (cot),  the  magni- 
tude of  the  current  begins  to  build  up  from  zero  in  an  exponential  fashion, 
then  “bends  over”  into  a quasi-sinusoidal  oscillation  whose  amplitude  is 
variable — or  even  apparently  erratic — and  finally  “settles  down”  into  the 
regular  sinusoidal  oscillation  of  the  steady  state. 

In  this  section  we  solve  the  equation  for  the  current  only  for  the  steady 
state.  This  makes  possible  a very  considerable  simplification,  since  the 
waveform  to  be  described  is  quite  simple.  As  usual,  we  begin  the  process  of 
analytical  solution  of  a differential  equation  by  guessing  at  the  answer.  For- 
tunately, this  guess  is  very  easy  to  make,  because  we  already  know  from 
Figs.  26-14  through  26-16  that  in  the  steady  state  the  current  i behaves  in  a 
sinusoidal  manner.  It  is  especially  clear  in  Fig.  26-14  (although  it  can  be 
seen  as  well  in  Figs.  26-15  and  26-16)  that  the  function  describing  the  steady- 
state  current  i differs  from  that  describing  the  driving  voltage  V only  in  scale  and  in 
phase.  The  angular  frequencies  of  the  two  f unctions  have  the  same  value  oj.  We 
therefore  write  the  solution  for  i in  the  form 

i = i0  cos (ojt  + f)  (26-59) 

There  are  two  unknown  quantities  in  this  equation,  which  we  must  now 
evaluate.  The  first  is  the  current  amplitude  io,  which  must  be  related  to  the 
amplitude  V0  of  the  driving  voltage.  The  other  is  the  phase  angle  </>,  which 
describes  the  phase  difference  between  the  voltage  V and  the  current  i. 
When  the  value  of  <p  is  positive,  the  current  leads  the  voltage;  when  the 
value  of  <f>  is  negative,  the  current  lags  the  voltage. 


We  continue  the  development  begun  in  Secs.  26-6  and  26-7.  Our  goal  is  to 
derive  an  expression  for  the  steady-state  current  i in  a series  ac  circuit  in 
terms  of  the  sinusoidal  voltage  V which  drives  it.  In  Sec.  26-6,  we  used 
Kirchhoff’s  loop  rule  to  sum  the  voltage  changes  around  such  a circuit, 
consisting  of  an  ac  source,  an  inductor,  a resistor,  and  a capacitor,  as  shown 
in  Fig.  26-13.  This  sum  yielded  a differential  equation.  The  equation  can 
be  written  in  the  form  of  Eq.  (26-48),  which  is 


dV  _ dpi 
~di  ~ L ~d? 


n di  _ f ■ 

R dt  cl 


0 


Solving  the  differential  equation  displayed  at  the  beginning  of  this  sec- 
tion requires  a “plugging  in”  of  the  values  of  the  derivatives  dV/dt,  dri/dt 2, 
and  di/dt  and  of  the  value  of  i.  Since  we  know  V and  i,  we  can  evaluate  their 
derivatives.  Beginning  with  the  value  of  i given  by  Eq.  (26-59),  we  have  for 
the  first  derivative  di/dt  the  value 

— = — (oi0  sin(o)t  + 4>)  (26-60o) 

dt 

And  taking  the  derivative  with  respect  to  time  t of  both  sides  of  Eq.  (26-60o) 
gives  us  for  the  second  derivative  d2i/dt2  the  value 

= — a)2i0  cos(a )t  + <f>)  (26-606) 

dt ~ 


26-8  Alternating-Current  Circuits:  Analytical  Description  1247 


To  obtain  the  value  of  dV/dt,  we  begin  with  the  expression  for  V given 
by  Eq.  (26-49),  which  is 

V = Vq  cos  (cot) 

Taking  the  derivative  with  respect  to  time  of  both  sides  of  this  equation 
yields 

dV 

— = -wV0sinM)  (26-61) 

Substituting  the  values  given  by  Eqs.  (26-59),  (26-60a),  (26-606),  and 
(26-61)  into  the  differential  equation  at  the  beginning  of  this  section,  we 
have 

— ojV0  sin  (cot)  + Lco1 2i0  cos  (cot  + </>)+  R<oi0  sin(cnt  + (/>) 


1 

- — i0  cos (col  + 4>)  = 0 (26-62) 


We  next  use  the  fact  that  the  solution  given  by  Eq.  (26-59)  is  restricted  to 
the  steady  state  to  eliminate  the  quantity  i0  from  Eq.  (26-62).  Since  in  the 
steady  state  i0  is  a constant,  it  can  be  set  equal  to  the  constant  E0  divided  by 
some  other  constant,  which  we  call  Z.  We  thus  have,  by  definition, 


(26-63) 


This  equation  looks  superficially  like  Ohm’s  law.  However,  we  must  expect 
that  the  quantity  Z,  which  relates  i0  to  T0,  will  depend  not  only  on  the  re- 
sistance R , but  also  on  the  inductance  L and  the  capacitance  C,  which  play  a 
role  in  determining  the  flow  of  current  through  the  circuit.  We  must  ex- 
pect, moreover,  that  the  angular  frequency  co  will  appear  in  the  expression 
giving  the  value  of  Z,  since  the  examples  in  the  previous  two  sections  made 
it  evident  that  the  relation  between  i0  and  E0  depends  on  a>. 

Substituting  the  value  of  i0  given  by  Eq.  (26-63)  into  Eq.  (26-62)  and 
then  multiplying  through  by  — 1 yield 


coV0  sin {cot)  — Leo2 


Eo 

Z 


cos(oit  + d>)  — Rco 


Eo 

Z 


sin(a>t  + 4>) 


1 E0 

+ — — cos(a>t  + (/>)=  0 

Z- 

We  now  simplify  this  equation  in  three  steps.  First,  we  multiply  both  sides 
of  the  equation  by  the  quantity  Z/eoV0  to  obtain 

Z sin(cat)  — coL  cos  (cot  + 4>)  — R sin(wt  + c />) 

4 — — cos(wt  + <6)  = 0 (26-64a) 

coL 

Next,  note  that  the  quantity  eoL  in  the  second  term  on  the  left  side  of  this 
equation  is  the  inductive  reactance  XL  defined  by  Eq.  (26-56).  Likewise,  the 
quantity  1 /c oC  in  the  fourth  term  is  the  capacitive  reactance  Xc  defined  by 
Eq.  (26-54).  So  Eq.  (26-64«)  can  be  written 

Z sin(tot)  — XL  cos (eot  + </>)  — R sin(o)t  4-  d>) 

+ Xc  cos(a >t  + </>)  = 0 (26-646) 

[For  reasons  of  mathematical  convenience,  we  have  chosen  to  express  the 


1248 


Changing  Electric  Currents 


voltage  in  the  form  V — V0  cos  (cot),  as  we  did  in  the  numerical  treatment  of 
Sec.  26-6,  rather  than  in  the  form  V = V0  sin(cot)  used  in  the  phasor  treat- 
ment of  Sec.  26-7.  In  spite  of  this  phase  difference,  can  you  see  a connec- 
tion between  the  second,  third,  and  fourth  terms  in  Eq.  (26-646)  and  the 
phasors  VL0,  \K0,  and  Vco  discussed  in  Example  26-13?] 

The  terms  in  Eq.  (26-646)  which  involve  XL  and  Xc  have  the  common 
factor  cos  (cot  + <f>).  Collecting  these  terms,  we  have 

Z sin(ajt)  — R sin(o>  t — </>)  — (XL  — Xc)  cos  (cot  + </>)  = 0 (26-65) 

We  dehne  the  reactance  X of  the  series  ac  circuit  to  be  the  difference 

X = XL  - Xc  (26-66 a) 

or 

X s mL  - (26-666) 

cot 

In  terms  of  the  reactance,  Eq.  (26-65)  can  be  written 

Z sin(cot)  — R sin(wt  + 0)  — X cos  (cot  + <p)  = 0 (26-67) 

The  physical  significance  of  the  reactanceX^  — Xc,  which  is  the  difference  of 
the  inductive  and  capacitive  reactances  in  the  series  ac  circuit,  is  made  evident  by 
the  way  in  which  the  corresponding  phasors  VL0  and  Vco  add.  These  two  phasors 
are  always  oppositely  directed,  with  the  former  leading  the  resistive  phasor  V^o  by 
90°  and  the  latter  lagging  it  by  90°.  According  to  the  discussion  in  Example  26-13, 
we  have  Vi0  = ioXi  and  Vco  = i(Xc-  So  their  sum,  which  is  the  net  reactive  voltage 
phasor  Vyo  = Vi0  + Vco,  is  proportional  to  the  difference  XL  - Xc. 


The  next  step  is  to  apply  the  trigonometric  identities 
sin(o»t  + </>)  = sin(oit)  cos  (f>  + cos(c ot)  sin  4> 


and 


cos(o >t  + (f>)  = cos(o> t)  cos  cf>  — sin(wt)  sin  <f> 

Making  these  substitutions  into  Eq.  (26-67),  we  obtain 

Z sin(cvf)  — R sin(ojf)  cos  (b  — R cos  (cot)  sin  (f> 

— X cos  (cot)  cos  <6  + X sin(a»t)  sin  4>  = 0 

Every  term  on  the  left  side  of  this  equation  contains  a factor  sin(caf)  or  a 
factor  cos(o >t).  Collecting  the  terms  having  the  same  factor,  we  have 

[Z  — R cos  <6  + X sin  <f>]  sin(cut) 

— [i?  sin  <f>  + X cos  </>]  cos(ca t)  = 0 (26-68) 

In  this  equation,  all  the  terms  in  brackets  are  constants  while  the 
factors  cos  (cot)  and  sin(cot)  are  functions  of  the  independent  variable  t.  The 
equation  is  thus  of  the  form  A sin(wf)  — B cos(cot)  = 0,  where  A and  B are 
constants.  Now,  it  is  always  possible  to  find  some  particular  value  of  t for 
which  the  left  side  of  the  equation  is  equal  to  zero,  regardless  of  the  values 
of  A and  B.  But  we  require  that  the  left  side  of  the  equation  be  equal  to  zero 
for  all  values  of  t.  The  only  way  to  satisfy  this  condition  is  to  have  both  A — 0 and 
B = 0.  Thus  Eq.  (26-68)  yields  the  two  separate  conditions 

Z — R cos  d>  + X sin  </>  = 0 (26-69a) 


26-8  Alternating-Current  Circuits:  Analytical  Description  1249 


and 


R sin  </>  + X cos  </>  = 0 


(26-696) 


We  now  have  a pair  of  equations  containing  the  two  unknowns  Z and 
(j).  The  first  of  these  quantities  determines  the  relation  i0  = VQ/Z 
between  the  amplitude  i0  of  the  current  and  the  amplitude  T0  of  the 
driving  voltage.  The  second  determines  the  phase  relationship  between  the 
current  i and  the  voltage  V.  Let  us  solve  Eq.  (26-69 6)  for  </>  and  then  substi- 
tute the  value  obtained  back  into  Eq.  (26-69c)  to  obtain  Z.  From  Eq. 
(26-696),  we  have  R sin  cf>  = —X  cos  cf).  This  yields 


Fig.  26-24  Obtaining  the  values  of  sin 
cf>  and  cos  (j>,  given  that  tan  c)>  = —X/R. 
For  simplicity  we  assume  that  the  reac- 
tance X has  a negative  value,  so  that  — X 
is  positive.  When  (f>  is  one  of  the  angles 
of  a right  triangle,  as  shown,  tan  <t>  is  de- 
fined to  be  the  ratio  of  the  length  of  the 
side  opposite  the  angle  4>  to  the  length 
of  the  side  adjacent  to  the  angle.  T hus 
those  two  sides  must  have  the  lengths 
— X and  R shown,  as  measured  in  arbi- 
trary units.  Using  the  pythagorean 
theorem,  we  find  the  length  of  the 
hypotenuse  to  be  ( R 2 + X2)11'2.  Since 
sin  <f>  is  defined  as  the  ratio  of  the  length 
of  the  side  of  the  triangle  opposite  the 
angle  4>  to  the  length  of  the  hypotenuse, 
we  have  sin  </>  = -X/(R2  + X2)112.  And 
since  cos  4>  is  defined  as  the  ratio  of  the 
length  of  the  side  of  the  triangle  adja- 
cent to  the  angle  4>  to  the  length  of 
the  hypotenuse,  we  have  cos  </>  = 
R/(R2  + X2)112.  Can  you  explain  why 
these  expressions  for  sin  <f>  and  cos  4> 
apply  no  matter  whether  the  value  of  X 
is  negative  or  positive? 


tan  </>  = 


(2 6- 70c) 


or 


* = ,an”  hr ) 


(26-706) 


In  order  to  solve  Eq.  (26-69c),  we  need  expressions  for  sin  </>  and  cos  cf). 
These  are  obtained  from  Eq.  (26-70c)  in  Fig.  26-24  and  its  caption.  The  de- 
sired expressions  are 

_ V 

sin  ^ = (i?2  + X2)112  (26-7  lc) 

and 

cos  <fi  — |^2  x2)1/2  (26-716) 

We  substitute  Eqs.  (26-7 lc)  and  (26-716)  into  Eq.  (26-69c),  which  we  first 
rewrite  in  the  form 


Z — R cos  (f>  — X sin  </> 


This  substitution  gives  us 

R2 

Z ~~  (R2  + X2)112 


X2 

(R2  + X2)112 


R2  + X2 

(R2  + X2)1'2 


or 


Z = (R2  + X2)112  (26-72 a) 

The  quantity  Z can  also  be  expressed  directly  in  terms  of  the  phase  angle  </>. 
To  do  this,  note  l hat  Eq.  (26-716)  can  be  rewritten  in  the  form 

R. 

(R2  + X2)1/2  = t 

cos  <p 

Comparing  this  equation  with  Eq.  (26-72c)  yields  the  immediate  result 

Z=-*-  (26-726) 

COS  (j) 

Equation  (26-706)  and  either  Eq.  (26-72c)  or  (26-726)  give  us  the  values 
of  the  unknown  quantities  in  our  “guessed”  solution,  Eq.  (26-59),  to  the 
equation  for  the  current  in  a series  ac  circuit.  1 he  guess,  given  by  Eq. 
(26-59),  was  i = i0  cosM  + <6)-  If  we  use  the  definition  i0  = V0/Z  given  by 
Eq.  (26-63),  this  becomes 


1250 


Changing  Electric  Currents 


(26-73a) 


or,  by  using  Eq.  (26-726), 


(26-736) 


What  is  the  physical  significance  of  Eq.  (26-73a)?  The  term  cos (cot  + </>) 
on  the  right  side  of  the  equation  expresses  the  fact  that  the  current  i in  the 
series  ac  circuit  oscillates  with  the  angular  frequency  of  the  driving  voltage 
V = T0  cos  (cot),  but  in  general  has  a different  phase.  If  the  reactance  X = 
XL  — Xc  has  a positive  value,  then  the  phase  angle  c/>  = tan-1  (-X/L)  of  the 
current  with  respect  to  the  voltage  has  a negative  value,  and  the  current 
lags  the  driving  voltage.  If  X has  a negative  value,  then  <fi  has  a positive 
value  and  the  current  leads  the  driving  voltage.  The  quantity  V0/Z  deter- 
mines the  current  amplitude  according  to  Eq.  (26-63),  i0  = V0/Z.  As  we 
have  already  noted,  this  equation  bears  some  resemblance  to  the  equation 
which  would  hold  if  the  circuit  contained  only  an  ac  source  and  a resistor. 
That  equation  is  Ohm's  law,  in  the  form  z0  = V0/R- 

The  quantity  Z is  called  the  impedance.  As  far  as  the  maximum,  values 
are  concerned,  the  impedance  Z plays  the  same  role  in  ac  circuits  that  the 
resistance  R plays  in  dc  circuits.  But  do  not  forget  this  restriction:  it  is  not 
true  that  at  any  instant  i = V/Z.  Indeed,  unless  <f>  = 0 (so  that  Z = R),  the 
current  and  the  voltage  do  not  attain  their  maximum  values  simulta- 
neously. 

In  spite  of  this  limitation,  the  impedance  Z is  very  useful  in  under- 
standing the  behavior  of  ac  circuits.  To  see  this,  we  express  Z directly  in 
terms  of  the  quantities  R,  L,  C , and  co.  We  substitute  into  Eq.  (26-72 a)  the 
definition  of  X given  by  Eq.  (26-666),  X = ojL  — 1/ooC.  We  have 


r / i \2ii/2 

Z = R2  + { ojL 


(26-74) 


The  impedance  Z is  the  square  root  of  the  sum  of  the  squares  of  two  terms. 
The  first  is  the  resistance  R,  and  the  second  is  the  reactance  X.  Like  the  re- 
sistance, the  reactance  and  the  impedance  are  both  expressed  in  ohms. 

To  see  that  this  is  the  case,  consider,  for  example,  the  term  coL.  The  inductance 
L is  expressed  in  henrys  and,  according  to  Eq.  (25-30),  1 H = 1 V-s/A.  The  units  of 
angular  frequency  co  are  l/s,  so  the  units  of  coL  are  V-s/(A-s)  = V/A  = fi.  Can  you 
use  a similar  argument  to  show  that  the  units  of  the  term  1 /coC  can  also  be  reduced 
to  ohms? 

While  the  reactance  X is  expressed  in  ohms,  it  is  different  from  resis- 
tance in  several  important  ways.  Its  value  can  be  either  positive  or  negative. 
But  since  it  always  contributes  to  the  impedance  in  the  form  X2,  the  pres- 
ence of  a nonzero  reactance  in  a circuit  can  never  reduce  the  impedance 
below  the  minimum  value  Z = R.  Unlike  the  resistance,  the  reactance  de- 
pends on  the  angular  frequency  co. 

We  can  now  consider  the  phenomenon  of  resonance  for  the  current  in 
the  series  ac  circuit  from  a quantitative  point  of  view.  According  to  Eq. 
(26-63),  i0  = V0/Z,  the  current  amplitude  i0  will  attain  the  greatest  pos- 


26-8  Alternating-Current  Circuits:  Analytical  Description  1251 


sible  value  for  a fixed  value  of  V0  when  the  impedance  has  its  minimum 
possible  value  Z = R.  For  a particular  ac  circuit  in  which  R,  L,  and  C are 
fixed,  the  impedance  can  be  varied  by  varying  the  driving  frequency  at. 
1 he  impedance  will  attain  its  minimum  value  at  the  frequency  at  which  the 
reactance  X = a>L  — 1 / wC  is  zero,  as  can  be  seen  by  inspecting  Eq.  (26-74)  or 
from  Fig.  26-23.  I his  resonance  condition  is  met  when  the  driving  frequency  is 
to  = a),- , where  the  resonant  frequency  a>r  satisfies  Eq.  (26-58), 

m'^Vlc 

Thus  the  resonant  frequency  for  the  current  in  a series  ac  circuit  is  the  same  as  the 
natural  frequency  for  the  ideal,  resistanceless  LC  circuit,  given  by  Eq.  (26-23), 
oi0  = 1/vLC.  We  note  here  without  proof  that  the  same  resonance  condi- 
tion holds  for  other  types  of  ac  circuits  as  well.  The  value  of  the  resonant 
frequency,  unlike  that  of  the  natural  frequency  for  free,  undriven  oscilla- 
tion, is  independent  of  the  resistance  of  the  circuit. 


EXAMPLE  26-14 

Find  the  resonant  frequency  wr  for  the  current  in  the  series  ac  circuit  discussed  in 
Examples  26-10  through  26-12. 

■ In  this  circuit  the  inductance  has  the  value  L = 0.100  H,  and  the  capacitance 
has  the  value  C = 10.0  p.F.  Using  Eq.  (26-58),  you  have 


1 

Mr  ~ (0.100  H x 10.0  x 10-6  F)1'2 


1000  rad/s 


In  Example  26-9,  the  natural  frequency  w0  was  calculated  for  a series  LRC  cir- 
cuit identical  to  the  series  ac  circuit  discussed  in  Examples  26-10  through  26-12 
and  26-14,  except  that  it  does  not  contain  an  ac  voltage  source  and  therefore  oscil- 
lates freely,  that  is,  “naturally.”  The  natural  frequency  is  given  by  Eq.  (26-43), 
co0  = (1/LC  — R2/4L2)1/2.  Its  value  does  depend  on  the  resistance;  for  this  particular 
circuit,  it  was  found  to  be  co0  = 968  rad/s.  There  is  only  a 3 percent  difference 
between  w0  = 968  rad/s  and  wr  = 1000  rad/s  in  this  case,  because  the  system  is 
lightly  damped.  But  the  difference  between  w0  and  wr  increases  rapidly  if  the 
damping  is  made  larger  by  increasing  the  resistance  R. 


When  the  driving  frequency  at  of  the  series  ac  circuit  is  varied  either 
upward  or  downward  from  the  value  o»r,  for  which  coL  = 1/coC,  the 
impedance  Z = [ R 2 + (coL  — 1 /wC)2]112  increases.  Consequently,  the 
value  of  i0  = T0/Z  decreases.  This  is  illustrated  in  Fig.  26-25,  which 
shows  a family  of  resonance  curves.  The  details  of  these  curves  are  dis- 
cussed in  the  caption  to  the  figure.  The  circuits  represented  by  these  curves 
differ  only  in  the  value  of  the  resistance  R.  The  height  of  each  curve  at  res- 
onance is  given  by  the  relation  i0  = V0/R,  and  these  heights,  of  course, 
vary  as  the  value  of  R varies.  Equally  significant,  however,  is  the  way  in 
which  the  sharpness  of  the  curves  increases  with  decreasing  R. 

The  sharpness  of  the  resonance  curve  of  an  ac  circuit  is  related  to  a fig- 
ure of  merit  called  the  quality  factor,  or  Q factor.  The  sharper  the  reso- 
nance, the  larger  the  Q factor.  The  Q factor  is  defined  to  be  2tt  times  the 
ratio  of  the  energy  stored  in  the  circuit  to  the  energy  dissipated  per  cycle  of 
oscillation  (for  example,  in  the  resistor  of  a series  LRC  circuit).  It  is  com- 
pletely analogous  to  the  Q factor  defined  for  oscillating  mechanical  systems 
in  Sec.  8-5.  In  a mechanical  system,  if  the  frictional  force  acting  on  the  oscil- 


1252  Changing  Electric  Currents 


i o (in  A) 


0 200  400  600  800  1000  1200  1400  1600  1800  2000 

co  (in  rad/s) 


Fig.  26-25  A family  of  resonance  curves  for  the  current 
in  a series  ac  circuit.  Each  curve  has  been  plotted  by  using 
the  equation  i0  = V0 /Z,  with  the  impedance  Z given  by 
Eq.  (26-74).  For  all  curves,  V0  = 250  V.  L = 0.100  H, 
and  C = 10.0  /zF.  The  middle  curve  represents  the  cir- 
cuit discussed  in  Examples  26-10  through  26-12,  for 
which  R = 50.0  f 1.  For  the  upper  curve,  R = 25.0  Cl;  for 
the  lower  curve,  R = 250  fl.  All  three  curves  display  the 
same  resonant  frequency  for  current,  u>r  = 1000  rad/s. 
However,  the  resonance  is  “sharpest"  for  the  smallest  re- 
sistance. The  sharpness  of  a resonance  curve  is  one  of  its 
most  significant  characteristics.  It  can  be  conveniently- 
characterized  by  its  so-called  “half-width  at  half- 
maximum.” This  quantity  tSu>  is  half  the  width  of  the 
curve  where  its  value  is  one-half  the  peak  value  which  it 
attains  at  w = &jr.  The  half-width  at  half  maximum  of  the 
upper  curve  is  denoted  by  the  pair  of  arrows  at  i0  = 5.00 
A.  The  steady-state  values  for  i0  found  by  numerical  cal- 
culation in  Examples  26-10  through  26-12  are  shown  for 
comparison  with  the  curves  drawn  on  the  basis  of  the 
analytical  treatment  of  this  section.  These  values  were 
taken  from  the  graphs  of  Figs.  26-14  through  26-16.  The 
slight  discrepancy  in  the  case  of  the  value  read  from  Fig. 
26-16  is  due  to  the  fact  that  the  oscillation  had  not  quite 
“settled  down”  to  the  steady  state  within  the  range  of  the 
plot. 


lating  body  is  proportional  to  the  velocity  dx/dt  of  the  body,  it  can  be 
written  in  the  form  — r dx/dt , where  r is  the  frictional  drag  coefficient.  An 
expression  for  the  mechanical  Q factor,  Eq.  (8-426),  was  derived  in  terms  of 
r,  the  mass  m of  the  body,  and  the  natural  frequency  co0  of  a lightly  damped 
system.  1 he  expression  is  Q = ma>0/r.  Since  for  light  damping  cu0  — cur,  the 
more  simply  evaluated  resonant  frequency,  to  a good  approximation  we 
can  write 


<2  = 


m(or 

r 


(26-7  5rt ) 


It  is  possible  to  derive  the  expression  for  the  Q factor  of  an  ac  circuit  in  a 
manner  analogous  to  that  employed  for  the  mechanical  Q factor  in  Sec.  8-5. 
Or  we  can  take  advantage  of  the  analogies  between  electrical  and  mechan- 
ical quantities  given  in  Table  26-1  to  write  the  electrical  Q factor  directly. 
Using  those  analogies,  we  have 


(26-75 b) 


The  Q_f actor  is  proportional  to  both  the  inductance  and  the  resonant  frequency  of  os- 
cillation of  the  circuit  and  inversely  proportional  to  the  resistance.  Particularly  in 
circuits  which  oscillate  at  high  frequencies,  the  Q factor  can  he  very  large 
compared  to  the  practical  maximum  of  about  102  encountered  in  mechan- 
ical oscillating  systems.  Several  such  circuits  are  discussed  in  exercises  at  the 
end  of  this  chapter. 

Circuits  having  large  Q factors  and  thus  great  sharpness  of  resonance 
have  many  applications  in  electronics.  A familiar  and  very  typical  applica- 
tion is  the  tuning  circuits  of  radio  and  television  receivers.  The  antenna, 
whose  operation  is  described  in  Chap.  27,  simultaneously  picks  up  a 
number  of  signals  whose  frequencies  are  slightly  different  from  one  an- 
other. It  thus  acts  as  an  ac  source  which  produces  a superposition  of  essen- 


26-8  Alternating-Current  Circuits:  Analytical  Description  1253 


tially  sinusoidal  voltages  of  differing  frequencies.  The  tuning  circuit  is  a 
resonant  circuit  having  a very  sharp  response  curve,  whose  resonant  fre- 
quency can  be  adjusted  by  varying  its  capacitance,  its  inductance,  or  both. 
When  the  circuit  is  “tuned”  so  that  its  resonant  frequency  is  equal  to  the 
frequency  of  the  desired  signal,  the  voltage  of  that  frequency  drives  a much 
larger  current  through  the  circuit  than  do  voltages  of  approximately  the 
same  amplitude  but  different,  undesired,  frequencies.  It  is  thus  possible  to 
recover  a single  desired  signal  from  a jumble  of  superposed  signals. 

On  the  other  hand,  there  are  practical  situations  in  which  it  is  important  to 
make  the  resonance  curve  of  a system  as  broad  as  possible.  Consider  the  springing 
system  of  an  automobile,  in  which  the  springs  play  a role  analogous  to  that  of  the 
capacitor  in  the  ac  circuit  and  the  mass  of  the  automobile  body  plays  a role  analo- 
gous to  that  of  the  inductor.  If  the  resonance  were  sharp,  the  automobile  would  os- 
cillate in  an  uncomfortable  and  possibly  dangerous  manner  when  it  was  subjected 
to  a driving  force  at  the  resonant  frequency.  This  could  occur,  for  example,  if  the 
car  were  driven  at  a critical  speed  down  a concrete  road  with  evenly  spaced, 
bumpy  joints. 

The  automobile  designer  has  recourse  to  two  major  devices  to  avoid  this  pos- 
sibility. The  first,  described  in  Example  6-9,  is  to  employ  shock  absorbers,  which 
play  a role  analogous  to  that  of  the  resistor  in  the  series  ac  circuit.  By  introducing 
sufficient  viscous  friction,  the  designer  ensures  that  the  resonance  curve  of  the  au- 
tomobile’s oscillation  amplitude  as  a function  of  frequency  will  be  broad  and  low, 
rather  than  sharp  and  tall.  The  other  device  is  to  design  the  springing  system  so 
that  the  resonant  frequencies  of  its  various  parts  (say,  the  front-wheel  suspension 
and  the  rear-wheel  suspension)  are  different.  Thus  there  is  no  single,  sharp  reso- 
nant frequency  for  the  entire  automobile. 

A similar  problem  exists  in  the  design  of  stringed  musical  instruments  such 
as  the  violin.  If  the  body  of  the  violin  had  a single,  sharp  resonant  frequency,  the 
musical  notes  produced  by  vibrating  strings  whose  frequency  approximated  that 
resonant  frequency  would  be  much  louder  than  notes  elsewhere  in  the  range  of 
the  violin.  The  means  for  avoiding  this  situation  are  identical  to  those  discussed 
in  connection  with  automobile  springing  systems.  The  “softness”  of  the  wood 
construction  introduces  a “resistance,”  while  the  very  complex  shape  of  the  in- 
strument tends  to  produce  a situation  in  which  various  parts  of  the  body  have 
various  resonant  frequencies.  Nevertheless,  poorly  designed  violins  are  subject  to 
“wolf  notes.”  Certain  notes  played  on  such  an  instrument  excite  resonant  oscilla- 
tion in  the  body  of  the  violin,  with  a characteristic  “howling”  sound. 


The  resonance  curves  of  Fig.  26-25  illustrate  the  variation  of  the  cur- 
rent amplitude  i0  in  the  series  ac  circuit  as  the  impedance  Z varies  as  a re- 
sult of  a variation  in  the  driving  frequency.  But  as  the  driving  frequency 
varies,  the  phase  angle  </>  of  the  current  with  respect  to  the  voltage  varies  as 
well.  The  phase  angle  is  given  by  Eq.  (26-70o),  tan  p>  = —X/R.  Writing  the 
reactance  X in  terms  of  the  quantities  L , C,  and  o>  gives  us  the  more  explicit 
expression 


or 


tan  <p  = 


a>L  — 1 /ojC 
R 


(26-76a) 


<6  = tan 


/ 1/coC  — coL 
l R~ 


(26-766) 


In  Fig.  26-26,  the  phase  angle  is  plotted  as  a function  of  driving  frequency 
for  the  three  series  ac  circuits  whose  resonance  curves  are  shown  in  Fig. 


Current  lags  voltage  Current  leads  voltage 


Fig.  26-26  Plot  of  the  phase  angle  <f>  given  by  Eq. 
(26-76 b)  as  a function  of  driving  frequency  oj  for  the 
series  ac  circuits  whose  resonance  curves  are  shown  in 
Fig.  26-25.  The  middle  curve  (solid  line)  again  repre- 
sents the  circuit  discussed  in  Examples  26-10  through 
26-12,  the  results  for  which  are  plotted  in  Figs.  26-14 
through  26-16.  For  comparison,  the  phase  angles 
read  from  those  three  plots  are  shown  as  heavy 
points.  It  is  difficult  to  measure  the  phase  angle  ac- 
curately on  such  a plot;  this  accounts  for  the  fact  that 
the  points  do  not  lie  exactly  on  the  solid  curve. 


26-25.  A positive  value  of  $ means  that  the  current  leads  the  voltage;  a neg- 
ative value  of  means  that  the  current  lags  the  voltage.  Note  that  the  cur- 
rent is  iu  phase  with  the  driving  voltage  at  resonance. 


26-9  POWER  IN 
ALTERNATING- 
CURRENT  CIRCUITS 


Alternating-current  systems  are  of  the  greatest  importance  in  the  transmis- 
sion of  power  from  one  place  to  another.  We  consider  in  this  section  some 
of  the  quantities  significant  in  tfie  analysis  of  power  transmission. 

Average  voltages  and  currents  in  ac  circuits  are  usually  given  in  terms 
of  their  root-mean-square  values.  1 he  root  mean  square  of  a quantity  is  de- 
fined to  be  the  square  root  of  the  mean  (average)  value  of  the  square  of  the 
quantity;  in  particular,  the  root-mean-square  value  of  the  voltage  V,  or  rms 
voltage  Vrms,  is 

ums  - vu^y 


For  a sinusoidal  voltage  such  as  V = V0  cos(c ot),  the  value  of  Vrms  is  given  by 
an  integral  over  one  period  T = 2tt/oj\ 

-i  1/2 


r rms 


r fT 

1/2  r 

V2  dt 

Jo 

II 

0^ 

fT 

dt 

L Jo  J 

out  the 

remaining 

C2ttI(d 


cos 2(ojt)  dt 


2tt/oj 


(26-77) 


make  the  substitutions  6 = cot  and  dd  = u>  dt.  In  terms  of  this  new  variable, 
the  upper  limit  of  the  integral  becomes  2-77-.  The  equation  thus  becomes 


Urns  U 


11/ 


on 


2tt 


cos2  6 dd 


2tt/ 


oj 


1/2 


The  value  of  the  integral  is  77.  This  leads  to  the  result 


V = Vn 

v rms  v 0 


7 T 


/oj 


2tt/ 


OJ 


1/2 


To 


(26-78) 


26-9  Power  in  Alternating-Current  Circuits 


1255 


The  rms  voltage  is  invariably  quoted  in  specifying  the  voltage  of  a power 
line.  (You  will  soon  see  that  the  amount  of  energy  sold  is  most  conveniently 
expressed  in  terms  of  the  rms  voltage.)  1 he  peak  voltage  V0,  or  maximum 
instantaneous  voltage,  in  a nominal  110-V  power  line  is  thus 

Fo  = V2T,ms  = 156  V 

Another  voltage  sometimes  used  is  the  peak-to-peak  voltage  Tpp.  This  is 
the  difference  between  the  extremal  values  of  the  voltage,  and  it  is  twice  the 
peak  voltage.  For  the  1 10-V  line, 

Vpp  = 2T0  = 312  V 

The  manufacturers  of  high-fidelity  equipment  sometimes  specify  their 
products  in  terms  of  peak-to-peak  rather  than  root-mean-square  values. 
You  can  see  why! 


The  root-mean-square  current  in  an  ac  circuit  is  defined  in  a com- 
pletely analogous  way.  If  the  current  is  sinusoidal  and  the  phase  angle  is  </>, 
we  have  i = i0  cos  {cot  + (f>),  and  ?Tms  is  given  by  the  integral 


r r 1 

1/2 

r 2ttI  oj 

pdt 

cos2(wt  + (/>)  dt 

Jo 

Jo 

r dt 

— h 

27r/(0 

L Jo  J 

L 

The  same  substitution  used  in  calculating  Vrms  gives 


C2it 

1/2 

( 1/co)  cos 2(0  + (/>)  dO 

TT  / (x) 

Jo 

2tt/co 

— to 

2i r/co  _ 

or 


*rms 


(26-79) 


Both  i0  and  trms  are  always  positive. 


The  instantaneous  power  input  to  an  ac  circuit  is  given  by  Joule’s  law, 

P = Vi 

The  average  power  is  found  by  averaging  this  quantity  over  one  complete 
cycle.  For  sinusoidal  voltage  and  current  this  gives 

1 /'  T j CT 

( P ) - — Vi  dt  — l>'°  cos(co  t)  cos  {cot  + c/>)  dt 

T Jo  Jo 

The  substitutions  0 = (ut  and  dd  = co  dt  are  again  made,  yielding 

Vnln  l'2n 

(P)  = ' , , cos  0 cos (6  + (f>)  d0 

w(2tt/oj)  Jo 

The  integrand  can  be  written 

cos  0 cos {0  + </>)  = cos  0{ cos  0 cos  <J>  - sin  0 sin  <J>) 

So  the  expression  for  (P)  becomes 

(P)  = -77-^  (cos  c/>  | cos2  0 dO  — sin  I cos  0 sin  0 dO  ) 

2tt  \ J 0 Jo  / 


1256  Changing  Electric  Currents 


The  first  integral  has  the  value  tt,  and  the  second  one  the  value  0.  Thus  the 
average  power  is 


(P)  = — g-  cos  4>  (26-80a) 

It  is  customary  to  write  this  result  in  terms  of  the  rms  values  Vrms  and  irms . 
1 his  gives 

(P)  = Kms  *rms  COS  p (26-806) 

The  quantity  cos  <6,  the  cosine  of  the  phase  angle,  is  called  the  power  factor 
of  the  circuit.  [Note  that  the  value  of  the  power  factor  is  always  positive,  be- 
cause the  phase  angle  of  the  current  with  respect  to  the  voltage  can  never 
be  greater  than  90°  or  less  than  —90°.  (Why?)  Therefore  cos  p always  has  a 
positive  value.  Indeed,  as  far  as  the  power  factor  is  concerned,  it  does  not 
matter  whether  the  current  leads  or  lags  the  voltage.  Only  the  magnitude 
\(f)\  of  the  phase  angle  between  them  is  significant.] 

We  can  find  alternative  expressions  for  the  average  power  (P).  To  do 
so,  we  substitute  Eq.  (26-72 6),  Z = R/ cos  (p , into  Eq.  (26-63),  i0  = V0/Z,  and 
obtain 


V0  cos  </> 
lo  - ^ 

Using  Eqs.  (26-78)  and  (26-79),  we  get 

Urns  COS  (/> 

o'ms 

Thus  Eq.  (26-806)  shows  that  the  average  power  can  be  written 

(P)  = Erms  Vrms  ^°S  ^ cos  (/»  = V ™-s  c°—  (26-80r) 

K K 

or 

< P ) = W COS  <f>  = i2rms  R (26-80 d) 

COS  cp 

Equation  (26-80 d)  makes  clear  what  our  preceding  considerations  do 
not — power  is  dissipated  in  an  ac  circuit  only  by  the  resistance,  and  not  by  the  reac- 
tive elements.  As  you  have  already  seen,  energy  can  be  stored  in  an  inductor 
or  a capacitor,  but  not  dissipated. 

If  the  power  factor  cos  </>  is  small,  as  happens  if  the  circuit  has  a large  reac- 
tance and  thus  a large  phase  angle,  the  power  that  can  be  transmitted  from  source 
to  load  is  limited.  Current  flows  back  and  forth  in  the  transmission  line  merely  to 
energize  and  deenergize  the  inductor  or  capacitor.  This  is  a wasteful  situation, 
since  a large  rms  current  results  in  large  i2R  losses  in  the  line.  It  is  therefore  impor- 
tant to  keep  the  power  factor  as  close  as  possible  to  1,  especially  in  motor  circuits 
with  large  inductances.  There  are  several  devices  for  accomplishing  this.  The 
most  direct  is  to  connect  a capacitor  in  parallel  with  the  motor,  which  is  an  induc- 
tive load.  Why  does  this  reduce  the  net  reactance? 

The  transformer  is  an  important  practical  device  which  illustrates  a 
number  of  the  basic  ideas  introduced  in  this  chapter  and  the  previous  one. 
The  transformer,  shown  schematically  in  Fig.  26-27,  is  a special  type  of 


26-9  Power  in  Alternating-Current  Circuits  1257 


Fig.  26-27  A transformer. 


mutual  inductor.  It  is  represented  in  circuit  diagrams  by  the  standard 
symbol  ^ |||  Two  separate  windings,  called  the  primary  and  the  second- 
ary windings,  are  wound  on  a common  core  made  of  a ferromagnetic 
material.  The  purpose  of  the  core  is  to  ensure  that  essentially  all  the  mag- 
netic flux  resulting  from  electric  current  in  either  winding  links  that 
winding  with  the  other.  The  primary  winding  has  Ah  turns,  and  the  second- 
ary winding  has  N2  turns.  Assume  for  simplicity  that  the  resistances  of  both 
windings  are  small  compared  to  their  self-inductances. 

The  primary  is  connected  to  a source  of  voltage  Vh  of  alternating  sense. 
Together  they  comprise  the  primary  circuit.  Suppose  that  the  switch  is 
open,  so  that  no  current  can  flow  in  the  secondary  circuit.  Because  the  in- 
ductances are  relatively  large,  the  primary  voltage  Tj  drives  a rather  small 
current  through  the  primary  winding.  This  is  called  the  magnetizing  cur- 
rent. Under  these  conditions,  the  primary  current  t1  is  nearly  90°  out  of 
phase  with  the  primary  voltage.  The  power  factor  is  close  to  zero,  and  neg- 
ligible power  is  consumed. 

The  magnetizing  current  is  an  alternating  current,  and  there  is  an  os- 
cillating magnetic  flux  in  the  core.  According  to  Faraday’s  law,  this  flux  in- 
duces an  emf  in  each  turn  of  wire  surrounding  it.  This  is  equally  true  of 
primary  and  secondary  turns.  Since  the  emf’s  induced  in  the  turns  of  a coil 
add,  the  emf  induced  in  each  of  the  two  windings  is  proportional  to  the 
number  of  turns  in  the  winding.  The  total  emf's  across  the  two  windings 
are  thus  related  by  the  expression 


U2 


-Vt 


N, 

Ah 


(26-81) 


The  emf  induced  in  the  primary  can  be  written  as  —V1,  because  the  exter- 
nally applied  voltage  Vi  drives  only  a very  small  current  through  the  pri- 
mary winding.  It  must  therefore  be  opposed  by  a nearly  equal  emf  in- 
duced in  the  primary  in  accordance  with  Lenz’  law.  According  to  Eq. 
(26-81),  the  emf  induced  in  the  secondary  can  be  made  any  (reasonable) 
multiple  of  the  primary  voltage  by  choosing  the  proper  turns  ratio  Ah/Ah. 


What  happens  when  the  switch  in  the  secondary  circuit  is  closed,  so 
that  current  flows  through  the  secondary  winding  and  through  the  resistor 
R } According  to  Lenz’  law,  the  current  must  flow  at  any  instant  in  such  a 
way  as  to  oppose  change  in  the  flux  penetrating  the  secondary.  Suppose 


1258  Changing  Electric  Currents 


that  at  a certain  moment  t lie  primary  current  q is  increasing,  so  that  the 
Ilux  is  increasing.  The  secondary  current  i2  will  reduce  the  flux  to  a value 
smaller  than  it  would  have  at  that  moment  if  the  switch  in  the  secondary 
circuit  were  open.  But  the  flux  penetrating  the  primary  winding  is  also  re- 
duced (it  is  the  same  flux!).  This  results  in  a reduction  in  the  rate  of  change 
of  the  flux  as  well.  Consequently,  the  back  emf  induced  in  the  primary  is  re- 
duced. But  the  externally  imposed  voltage  is  unchanged.  It  therefore 
drives  an  increased  primary  current  q . So  the  input  power  must  increase  as 
a consequence  of  the  flow  of  secondary  current  i2  through  an  external  load. 

Since  the  transformer  is  not  a source  of  energy,  the  power  input  to  the 
primary  must  be  equal  to  the  power  dissipated  in  the  secondary  circuit. 
This  gives  the  general  condition 

h rms^i  rms  COS  (/q  — l2  rms^  rms  C°S  (j)2 


where  the  subscripts  indicate  the  quantities  appropriate  to  the  primary  and 
secondary  circuits,  respectively.  If  the  power  in  the  secondary  is  dissipated 
in  a resistive  load,  the  equality  of  power  input  and  power  output  can  be 
written  by  using  Eq.  (26-806)  to  express  the  power  input  and  Eq.  (26-80c)  to 
express  the  power  output.  This  gives 


h rms^h  rms  COS  (/q 


I 2 rms  COS  (f)2 

R 


In  a well-designed  transformer  system,  it  is  possible  to  make  both  power 
factors  quite  close  to  1 under  operating  conditions.  In  this  case,  combining 
the  equation  immediately  above  with  Eq.  (26-81)  gives 


'N2\2  V i rmS 

Nil  r 


(26-82) 


That  is,  the  current  flowing  in  the  primary  circuit  depends  inversely  on  the 
resistance  in  the  secondary  circuit  through  a sort  of  “Ohm’s  law”  involving 
the  turns  ratio  N2/N1.  A resistor  R in  the  secondary  circuit  has  the  same  ef- 
fect on  the  primary  current  as  a resistor  in  the  primary  circuit  having  the 
value  R(N2/Ni)~2. 


EXERCISES 

Group  A 

26-1.  Another  interpretation  of  time  constant. 

a.  In  the  situation  described  by  Eq.  (26-9),  what  is  the 
initial  rate  of  decrease  of  the  current? 

b.  How  long  would  it  take  for  the  current  to  decrease 
to  zero  if  it  maintained  the  initial  rate  of  decrease? 

c.  How  is  this  time  related  to  the  time  constant  tl ? 

26-2.  Current  in  the  circuit.  A 5.0-H  coil  has  a resist- 
ance of  50  ft.  It  is  connected  to  a 5.0-V  battery. 

a.  When  the  current  is  changing  at  t he  rate  of  0.50 
A/s,  what  is  the  current  in  the  circuit? 

b.  What  was  the  initial  rate  of  change  of  the  current? 

26-3.  The  dimensions  oj  time.  Show  that  the  quantity  RC 
has  the  dimensions  of  time. 

26-4.  More  dimensional  analysis.  Show  that  oj  = 
1 /V7C  is  a dimensionally  correct  equation. 


26-5.  Oscillation  frequency.  If  in  Fig.  26-7,  the  value  of 
the  inductance  is  1.0  H and  the  capacitance  is  1.0  fjcF.  what 
is  the  oscillation  frequency? 

26-6.  7 'line  in.  A variable  capacitor  is  used  in  tuning  a 
radio  in  the  broadcast  band.  It  is  in  series  with  a coil  of 
neslioible  resistance  whose  inductance  is  2.5  x 10-4  H. 

O O 

The  lowest  frequency  to  be  tuned  is  5.5  x 10°  Hz.  This 
must  be  the  lowest  frequency  (not  angular  frequency)  of 
the  LC  circuit.  What  must  be  the  maximum  capacitance  ot 
the  variable  capacitor? 

26-7.  Hummmm  ....  Henry  Ohm  wishes  to  construct 
an  LRC  circuit  that  will  have  a natural  frequency  of  60.0 
Hz.  He  has  an  inductor  with  inductance  L = 0.200  H and 
a capacitor  with  capacitance  C = 10.0  /zF.  What  resistance 
value  should  he  select  for  the  resistor?  (The  hum  is  due  to 
vibrations  in  the  winding  of  the  inductor.) 


Exercises  1259 


26-8.  Splash.  Henn  decides  that  he  can’t  stand  the 
humming  noise  made  by  the  inductor  of  the  LRC  circuit 
considered  in  Exercise  26-7.  What  resistance  value  should 
he  select  to  make  the  circuit  critically  damped? 

26-9.  Comparing  reactances. 

a.  What  is  the  reactance  at  60  H/  of  (i)  a 1.0-H  in- 
ductor? (ii)  a 1.0-/U.F  capacitor? 

b.  At  what  frequency  will  the  two  reactances  in  part  a 
be  equal? 

26-10.  Inductive  and  capacitive  reactance.  A resistance 
draws  a current  of  2.00  A front  a 200-V,  60-11/  ac  genera- 
tor (the  values  quoted  are  amplitudes). 

a.  How  large  an  inductive  reactance  in  series  with  the 
resistance  will  reduce  the  current  to  1.00  A? 

b.  Repeat  for  a capacitive  reactance. 

26-11.  Simple  ac  circuit.  A resistor  and  a capacitor  are 
in  series  in  an  ac  circuit.  A meter  across  the  resistor  reads  a 
potential  difference  of  amplitude  100  V;  across  the  capac- 
itor, it  also  reads  100  V. 

a.  What  would  the  meter  read  across  the  two  in 
series? 

b.  What  is  the  phase  relation  between  the  current 
and  the  voltage  in  part  a? 

26-12.  Frequencies  of  LRC  circuit.  An  LRC  circuit  con- 
tains a 5.0-p.F  capacitor  and  a 0.50-1 1 inductor  whose  re- 
sistance is  400  n. 

a.  Evaluate  its  natural  frequency. 

b.  Evaluate  its  resonant  frequency. 

c.  Evaluate  its  (J  factor. 

26-13.  Avoid  breakdown.  A 5.0-/U.F  capacitor  can  with- 
stand a peak  voltage  ot  1200  V.  What  is  the  maximum 
root-mean-square  value  of  the  current  it  can  carry  at  a fre- 
quency of  60  Hz? 

26-14.  No  power  used  by  a pure  inductance.  Draw  a 
graph  of  i versus  a>t  for  an  alternating  current,  i = 
2 sinfwO  A,  with  a>  equal  to  2 rad/s.  On  the  same  graph 
draw  the  voltage,  L di/dt.  with  L equal  to  1 H.  Use  the 
graph  to  explain  why  the  power  consumed  by  a pure  in- 
ductance is  zero. 

26-15.  Much  ado  about  a circuit.  A 1.0-H  inductor,  a 
5.0-p.F  capacitor,  and  a 100-fl  resistor  are  connected  in 
series  to  a 120-V  root-mean-square  60-Hz  power  line. 

a.  What  is  the  reactance  of  the  (i)  the  inductor?  (ii) 
the  capacitor? 

b.  What  is  the  impedance  of  the  circuit? 

c.  What  is  the  root-mean-square  value  of  the  cut- 
rent? 

d.  What  is  the  root -mean-square  voltage  across  (i)  the 
inductor?  (ii)  the  capacitor?  (iii)  the  resistor? 

e.  What  is  the  phase  angle  of  the  current  with  respect 
to  the  voltage? 

f.  What  is  the  power  factor  of  the  circuit? 

g.  What  is  the  power  dissipated  in  the  circuit? 


Group  B 

26-16.  Accounting  for  energy  consumed,  I.  Show  that  the 
energy  consumed  in  the  resistor  of  Fig.  26-3,  when  the 
switch  is  thrown  from  position  A to  position  B , is  equal  to 
the  initial  magnetic  energy  stored  in  the  inductor,  Li2/ 2. 
Here  i = V0/R  is  the  current  flowing  at  the  instant  the 
switch  is  thrown. 

26-17.  Charge  on  a capacitor.  If  the  switch  in  Fig.  26-5 
is  thrown  to  position  B at  t =0  after  having  been  at  posi- 
tion A for  some  time,  show  that  the  charge  q on  the  left 
plate  of  the  capacitor  is  given  for  t > 0 by  q = CV0e~'IRC . 

26-18.  Accounting  for  energy  consumed,  II.  Show  that 
the  energy  consumed  in  the  resistor  of  Fig.  26-5,  when  the 
switch  is  thrown  from  position  A to  position  B as  described 
in  Exercise  26-17,  is  equal  to  the  initial  electric  energy 
stored  in  the  capacitor,  cf  /2C.  Here  q = CV0  is  die  charge 
on  the  left  plate  of  the  capacitor  at  the  instant  the  switch  is 
thrown. 

26-19.  Designing  an  LC  circuit.  It  is  desired  to  set  up 
an  LC  circuit  in  which  the  capacitor  is  originally  charged 
to  a difference  of  potential  of  100.0  V.  The  maximum 
current  is  to  be  10.0  A,  and  the  oscillation  frequency  is  to 
be  1000  Hz.  What  are  the  required  values  of  L and  C? 

26-20.  Equal  time  constants.  Show  that  the  current  in 
the  circuit  through  the  battery  of  F'ig.  26E-20  rises  in- 
stantly to  its  final  value  of  V/R  when  the  switch  S is  closed, 
provided  the  time  constants  of  the  two  branches  are  equal. 
The  internal  resistance  of  the  battery  and  of  the  con- 
necting wires  is  negligible. 

R L Fig.  26E-20 


26-21.  From  driven  RL  to  free  LC.  Consider  the  circuit 
shown  in  Fig.  26E-21.  Prior  to  t = 0,  the  switch  is  in  posi- 
tion^ and  the  capacitor  is  uncharged.  At  t = 0,  the  switch 
is  moved  to  position  B. 

R A B Fig.  26E-21 


1260  Changing  Electric  Currents 


a.  Determine  the  current  in  the  LC  circuit  for  t > 0. 

b.  Find  the  charge  q on  the  lower  capacitor  plate  for 
t S'  0. 

c.  Show  that  the  total  energy  E stored  in  the  LC  cir- 
cuit is  constant,  and  express  E in  terms  of  V0.  R,  and  L. 

26-22.  Verification.  Show  that  Eq.  (26-42«)  satisfies 
the  differential  equation  for  an  LRC  circuit,  Eq.  (26-38). 

26-23.  Energy  loss  in  an  LRC  circuit.  Use  Eq.  (26-31)  to 
show  that  in  an  undriven  LRC  circuit  the  stored  energy  E 
decays  at  a rate  given  by  dE/dt  = — i2R.  Do  this  by  eval- 
uating dE/dt , with  E = Li2 / 2 + q2/2C. 

26-24.  Don’t  let  it  phase  you.  In  Fig.  26E-24,  the  cur- 
rent is  described  by  a phasor  whose  magnitude  is  i0  = 
1.0  A.  Also  R = 3.0  D,  XL  = 4.0  O,  and  Xc  = 5.0  Q. 
Draw  a phasor  diagram  showing  the  voltage  from  A to  B, 
from  B to  C,  from  C to  D , and  from  A to  D.  What  are 
their  numerical  magnitudes?  What  is  the  phase  angle 
of  the  current  with  respect  to  the  driving  voltage? 

R XL  Xc  Fig.  26E-24 

i— — AA/V— — —I  I— «-i 

A B C II  D 


b.  With  H at  F,  calculate  the  amplitude  of  the  voltage 
between  B and  D;  between  D and  F.  The  sum  of  the  mag- 
nitudes of  these  two  is  much  more  than  150  V.  How  is  this 
possible? 

26-27.  Deriving  the  Q-factor  formula.  Apply  the  defini- 
tion of  the  Q factor  of  an  LRC  circuit  given  in  Sec.  26-8  to 
derive  Etp  (26-75 b),  Q_  = Lwr/R.  Do  this  by  taking  the  total 
energy  stored  in  the  circuit  in  any  cycle  to  be  the  max- 
imum value  Lil/2  of  the  energy  stored  in  the  inductor 
during  that  cycle,  and  by  taking  the  average  power  lost  in 
the  resistor  during  the  cycle  to  be  (irms)2R. 

26-28.  Properties  of  an  inductor.  An  inductor,  having 
appreciable  resistance,  is  connected  across  a power  line, 
supplying  an  rms  voltage  of  110  V at  a frequency  of 
60  Hz.  It  draws  an  rms  current  of  0.60  A.  The  average 
power  consumed  is  36  W.  Calculate  the  following  prop- 
erties of  the  inductor: 

a.  impedance 

b.  power  factor 

c.  resistance 

d.  inductance 

26-29.  When  you’ve  got  to  glow,  you’ve  got  to  glow.  In  Fig. 
26E-29,  A is  a 100-W,  120-V  light  bulb,  and  B is  a 60- W, 
12-V  light  bulb.  A is  connected  in  the  primary  circuit  of  a 
10:  1 stepdown  transformer.  B is  in  the  secondary  circuit. 
The  self-inductance  of  the  primary  coil  is  10  H. 


26-25.  Same  resonant  frequency . L1,C1 , and  R x are  con- 
nected in  series  and  have  a particular  resonant  fre- 
quency. L2,C2,  and  R2 , also  connected  in  series,  happen  to 
have  an  identical  resonant  frequency.  Prove  that  if  all  of 
these  six  circuit  elements  are  connected  in  series,  the  new 
circuit  will  have  the  same  resonant  frequency  as  either  of 
the  circuits  first  mentioned. 

26-26.  Shedding  light  on  the  problem.  In  Fig.  26E-26,  K 
is  a light  bulb  whose  filament  has  a resistance  of  120  fl, 
and  the  amplitude  of  the  oscillatory  driving  voltage  is 
150  V.  The  tap  H is  connected  successively  to  points  B.  D, 
E,  F.  and  G.  With  H at  B the  light  bulb  glows  with  its 
normal  brightness.  With  H at  D.  the  glow  is  barely  visible. 
With  H at  E,  the  glow  is  distinctly  visible.  With  H at  F,  the 
glow  is  nearly  normal.  With  H at  G,  the  glow  is  about  the 
same  as  at  E.  The  driving  frequency  is  w = 277  x 60  Hz. 


Fig.  26E-26  1 20  ft 


case. 


a.  With  the  switch  S in  the  secondary  open,  A does 
not  glow.  Why? 

When  S is  closed,  B glows  and  so  does  A.  Explain 
qualitatively. 

26-30.  Mutual  inductance  in  a transformer . In  the  trans- 
former of  Fig.  26E-30,  show  that  M12  = \JLXL2  if  there  is 
no  leakage  of  magnetic  flux  from  the  iron  core.  (The 
quantities  Lx  and  L2  are  the  self-inductances  of  the  coils 
having  N1  and  N2  turns,  respectively,  and  M12  is  their 


Exercises  1261 


Group  C 

26-31.  There’s  a switch!  Consider  the  RL  circuit  shown 
in  Fig.  26E-31.  Initially,  the  switch  is  in  position  C,  so  that 
the  circuit  is  open.  At  t = 0,  the  switch  is  thrown  to  A,  so 
that  a battery  voltage  + \VA\  is  applied  to  the  circuit. 


R L Fig.  26E-31 

am — 


VA  C 


I Positive 

+ 

A t 

(sense 

l 

VB 


Then  at  time  T the  switch  is  thrown  to  B,  which  results  in 
an  applied  voltage  — \VB\. 

a.  Find  the  current  i for  0 t =£  T. 

b.  Find  the  current  i for  t 3=  T. 

c.  Find  the  instant  t'  > T at  which  the  switch  could  be 
thrown  from  B to  C,  reopening  the  circuit,  without  danger 
of  producing  an  arc.  Express  t'  in  terms  of  T , tl  = L/R. 

| V.4 1 , and  |VB|. 

26-32.  Rise  and  fall.  Consider  the  RL  circuit  shown  in 
Fig.  26-1.  Prior  to  t = 0 the  switch  is  at  position  C; 
between  t = 0 and  t = T it  is  at  A;  at  t = T it  is  at  B.  where 
it  remains. 

a.  For  all  t 3 0.  find  the  current  i through  the  circuit. 

b.  Find  the  voltages  VR  and  VL  across  the  resistor  and 
the  inductor. 

c.  Lise  your  results  from  part  b to  check  that  (i)  for 

0<t  <T.  V ft  + V i = Vo ; ( n ) for  t > T.  V R + = 0. 

26-33.  Sawtooth  voltage.  The  electron  beam  deflection 
system  in  a TV  tube  requires  a sawtooth  voltage,  like  the 
one  shown  in  Fig.  26E-33a.  A simple  way  of  obtaining 
such  a voltage  approximately  is  by  means  of  the  circuit 


Fig.  26E-33 

AAA/yw 

t 


(a) 


R 

MV 


-o 


■o 


(.b) 

1262  Changing  Electric  Currents 


shown  in  Fig.  26E-33 b.  N is  a neon  glow  lamp  which  does 
not  conduct  until  the  firing  voltage  Vf  is  applied  to  it  and 
then  conducts  almost  perfectly.  That  is,  N acts  like  a switch 
which  is  open  when  the  voltage  across  it  is  less  than  Vf  and 
closes  when  this  voltage  reaches  V f. 

a.  Show  that  the  voltage  Vc  across  C is  given  by 
V6(l-  e~tlRC). 

b.  For  t <3=:  RC  = tc,  show  that  Vc  = Vbt/RC.  a 
linear  relation  between  Vc  and  t. 

c.  Show  that  the  period  T of  the  sawtooth  voltage  is 
equal  to  VfRC/Vb.  The  resistance  of  the  conducting  glow 
lamp  is  negligibly  small. 

26-34.  Setting  the  initial  conditions.  From  Eqs.  (26-42a) 
and  (26-45),  show  that  if  i = 0 and  q = q0  at  t = 0,  then 
i = - (q0/woLC)  e~fRI2L)t  sin(w00- 

26-35.  Parallel  LRC  circuit.  Consider  the  parallel  LRC 
circuit  shown  in  Fig.  26E-35.  Initially  the  switch  is  in  posi- 
tion A.  all  currents  are  zero,  and  the  positive  charge  on  the 
left  plate  of  the  capacitor  is  q0.  At  t = 0,  the  switch  is 
moved  to  position  B. 


L Fig.  26E-35 


a.  Immediately  after  the  switch  is  closed,  what  is  (i) 
the  inductor  current  iL ? (ii)  the  resistor  current  iR?  (iii)  the 
capacitor  current  ic? 

b.  What  final  values  do  these  currents  eventually  ap- 
proach? 

c.  Show  that  the  charge  q on  the  left  plate  of  the 
capacitor  satisfies  the  equation 


cPq  1 dq  q 
~dP  + RC~dt  + LC 


0 


d.  Show  that  if  R 2 > L/4C,  the  general  solution  of 
the  differential  equation  of  part  c can  be  written  as  q = 
Ae-wPmt  COS(Wpt)  + Be~ippl2)'  sin (a>pt),  where  the  constants  A 
and  B may  be  freely  chosen  but  the  quantities  (3P  and  wp 
are  determined  by  the  values  of  L.  R.  and  C.  Find  /3P  and  a>p . 

e.  Find  the  values  A and  B for  which  the  general  solu- 
tion satisfies  the  initial  conditions  of  the  present  problem. 

f.  As  shown  in  part  d.  the  parallel  LRC  circuit  is 
lightly  damped  when/?  exceeds  a certain  minimum  value, 
while  a series  LRC  circuit  (the  more  common  type  that  is 
treated  in  the  text  and  called  there  simply  an  LRC  circuit) 
is  lightly  damped  only  when  R is  less  than  a certain  value. 
Explain  the  difference  in  these  criteria  for  light  damping. 


Fig.  26E-40 


26-36.  Parallel  ac  circuit.  Use  phasors  to  carry  out  an 
analysis  of  the  parallel  ac  circuit  in  Fig.  26-22.  Pattern 
your  analysis  after  the  one  given  in  the  text  for  a series  ac 
circuit. 

26-37.  Using  a phasor  diagram.  With  the  aid  of  a 
phasor  diagram,  show  that  the  impedance  of  a capacitor 
and  a resistor  connected  in  parallel  is  equal  to 
R/V 1 + (o2R2C2. 

26-38.  Driven  LRC  circuit.  The  resonant  frequency  of 
a certain  driven  LRC  circuit  is  wr.  If  the  amplitude  of  the 
voltage  remains  constant  and  the  driving  frequency  w is 
varied,  the  steady-state  current  amplitude  diminishes 
when  (u  departs  from  car.  Let  aq,  greater  than  wr,  give  a 
certain  current  amplitude  and  o>2,  less  than  wr,  give  the 
same  current  amplitude.  Prove  that  co1oj2  — w?. 

26-39.  Wheatstone  bridge.  Figure  26E-39  represents  an 
ac  bridge  similar  to  a Wheatstone  bridge.  From  a knowl- 
edge off?!,  R 2,  and  either  L or  C,  the  unknown  fourth 
quantity  can  be  determined  by  varying  the  three  known 
ones.  The  detector  between  A and  B is  an  ac  ammeter.  If 
the  bridge  is  balanced  so  that  there  is  no  current  through 
the  detector,  VAD  and  V BD  must  have  the  same  amplitude 
and  phase.  Consideration  of  either  equality  leads  to  the 
relation  that  L/C  = R\R2.  Prove  this  by  first  considering 
the  amplitude  in  part  a,  then  by  considering  the  phase  in 
part  b: 

A Fig.  26E-39 


a.  When  the  bridge  is  balanced:  (i)  What  is  the  cur- 
rent through  L and  Rf  (ii)  What  is  the  amplitude  of  VAD~t 
(iii)  What  is  the  current  through  R2  and  C?  (iv)  What  is  the 
amplitude  of  VBD ? (v)  Show  that  L/C  = R\R2  is  valid. 

b.  When  the  bridge  is  balanced;  (i)  What  is  the 
tangent  of  the  angle  by  which  VAD  leads  the  applied  volt- 
age? (ii)  What  is  the  tangent  of  the  angle  by  which  VBD 
leads  the  applied  voltage?  (iii)  From  the  last  two  results, 
show  that  L/C  = RiR2  is  valid. 

26-40.  Skin  effect.  At  frequencies  in  the  megahertz 
range,  ac  current  in  a wire  is  not  uniformly  distributed 
throughout  its  cross  section.  Instead  the  current  density  is 
much  greater  near  the  surface,  so  the  wire  could  just  as 
well  be  a hollow  tube.  This  concentration  of  current  near 
the  surface  is  called  the  skin  effect,  and  the  reason  for  it 


(F~ 

fs 

C" 

cl 

i 

i 

i 

'B‘ 

c' 

| 

1 

1 

1 

i 

i 

a"\a 

1 

1 

A' 

1 

1 

l 

1 

1 

1 

1 

1 

1 

1 

f 

l 

1 

l 

1 

l* 

E" 

V 

DJ 

D"j 

can  be  seen  if  it  is  kept  in  mind  that  there  is  magnetic  flux 
inside  a wire  as  well  as  in  the  space  around  it. 

Figure  26E-40  shows  a loop  of  wire  carrying  a current. 
Filament  ABCDE  is  at  the  center  of  the  wire.  Filament 
A'B'C'D'E'  is  at  the  inner  surface  erf  the  wire  rectangle. 
T he  filaments  both  have  the  same  cross-sectional  area  and 
therefore  about  the  same  resistance. 

a.  Why  is  there  more  flux  linked  with  the  unprimed 
than  with  the  primed  circuit? 

b.  Compare  the  self-inductance  and  impedance  of 
the  two  filament  circuits. 

c.  If  the  current  is  high-frequency  ac,  compare  the 
currents  and  the  current  densities. 

d.  Why  will  increasing  the  frequency  enhance  the 
difference  in  current  density? 

e.  Filament  A"B"C"D"E"  is  also  on  the  surface  of 
the  wire.  It  will  have  a greater  current  density  if  it  en- 
closed a smaller  flux.  Show  that  it  has  the  same  flux  as 
A'B'C'D'E'. 

26-41.  Meas  wring  power  consumption  in  an  inductor.  Fig- 
ure 26E-41  shows  a circuit  for  measuring  the  power  con- 
sumption of  an  inductor,  represented  by  its  inductance  L 
in  series  with  the  resistance  RL  of  its  windings.  A known 
resistor/?  is  connected  in  parallel  with  the  inductor.  The 
root-mean-square  currents  in  each  branch,  q and  i2 , are 
measured  with  ammeters,  as  well  as  the  combined  root- 
mean-square  current  i3. 


a.  Draw  a phasor  diagram  showing  the  phase  rela- 
tions among  the  three  currents. 

b.  Use  it  to  show  that  if  = if  + if  + 2i2q  cos  </>, 
where  </>  is  the  phase  angle  of  the  current  in  the  inductor 
with  respect  to  the  voltage  applied  across  it. 


Exercises  1263 


c.  Using  the  relation  V = i2R,  where  V is  the  root- 
mean-square  voltage  across  the  known  resistor  (or  across 
the  inductor),  and  the  relation  (P)  = Vi1  cos  0,  where  ( P ) 
is  the  average  power  consumed  in  the  inductor,  show  that 
<p>  = R(%  - >2  - *l)/2. 

Numerical 

26-42.  LRC  circuit,  I.  Run  the  damped  oscillator  pro- 
gram with  initial  conditions  and  parameters  just  as  in  Ex- 
ample 26-6,  except  with  the  inductance  of  the  LRC  circuit 
doubled.  Plot  your  results  and  compare  with  those  plotted 
in  Fig.  26-10. 

26-43.  LRC  circuit,  II.  Run  the  damped  oscillator  pro- 
gram with  initial  conditions  and  parameters  just  as  in  Ex- 
ample 26-6,  except  reduce  the  resistance  of  the  LRC  cir- 
cuit to  R = 180  fi.  Measure  the  angular  frequency  w0  of 
the  oscillation.  Compare  your  results  with  the  predictions 
of  Eq.  (26-426). 

26-44.  LRC  circuit  with  different  initial  conditions.  As- 
sume the  switch  in  Fig.  26-9  has  been  in  position  B for  a 
considerable  time  prior  to  t = 0,  and  that  at  t = 0 it  is 
moved  to  position  A.  Write  a differential  equation  which 
determines  the  current  i flowing  in  the  LRC  circuit  for  t > 
0.  Modify  the  damped  oscillator  program  so  that  it  will 
solve  this  equation.  Determine  the  initial  conditions  i0  and 
(di/dt) o for  the  circuit  parameters  of  Example  26-5.  Then 


run  the  program  with  these  initial  conditions  and  parame- 
ters. Compare  your  plot  of  i versus  t with  the  one  in  Fig. 
26-10,  and  explain  both  the  similarities  and  the  dif- 
ferences. 

26-45.  AC  circuit,  I.  Run  the  driven  LRC  circuit  pro- 
gram with  initial  conditions  and  parameters  just  as  in  Ex- 
ample 26- 1 1 . except  with  the  resistance  of  the  driven  LRC 
circuit  halved.  Plot  your  results  and  compare  with  those 
plotted  in  Fig.  26-15.  Also  compare  the  steady-state  ampli- 
tude with  the  analytical  value  displayed  in  Fig.  26-25. 

26-46.  AC  circuit,  II.  Run  the  driven  LRC  circuit  pro- 
gram to  obtain  points  on  the  R = 50.0-fl  resonance  curve 
of  Fig.  26-25  at  w = 600  rad/s  and  &>  = 1400  rad/s.  It  is 
not  necessary  to  plot  i versus  t to  determine  the  steady- 
state  current  amplitude. 

26-47.  AC  circuit  with  different  source  voltage.  Modify 
the  differential  equation  for  the  driven  LRC  circuit  so  that 
the  ac  source  voltage  is  V = V0  sin(wt).  Then  modify  the 
driven  LRC  circuit  program  so  that  it  solves  this  differen- 
tial equation.  Next  determine  the  initial  conditions  i0  and 
(di/dt) o for  the  circuit  parameters  of  Example  26-11,  the 
switch  being  thrown  to  B at  t = 0.  Run  the  program  with 
these  initial  conditions  and  parameters.  Compare  your 
plot  of  i versus  t with  the  one  in  Fig.  26-15,  and  explain 
both  the  similarities  and  the  differences. 


1264 


Changing  Electric  Currents 


■ 

Maxwell's  Equations 
and  Electromagnetic 
Waves 


27-1  THE 
DISPLACEMENT 
CURRENT 


11  ie  properties  of  electromagnetic  fields  can  be  described  by  a set  of  four 
equations.  These  equations  are  generally  called  Maxwell’s  equations,  after 
the  British  physicist  James  Clerk  Maxwell  (1831-1879).  Two  of  Maxwell's 
equations  are  just  Gauss’  laws  for  electric  fields  and  for  magnetic  fields,  and 
the  third  is  nothing  more  than  Faraday’s  law.  The  fourth  of  Maxwell's 
equations  is  a generalization  of  Ampere's  law.  But  for  this  generalization 
Maxwell  deserves  great  credit.  In  making  the  generalization  Maxwell 
achieved  an  extremely  important  scientific  advance  that  soon  led  to  equally 
important  practical  applications. 

Maxwell  devised  the  fourth  equation  in  1862  by  making  a penetrating 
analysis  of  Ampere’s  law,  Eq.  (23-60): 


| (B  • <71  = /jl0 


j * da 


closed  enclosed 

curve  surface 


As  you  know,  the  integral  on  the  right  side  of  this  relation  is  the  total  cur- 
rent crossing  a certain  surface.  The  current  possesses  a magnetic  field 
which  encircles  the  current,  and  the  relation  says  that  the  strength  of  the 
magnetic  field  is  such  that  its  circulation  equals  the  current  multiplied  by 
the  permeability  constant.  In  an  argument  that  we  present  soon.  Maxwell 
showed  that  there  is  a logical  inconsistency  in  Ampere’s  law  when  it  is  applied  to  sit- 
uations involving  time-dependent  electric  fields.  Maxwell  showed  also  that  the  in- 
consistency is  removed  if  a term  is  added  to  the  right  side,  so  that  it  reads 

j ® • dl  = fx0  ( | j ’ da  + e0  j • da  ) 

closed  enclosed  enclosed 

curve  surface  surface 


1265 


The  additional  term  acts  as  a source  of  the  magnetic  field  just  as  the  first 
term  does — even  though  it  has  to  do  with  the  rate  of  change  of  an  electric 
held  and  not  with  moving  charges.  That  is,  it  functions  as  a current.  Max- 
well called  this  term  the  displacement  current. 


Maxwell  used  his  significantly  modified  form  of  Ampere’s  law, 
together  with  Faraday’s  law  and  the  two  Gauss’  laws,  to  study  the  relations 
between  electric  and  magnetic  fields.  He  found  a complete  symmetry 
between  the  two  kinds  of  fields.  This  led  him  to  the  conclusion  that  when 
varying  electric  fields  are  present  in  a vacuum,  there  must  also  be  varying 
magnetic  fields,  and  vice  versa.  Furthermore,  he  found  that  these  paired 
fields,  called  collectively  the  electromagnetic  field,  can  propagate  through 
the  vacuum  in  the  manner  of  waves.  He  obtained  an  expression  for  the 
propagation  speed  of  these  electromagnetic  waves.  When  evaluated  numeri- 
cally, this  theoretical  expression  gave  a result  in  excellent  agreement  with 
the  experimentally  measured  speed  of  light!  Thus  Maxwell  answered  a 
very  long-standing  cpiestion  concerning  the  nature  of  light.  His  work  made 
it  clear  that  light  waves  are  electromagnetic  waves. 

Maxwell’s  work  suggested  that  there  could  also  be  electromagnetic 
waves  with  wavelengths  different  from  those  of  visible  light.  Following  this 
suggestion,  in  1887  Heinrich  Hertz  (1857-1894)  produced  and  detected 
for  the  first  time  the  much  longer  wavelength  electromagnetic  waves  we 
now  call  radio  waves.  Only  a few  years  later  Guglielmo  Marconi 
(1874-1937)  began  to  put  them  to  practical  use  in  long-distance  com- 
munication. Electromagnetic  waves  with  wavelengths  much  shorter  than 
those  of  light  were  discovered  by  Wilhelm  Konrad  Roentgen  (1845-1923) 
in  1895.  They  are  usually  known  as  X rays. 


In  this  section  we  begin  our  study  of  Maxwell’s  equations  and  electro- 
magnetic waves  by  considering  the  essential  point  in  Maxwell’s  analysis  of 
Ampere’s  law.  Figure  27-1  depicts  the  application  of  Ampere’s  law, 


Fig.  27-1  The  application  of  Ampere’s 
law  to  a circuit  consisting  of  a continu- 
ous conductor. 


I (B  • dl  = /jl0  J j ' da  (27-1) 

closed  enclosed 

curve  surface 


to  a wire  through  which  a current  is  driven  by  a battery.  Associated  with  the 
current  is  a magnetic  field  described  by  the  vector  (B.  with  field  lines 
circling  the  wire.  The  figure  shows  a closed  curve  with  elements  dl  on 
which  the  integral  of  (B  • dl,  the  magnetic  circulation,  is  evaluated.  Also 
shown  are  two  different  surfaces,  1 and  2,  that  are  enclosed  by  the  curve. 
Ampere’s  law  says  that  the  integral  of  j • da,  where  j is  the  current  density 
and  da  is  the  surface  element,  can  be  evaluated  on  any  such  surface  to  de- 
termine the  value  of  the  integral  of  (B  • dl.  It  makes  no  difference  which 
surface  is  used.  Although  j • da  varies  over  surface  1 in  a way  that  is  not  the 
same  as  its  variation  over  surface  2,  we  know  that  its  integral  over  either  sur- 
face has  the  same  value,  namely,  the  total  current  i flowing  in  the  wire. 
Thus  the  right  side  of  Eq.  (27-1)  can  be  evaluated  without  ambiguity  and 
used  in  the  law  to  determine  the  magnetic  circulation.  There  usually  is  no 
difficulty  in  applying  Ampere’s  law  in  any  situation  in  which  current  Hows 
in  a continuous  conducting  circuit. 


1266  Maxwell’s  Equations  and  Electromagnetic  Waves 


Fig.  27-2  The  application  of  Ampere’s 
law  to  a circuit  in  which  the  conducting 
path  is  not  continuous.  Current  is 
flowing  because  the  capacitor  is  in  the 
process  of  charging.  For  the  sake  of 
clarity,  the  spacing  between  the  capaci- 
tor plates  is  exaggerated  in  this  figure 
and  in  those  through  Fig.  27-5. 


Enclosed 
surface  2 


Fig.  27-3  A surface  of  integration  used 
to  obtain  Maxwell’s  generalization  of 
Ampere’s  law. 


But  there  is  a very  serious  difficulty  with  Ampere’s  law  in  this  form,  if 
it  is  applied  to  a situation  in  which  the  current  flows  in  a circuit  which  is  not 
a continuous  conductor.  Figure  27-2  shows  a simple  example.  Here  the  ter- 
minals of  the  same  battery  are  connected  at  some  instant  to  the  wires 
leading  to  a plane-parallel  capacitor,  instead  of  to  a continuous  wire.  Until 
the  voltage  that  develops  across  the  capacitor  as  it  charges  equals  the  bat- 
tery voltage,  a current  i will  flow  in  the  leads  to  the  capacitor.  To  determine 
the  magnetic  held  encircling  a lead  wire  while  this  happens,  Ampere’s  law 
is  applied  to  the  closed  curve  shown  in  the  figure.  Also  shown  are  two  sur- 
faces, 1 and  2.  Both  have  peripheries  lying  on  the  curve.  Surface  1 passes 
through  the  lead,  and  surface  2 passes  between  the  plates  of  the  capacitor. 
The  difficulty  is  apparent.  The  integral  of  j • da  over  surface  1 yields  the 
current  i flowing  in  the  lead.  But  the  integral  of  j • da  over  surface  2 yields 
zero  because  the  current  density  j is  zero  everywhere  on  that  surface.  That 
is,  no  charge  is  flowing  through  the  empty  space  between  the  capacitor 
plates.  Thus  Eq.  (27-1), 

| ® • dl  = no  J j ' da 

closed  enclosed 

curve  surface 

cannot  be  correct  when  applied  to  the  situation  illustrated  in  Fig.  27-2!  The 
integral  on  the  left  side  has  some  unique  nonzero  value  since  ® has  a cer- 
tain value  at  each  point  on  the  curve.  But  the  integral  on  the  right  side  does 
not  have  a unique  value  for  all  surfaces  which  are  enclosed  by  the  curve. 
Equation  (27-1)  is  internally  inconsistent. 

Maxwell  found  that  the  inconsistency  could  be  removed  by  adding  a 
second  term  to  the  right  side  of  Eq.  (27-1).  This  term  expresses  the  fact  that 
while  no  charge  flows  through  the  gap,  the  electric  field  in  the  gap  changes 
as  long  as  charge  is  flowing  through  the  leads.  The  term  is  proportional  to 
the  integral  over  the  surface  enclosed  by  the  curve  of  the  quantity 
(d 8/dZ)  • da,  where  dg  /dt  is  the  time  derivative  of  the  electric  field  at  the  sur- 
face element  da.  A partial  time  derivative  is  used  for  the  sake  of  clarity,  since 
the  electric  field  8 generally  depends  on  position  as  well  as  time.  You  can  see 
what  might  have  motivated  Maxwell  to  try  such  a term  by  considering  the 
following  facts. 

While  the  current  flows  in  the  capacitor  leads,  the  electric  field  8 
between  its  plates  is  changing,  and  so  dS/df  has  a nonzero  value  on  the  part 
of  surface  2 lying  between  the  plates.  Thus  the  integral  of  (dS/d t)  • da  over 
surface  2 will  be  nonzero.  By  multiplying  this  integral  by  the  proper  con- 
stant factor  and  adding  the  product  to  the  right  side  of  Eq.  (27-1),  it  should 
be  possible  to  make  that  side  have  the  same  value  when  surface  2 is  used  as 
when  surface  1 is  used.  Furthermore,  adding  the  term  to  the  right  side  of 
the  equation  will  not  affect  its  value  when  surface  1 is  used.  The  reason  is 
that  the  capacitor  produces  no  significant  electric  field  on  surface  1,  and  so 
d8/d t is  essentially  zero  there. 

Now  we  work  out  the  details.  This  is  most  easily  done  if  we  take  the 
part  of  the  surface  2 lying  between  the  capacitor  plates  to  be  parallel  to  the 
plates,  as  in  Fig.  27-3.  Furthermore,  we  ignore  “edge  effects.”  That  is,  we 
say  that  inside  the  capacitor  the  electric  field  8 is  everywhere  the  same  at 
any  instant  and  is  always  directed  normal  to  the  capacitor  plates,  and 


27-1  The  Displacement  Current  1267 


that  S is  everywhere  zero  outside  the  capacitor.  Then  in  the  only  region 
where  8 is  nonzero  it  depends  solely  on  time,  so  that  c)Z/dt  = dZ/dt.  More- 
over, in  that  region  between  the  plates  dZ/dt  has  a uniform  magnitude 
everywhere.  We  evaluate  the  integral  around  the  closed  curve  by  traversing 
the  curve  in  the  sense  indicated  in  the  figure  by  d\.  Then  according  to  the 
right-hand  rule  for  surface-element  vectors,  the  vector  da.  will  point  to  the 
right,  as  shown  in  the  figure.  This  is  also  the  direction  of  dZ/dt  in  this 
region.  So  we  have 


surface  2 


d% 

dt 


da 


surface  2 surface  2 

between  plates  between  plates 


or 


as  , d% 

— • da  = —r- 
dt  dt 


surface  2 


(27-2) 


where  a is  the  area  of  each  plate. 

The  magnitude  c?  of  the  electric  field  between  the  capacitor  plates  can 
be  obtained  by  writing  Eq.  (21-47)  as 


Co 


The  positive  quantity  cr  is  the  charge  density  on  the  left-hand  plate.  Since 
its  value  is  given  by  cr  = q/a,  where  q is  the  positive  charge  on  that  plate,  we 
have 


% = 


q 

e0a 


Differentiating  with  respect  to  time  produces 

d%_  _ J _dq 
dt  e0a  dt 


But  the  rate  at  which  the  charge  q on  the  capacitor  plates  is  changing  equals 
the  current  i flowing  in  its  leads.  So 


dq  _ . 
dt  1 


and  we  have 


d%  i 

— r~  a = — 
dt  e0 

Using  this  in  Eq.  (27-2),  we  obtain  the  result 


as  , i 

— • da  = — 
dt  e0 

surface  2 


or 


f 13. 
J Hi 


da 


surface  2 


(27-3) 


1268  Maxwell's  Equations  and  Electromagnetic  Waves 


It  follows  that  we  can  achieve  our  goal  of  making  the  value  of  the  right 
(or  “current”)  side  of  Ampere’s  law  independent  of  the  particular  surface 
of  integration  chosen  by  adding  to  the  integral  of  j • da  the  right  side  of  Eq. 
(27-3).  That  is,  the  right  side  of  Eq.  (27-1 ) should  be  modified  by  adding  to 
what  we  will  here  call  the  conduction  current, 

j j - da 

enclosed 

surface 


the  quantity  that  is  called  the  displacement  current: 


• da 


enclosed 

surface 


The  result  is  Maxwell’s  generalization  of  Ampere’s  law: 

j ® • dl  = Mo  ( | j • da  + e0  J ~ • da  j 

closed  enclosed  enclosed 

curve  surface  surface 


(27-4) 


The  magnetic  circulation  eq  uals  the  product  of  the  permeability  constant  and  the  sum 
of  the  conduction  and  displacement  currents.  By  making  this  addition,  the  diffi- 
culty found  in  analyzing  Fig.  27-2  is  removed.  To  see  this,  evaluate  the 
right  side  of  Eq.  (27-4)  on  surface  1.  Doing  so  yields 

Mo  ( | j • da  + e0  f ~ • da  J = fx0(i  + 0)  = jx0i 

surface  1 surface  1 


Now  evaluate  the  right  side  on  surface  2,  using  Eq.  (27-3),  to  obtain 

38 
Mo 


j • da  + e0 

surface  2 


dt 

surface  2 


• da  = Mo(0  + i)  = Mo? 


The  values  found  in  the  evaluations  are  the  same,  no  matter  which  surface 
is  used.  Thus  Maxwell’s  generalization  of  Ampere's  law  does  not  suffer 
from  internal  inconsistency. 


The  flow  of  charge  through  a conductor  is  called  “conduction  current” 
in  the  remainder  of  this  section  and  in  the  next.  After  that  there  is  no  need 
to  distinguish  between  it  and  displacement  current,  and  so  we  then  revert  to 
calling  the  flow  of  charge  through  a conductor  simply  “current.”  The  name 
“displacement  current”  was  introduced  by  Maxwell,  for  reasons  that  are 
not  relevant  to  the  modern  view  of  electromagnetic  fields  in  vacuum.  A dis- 
placement  current  in  vacuum  is  not  a current  in  the  sense  that  it  describes 
any  motion  of  charge  carriers.  But  it  is  a current  in  the  sense  that  it  pos- 
sesses the  essential  property  of  electric  currents.  Namely  it  has  associated 
with  it  a magnetic  field. 


Example  27-1  concerns  a situation  where  both  terms  on  the  right  side 
of  Eq.  (27-4)  make  a contribution. 


27-1  The  Displacement  Current  1269 


EXAMPLE  27-1 


The  plane-parallel  capacitor  in  Fig.  27-4  is  charged  by  current  flowing  to  the  plates 
through  the  lead  wires.  Evaluate  the  magnitude  31  of  the  magnetic  held  on  the 
closed  curve  of  radius  r2  shown  in  the  figure  at  an  instant  when  the  current  in  the 
leads  is  i.  Do  this  by  applying  Eq.  (27-4)  to  the  enclosed  stovepipe-hat-shaped  sur- 
face labeled  in  the  figure  as  surface  3. 

■ For  this  surface  there  are  contributions  from  both  terms  on  the  right  side  of 
Eq.  (27-4)  because  both  conduction  and  displacement  currents  flow  through  it. 
There  is  conduction  current  since  the  charge  entering  the  center  of  the  plate  at  the 
lead  wire  must  distribute  itself  uniformly  over  the  plate.  This  gives  rise  to  a conduc- 
tion current  passing  through  the  “stovepipe  of  the  hat"  — the  curved  surface  of  the 
cylinder  that  is  part  of  surface  3,  where  it  intersects  the  left-hand  capacitor  plate. 
While  this  is  happening,  the  electric  held  between  the  plates  is  increasing.  Conse- 
quently, there  is  a displacement  current  passing  through  the  part  of  surface  3 which 
is  the  flat  surface  covering  the  end  of  the  cylinder. 

At  a time  when  the  charge  per  unit  area  on  the  positively  charged  left  capacitor 
plate  is  cr,  the  total  charge  qr>ri  outside  the  cylindrical  surface  is  cr  times  the  area  of 
this  outer  part  of  the  plate.  So  if  the  radius  of  the  plate  is  R,  we  have 

<?r>n  = <t{ttR2  - TTr\) 


The  rate  of  change  of  this  charge, 


d(]  r>ri 

dt 


da 

= — ■ (ttR2  - 77>1) 

dt 


is  the  charge  flowing  through  surface  3 to  the  outer  part  of  the  plate  from  the  inner 
part,  as  the  charge  entering  at  the  lead  wire  distributes  itself  uniformly  over  the 
plate.  This  is  the  conduction  current,  so  you  have 

f da 

I j " da  = ~dt  (7tR~  ~ 7Tr^ 

surface  3 


You  now  evaluate  the  displacement  current  by  setting 


% 


a 

Co 


Thus  you  have 


d%  d’S  1 da 
dt  dt  e0  dt 


Since  £ is  parallel  to  the  curved  surface  of  the  cylinder  and  normal  to  the  flat  sur- 
face covering  it,  you  can  write 


d£ 

f dS 

d% 

r l 

da 

— • da 

= — • da 

= 

da  = — 

— 

dt 

J dt 

dt 

J e0 

dt 

rface  3 

surface  3 

surface  3 

r < r,  r <rl 


9 


n 


Fig.  27-4  A surface  of  integration  which  is  penetrated  by  both  a 
conduction  current  and  a displacement  current. 


1270  Maxwell’s  Equations  and  Electromagnetic  Waves 


Adding  the  conduction  and  displacement  currents  you  obtain  the  right  side  of 
Eq.  (27-4)  for  surface  3: 


/ 

f f \ 

do 

1 do 

Mo  y 

j- Ja  + e„j  -•*) 

= Mo 

hr  ^ - 

77-ri)  + e0 — mr\ 

e0  dt  J 

enclosed  enclosed 

surface  surface 


da- 


dcr 


Mo 


dq 

dt 


Mo  i 


Here  a is  the  total  area  of  the  positively  charged  capacitor  plate,  and  q is  the  total 
charge  on  it.  It  is  gratifying  to  see  that  this  result  agrees  with  those  obtained  in  the 
text  for  surfaces  1 and  2. 

The  left  side  of  the  equation  is  evaluated  by  noting  that  the  vectors  (B  and  d\  at 
the  same  point  are  parallel.  Thus  you  have 

J ffi  • d\  = J S3  dl  = S3  J dl  = 3327rr2 

closed  closed  closed 

curve  curve  curve 


Equating  this  to  the  value  obtained  for  the  right  side  gives 

3327 rr2  = Mo  ‘ 


or 


S3 


Mo* 

2nr2 


The  same  result  was  obtained  in  Eq.  (23-42a)  for  the  magnetic  held  encircling  a 
continuous  wire  carrying  current  i.  Thus  the  displacement  current  will,  under  all 
circumstances,  make  the  same  contribution  to  the  magnetic  held  as  the  conduction 
current  it  replaces.  As  far  as  its  connection  with  magnetic  field  is  concerned,  a displacement 
current  is  no  different  from  a conduction  current. 


Example  27-2  will  give  you  a feel  for  the  magnitude  of  the  magnetic 
held  which  must  be  present  when  a displacement  current  is  present  in  a 
typical  circuit. 


EXAMPLE  27-2 


a.  Obtain  an  expression  for  the  magnetic  held  strength  S3  at  the  point  between 
the  capacitor  plates  indicated  in  f ig.  27-5.  Express  S3  in  terms  of  the  rate  of  change 
d%/dt  of  the  electric  held  strength  between  the  plates. 

■ First  you  visualize  a circle  of  radius  ty  inside  the  plates  in  a plane  parallel  to 
them.  Applying  Eq.  (27-4)  to  this  closed  curve  and  to  the  plane  enclosed  by  the 
circle,  you  have 


| ® • d\  = ju0  ^ j j • da.  + 

closed  enclosed 

curve  surface 


enclosed 

surface 


Fig.  27-5  A plane-parallel  capacitor 
considered  in  Example  27-2. 


This  immediately  gives 


dl  = S3 


dl  = SS2rrr1 


closed  closed 

curve  curve 


r d% 

M°e°  J da 

enclosed 

surface 


dX 
M°e°  ~dt 


da 


d% 

= m oe0  —r  rrri 


enclosed 

surface 


27-1  The  Displacement  Current 


1271 


So  you  have 


d% 

SB2tt)\  - yu.„en—  Trr\ 
at 


or 


33  = 


Mog0>'i  d% 
2 ~dt 


for  >i  'S  R 


Thus  £$  is  proportional  to  d%/dt  and  increases  with  increasing  rx  to  acliieve  a max- 
imum value  at  r1  = R,  the  edge  of  the  region  between  the  capacitor  plates.  The 
maximum  value  is 


/r-oCoK  d<s 
2 ~dt 


(27-5) 


What  happens  to  S3  for  rt>  R? 

b.  Evaluate  S3  for  r\  = R = 0.100  m and  d%/dt  = 1.00  x 1()10 
■ By  using 


V/(m-s). 


p,0  = Att  x 10  ‘ T-m/A 


and 


e0  = 8.85  X 10-12  C2/(N-m2) 


Eq.  (27-5)  gives  you 


4ttX  ltr7  T-m/A  x 8.85  x 10~12  C2/(N-m2)  x 1.00  x Hr1  m x 1 .00  x 1010  V/(m-s) 

2 

= 5.56  x 10~9  T 


Example  27-2  shows  that  the  generalized  Ampere’s  law  predicts  an 
extremely  small  magnetic  field,  even  when  the  rate  of  change  of  electric 
field  is  (piite  large,  in  situations  where  the  changing  electric  field  is  the  only 
source  of  the  magnetic  field.  This  contrasts  sharply  with  the  very  appre- 
ciable electric  field  predicted  by  Faraday’s  law  for  a system  in  which  there  is 
a magnetic  field  having  a modest  rate  of  change.  See  Example  25-4.  As  a 
consequence  of  this  situation,  it  is  easy  to  demonstrate  the  Faraday-Henry 
effect  directly  in  the  laboratory,  while  it  is  not  possible  to  do  the  same  for 
Maxwell’s  effect.  So  it  is  not  surprising  that  the  former  was  discovered  by 
experimental  investigation,  while  the  latter  was  discovered  by  theoretical 
analysis. 

Insight  into  this  contrast  can  be  gained  by  writing  Faraday’s  law,  Eq. 
(25-156): 


8 • d 1 


closed  enclosed 

curve  surface 


Now  write  the  generalized  Ampere’s  law  for  a situation  in  which  a 
magnetic  field  is  present  only  because  there  is  a changing  electric  field.  We 
have 

| ® • dl  = / x0e0  | — • da 

closed  enclosed 

curve  surface 

Except  for  the  factor  jx 0e0  in  the  latter,  the  two  equations  have  a very  sym- 
metrical appearance.  But  the  magnitude  of  the  factor  is  /r0e0  — 10  1 ' 


1272 


Maxwell's  Equations  and  Electromagnetic  Waves 


T-C2/(N-m-A).  Its  minute  value  explains  why  Maxwell’s  effect  is  of  no 
practical  consequence  in  many  familiar  cases.  It  plays  a vital  role  in  electro- 
magnetic waves,  however,  as  we  will  see  later  in  this  chapter. 


27-2  MAXWELL’S 
EQUATIONS 


The  complete  set  of  Maxwell’s  equations  can  be  written 

| 8 • da  = — j p dv 

closed  enclosed 

surface  volume 


(27-6a) 


j ® • da  = 0 

closed 

surface 


le-d'  = 

closed 

curve 


enclosed 

surface 


(27-6  b) 


(27-6  c) 


J ® • dl  = p,0  ^ | j • da  + €0  j • da  j (27-6 d) 

closed  enclosed  enclosed 

curve  surface  surface 

The  hrst  of  these  is  just  Gauss’  law  for  the  electric  held  8.  Eq.  (20-37),  with 
the  total  charge  within  the  closed  surface  written  as  the  integral  over  the 
enclosed  volume  of  the  charge  density  p.  As  you  know,  it  is  based  funda- 
mentally on  Coulomb’s  experimental  work.  The  second  of  Maxwell’s  equa- 
tions is  Gauss’  law  for  the  magnetic  held  ffi,  Eq.  (23-54).  Its  basis  is  the 
experimental  observation  that  the  magnetic  “charge”  density  so  far  has 
always  been  found  to  be  zero — in  other  words,  all  experimental  results  ob- 
tained to  date  show  that  there  are  no  magnetic  monopoles.  Maxwell’s  third 
equation  is  Faraday’s  law,  written  in  the  form  of  Eq.  (25-156).  It  is  founded 
on  laboratory  work  of  Faraday  and  Henry.  As  you  saw  in  Sec.  27-1,  the 
fourth  equation  is  justified  by  a combination  of  Ampere’s  measurements, 
showing  that  a nonzero  conduction  current  density  j is  always  accompanied 
by  a magnetic  held,  and  Maxwell’s  analysis,  according  to  which  a nonzero 
rate  of  change  of  electric  held  dZ/dt,  multiplied  by  the  constant  e0,  also  acts 
as  a current  density  and  therefore  also  is  accompanied  by  a magnetic  held. 
The  complete  experimental  basis  for  electromagnetism  is  summarized  in 
these  four  equations,  together  with  the  equation  dehning  8 and  © in  terms 
of  the  force  F acting  on  a test  charge  q moving  at  velocity  v.  This  is  the 
Lorentz  force,  given  by  Eq.  (23-19), 

F = qZ  + q\  x ffi 


Maxwell’s  equations  relate  the  electric  and  magnetic  fields  to  each  other  and 
to  the  density  of  electric  charge  and  the  electric  conduction  current.  Since  the  uni- 
verse does  contain  electric  monopoles  but  not,  apparently,  magnetic  monopoles, 
the  only  charge  and  conduction  current  densities  that  exist  are  electric.  If  mag- 
netic monopoles  existed,  there  would  be  a volume  integral  of  the  magnetic 
“charge”  density  on  the  right  side  of  Eq.  (27-6b)  and  a surface  integral  of  the  mag- 
netic “conduction  current”  density  on  the  right  side  of  Eq.  (27-6c).  Then  the  entire 
set  of  equations  would  be  quite  symmetrical.  This  is  not  what  we  have  observed  to 
happen  in  nature.  But  what  can  and  often  does  happen  is  that  the  electric  charge 


27-2  Maxwell’s  Equations  1273 


and  conduction  current  densities  are  both  zero,  so  that  the  volume  integral  on  the 
right  side  of  Eq.  (27-6a)  and  the  first  surface  integral  on  the  right  side  of  Eq. 
(27-6d)  are  equal  to  zero.  This  is  the  case  for  electric  and  magnetic  fields  in  a 
region  completely  devoid  of  the  charged  constituents  found  in  all  matter,  that  is, 
for  electric  and  magnetic  fields  in  vacuum. 


If  electric  and  magnetic  fields  are  present  in  vacuum,  they  must  be  re- 
lated to  eacli  other  in  such  a way  as  to  satisfy  Maxwell's  equations  for  the 
case  p = 0 and  j = 0.  Imposing  these  special  but  often  encountered  condi- 
tions on  Eqs.  (27-6),  we  obtain  Maxwell’s  equations  in  vacuum: 


| 8 • da  = 0 (27-7 a) 

closed 

surface 

| ( B • da  = 0 (27-7 b) 

closed 

surface 

j 8 • dl  = — j • da  (27-7 c) 

closed  enclosed 

curve  surface 


j (B  • dl  = p,0e0  | ~ • da  (27-7 d) 

closed  enclosed 

curve  surface 

The  first  and  second  equations  state  that  in  vacuum  there  is  no  net  electric 
or  magnetic  flux  penetrating  any  closed  surface.  The  third  and  fourth 
equations  state  relations  that  must  be  satisfied  in  vacuum  between  the  space 
dependence  of  the  electric  field  and  the  time  dependence  of  the  magnetic 
field,  and  also  between  the  space  dependence  of  the  magnetic  field  and  the 
time  dependence  of  the  electric  field. 


Maxwell’s  third  and  fourth  equations  show  that  there  are  relations 
between  the  space  and  time  dependences  of  the  quantities  describing  an 
electromagnetic  field  in  vacuum.  Recall  that  the  wave  equation  for  a 
stretched  string  describes  in  a similar  fashion  a relation  between  the  space 
and  time  dependence  of  the  transverse  displacement  of  a string  as  a wave 
propagates  along  it,  as  you  saw  in  Chap.  12.  The  similarity  suggests  that  it 
may  be  possible  to  use  Eqs.  (27-7)  to  obtain  wave  equations  for  electric 
and  magnetic  fields  in  a vacuum.  We  do  so  in  Sec.  27-3. 


27-3  THE 
ELECTROMAGNETIC 
WAVE  EQUATIONS 


In  Chap.  12  we  applied  Newton’s  equations  of  mechanics  to  develop  the 
wave  equation  for  transverse  displacements  y = f(x,  t)  of  a wave  traveling  in 
a string  of  density  pet  unit  length  p stretched  along  the  x axis  by  the  ten- 
sion F : 


d2f(x,  t)  _ p d2f{x,  t) 
dx2  F dt2 

Here  we  apply  Maxwell's  equations  to  develop  wave  equations  for  trans- 
verse electric  and  magnetic  fields  in  a vacuum.  These  electromagnetic  wave 


1274  Maxwell’s  Equations  and  Electromagnetic  Waves 


y 


Fig.  27-6  Part  of  one  wave  front  of  an 
electromagnetic  plane  wave  propa- 
gating in  the  direction  of  the  positive 
or  negative  x axis.  Other  wave  fronts  are 
parallel  to  the  one  shown.  At  any  instant 
the  same  electric  held  vector  describes 
the  held  at  all  points  on  any  given  wave 
front.  If  the  wave  is  a perfect  plane 
wave,  all  the  wave  fronts  extend  to  in- 
finity in  the  directions  perpendicular  to 
the  direction  of  propagation.  Of  course, 
this  condition  is  never  strictly  met  in 
practice.  Nevertheless,  a plane  wave  can 
be  used  to  approximate  well  the  central 
part  of  a wave  propagating  in  nearly  a 
single  direction. 


y 


dy 


x,y,z 


Z 


Fig.  27-7  An  infinitesimal  cubic  sur- 
face of  integration  enclosing  an  evac- 
uated region. 


equations  will  prove  to  be  of  exactly  the  same  mathematical  form  as  the  mechanical 
wave  equation,  despite  the  fact  that  their  physical  origins  are  so  different.  As 
a consequence,  the  electromagnetic  wave  equations  will  have  exactly  the 
same  kinds  of  solutions  as  the  mechanical  wave  equation.  So  nearly  all  the 
work  we  did  in  analyzing  the  behavior  of  mechanical  waves  in  Chaps.  12 
and  13  will  be  applicable  to  our  work  with  electromagnetic  waves  in  this 
chapter  and  the  next.  For  instance,  once  we  obtain  the  electromagnetic 
wave  equations,  we  will  find  traveling-wave  solutions  by  inspection  and 
immediately  will  be  able  to  predict  the  speed  of  such  waves. 

Let  us  consider  the  simplest  situation  in  which  electromagnetic  waves 
can  exist  and  travel  in  a direction  along  the  x axis.  Suppose  that  an  evacuated 
reefion  contains  electric  and  magnetic  fields.  Take  first  the  electric  held.  It 
must  vary  both  in  space  and  in  time  if  it  is  associated  with  a wave.  Let  us  as- 
sume for  simplicity  that  at  any  particular  time  the  electric  held  is  the  same 
everywhere  in  any  plane  parallel  to  the  yz  plane  in  Fig.  27-6,  although  its 
value  varies  from  one  plane  to  the  next.  That  is,  we  say  that  the  electric 
held  8 depends  on  the  coordinate  x,  but  not  on  the  coordinate  y or  z.  Since 
it  also  depends  on  the  time  t,  the  electric  held  we  are  considering  can  be 
written  as  8 (x,  t). 

We  are  dealing  with  the  electric  held  of  what  is  called  a plane  wave. 
The  word  “plane”  refers  to  the  shape  of  the  surfaces  on  which  8 has  con- 
stant values  at  any  given  instant.  These  surfaces  are  called  wave  fronts — 
the  terminology  is  the  same  as  that  used  with  sound  waves  to  describe  the 
surfaces  on  which  the  pressure  has  constant  values  at  a given  instant,  to  cite 
just  one  example.  For  the  case  being  considered,  the  plane  wave  fronts  are 
normal  to  the  x axis.  (Later  in  this  chapter  we  investigate  how  electromag- 
netic waves  are  produced.  For  now,  it  suffices  to  say  that  an  electromagnetic 
plane  wave  is  found  to  a very  good  approximation  in  a beam  from  a suitably 
adjusted  laser,  in  a region  that  is  not  too  close  to  the  edges  of  the  beam.  If  the 
beam  extends  along  the  x axis,  then  these  plane  wave  fronts  are  normal  to  the 
x axis.) 

Our  first  task  in  developing  the  electromagnetic  wave  equation  is  to 
prove  that  if  a plane  electromagnetic  wave  exists,  both  its  8 and  its  ® 
vectors  must  be  perpendicular  to  the  direction  of  propagation.  That  is,  we 
will  prove  that  the  wave  must  be  transverse.  To  do  this,  we  apply  Maxwell's 
hrst  equation  in  vacuum,  Eq.  (27-7 a), 

j 8 • da  = 0 

closed 

surface 

to  the  closed  surface  surrounding  an  evacuated  region  shown  in  Fig.  27-7. 
The  surface  comprises  the  six  faces  of  an  infinitesimal  cube  with  edge 
lengths  dx,  dy,  dz.  Since  8 does  not  depend  on  y,  any  contribution  to  the  flux 
integral  from  the  lower  face  of  area  dx  dz  at  coordinate  y will  be  canceled  ex- 
actly by  a contribution  from  the  upper  face,  also  of  area  dx  dz,  at  coordinate 
y + dy.  To  see  this,  first  note  that  at  the  upper  face  the  outward  flux  will  be 
dx  dz,  where  %y  is  the  y component  of  8.  The  reason  is  that  the 
outward-pointing  surface-element  vector,  whose  magnitude  is  dx  dz,  is  in 
the  positive^  direction  at  that  face.  At  the  lower  face,  however,  the  outward 
flux  is  — %y  dx  dz,  since  the  outward-pointing  surface-element  vector  is  in 
the  negative  y direction  at  that  face.  Thus  the  net  outward  I lux  for  these 


27-3  The  Electromagnetic  Wave  Equations  1275 


two  faces  is  %y  dx  dz  — %y  dx  dz  = 0,  as  stated.  Similarly,  there  will  be  no  net 
contribution  to  the  outward  electric  flux  from  the  pair  of  front  and  back 
faces  of  area  dx  dy,  which  are  separated  by  the  distance  dz,  because  8 does 
not  depend  on  z. 

Maxwell’s  first  equation  says  that  there  is  no  net  electric  flux  pene- 
trating outward  through  the  closed  surface,  because  it  surrounds  a region 
containing  no  charge.  Since  we  have  argued  that  there  is  no  net  flux  pass- 
ing through  two  of  its  three  pairs  of  faces,  it  must  also  be  true  that  there  is 
no  net  flux  passing  through  the  third  pair.  This  pair  consists  of  the  left  face 
of  area  dy  dz  at  coordinate  x and  the  right  face  of  the  same  area  at  coordi- 
nate x + dx.  The  outward  flux  penetrating  the  latter  is  («? x)x+ax  dy  dz,  where 
the  subscript  x + dx  means  that  %x  is  to  be  evaluated  at  the  coordinate  spe- 
cifying the  location  of  that  face.  The  outward  flux  at  the  face  whose  coordi- 
nate is  x is  — (%x)x  dy  dz.  Thus  we  have 

| £>•  da  = (%x)x+dx  dy  dz  - {%x)x  dy  dz  = 0 

closed 

surface 

The  second  equality  can  be  divided  by  dy  dz,  and  the  negative  term  can  be 
transposed  to  the  opposite  side  of  the  equality,  to  give  us 

(«? x)x+dx  = (%x)x 

In  words,  we  have  found  that  the  x component  of  8 has  the  same  value  at 
coordinate  x + dx  as  it  has  at  coordinate  x.  In  other  words,  the  component 
%x  does  not  depend  on  x.  It  is  possible  that  an  x component  of  the  electric 
held  could  be  present  when,  as  assumed,  we  have  electromagnetic  plane 
waves  with  wave  fronts  normal  to  the  x axis.  But  if  such  a component  is 
present,  it  cannot  depend  on  x and  therefore  can  have  nothing  to  do  with 
the  wave.  It  just  represents  a uniform  electric  field  in  the  x direction  which 
happens  also  to  be  there.  Henceforth  we  ignore  any  such  held. 

We  can  apply  the  second  of  Maxwell’s  equations,  Eq.  (27-7 h),  to  our 
electromagnetic  wave  whose  wave  fronts  we  have  assumed  to  be  planes 
normal  to  the  x axis.  On  any  of  these  wave  fronts,  the  magnetic  held  has 
everywhere  the  same  value  (B  at  a given  time.  By  means  of  an  argument 
identical  to  that  above,  we  can  show  that  (B  cannot  have  a component  03 x 
which  depends  on  x.  Thus  it  is  the  same  with  both  8 and  (B:  any  parts  of 
these  electric  and  magnetic  held  vectors  that  are  associated  with  an  electro- 
magnetic plane  wave  have  only  components  which  lie  in  the  wave  fronts. 
That  is,  both  held  vectors  lie  everywhere  in  planes  parallel  to  the  yz  plane. 

The  analogy  to  sound  waves  suggests  that  electromagnetic  waves  prop- 
agate in  the  direction  normal  to  the  wave  fronts,  that  is,  in  the  x direction 
for  the  wave  we  are  considering.  We  soon  prove  that  this  is  true.  Thus  our 
conclusion  can  be  expressed  by  saying  that  the  electric  and  magnetic  fields  in 
the  wave  can  have  components  only  in  directions  transverse  to  the  direction  of  propa- 
gation. The  electromagnetic  wave  must  be  a transverse  wave. 

You  will  recall  that  waves  in  a stretched  string  are  also  transverse 
waves.  For  such  waves,  the  quantity  that  is  restricted  to  directions  trans- 
verse to  the  propagation  direction  is  the  displacement  of  segments  of  the 
string,  rather  than  the  electric  and  magnetic  helds  of  transverse  electro- 
magnetic waves.  Nevertheless,  the  analogy  between  the  two  types  of  trans- 


1276  Maxwell’s  Equations  and  Electromagnetic  Waves 


y 


z 

Fig.  27-8  An  infinitesimal  square  sur- 
face of  integration,  parallel  to  the  xy 
plane,  enclosed  by  a square  curve  of  in- 
tegration. 


verse  waves  is  strong  enough  to  suggest  we  can  assume  that  the  electromag- 
netic wave  we  are  investigating  is  what  is  called  a polarized  wave.  That  is, 
we  take  the  electric  held  to  be  restricted  to  the  y direction,  just  as  in  Chap. 
12  we  took  the  displacement  of  string  segments  to  be  restricted  to  the  y 
direction.  If  we  had  allowed  the  transverse  displacement  in  the  string  to 
have  a z component  as  well,  we  would  have  complicated  the  mathematics 
considerably  without  a corresponding  gain  in  the  understanding  of  the 
physics  to  which  it  leads.  (Covering  the  front  of  the  laser  mentioned  before 
with  a suitably  oriented  sheet  of  Polaroid  him  will  polarize  the  essentially 
plane  wave  emerging  from  it  with  the  electric  held  restricted  to  they  direc- 
tion.) 

Now  we  apply  the  third  of  Maxwell’s  equations,  Eq.  (27-7c), 

J S'd,  = ~Jl7'da 

closed  enclosed 

curve  surface 

We  take  the  electric  held  to  be  polarized  in  the  y direction  so  that  it  has  only 
the  component  %y.  We  evaluate  the  integrals  on  the  closed  curve  indicated 
in  Fig.  27-8  and  on  the  plane  surface  enclosed  by  it.  The  surface  is  a square 
of  inhnitesimal  sides  dx  and  dy,  constructed  in  a plane  parallel  to  the  xy 
plane.  Let  us  integrate  the  electric  circulation  counterclockwise  around  the 
square  at  some  particular  instant  of  time,  starting  from  the  corner  on  the 
lower  left.  For  the  hrst  side  of  the  square,  8 is  perpendicular  to  the  element 
dl  of  the  curve,  so  6 • dl  is  zero  and  we  obtain  no  contribution  to  the  inte- 
gral. The  second  side  gives  a contribution  (%y)x+dX  dy,  since  here  the  two 
vectors  are  in  the  same  direction,  the  magnitude  of  8 is  (<£, ,)x+dX,  and  the 
magnitude  of  dl  is  dy.  The  third  side  again  yields  zero  because  the  vectors 
are  perpendicular.  From  the  fourth  side  we  obtain  —{<Sy)x  dy  because  here 
the  vectors  point  in  opposite  directions.  Adding  all  the  contributions,  we 
have 

J 8 • dl  = y)x+dx  dy  - {%y)x  dy 

closed 

curve 

= [(<%y)x+dx  ~ dy 

The  quantity  in  brackets  is  the  difference  between  the  value  of  %y  at  x + dx 
and  its  value  at  x.  It  can  be  written  in  terms  of  the  rate  of  change  of  %y  with 
respect  to  x,  multiplied  by  the  change  in  x: 

(%>y)x+djc  ~ (%y)x  ~ dx 

We  use  partial-derivative  notation  because  %y  depends  on  both  x and  t,  and 
the  time  t is  fixed  while  the  position  x is  varied  in  calculating  the  derivative. 
The  derivative  must  be  evaluated  in  the  range  x to  x + dx.  But  since  %y  has 
no  y dependence,  d%y/dx  must  likewise  be  independent  of  y.  So  it  can  be 
evaluated  at  any  y.  We  choose  to  evaluate  d%y/ dx  at  the  center  of  the  square 
of  sides  dx  and  dy.  Using  this  relation,  we  have 

r ~\Cfp 

8 • dl  = — - dx  dy  (27-8) 

J dx 

closed 

curve 


27-3  The  Electromagnetic  Wave  Equations  1277 


Now  let  us  evaluate  the  integral  on  the  right  side  of  Eq.  (27-7 c)  over 
the  surface  of  the  same  infinitesimal  square  shown  in  Fig.  27-8.  Applying 
the  right-hand  rule  for  surface-element  vectors  to  the  situation  illustrated 
in  the  figure,  we  see  that  da  will  be  in  the  positive  z direction.  This  tells  us 
that  if  the  integral  is  to  have  a nonzero  value,  and  so  satisfy  Maxwell’s  third 
equation,  the  magnetic  field  ® must  have  a z component  2ftz.  Only  then  can 
its  time  derivative  c ')($>/ dt  have  a z component  d2ftz/dt,  and  this  is  the  only 
component  that  will  contribute  to  the  dot  product  in  the  integral. 

Thus  the  polarized  plane  wave  we  are  dealing  with  has  a magnetic  field 
2ft  z.  as  well  as  an  electric  field  <%y,  and  we  have 

5®  d2ftz 

— - • da  = — — da 


Integrating  gives 


d®* 

dt 


da 


enclosed  enclosed 

surface  surface 


The  value  of  this  integral  can  be  expressed  exactly  as  the  average  value  of 
b2ftzldt  over  the  surface  of  the  square  times  the  area  dx  dy  of  the  square. 
Since  the  square  is  infinitesimal,  this  average  value  of  d2ftz/dt  must  be 
extremely  close  to  the  value  of  d2ftz/ dt  at  the  center  of  the  square.  Thus  we 
have 


dt 


da  = 


dt 


dx  dy 


enclosed 

surface 


where  it  is  to  be  understood  that  the  quantity  d2ftz/dt  outside  the  integral  is 
evaluated  at  the  center  of  the  square  of  sides  dx  and  dy.  Combining  the  last 
two  equations,  we  find 


d2ftz 

dt 


dx  dy 


(27-9) 


enclosed 

surface 


We  now  use  Eqs.  (27-8)  and  (27-9)  in  Maxwell’s  third  equation,  Eq. 
(27-7c),  to  obtain 


d'Sy 

dx 


dx  dy  = 


d2ftz 

dt 


dx  dy 


or 


d%y  _ _ d^z 
dx  dt 


(27-10) 


That  is,  a variation  of  y in  space  must  be  associated  with  a variation  of  2ft z in  time. 


Next  we  make  use  of  Maxwell’s  fourth  equation  in  vacuum,  Eq. 

(27-7  d): 


j ($>  ‘ d 1 — [A()6o 


closed  enclosed 

curve  surface 


1278  Maxwell’s  Equations  and  Electromagnetic  Waves 


y 


Fig.  27-9  An  infinitesimal  square  sur- 
face of  integration,  parallel  to  the  xz 
plane,  enclosed  by  a square  curve  of  in- 
tegration. 


We  apply  it  to  (he  closed  curve  in  Fig.  27-9.  This  curve  is  also  an  infinites- 
imal square,  but  it  lies  parallel  to  the  xz  plane  and  has  sides  dx  and  dz.  Its 
center  coincides  with  the  center  of  the  square  in  Fig.  27-8.  We  start  the  inte- 
gration around  the  curve  at  the  left  rear  corner,  and  again  we  proceed  in 
the  counterclockwise  sense.  Thus  the  right-hand  rule  for  surface-element 
vectors  specifies  that  da  points  in  the  positive  y direction.  Since  the  third 
and  fourth  of  Maxwell’s  equations  are  very  similar  mathematically,  we  eval- 
uate the  two  integrals  in  the  fourth  equation  just  as  we  did  the  corre- 
sponding integrals  of  the  third.  We  obtain  for  the  integral  on  the  left  side 


I (B  • dl  = (fRz)x  dz  - {SRz)x+dx  dz 

closed 

curve 


— [(^z)  x+dx 


cm, 

dx 


dx  dz 


[S/iz)x]  dz 


And  for  the  integral  on  the  right  side  we  have 


f d£ 

f , 

f 

da 

= da  = 

da 

J dt 

J dt 

dt  J 

enclosed 

enclosed 

enclosed 

surface 

surface 

surface 

<%« 

dt 


dx  dz 


Thus  Maxwell’s  fourth  equation  gives  us 


dx 


dx  dz  = ix0e0  —— 
dt 


dx  dz 


or 


dERz  d%y 

77  = "^oeoir 


(27-11) 


Both  derivatives  are  evaluated  at  the  same  point  as  the  derivatives  in  Eq. 
(27-10).  We  have  found  that  a variation  of  ERZ  in  space  must  be  associated  with  a 
variation  of  in  time.  The  near  symmetry  of  this  equation  and  Eq.  (27-10)  is 
striking. 


We  have  been  dealing  with  a plane  wave  polarized  so  that  its  electric 
field  has  only  a y component.  That  is,  we  have  taken  di£y/dxf  0 and 
diEy/dt  f 0 but  diEz/dx  = 0 and  d%zldt  = 0.  Equations  (27-10)  and  (27-1 1) 
show  that  the  plane  wave  must  also  contain  a magnetic  field  with  a z compo- 
nent. That  is,  they  show  that  dERz/dx  f 0 and  d£Rz/dt  f 0.  But  they  have 
nothing  to  say  about  the  y component  of  the  magnetic  field.  In  fact, 
dSRy/ dx  = 0 and  dSRy/dt  = 0.  That  is,  the  magnetic  field  of  the  plane  wave  is 
polarized  so  that  it  has  only  a z component.  This  statement  can  be  proved 
by  going  through  an  analysis  essentially  the  same  as  the  one  we  have  gone 
through,  but  with  the  plane  wave  polarized  so  that  the  electric  field  has 
only  a z component.  The  analysis  leads  to  equations,  much  like  Eqs.  (27-10) 
and  (27-11),  which  show  that  the  value  of  dERy/dx  is  proportional  to  the 
value  of  d%zldt  and  that  the  value  of  dERy/dt  is  proportional  to  the  value  of 

27-3  The  Electromagnetic  Wave  Equations  1279 


dtgz/dx.  Since  in  the  case  we  are  dealing  with  here  we  have  d^z/dt  = 0 and 
d%z/dx  = 0,  the  statement  that  we  also  have  d$fty/dx  = 0 and  d2fty/dt  = 0 
follows  immediately. 


1280 


Now  we  will  eliminate  0&z  from  Eqs.  (27-10)  and  (27-11)  and  thereby 
obtain  an  equation  involving  only  <%v  as  a dependent  variable.  First  we  take 
the  partial  derivatives  with  respect  to  x of  both  sides  of  Eq.  (27-10),  pro- 
ducing (d/dx){d’£y/dx)  = — (d/dx)(d8ftz/dt),  or 


d2%y  _ d2Wz 
dx2  dx  dt 


(27-12) 


Then  we  take  the  partial  derivatives  with  respect  to  t of  both  sides  of  Eq. 
(27-11).  We  obtain  (d/dt)(d28z/dx)  = - [jL0e0{d/dt)(d%y/dt),  or 


d22ftz  _ d2%y 

dt  dx  ~ 'U'°e°  dt 2 


(27-13) 


But 


d2l3iz  d2'Mz 
dx  dt  dt  dx 

f his  is  true  because  x and  t are  independent  variables.  Thus  the  same  result  is 
produced  whether  3iz  is  differentiated  first  with  respect  to  t and  then  with 
respect  to  x or  first  with  respect  to  x and  then  with  respect  to  t.  So  Eqs. 
(27-12)  and  (27-13)  give  us 

d^y  _ d^y 

dx2  ~ dt2 


To  make  this  result  more  explicit,  we  indicate  that  %y  is  a dependent  vari- 
able whose  value  depends  on  both  the  independent  variables  x and  t by 
writing  %y  = %y{x,  t).  Then  we  have 


d2<Sy(x,  t)  _ d2<£y(x,  t) 

~^F~  ~ Mo6° 


(27-14) 


To  get  a similar  equation  for  0iz,  we  reverse  the  above  process.  That  is, 
we  go  back  to  Eqs.  (27-10)  and  (27-11)  and  eliminate  %y  by  differentiating 
both  sides  of  the  first  with  respect  to  t and  both  sides  of  the  second  with 
respect  to  x.  We  obtain 

d2%y  = d29lz 
dt  dx  dt2 


and 

d29Sz  _ _ d2%y 
dx2  ^'°e°  dx  dt 


These  combine  to  give  us 


d2<3iz  d2Wz 

dx2  ~ ^°e°  dt 2 


or,  more  explicitly, 


d2^z{x,  t)  d2<3lz{x,  t) 

~ /Uoe<r 


(27-15) 


dx2  dt2 

Equations  (27-14)  and  (27-15)  are  the  electromagnetic  wave  equations.  We 

Maxwell’s  Equations  and  Electromagnetic  Waves 


have  obtained  them  by  carrying  out  mathematical  manipulations  on  Max- 
well’s equations  in  vacuum.  We  study  their  physical  significance  in  Sec. 
27-4. 


27-4  ELECTRO- 
MAGNETIC WAVES 


Compare  the  stretched-string  wave  equation  with  the  electromagnetic  wave 
equations: 


d2f(x,  t) 

/j.  d2f{x,  t) 

dx2 

F dt2 

d2<gy(X,  t) 

b2%yiX,  t) 

dx2 

~ Moe0  df2 

d20$z(x,  t) 

d2£$2(x,  t) 

dx2 

Although  the  physical  quantities  related  by  the  first  equation  are  very  dif- 
ferent from  those  related  by  the  second  and  third,  from  a mathematical 
point  of  view  the  differences  are  inconsequential.  In  Chap.  12  you  learned 
that  the  equation  for  the  mechanical  waves  in  a string  has  traveling-wave 
solutions  of  the  form 


fix,  t)  = f{x  - vt) 


These  solutions  describe  profiles  of  transverse  displacement  f(x,  t)  in  the 
string  which  maintain  fixed  shapes  as  they  move  along  the  string  in  the  x 
direction  at  velocity  v.  You  also  learned  that  the  propagation  speed  |v|  of 
the  traveling  wave  is  given  by  the  expression 


where  F and  \x  are  the  tension  in  the  string  and  its  linear  density.  Note  that 
VFU  is  just  the  square  root  of  the  reciprocal  of  the  constant  multiplying 
the  time  derivative  in  the  string  wave  equation. 

Because  of  the  similarity  of  the  three  wave  equations,  it  would  be  natu- 
ral to  guess  that  the  two  electromagnetic  equations  have  traveling-wave  solu- 
tions of  the  form 


«y(x,  t)  — %y(x  — vt)  (27-16) 

and 

S8*(x,  t ) = S8z(x  - vt)  (27-17) 

where  the  propagation  speed  |u|  of  these  waves  is  given  by 

1 

H = (27-18) 

V/u,0e0 

This  speed  is  also  the  square  root  of  the  reciprocal  of  the  constant  multiply- 
ing the  time  derivative  in  either  of  the  electromagnetic  wave  equations. 
You  should  verify  the  guess  by  substituting  Eq.  (27-16)  into  the  wave  equa- 
tion for  %y{x,  t),  after  evaluating  the  required  derivatives,  and  then  em- 
ploying Eq.  (27-18).  The  procedure  is  identical  to  that  carried  out  in 
Sec.  12-4. 


27-4  Electromagnetic  Waves  1281 


EXAMPLE  27-3 


Use  Eq.  (27-18)  to  determine  the  numerical  value  of  the  speed  of  propagation  |u|  for 
electromagnetic  waves,  and  comment  on  the  result. 

■ Recall  from  Eq.  (23-38 b)  that  the  value  of  the  permeability  constant  /x0  is  de- 
fined to  be 

/To  = 47 r x 10-7  T-m/A 

The  value  of  the  permittivity  constant  e0  can  be  obtained  from  static  electric  held 
measurements  on  a capacitor  of  accurately  known  dimensions,  as  described  in  Sec. 
21-5.  The  value  obtained  from  these  measurements  is 

e0  = 8.854  x 1(T12  Cz/(N-m2) 

Using  these  values  in  Eq.  (27-18),  you  find  for  the  speed  of  electromagnetic  waves 

11“  1 

V4tt  x 10-7  T-m/A  x 8.854  x 10“12  C2/(N-m2) 


or 

\v\  = 2.998  x 108  m/s 

Time-of-flight  measurements  of  the  speed  of  light  c were  discussed  in  Sec. 
14-2.  The  value  given  there  is,  to  four  significant  figures, 

c = 2.998  x 108  m/s 

The  speed  of  electromagnetic  waves  is  the  same  as  the  speed  of  light! 


The  first  person  ever  to  make  the  calculation  in  Example  27-3,  and 
thereby  discover  the  relation  between  the  speed  of  electromagnetic  waves 
and  the  speed  of  light,  was  Maxwell  (although  he  used  a very  different 
system  of  units). 


Maxwell’s  surprise  and  delight  in  making  this  discovery  are  evident  in  the 
following  excerpt  from  a letter  he  wrote  to  William  Thomson  (later  Lord  Kelvin] 
toward  the  end  of  1861:  “I  made  out  the  equations  in  the  country  before  I had  any 
suspicion  of  the  nearness  between  the  two  values  [of  what  in  modern  terms  are 
(/x0e0)  1,2  and  the  speed  of  light  c],  so  that  I think  I have  reason  to  believe  that  light 
consists  in  the  transverse  undulations  [oscillations]  ...  of  electric  and  mag- 
netic phenomena.”  You  can  imagine  how  he  must  have  rushed  to  the  university 
library  to  obtain  the  numerical  values  when  he  returned  from  vacation. 


Maxwell’s  result  can  be  expressed  by  saying  that  light  propagates  as  an 
electromagnetic  wave.  This  discovery  was  a triumph  that  could  not  have  been 
anticipated  when  he  set  out  to  repair  the  flaw  in  Ampere’s  law. 

To  summarize  Maxwell’s  discovery,  he  showed  five  things:  (1)  If  there 
is  a varying  electric  field  IS y(x , t)  in  vacuum,  then  there  is  also  a varying 
magnetic  field  Sftz(x,  t),  and  vice  versa.  The  electric  and  magnetic  fields  are 
transverse  to  the  direction  of  propagation  and  perpendicular  to  each  other. 
(2)  These  fields  obey  wave  equations  with  identical  propagation  speeds,  so  a 
configuration  of  electric  and  magnetic  fields  will  not  change  as  it  propa- 
gates in  a beam  in  the  x direction  as  t increases.  (3)  The  propagation  speed 
|u|,  predicted  by  the  wave  equations  in  terms  of  the  values  of  the  electric 
and  magnetic  constairts  e0  and  /a0,  is  the  same  as  the  measured  speed  of 


1282  Maxwell’s  Equations  and  Electromagnetic  Waves 


light  c.  (4)  Light  waves  can  therefore  be  identified  as  electromagnetic 
waves.  (5)  This  being  the  case,  Eq.  (27-18)  can  be  written  as 

(27-19) 


For  a specific  example  of  a wave  with  the  general  form  of  Eq.  (27-16), 
consider  the  sinusoidal  polarized  plane  wave  propagating  in  the  positive  x 
direction  whose  electric  field  is  described  by  the  function 


%y[x,  t)  = A cos 


(x  - ct) 


(27-20) 


Its  propagation  speed  has  been  written  as  the  speed  of  light  c.  Here  A is  the 
amplitude  of  the  wave.  That  is,  A is  the  maximum  value  attained  by  the  elec- 
tric field  %y(x,  t).  The  quantity  A is  the  wavelength  of  the  wave.  In  other 
words,  A is  the  change  in  position  x that  takes  the  sinusoidal  function 
through  one  cycle,  if  the  time  t is  fixed.  If  x is  fixed,  then  the  change  in  t re- 
quired to  take  it  through  one  cycle  is  the  period  T of  the  wave.  Since  in  going 
through  one  cycle  the  argument  of  the  cosine  changes  by  277.  the  value  of  T 
is  given  by 


or 


T 


A 


c 


The  reciprocal  of  T is  the  frequency  v of  the  wave.  So  we  have  v — 1 /T,  and 
therefore 


(27-21) 


This  is  just  Eq.  (12-1  lb)  with  |v|  set  equal  to  c. 

If  the  electric  field  described  by  Eq.  (27-20)  is  traveling  through  vacu- 
um, then  there  is  also  a magnetic  field  2ft  fx,  t)  related  to  it  by  Eq.  (27-1 1). 
To  evaluate  this  field,  we  write  the  equation  as 

d2ftz{x,  t)  _ d%y(x,  t) 

d7~  - -Moeo  Jt 


and  then  employ  Eq.  (27-19)  to  express  it  as 

d2ftz(x,  t)  _ 1 d%y(x,  t) 

dx  c2  dt 

Then  we  differentiate  Eq.  (27-20)  with  respect  to  t,  obtaining 


ft%y{x,  t ) 27 tc  . 

= A~ sin 


27T 

— (x  - ct) 


So  we  have 


d2ftz(x,  t)  1 27r  . 

= A — — sin 

dx  c A 


277 


(x  — ct) 


27-4  Electromagnetic  Waves  1283 


I his  is  an  equation  determining  £ftz(x,  t).  It  has  the  solution 


£$z(x,  t)  = — A cos 
c 


2 77 


{x  — Ct) 


as  can  be  verified  immediately  by  differentiating  with  respect  to  x.  (Another 
solution  is  the  same  function  plus  any  function  that  does  not  depend  on  x. 
But  an  x-indepenclent  function  does  not  describe  any  part  of  a wave  of  mag- 
netic field,  so  we  reject  such  a solution.)  Comparing  the  solution  for  S#2(x,  t) 
just  obtained  with  the  value  of  fj/x,  t)  given  in  Eq.  (27-20),  we  see  that  the 
value  of  the  magnetic  field  3iz{x,  t)  is  just  equal  to  the  value  of  the  electric  field 
<C(x,  t)  at  the  same  location  and  time,  divided  by  the  speed  of  light  c.  So  we  can 
write  0iz{x,  t)  in  the  simple  form 

gu(x,  t ) 

mz{x,  t)  = y (27-22) 

c 

The  mutually  perpendicular  electric  and  magnetic  fields  in  the  electromagnetic  wave 
oscillate  in  unison — that  is,  in  phase — with  proportional  strengths. 


Figure  27-10  is  a plot  of  the  values  of  %y  and  (3iz  given  by  Eqs.  (27-20) 
and  (27-22),  as  a function  of  position  x for  a fixed  time  t.  It  gives,  so  to 
speak,  a snapshot  showing  what  a polarized  electromagnet  plane  wave 
looks  like.  Such  a wave  can  be  polarized  with  the  8 field  in  any  fixed  direc- 
tion in  the  plane  normal  to  the  propagation  axis- — that  is,  in  the  plane  of  a 
wave  front.  Whatever  the  direction  of  8,  the  © field  must  be  perpendicular 
to  it  and  must  also  lie  in  the  plane  of  a wave  front.  If  the  wave  is  not  po- 
larized, then  8 has  components  along  any  two  perpendicular  directions  in  a 
wave  front,  and  so  does  ffi.  But  at  each  instant  the  vector  ffi  is  perpendicular 
to  the  vector  8 and  in  vacuum  has  a magnitude  given  by 


(27-23) 


This  equation  holds  just  as  well  for  waves  traveling  in  the  negative  x direc- 
tion. And,  since  a standing  wave  can  be  composed  of  oppositely  directed 
traveling  waves,  it  is  also  valid  for  standing  waves.  For  traveling  waves  it  re- 
mains valid  whether  the  functions  describing  them  are  sinusoidal  or  of 
some  other  form.  It  even  holds  for  electromagnetic  waves  which  are  not 
plane  waves  but  spherical  expanding  waves — or  waves  of  any  other  geome- 
try. The  reason  is  that  the  relation  was  obtained  by  considering  the  relation 


y 


£ 


Fig.  27-10  T he  x dependence  of  the  electric  field  £ and  magnetic  field  (B  of 
a plane  electromagnetic  wave  propagating  in  the  x direction,  at  a particular 
instant  of  time  t.  The  wave  is  polarized  so  that  the  only  component  of  £ is  £„. 
Thus  £ is  directed  everywhere  in  the  positive  or  negative  y direction.  With 
this  polarization  of  £.  there  must  also  be  a polarization  of®  so  that  it  has  only 
a component  9iz.  That  is,  ® is  directed  everywhere  in  the  positive  or  nega- 
tive z direction.  The  magnitudes  of  £ and  ® are  not  drawn  to  the  same  scale 
since  382  = ? y/c . Indeed,  the  two  vectors  cannot  be  drawn  to  the  same  scale 
since  they  have  different  dimensions.  You  should  describe  to  yourself  what  £ 
and  ® do  at  a fixed  value  of  x as  the  value  of  t increases. 


1284  Maxwell’s  Equations  and  Electromagnetic  Waves 


between  8 and  ® in  an  infinitesimal  region  of  space.  Over  such  a small 
region,  wave  fronts  of  any  shape  are  indistinguishable  from  plane  wave 
fronts. 


How  does  Maxwell’s  theoretical  work  concerning  the  speed  of  light  in  vac- 
uum relate  to  the  experimental  work  of  Michelson  and  Morley  and  to  Einstein's 
special  theory  of  relativity?  Maxwell  and  his  contemporaries  believed  that  the  par- 
ticular numerical  value  c would  be  obtained  for  a measurement  of  the  speed  of 
light  in  vacuum  only  if  the  measurement  was  carried  out  in  a particular  inertial 
reference  frame.  The  reason  for  this  view  is  that  when  the  Galilean  position-time 
transformation,  that  is,  Eqs.  (14-8),  is  applied  to  Maxwell’s  equations  in  vacuum, 
the  new  equations  produced  have  mathematical  forms  which  are  significantly  dif- 
ferent from  those  displayed  in  Eqs.  (27-7a)  through  (27-7d).  As  a consequence  of 
these  changes  in  the  Galilean-transformed  Maxwell’s  equations,  there  are  changes 
in  the  wave  equations  that  are  obtained  from  them.  These  changes,  in  turn,  lead 
to  a different  predicted  value  for  the  speed  of  light,  as  measured  in  the  new  inertial 
reference  frame.  Thus  the  measured  value  of  the  speed  of  light  would  be  different 
in  different  inertial  frames,  if  both  Maxwell's  equations  and  the  equations  of  the 
Galilean  transformation  were  valid  under  all  circumstances.  In  fact,  Maxwell’s 
equations  do  have  this  universal  validity.  But,  as  you  saw  in  Chap.  14,  the  Galilean 
transformation  equations  are  not  valid  when  the  two  inertial  frames  are  moving 
relative  to  each  other  at  a speed  comparable  to  the  speed  of  light.  After  the  discov- 
ery of  the  Lorentz  position-time  transformation,  Eqs.  (14-16),  Einstein's  applica- 
tion of  this  transformation  to  Maxwell's  equations  made  the  relation  between 
theory  and  experiment  completely  clear.  Maxwell's  equations  in  vacuum  are  un- 
changed in  form  by  the  application  of  a Lorentz  transformation  (providing  the 
electric  and  magnetic  fields  in  the  two  inertial  frames  are  related  as  described  in 
Sec.  24-2).  So  the  new  equations  lead  to  identical  electromagnetic  wave  equations 
and  to  the  prediction  of  identical  values  for  the  measured  speed  of  light  in  vacu- 
um, for  ah  inertial  reference  frames — even  though  such  frames  move  relative  to 
one  another.  Actually  carrying  out  a Lorentz  transformation  of  Maxwell's  equa- 
tions in  vacuum  would  involve  too  lengthy  a calculation  to  reproduce  here. 


If  an  electromagnetic  wave  has  wavelength  A,  it  necessarily  has  fre- 
quency v satisfying  Eq.  (27-21): 


c 


This  restriction  is  imposed  by  the  electromagnetic  wave  equations.  But 
these  equations  impose  no  other  r estrictions  on  v and  A.  That  is,  any  pair  of 
v and  A related  by  Eq.  (27-21)  can  be  possible  values  for  the  frequency  and 
wavelength  of  an  electromagnetic  wave.  The  values  of  these  quantities  for  a 
particular  wave  are  dictated  by  the  process  that  produced  the  wave.  At  the 
end  of  this  chapter  you  will  find  that  electromagnetic  waves  very  commonly 
are  produced  by  electric  charges  undergoing  some  sort  of  oscillatory  mo- 
tion, and  that  the  frequency  v of  the  wave  produced  equals  the  frequency 
of  the  oscillation  producing  it. 

For  the  particularly  important  case  of  visible  light,  the  frequency  v and 
wavelength  A range  from  about  5 x 1014  Hz  and  7 x 10-7  m to  about  8 x 
1014  Hz  and  4 x 10-7  m.  The  first  pair  of  values  corresponds  to  reel  light, 
while  the  second  pair  corresponds  to  violet  light.  Frequencies  of  light  waves 
are  difficult  to  measure  directly  because  they  are  so  high.  But  wavelengths 
of  light  are  measured  with  no  difficulty  by  using  instruments  such  as  the 


27-4  Electromagnetic  Waves  1285 


Wavelength  (in  m) 


Name 


10-12  r 
lCr11  - 
l cr10  - 
io-9  L 


IO”8  h 


icr6 

10-5 

ter4 


10  3 k 
10-2  L 

10-’  j- 

1 L 
10  j- 

IU:  h 

103 

104 

105  j- 

106  j~ 

10  L 


7 ray 

X ray 

Ultraviolet 
Visible  light 

Infrared 

EHF  microwaves 

Radar 


Frequency  (in  Hz) 

-,1021 

- IO20 

- 1019 
IO18 

- 1017 

- IO16 
IO15 
1014 
IO13 

- IO12 
10" 

- IO10 


Fig.  27-11  The  electromagnetic 
spectrum. 


~T~  UHF  television 
Citizen  bands,  etc. 

V 1 1 1 television 
^\FM  radio 
X^VHF  television 
.Citizen  bands,  etc. 

- - Short-wave  radio 

- -^AM  radio 
Radio  direction  finding 


109 

108 

107 

106 

105 


VLF  radio 


104 

103 


AC  power 


102 


Michelson  interferometer  described  in  Chap.  14.  So  for  light  the  frequency 
usually  is  deduced  from  the  wavelength  by  employing  the  relation  v = c/k. 

Figure  27-11  displays  (he  names  given  to  electromagnetic  waves  in 
various  regions  of  what  is  called  the  electromagnetic  spectrum,  and  it  indi- 
cates the  frequencies  and  wavelengths  characterizing  each.  Taken 
together,  they  constitute  electromagnetic  radiation.  As  we  go  through  the 
spectrum,  we  find  striking  changes  in  those  properties  of  electromagnetic 
radiation  that  are  associated  with  its  emission  or  reception.  But  all  forms  of 
electromagnetic  radiation  travel  from  the  point  of  emission  to  the  point  of 
reception  in  the  same  way,  if  they  are  traveling  through  vacuum.  They 
move  like  waves  at  speed  c because  they  are  all  described  by  solutions  to  the 
same  electromagnetic  wave  equations. 

When  Maxwell  hrst  obtained  the  electromagnetic  wave  equations,  the 
only  known  example  of  an  electromagnetic  wave  phenomenon  was  light. 
But  in  1887  Heinrich  Hertz  succeeded  in  generating  the  electromagnetic 
waves  that  we  now  call  radio  waves.  Compared  to  light  waves,  radio  waves 
have  much  lower  frequencies  and,  therefore,  much  longer  wavelengths.  As 
a consequence,  the  techniques  required  to  generate  and  detect  such  waves 


1286 


Maxwell’s  Equations  and  Electromagnetic  Waves 


are  quite  different  from  those  used  for  light.  Hertz'  rudimentary  radio 
transmitter  consisted  of  a capacitor  discharging  across  the  resistance  of  a 
spark  gap.  If  the  inductance  of  the  capacitor  lead  wires  is  also  taken  into  ac- 
count, ii  can  be  seen  that  he  had  an  LRC  circuit  in  which  electrons  oscillated 
with  damped  harmonic  motion.  The  oscillation  frequency  was  somewhat 
higher  than  I08  Hz.  So  electromagnetic  waves  were  produced  that  had  this 
frequency  and  the  corresponding  wavelength  of  about  1 m.  The  capacitor 
lead  wires  acted  as  the  transmitting  antenna. 

Hertz’  radio  receiver  was  even  simpler.  It  consisted  of  a loop  of  wire 
acting  as  a receiving  antenna,  with  a narrow  gap  at  one  point  in  the  loop. 
The  gap  acted  as  a detector  of  oscillating  current  in  the  loop,  because  a 
weak  spark  appeared  across  the  receiver  gap  when  the  electric  field  of  the 
radio  wave  produced  a sufficiently  large  entf  across  the  gap.  Hertz  found 
that  the  waves  traveling  between  the  transmitter  and  receiver  could  be  re- 
flected, refracted,  and  made  to  superpose  in  a manner  completely  analogous 
to  the  behavior  of  light.  This  further  confirmed  the  essential  identity  of  elec- 
tromagnetic waves  and  light  waves. 

But  Hertz’  insensitive  receiver  could  detect  signals  from  the  weak 
transmitter  only  if  it  was  quite  close.  The  Italian-British  inventor  Gug- 
lielmo  Marconi  soon  made  a series  of  significant  technical  improvements  in 
both  the  receiver  and  the  transmitter.  After  numerous  shorter-range  suc- 
cesses, he  achieved  transatlantic  radio  communication  in  1901. 

A simple  form  of  receiving  antenna  for  electromagnetic  waves  in  the 
television  range  of  frequencies  and  wavelengths  is  shown  in  Fig.  27-12.  It  is 
called  a dipole  antenna  because  it  consists  of  two  metallic  rods  which,  when 
oppositely  charged,  comprise  an  electric  dipole.  The  rods  are  coupled  to 
the  input  amplifier  of  the  television  receiver.  One  way  to  do  this  is  through 
the  air-core  transformer  shown  in  the  figure.  When  the  antenna  is  aligned 
along  the  electric  field  8 of  the  wave  being  received,  the  field  exerts  forces 
on  the  conduction  electrons  in  the  rods  and  so  sets  up  a current  which  flows 
through  the  few  turns  of  the  primary  coil  of  the  transfer.  Since  the  direc- 
tion of  8 at  the  antenna  changes  at  the  frequency  v of  the  electromagnetic 
wave,  the  current  it  produces  in  the  antenna  changes  direction  at  that  fre- 
quency. 

Inductance,  resistance,  and  capacitance  are  distributed  along  the  an- 
tenna rods,  so  that  these  metallic  conductors  act  like  a system  of  many  in- 
terconnected LRC  circuits.  Analysis  shows  that  if  a pulse  of  current  is  pro- 
duced at  some  location  in  a conductor,  say,  by  applying  an  electric  field  to 
the  conduction  electrons  at  that  location,  the  pulse  will  propagate  at  a finite 
speed  along  the  conductor.  It  obeys  the  so-called  transmission  line  wave 
equation.  What  happens  is  quite  like  what  occurs  when  a pulse  of  water 


Fig.  27-12  A dipole  receiving  antenna.  The  electric  field  of  the  wave 
impinging  on  the  antenna  is  indicated  at  a particular  instant,  as  in 
Fig.  27-10. 


To  receiver 


27-4  Electromagnetic  Waves  1287 


current  is  produced  somewhere  in  an  already  filled  water  pipe  by  pushing 
on  the  water  there— the  pulse  propagates  along  the  pipe  at  a finite  speed. 
For  a conductor  of  negligible  resistance,  the  electric  current  pulse  travels  at 
the  speed  of  light.  When  it  comes  to  the  end  of  the  conductor,  the  pulse  is 
reflected,  with  the  direction  of  the  current  reversing. 

Thus  a wave  of  current  continuously  produced  in  the  antenna  travels 
to  one  end,  is  reflected,  travels  back  to  the  other  end,  is  reflected  again,  and 
so  on.  The  oppositely  directed  traveling  current  waves  form  a longitudinal 
standing  wave  in  which  the  current  pattern  has  nodes  at  both  ends  of  the 
antenna  and  an  antinode  at  its  center.  That  is,  half  a wavelength  of  the 
standing  wave  of  current  just  fits  into  the  total  length  L of  the  antenna  rods 
(if  the  inductance  of  the  small  transformer  at  its  center  can  be  neglected). 
The  frequency  of  the  standing  wave  of  current  can  be  calculated  from  the 
relation  ^current  = c/Xcurrent>  since  c is  the  speed  of  current  waves  in  the  an- 
tenna rods.  Setting  Xcurrent/2  — L,  or  AcUrrent  = 2 L,  we  find  the  frequency  of 
the  standing  wave  to  be  Current  = c/2L.  For  the  antenna  to  have  maximum 
efficiency,  this  frequency  should  equal  the  frequency  of  the  oscillating  8 
held  driving  the  wave  of  current,  that  is,  the  frequency  of  the  electromag- 
netic wave  being  received  by  the  antenna.  This  frequency  is  v — c/k, 
where  A is  the  wavelength  of  the  electromagnetic  wave.  Equating  the  two 
frequencies  gives  c/2L  = c/k,  or  L = A/2.  So  the  total  length  L of  the  di- 
pole antenna  should  be  L — A/2,  half  the  wavelength  of  the  wave.  Such  an- 
tennas are  called  half-wave  dipole  antennas. 

Example  27-4  evaluates  the  frequency  of  the  electromagnetic  wave 
being  received  by  a half-wave  dipole  antenna  of  a certain  length. 


EXAMPLE  27-4 

Estimate  the  frequency  of  the  electromagnetic  waves  used  in  television  transmis- 
sion. 

■ If  you  have  been  observant,  you  have  noticed  that  antennas  used  to  receive 
television  signals  usually  consist  of  sets  of  dipoles.  In  many  antennas,  the  dipoles  are 
graded  in  length.  Why?  A typical  length  (for  a VHF  antenna)  is  about  L = 2.0  m. 
Using  the  fact  that  the  dipoles  are  half-wave  dipoles,  you  can  obtain  the  wavelength 
of  the  electromagnetic  wave  with  which  the  typical  one  resonates  by  evaluating 

A = 21  = 2 x 2.0  m = 4.0  m 


The  corresponding  frequency  is 


v 


3.0  x 108  m/s 
4.0  m 


= 7.5  x 


107  Hz 


Hence  the  frequency  best  received  by  the  2.0-m-long  half-wave  dipole  antenna  is 
75  MHz. 

If  you  have  inspected  antennas  in  a locale  far  from  television  transmitters,  you 
have  noted  that  they  are  oriented  so  that  the  dipoles  lie  perpendicular  to  the  direc- 
tion to  the  transmitters.  Why? 


27-5  ENERGY  AND 
MOMENTUM  IN 
ELECTROMAGNETIC 
RADIATION 


It  is  common  knowledge  that  electromagnetic  radiation  carries  energy. 
This  fact  motivates  the  very  serious  efforts  being  made  to  hud  ways  of 
using  solar  power  as  at  least  a partial  solution  to  the  world’s  energy  crisis.  In 
this  section  we  determine  just  how  much  energy  radiation  does  carry  by  in- 
vestigating quantitatively  the  relations  among  the  energy  content,  energy 


1288  Maxwell’s  Equations  and  Electromagnetic  Waves 


i l 

£ 


■*» 


S 


x 


transport,  and  field  strengths.  We  also  study  a much  less  well-known  prop- 
erty of  radiation  — its  momentum  content. 

If  an  electromagnetic  wave  is  present  in  an  evacuated  region  of  space, 
then  there  is  an  electric  field  and  also  a magnetic  field  in  that  region.  Let  us 
again  suppose  that  the  x direction  is  the  direction  of  propagation  and  that 
the  electric  field  in  the  transverse  plane  is  confined  to  the  y direction.  Then 
the  magnetic  field  will  be  in  the  z direction,  as  indicated  in  Fig.  27-13.  Ac- 
cording to  Eq.  (21-56),  the  energy  pe  per  unit  volume  contained  in  the  elec- 
tric field  is 


Pe  = 


2 


(27-24) 


The  energy  pm  per  unit  volume  in  the  magnetic  field  is  given  by  Eq. 
(25-43)  to  be 


Fig.  27-13  The  electric  field  £ and 
magnetic  field  (B.  at  some  position  and 
time,  in  a plane  wave  propagating  along 
the  x direction  with  8 polarized  in  the  y 
direction.  The  energy  flux  vector  S will 
be  introduced  soon  to  describe  the  en- 
ergy flow  associated  with  the  electro- 
magnetic radiation. 


Pm  ~ 


(27-25) 


To  facilitate  comparison  of  the  two  energy  densities,  we  use  Eq.  (27-22), 


C 


to  write  pm  as 


Pm 


C22/JL0 


Then  we  employ  Eq.  (27-19),  in  the  form 


to  obtain 


2 

2 — P" 0^0 


_ p0e0%l  _ e0%* 

Pm  q 9 

-P-0  4 

Comparing  this  with  Eq.  (27-24),  we  see  that 

Pe  - pm  (27-26) 

For  electromagnetic  radiation  in  vacuum,  the  electric  and  magnetic  fields  have 
the  same  energy  densities. 

The  total  energy  per  unit  volume  in  the  radiation  is  the  sum  of  the 
electric  and  magnetic  energy  densities.  This  total  energy  density  p has  a 
value  that  can  be  expressed  as 

P = Pe  + Pm  = 2 Pe  = e0%l  (27-27a) 

But  it  is  better  to  express  p in  a form  indicating  that  it  involves  both  an  elec- 
tric and  a magnetic  field.  To  do  this,  we  use  <3iz  = % y/c  to  write 

p = re  (27-27  b) 

For  electromagnetic  radiation  in  a gas  such  as  air,  Eqs.  (27-24)  through 
(27-27)  are  not  exact,  as  they  are  for  vacuum.  But  they  are  very  good 
approximations. 


27-5  Energy  and  Momentum  in  Electromagnetic  Radiation  1289 


Fig.  27-14  Schematic  representation 
of  the  energy,  flowing  at  speed  c,  which 
in  the  infinitesimal  time  interval  dt 
passes  across  an  infinitesimal  area  da  of 
a fixed  marker  surface  that  is  normal  to 
the  direction  of  flow.  In  that  time  all  the 
energy  in  the  shaded  region  flows  past 
the  indicated  area  of  the  marker  sur- 
face. The  shaded  region  extends  a dis- 
tance along  the  direction  of  flow  equal 
to  c dt.  Its  volume  is  c dt  da.  The  total  en- 
ergy content  in  the  region  is  the  product 
of  its  energy  content  per  unit  volume  p 
and  its  volume  c dt  da,  or  pc  dt  da.  Since 
all  this  energy  flows  in  time  dt  past  the 
area  da  of  the  marker  surface,  the  en- 
ergy flow  per  unit  time  per  unit  area  of 
the  marker  surface  is  pc  dtda/dtda,  or 
pc.  This  quantity  is  the  energy  flux  S; 
therefore  S = pc. 


Just  as  in  mechanical  traveling  waves,  the  energy  in  an  electromagnetic 
traveling  wave  is  carried  along  with  the  wave.  The  rate  at  which  energy  is 
transported  by  the  wave  past  a fixed  location  is  given  the  same  name  as  that 
used  for  mechanical  waves.  This  rate  of  flow  of  energy  is  called  the  energy 
flux  5.  It  is  the  energy  carried  per  unit  time  by  the  wave  per  unit  area 
normal  to  the  propagation  direction.  The  relation  between  energy  flux  and 
total  energy  density  for  electromagnetic  waves  has  precisely  the  same  form 
as  the  relation  between  these  two  quantities  for  mechanical  waves.  Figure 
27-14  and  the  argument  given  in  its  caption  show  that 

S = pc  (27-28) 

The  energy  fl  ux  equals  the  energy  density  times  the  speed  c at  which  energy  is  being 
transported.  Compare  this  statement  with  the  one  used  to  give  verbal  expres- 
sion to  Eq.  (12-56). 

If  we  use  Eq.  (27-27 b)  to  evaluate  the  energy  density,  we  have 

5 = c2e0<SyS^z 

I bis  can  be  simplified  by  employing  the  relation  c2  = 1 / /x0e0 - We  obtain 


Mo 


(27-29a) 


This  result  tells  us  the  instantaneous  magnitude  of  the  energy  carried  by 
the  polarized  plane  wave  past  any  area  normal  to  the  propagation  direc- 
tion, in  terms  of  the  instantaneous  magnitudes  of  its  electric  and  magnetic 
fields  at  that  area.  But  the  energy  flux  also  has  a direction,  namely,  the 
direction  of  propagation  of  the  wave.  Thus  S is  really  the  magnitude  of  an 
electromagnetic  energy  flux  vector  S.  The  vector  S is  also  called  the 
Poynting  vector  after  J.  H.  Poynting,  who  first  investigated  its  properties. 
A vector  form  of  Ecj.  (27-29o),  valid  for  any  combination  of  electric  and 
magnetic  fields,  is 


S = 


8 x (B 

Mo 


(27-296) 


In  the  case  at  hand,  8 has  only  a y component  and  (B  has  only  a z compo- 
nent, so  that  | 8 x (B  | = Thus  Eq.  (27-29 b)  specifies  a magnitude  for 

S in  agreement  with  Eq.  (27-29a).  Also.  8 x (B  is  in  the  x direction  in  this 
case,  so  the  vector  equation  gives  the  correct  direction  for  S.  (See  Fig.  27-13 
again.)  But  no  matter  what  the  geometry  of  the  electromagnetic  wave,  Eq. 
(27-29 b)  always  gives  the  correct  energy  flux  vector  S. 


In  many  cases  the  electric  and  magnetic  fields  of  an  electromagnetic 
wave  are  polarized  and  have  sinusoidal  time  dependences,  like  those  given 
by  Eqs.  (27-20)  and  (27-22).  Let  us  write  the  fields  as 


and 


S*  = S 


yo 


COS 


2tt 

— ( X — ct) 


= 38.0  cos 


2tt 

— (x  - ct) 
A 


where  %y0  is  the  amplitude  of  the  %y  wave  and  38*0  is  the  amplitude  of  the 


1290  Maxwell’s  Equations  and  Electromagnetic  Waves 


S82  wave.  The  energy  flux  carried  by  such  an  electromagnetic  wave  has  the 
magnitude 

'gyO&zO  9T277  / 

5 = = — cosz  — (x  — ct) 

Mo  Mo  LA.  J 

Hence,  at  any  position  x the  value  of  5 varies  through  each  cycle  of  oscilla- 
tion of  % y and  S82  from  0 (when  the  square  of  the  cosine  equals  0)  to 
zo/ Mo  (when  the  square  of  the  cosine  equals  1).  1 he  average  value  of 
the  energy  flux  over  one  cycle  is 


<S>  = 


-'yO'yozO  o 
COS2 

Mo 


5 4/0^20  / 2 

cos2 


277 

— (x  - Ct) 


Mo 


'277 

T(I 


Ct) 


Over  one  cycle  the  average  of  the  square  of  any  sinusoidal  is  i (see  Example 
12-6).  We  therefore  have 


(27-30«) 
2 Mo 

An  alternative  expression  of  Eq.  (27-30a)  is  obtained  by  using  the  relation 
&zo  ~ <$yo /c,  to  write  (S)  as 


(S) 


CP2 
6 i/0 

2/-l0c 


(27-30 b) 


The  values  of  %v0  and  gftz0  are  estimated  from  Eq.  (27-30 b)  in  Example 
27-5  for  the  most  familiar  form  of  electromagnetic  radiation  — sunlight. 


EXAMPLE  27-5 

Measurements  of  the  temperature  rise  of  an  absorbing  plate  oriented  normal  to  the 
sun’s  rays  show  that  the  energy  flux  delivered  by  sunlight  to  the  surface  of  the  earth 
on  a clear  day  is  about  1.0  x 103  W/m2.  Determine  the  amplitude  of  the  electric  and 
magnetic  fields  in  the  sunlight. 

■ Solving  Eq.  (27-30&)  for  taking  the  square  root,  and  then  inserting  the  nu- 
merical values,  you  find 

= V2 /l0c(S)  = V2  x 477  x 10~7  T-m/A  x 3.0  x 108  m/s  x 1.0  X 103  W/m2 
= 9.0  x 102  V/m 


Since  = %^/c,  you  also 


find 


q n 


The  magnitude  of  S520  obtained  in  Example  27-5  is  quite  small.  Never- 
theless, by  the  standards  of  the  everyday  world,  sunlight  carries  consider- 
able energy  to  the  surface  of  the  earth.  Indeed,  the  sun  supplies  nearly  all 
the  world’s  energy  (including  the  energy  stored  in  petroleum  and  coal,  ac- 
cumulated over  the  ages  from  sunlight  by  plants  through  the  process  of 
photosynthesis,  and  the  transport  of  water  in  the  form  of  rain  and  snow 
which  provide  hydroelectric  power). 


27-5  Energy  and  Momentum  in  Electromagnetic  Radiation  1291 


y 


Sunlight  and  other  forms  of  electromagnetic  radiation  also  carry  mo- 
mentum. You  can  see  qualitatively  that  this  is  the  case  by  considering  a po- 
larized electromagnetic  wave  with  electric  and  magnetic  fields  %y  and 
which  travels  in  the  x direction  to  a material  that  absorbs  the  wave  com- 
pletely. The  absorption  occurs  through  the  interaction  of  the  electric  and 
magnetic  fields  in  the  wave  with  electrons  bound  to  the  atoms  or  molecules 
of  the  absorbing  material.  (A  material  which  absorbs  an  electromagnetic 
wave,  in  contrast  to  reflecting  it,  is  not  a metal.  Essentially  all  the  electrons 
in  such  a material  are  bound  to  its  atoms  or  molecules.)  It  is  sufficient  for 
our  purposes  to  consider  the  absorbing  material  to  be  a collection  of  elec- 
trons bound  to  certain  locations  in  the  material  by  springs.  The  force  ex- 
erted on  each  electron  by  its  spring  represents  the  force  binding  the  elec- 
tron to  an  atom  or  molecule. 

The  Lorentz  force  equation,  Eq.  (23-19), 


Fig.  27-15  A polarized  electromag- 
netic plane  wave  impinging  on  an  elec- 
tron in  an  absorbing  material.  The  elec- 
tron is  represented  schematically  as  a 
small  negatively  charged  sphere. 


F = qS  + qy  X (B 

allows  us  to  follow  the  processes  involved  in  the  absorption  of  the  wave.  As 
the  wave  impinges  on  an  electron  in  the  material,  the  electron  experiences 
an  electric  force  Fe  = qZ  exerted  by  the  electric  field  of  the  wave.  Since  the 
electron  charge  q has  a negative  value  —e,  the  direction  of  Fe  is  opposite  to 
the  direction  of  8.  At  the  instant  illustrated  in  Fig.  27-15,  8 is  in  the  positive 
y direction,  and  so  FP  is  in  the  negative  y direction.  This  electric  force  is  in- 
crementing the  momentum  of  the  electron  in  the  negative  y direction  and 
so  is  incrementing  its  velocity  in  that  direction.  Half  an  oscillation  cycle  of 
the  electromagnetic  wave  later,  the  direction  of  Fe  reverses,  and  the  elec- 
tron begins  to  receive  velocity  increments  in  the  positive  y direction.  And 
half  a cycle  later  another  direction  reversal  occurs.  The  net  result  is  that  the 
electron  will  act  like  a driven  harmonic  oscillator,  performing  oscillations 
along  the  y axis  at  the  frequency  of  the  electromagnetic  wave.  Thus  the 
instantaneous  electron  velocity  v always  lies  along  the  y axis  and  has  a direc- 
tion which  changes  in  an  oscillatory  manner  at  the  frequency  of  the  oscilla- 
tory driving  force  Fe. 

The  analysis  of  a driven  harmonic  oscillator  is  very  similar  to  the  analy- 
sis in  Secs.  26-6  through  26-8  of  a driven  LRC  circuit.  The  results  of  the 
analysis  show  that  the  instantaneous  direction  of  v is  related  to  the  instanta- 
neous direction  of  Fe  in  a way  that  depends  on  the  relation  between  the  fre- 
quency of  the  oscillatory  driving  force  and  the  resonant  frequency  of  the 
electron  oscillator.  The  simplest  situation  occurs  when  the  oscillator  is 
driven  at  its  resonant  frequency.  In  these  circumstances  the  velocity  is  ex- 
actly in  phase  with  the  driving  force.  In  other  words,  the  direction  of  v is 
always  the  same  as  the  direction  of  Fe  if  the  oscillator  is  driven  at  its  reso- 
nant frequency.  We  assume  this  to  be  the  case,  as  implied  in  the  figure. 
(This  assumption  does  not  affect  the  results  that  will  be  obtained  from  the 
argument.  The  reason  is  that  the  results  concern  properties  of  the  electro- 
magnetic wave  which  are  independent  of  the  properties  of  the  material 
absorbing  it.  The  same  results  can  be  obtained  without  assuming  reso- 
nance. In  that  case,  complicated  phase  relationships  must  be  taken  into  ac- 
count until  the  last  step  of  the  argument.  T hen  averages  are  taken  over  one 
oscillation  cycle,  and  the  phase  relationships  are  found  not  to  affect  the  re- 
sults of  the  argument.) 


Now  that  we  have  considered  the  effect  on  the  electron  of  the  electric 
force  Fe,  we  turn  to  the  interaction  between  the  magnetic  field  ® of  the 


1292  Maxwell's  Equations  and  Electromagnetic  Waves 


electromagnetic  wave  and  the  oscillating  electron.  This  held  exerts  a mag- 
netic force  Fm  on  the  electron,  given  by  Fm  = q\  x (B.  At  the  instant  de- 
picted in  the  figure,  (B  is  in  the  positive  z direction.  As  just  explained,  we 
take  v to  have  the  same  direction  as  the  electric  force.  Therefore  the  direc- 
tion of  v at  this  instant  is  the  negative  y direction,  perpendicular  to  (B . 
Applying  the  right-hand  rule  for  cross  products  and  remembering  that  the 
electron  charge  q is  negative,  you  can  see  that  Fm  is  in  the  positive  x direc- 
tion, as  shown  in  the  figure. 

In  the  course  of  each  oscillation  cycle  of  the  electromagnetic  wave,  the 
direction  of  the  electric  force  Fe  always  lies  along  the  y axis  but  goes 
through  a cycle  of  reversals.  Thus  there  is  no  net  force  acting  on  the  elec- 
tron in  the  y direction  when  the  force  is  averaged  over  a complete  cycle. 
This  means  that  no  net  momentum  is  given  to  the  electron  in  the  y direc- 
tion. But  the  magnetic  force  Fm  remains  acting  in  the  same  positive  x direc- 
tion throughout  any  cycle.  You  should  prove  this  to  yourself,  using  the  fact 
that  and  2ftz  oscillate  in  phase  [see  Eq.  (27-22)].  Hence  there  will  be  a net 
force  acting  on  the  electron  in  the  positive  x direction,  which  is  the  direc- 
tion of  the  incident  traveling  electromagnetic  wave.  This  force  imparts  mo- 
mentum in  that  direction  to  the  electron  in  the  absorbing  material.  Conse- 
quently, it  gives  momentum  in  the  positive  x direction  to  the  material.  Since 
momentum  must  be  conserved,  we  can  conclude  that  the  momentum  ac- 
quired by  the  absorbing  material  must  come  from  the  wave  that  it  absorbs. 
Therefore  the  electromagnetic  wave  must  carry  momentum  in  its  direction  of  propa- 
gation. 

To  find  out  how  much  momentum  is  absorbed  from  the  wave,  we  eval- 
uate the  magnitude  Fm  of  the  magnetic  force.  It  is 

Fm  = \qy  x <B|  = \q\v2ftz 

In  the  second  equality,  account  has  been  taken  of  the  perpendicularity 
between  v and  (B,  and  the  magnitude  of  the  latter  has  been  expressed  as 
2ftz.  Since  2ft z = % y/c , this  force  on  the  electron  moving  at  speed  v can  be 
written  as 

r _ \q\v%v 


But  \q\%y  = Fe,  where  Fe  is  the  magnitude  of  the  electric  force.  So  Fm  also  is 
given  by 

Fm= — (27-31) 

c 

Now  Fm  is  the  magnitude  of  the  force  continually  acting  on  the  electron  in 
the  positive  x direction.  Hence  Newton’s  law  of  motion,  F = dp/dt,  states 
that  the  electron  will  absorb  momentum  in  the  positive  x direction  from  the  wave  at 
the  rate  Fm,  the  value  of  which  is  given  by  Eq.  (27-31). 

Although  the  magnetic  force  leads  to  a transfer  of  momentum  from  the 
wave  to  the  electron  in  the  absorbing  material,  the  magnetic  force  never 
does  work  on  the  electron.  The  reason  is  that  the  magnetic  force  is  always 
directed  perpendicular  to  the  electron’s  motion  along  the  y axis.  But  work 
is  done  on  the  electron  by  the  electric  force.  In  fact,  Fev  is  the  rate  at  which 
this  work  is  done.  This  is  so  because  the  work  done  by  Fe  when  it  displaces 
the  electron  an  amount  dy  is,  by  definition,  Fe  dy.  With  the  displacement  oc- 
curring in  time  dt,  the  rate  at  which  work  is  done  is  Fe  dy/dt  = Fev.  Since 


27-5  Energy  and  Momentum  in  Electromagnetic  Radiation  1293 


only  the  electric  force  does  work,  Fev  represents  the  total  rate  at  which 
work  is  being  done  on  the  surface  electron  absorbing  the  incident  wave. 

1 lie  necessary  energy  comes  from  the  wave.  Hence  Fev  is  just  equal  to  the  rate 
at  which  energy  is  absorbed  from  the  wave  by  the  electron. 

With  this  conclusion  and  the  conclusion  drawn  just  below  Eq.  (27-31), 
we  can  interpret  Eq.  (27-31)  as  saying  that  the  rate  Fm  at  which  momentum  is 
absorbed  from  the  wave  equals  the  rate  Fev  at  which  energy  is  absorbed,  divided  by  c. 
If  in  a certain  time  the  electromagnetic  wave  contained  in  a unit  volume  is 
incident  on  the  surface  where  it  is  absorbed  by  electrons,  then  the  energy 
absorbed  equals  the  energy  per  unit  volume  in  the  wave.  This  is  the  energy 
density.  We  have  previously  designated  it  by  the  symbol  p,  but  here  it  is 
given  the  more  explicit  symbol  peneTgy.  The  momentum  absorbed  in  the 
same  time  is  equal  to  the  momentum  per  unit  volume  in  the  wave.  This  is 
the  momentum  density,  which  we  symbolize  by  pmomentum-  So,  finally,  we 
can  conclude  from  Eq.  (27-31)  that 


_ H energy  /07  om 

Pmomentum  ^ 

In  Chap.  30  an  equation  relating  the  energy  content  in  electromagnetic 
radiation  to  its  momentum  content  is  obtained  from  a relativistic  argument. 
It  is  quite  different  from  the  one  given  here,  but  leads  to  a conclusion  in 
perfect  agreement  with  Eq.  (27-32). 


The  momentum  and  energy  densities  at  any  given  point  vary  through 
each  cycle  of  oscillation  of  the  radiation,  just  as  the  energy  flux  5 does.  To 
establish  the  relations  among  the  averages  of  these  quantities  over  one 
cycle,  first  take  such  an  average  of  the  terms  in  Eq.  (27-28).  If  we  use  the 
present  symbolism,  the  result  is 

(5)  — ( Penergy  )t 


or 


( Penergy) 


(5) 


(27-33) 


Then  average  the  terms  of  Eq.  (27-32)  over  one  cycle,  to  obtain 

/ \ _ ( Penergy) 

\ r momentum/ 

With  Eq.  (27-33),  this  gives 


( Pmomentum) 


(5) 


(27-34) 


It  is  easy  to  find  the  average  force  exerted  on  a unit  area  of  the  absorb- 
ing surface.  This  quantity  is  a pressure,  called  the  radiation  pressure,  and 
is  symbolized  by  (N).  Newton’s  second  law  shows  that  its  value  equals  the 
average  momentum  absorbed  by  the  unit  surface  in  unit  time.  And  an 
argument  completely  similar  to  the  one  illustrated  in  Fig.  27-14  shows  that 
this  radiation  pressure  is  the  average  momentum  density  (pm0mentum)  multi- 
plied by  c.  Thus 


(N)  = (p 


momentum 


)c 


1294  Maxwell’s  Equations  and  Electromagnetic  Waves 


or,  using  Eq.  (27-34),  we  have 

(N)  =—  (27-35) 

c 

[Note  that  comparison  with  Eq.  (27-33)  shows  that  (N)  = (penergy)  •] 

In  Example  27-6  (penergy).  (Pmomentum)-  and  ( N ) are  evaluated  for  sun- 
light. 


EXAMPLE  27-6 


Use  the  value  (S)  = EO  x 10:i  W/m2  for  the  average  energy  flux  in  sunlight  at  the 
earth’s  surface,  quoted  in  Example  27-5,  to  evaluate  (pene rgy)>  (Pmomentum).  and  (N) 
for  sunlight. 

■ Employing  Eqs.  (27-33),  (27-34),  and  (27-35),  you  have 


^ Penergy) 
(Pmomentum) 


and 


(5) 

1.0 

x ] 

103  W/m2 

c 

3.0 

X 

108  m/s 

( S ) 

1.0 

x : 

1 03  W/m2 

9 

c 

(3.0 

x 

108  m/s)2 

(S)  _ 

1.0 

x ] 

103  W/m2 

c 

3.0 

x 

108  m/s 

= 1.1  x ltr14  kg-m/(s-m3) 


3.3  x 10“6  N/m2 


The  radiation  pressure  exerted  by  sunlight  at,  or  near,  the  earth’s  surface  is  so 
small  that  it  is  very  difficult  to  detect.  Nevertheless,  the  experiment  was  success- 
fully performed  by  P.  N.  Lebedev  in  1900.  The  pressure  exerted  on  a surface  can  be 
doubled  by  making  the  surface  a perfect  reflector,  instead  of  a perfect  absorber.  Can 
you  explain  why?  There  has  been  discussion  about  the  use  of  radiation  pressure  to 
propel  space  vehicles,  by  using  reflecting  “sails”  that  would  be  opened  when 
going  “down  wind."  Vehicles  have  been  seriously  proposed  that  use  very  thin 
sails  of  aluminized  plastic  film  with  area  of  about  10  km2.  Even  with  such  a large 
area,  the  force  is  quite  small.  But  the  final  speed  can  be  considerable  if  the  voyage 
is  long  enough.  Radiation  pressure  is  partly  responsible  for  the  way  comet  tails 
stream  away  from  the  sun.  Much  stronger  radiation  pressure  is  believed  to  have 
played  an  important  role  at  very  early  stages  in  the  development  of  the 
universe — shortly  after  the  “big  bang" — when  the  energy  density  in  electromag- 
netic radiation  is  thought  to  have  been  extremely  high.  Radiation  pressure  is  also 
of  consequence  in  procedures,  currently  being  developed,  to  produce  energy  by 
inducing  nuclear  fusion  with  a set  of  converging  laser  beams  of  very  high 
intensity. 


27-6  EMISSION  OF 
RADIATION  BY 
ACCELERATED 
CHARGES 


The  absorption  of  a continuous  train  of  electromagnetic  waves  is  explained 
in  the  theory  of  electromagnetism  by  arguing  that  electrons  in  the  absorb- 
ing material  are  set  into  oscillation  by  the  electric  held.  We  saw  in  Sec.  27-5 
that  this  requires  the  expenditure  of  energy,  with  the  energy  being  taken 
from  that  contained  in  the  waves.  Absorption  of  an  electromagnetic  wave 
consisting  of  a single  pulse  can  be  explained  in  much  the  same  way.  In  such 
a case,  the  force  acting  on  an  electron  gives  it  a single  pulse  of  acceleration, 
instead  of  the  repeated  accelerations  that  produce  oscillatory  motion.  But 
in  both  cases  the  absorption  of  electromagnetic  radiation  involves  the  accel- 
eration of  electrons.  According  to  the  electromagnetic  theory,  emission  of 
radiation  involves  the  inverse  of  diis  process.  The  theory  predicts  that  elec- 


27-6  Emission  of  Radiation  by  Accelerated  Charges  1295 


Fig.  27-16  T he  electric  field  of  a 
charged  particle  at  a location  fixed  with 
respect  to  the  inertial  frame  of  the  ob- 
server. 


Fig.  27-17  The  electric  field  of  a 
charged  particle  moving  uniformly 
through  the  inertial  frame  of  the  ob- 
server at  a speed  small  compared  to  the 
speed  of  light.  The  charge  occupies  the 
position  where  its  field  is  indicated  by 
the  gray  field  lines  and  then  later  the 
position  where  its  field  is  indicated  by 
the  black  field  lines.  The  direction  of  the 
field  line  going  through  a point  such  as 
P changes  as  time  passes  since  at  every 
instant  the  field  at  P is  in  the  direction 
from  the  location  of  the  charge  at  that 
instant.  Also  the  strength  of  the  field  at 
P changes  as  time  passes.  The  gray  and 
black  vectors  at  P indicate  both  the 
direction  and  the  strength  of  the  electric 
field  at  that  point  for  the  two  positions 
of  the  charge.  Although  at  any  point  the 
field  changes  continually  as  the  charge 
moves,  these  changes  are  gradual.  Sub- 
sequent changes  in  the  field  can  be  pre- 
dicted completely  from  preceding 
changes. 


tromagnetic  radiation  is  emitted  by  electrons,  or  other  charged  particles, 
when  they  accelerate. 

If  a charged  particle  is  stationary  in  some  inertial  reference  frame,  it 
will  appear  in  that  frame  to  produce  a time-independent  electric  field.  This 
familiar  field  is  indicated  in  Fig.  27-16,  which  shows  the  uniform  distribu- 
tion of  field  lines  emanating  from  the  position  of  the  charge.  There  is  en- 
ergy stored  in  the  electric  field.  But  there  is  no  transport  of  energy,  and  the 
energy  is  certainly  not  radiated  away. 

Now  consider  the  charged  particle  when  it  is  seen  to  be  moving 
through  the  inertial  reference  frame  with  a constant  velocity,  whose  magni- 
tude is  small  compared  to  the  speed  of  light.  It  has  a magnetic  field  as  well 
as  an  electric  field.  Both  fields  must  be  time-dependent,  since  they  move 
with  the  charge.  This  can  be  appreciated  by  considering  the  electric  field.  As 
indicated  in  Fig.  27-17,  the  electric  field  at  each  instant  consists  of  a uni- 
form distribution  of  field  lines  emanating  from  the  location  of  the  charge  at 
that  instant.  With  the  passage  of  time  the  charge  moves.  Therefore  the 
direction  of  the  field  lines  at  each  fixed  point  in  the  observer’s  reference 
frame  must  change  in  time  in  such  a way  that  the  field  lines  continue  to 
emanate  from  the  present  position  of  the  charge.  I fie  geometry  of  the 
magnetic  field  of  the  uniformly  moving  charge  is  more  complicated  than 
that  of  its  electric  field.  But  it  is  apparent  that  at  each  fixed  point  in  the  ref- 
erence frame  there  are  continuous  changes  in  the  magnetic  field  resulting 
from  the  continued  motion  of  the  charge. 

For  the  uniformly  moving  charge,  there  is  energy  stored  in  its  mag- 
netic field  as  well  as  in  its  electric  field.  Setting  the  charge  into  motion  at  a 
speed  small  compared  to  the  speed  of  light  has  no  effect  on  the  energy 
stored  in  the  electric  field.  Thus  the  additional  energy  in  the  magnetic  field 
must  come  from  the  work  done  by  the  force  which  initially  gave  the  charge 
its  uniform  motion.  The  energy  in  both  fields  is  transported  through  space 
at  the  same  velocity  as  the  velocity  of  the  charge,  and  the  fields  themselves 
remain  concentrated  around  the  location  of  the  charge.  The  energy  in  the 
fields  is  not  radiated  away  to  distant  regions.  That  this  must  be  true  can  be 
realized  by  remembering  that  the  same  charge  will  appear  to  possess  the 
time-independent  electric  field  illustrated  in  Fig.  27-16,  when  it  is  viewed 
from  an  inertial  reference  frame  that  moves  along  with  it.  Since  the  charge 
certainly  does  not  radiate  when  viewed  from  that  frame,  the  principle  of 
the  equivalence  of  inertial  reference  frames  in  Einstein’s  theory  of  relativity 
requires  that  it  cannot  radiate  when  viewed  from  the  inertial  frame  used  in 
Fig.  27-17.  Thus,  when  a charged  particle  moves  with  constant  velocity 
through  a reference  frame,  the  particle  possesses  electric  and  magnetic 
fields  which,  while  they  vary  in  space  and  time  in  that  reference  frame,  vary 
in  such  a way  that  they  do  not  satisfy  the  condition  for  electromagnetic 
radiation.  This  is  possible  because  the  fields  change  gradually  in  space  and 
time.  Through  the  connections  between  the  space  and  time  dependences 
of  the  electric  and  magnetic  fields  described  by  Maxwell’s  equations,  the 
fields  in  every  infinitesimal  region  can  anticipate  what  changes  should  be 
made  next  from  the  changes  that  have  just  been  made  in  that  region  and  in 
the  neighboring  regions.  (The  meaning  of  “anticipate”  is  clarified  soon  by 
means  of  a mechanical  analogy.) 


Electromagnetic  radiation  will  be  emitted  if  the  charged  particle  pro- 
ducing the  electric  and  magnetic  fields  is  accelerating  with  respect  to  the  in- 

1296  Maxwell’s  Equations  and  Electromagnetic  Waves 


v(t"  - t') 


v(t'  -t)l 2; 


Sphere 

centered 


\ Sphere 
centered 
on  L 


Fig.  27-18  A view  from  an  inertial  ref- 
erence frame  of  a typical  electric  field 
line  emanating  at  time  t"  from  an  accel- 
erated charged  particle.  Initially  the 
charge  was  at  rest  at  L.  Then  it  experi- 
enced a constant  acceleration  while 
moving  from  L at  time  t to  L'  at  time  t' . 
Then  it  continued  moving  at  constant 
velocity  of  magnitude  small  compared 
to  c,  passing  L"  at  time  t".  There  is  a kink 
in  the  field  line  found  between  the  inner 
sphere,  which  is  centered  on  L' , and  the 
outer  sphere,  which  is  centered  on  L. 
With  increasing  values  of  t"  the  kink 
propagates  out  along  the  field  line  at 
speed  c. 


ertial  reference  frame.  An  understanding  of  what  happens  can  be  achieved 
most  easily  by  focusing  attention  on  its  electric  field,  which  has  a simpler 
geometry  1 ban  its  magnetic  field.  Consider  a charged  particle  whose  behav- 
ior is  illustrated  in  Fig.  27-18.  Before  time  t the  particle  has  been  stationary 
at  location  L in  the  inertial  reference  frame  of  the  figure.  Then  it  begins  a 
constant  upward  acceleration  a,  which  lasts  until  time  t' , when  it  is  located 
at  L' . At  that  time  the  upward  velocity  has  increased  to  the  value  v = 
a(t'  — t),  which  we  assume  to  be  small  compared  to  c.  The  particle  then  con- 
tinues moving  at  the  velocity  v,  passing  the  location  L"  at  f.  The  figure  is 
constructed  for  a t"  such  that  the  time  interval  t'  — t of  acceleration  is  about 


one-fourth  of  the  time  interval  f — t'  of  constant-velocity  motion.  Shown 
in  the  figure  is  a typical  field  line  of  the  electric  field  produced  by  the 
charged  particle  at  t" , for  a case  in  which  the  magnitude  of  its  final  velocity 
v is  about  one-fifth  the  speed  of  light  c.  (This  makes  the  magnitude  of  v a 
little  large  to  satisfy  well  our  assumption  that  it  is  small  compared  to  c.  But 
clarity  would  be  lost  if  the  figure  were  constructed  so  that  v had  an  appre- 
ciably smaller  magnitude.) 

If  there  had  been  no  acceleration,  the  charged  particle  would  not  have 
started  moving  and  the  field  line  everywhere  would  be  directed  away  from 
its  original  location  L.  But  there  was  acceleration.  So  at  t"  the  field  line  near 
the  particle  is  directed  away  from  its  actual  location  L".  Far  from  the  par- 
ticle, on  the  other  hand,  the  field  line  continues  to  point  away  from  the  lo- 
cation L.  Fhe  reason  is  that  the  effects  of  the  change  in  motion  of  the  par- 
ticle do  not  propagate  to  distant  locations  at  infinite  speed.  Indeed,  the 
theory  of  relativity  requires  that  no  evidence  that  something  happened  at 
location  L can  travel  to  the  location  of  a distant  observer  at  a speed  faster 
than  the  universal  limiting  speed  c.  Consequently,  if  the  distance  from  L to 
the  observer  is  greater  than  c multiplied  by  the  time  interval  t"  — t,  there 
can  be  no  way  for  the  observer  to  know  at  time  t"  that  the  particle  acceler- 
ated away  from  L at  time  t.  This  means  that  the  directions  of  the  field  lines 
produced  by  the  particle  cannot  instantaneously  change  throughout  their 
entire  length.  Such  behavior  would  give  instantaneous  evidence  at  distant 
locations  of  the  fact  that  the  charged  particle  producing  the  lines  had  accel- 
erated. Instead,  this  evidence  propagates  to  the  distant  parts  of  its  field  at 
speed  c. 


27-6  Emission  of  Radiation  by  Accelerated  Charges  1297 


Inside  a sphere  of  radius  c(t"  — t')  centered  on  L'  the  typical  held  line 
emanates  from  the  actual  location  L"  of  the  charged  particle.  The  radius 
equals  the  distance  through  which  evidence  has  propagated  at  speed  c into 
the  held  at  time  t"  that  the  particle  commenced  its  uniform  final  motion  at 
location  L'  and  time  t' . Within  this  region  the  electric  held  is  like  that  of  the 
uniformly  moving  charged  particle  illustrated  in  Fig.  27-17.  The  effect  of 
the  acceleration  already  has  passed  through  the  region,  and  the  held  is  the 
same  as  if  the  particle  had  always  been  moving  uniformly.  Outside  a sphere 
of  radius  c(t"  — t)  centered  on  L the  held  line  continues  at  time  t"  to  point 
away  from  the  location  L that  the  particle  had  at  the  instant  t when  the 
acceleration  commenced,  as  we  have  already  concluded.  Between  the  two 
spheres  there  must  be  a region  in  which  there  is  a kink  in  the  held  line 
joining  its  two  parts.  The  thickness  of  the  region  depends  on  the  difference 
between  the  radii  of  the  two  spheres,  c(t"  — t)  — c(t"  — t ')  = c(t'  — t).  As  t" 
increases,  figures  illustrating  the  situation  would  show  the  kink  moving  out 
along  the  held  line  at  speed  c.  In  the  process,  the  kink  becomes  more  and 
more  like  two  successive  right-angle  bends  in  the  held  line.  This  is  so  be- 
cause the  separation  v(t"  — t')  + v(t'  — t)/2  between  the  points  of  origin  of 
the  inner  and  outer  parts  of  the  held  line  increases  with  t",  while  the 
thickness  c(t'  — t)  of  the  region  in  which  the  kink  occurs  remains  constant. 
Thus,  far  from  the  particle  the  held  line  at  the  kink  becomes  essentially 
transverse  to  the  direction  in  which  the  kink  is  propagating.  In  other 
words,  a transverse  electric  held  moves  away  from  the  particle  at  speed  c 
along  the  typical  held  line  shown  in  the  figure.  The  same  is  true  of  all  the 
other  held  lines. 

There  is  a strong  analogy  between  a kink  propagating  along  an  electric  field 
line  when  the  charged  particle  on  which  it  originates  has  accelerated  and  what 
happens  in  a stretched  string  when  one  end  of  the  string  is  accelerated.  Imagine 
you  are  holding  the  free  end  of  a string,  whose  other  end  is  attached  to  a support, 
and  are  applying  tension  to  the  string.  With  your  hand  initially  at  rest,  you  acceler- 
ate the  hand  to  some  final  velocity  in  a direction  transverse  to  the  string.  In  so 
doing,  you  induce  a kink  in  the  string,  which  propagates  along  the  string  as  a 
transverse  wave.  The  kink,  traveling  at  a speed  characteristic  of  the  stretched 
string,  carries  evidence  to  distant  locations  that  the  point  on  which  the  string  orig- 
inates has  accelerated. 

The  analogy  between  a stretched  string  and  an  electric  held  line  clarifies  the 
earlier  statement  concerning  the  way  the  held  of  a charge  moving  at  constant 
velocity  is  able  to  “anticipate”  what  changes  should  be  made  next  from  the 
changes  just  made.  Imagine  that  the  string  continues  indefinitely,  as  do  the  held 
lines  we  have  been  considering,  so  that  reflection  can  be  ignored.  Also  imagine 
that  when  you  hnish  bringing  your  hand  to  its  hnal  transverse  velocity,  you  con- 
tinue moving  it  at  that  velocity.  Then  after  the  kink  has  traveled  out  of  view,  no 
further  transverse  wave  motion  will  be  seen  in  the  string.  Every  element  of  the 
string  will  continue  moving  uniformly  in  the  transverse  direction — while  the  end 
you  hold  continues  moving  uniformly.  In  these  circumstances  the  connection  im- 
posed by  Newton’s  laws  of  motion  between  the  space  and  time  dependences  of  the 
transverse  coordinate  allows  every  element  of  the  string  to  “anticipate”  what 
changes  should  be  made  next  in  its  transverse  coordinate  from  the  changes  it  and 
its  neighbors  have  just  made  in  that  coordinate. 

The  accelerated  charged  particle  in  Fig.  27-18  is  surrounded  by  a 
region  of  transverse  electric  held  which  expands  away  from  it  at  speed  c. 
Investigation  of  the  behavior  of  the  magnetic  held  lines  shows  that  there  is 
also  a transverse  magnetic  field  traveling  along  with  the  electric  held.  These 


1298  Maxwell's  Equations  and  Electromagnetic  Waves 


Fig.  27-19  K inks  in  the  electric  field  lines  surrounding  a 
charged  particle  soon  after  it  has  experienced  a pulse  of  acceler- 
ation in  the  upward  direction.  The  acceleration  changed  its 
speed,  relative  to  the  inertial  frame  of  the  figure,  f rom  an  initial 
value  of  zero  to  a final  value  small  compared  to  c.  Note  that  the 
kinks  are  most  pronounced  in  the  directions  perpendicular  to 
the  direction  of  the  acceleration. 


transverse  fields  are  die  pulse  of  radiation  that  the  charged  particle  emits 
when  it  experiences  a pulse  of  acceleration.  The  radiation  travels  away 
from  (he  particle  at  the  universal  speed  of  electromagnetic  waves,  c.  Figure 
27-19  shows  the  complete  set  of  electric  held  lines  surrounding  a charged 
particle,  initially  at  rest  in  the  inertial  frame  of  the  figure,  shortly  after  the 
particle  experienced  an  upward  acceleration  that  gave  it  an  upward- 
directed  motion  at  a final  speed  small  compared  to  c.  In  all  directions  of 
emission  the  pulse  of  radiation  is  polarized.  That  is,  the  transverse  electric 
held  in  the  pulse  always  lies  in  a plane  containing  the  direction  of  the  accel- 
eration and  the  direction  of  emission. 

Note  that  for  each  field  line  shown  in  Fig.  27-19  (and  for  the  single  line  shown 
in  Fig.  27-18)  the  part  outside  the  kink  is  parallel  to  the  part  inside  the  kink.  The 
reason  is  that  the  part  outside  the  kink  is  part  of  a field  line  emanating  from  the 
charged  particle  when  the  particle  is  at  rest  in  the  inertial  frame  of  the  figure, 
while  the  part  inside  the  kink  is  part  of  the  corresponding  field  line  when  the  par- 
ticle is  moving  upward  through  the  frame  at  its  constant  final  speed.  Since  we  as- 
sume the  final  speed  to  be  small  compared  to  the  speed  of  light,  there  is  no  distor- 
tion in  the  electric  field  of  the  particle  as  a result  of  the  motion.  All  the  motion 
does  is  to  make  the  uniformly  distributed  set  of  held  lines  move  through  the  refer- 
ence frame.  This  does  not  change  the  angle  between  a particular  held  line  and  the 
line  of  motion.  Thus  after  the  charged  particle  is  set  into  motion,  each  held  line 
emanating  from  it  is  parallel  to  the  corresponding  held  line  emanating  from  the 
charge  when  it  is  at  rest. 

The  situation  is  different  if  the  hnal  speed  of  the  charged  particle  is  compara- 
ble to  the  speed  of  light.  The  theory  of  relativity  shows  that  for  such  a case  the 
held  lines  are  bunched  in  the  plane  perpendicular  to  the  line  of  motion.  Hence  a 
held  line  of  a rapidly  moving  particle  forms  a larger  angle  with  the  line  of  motion 
than  that  formed  by  the  corresponding  held  line  of  the  stationary  particle.  This  ef- 
fect complicates  matters  considerably.  So  we  avoid  it  by  restricting  our  attention 
to  cases  where  the  hnal  speed  of  the  charged  particle  is  small  compared  to  the 
speed  of  light. 


27-6  Emission  of  Radiation  by  Accelerated  Charges  1299 


v(t"  - t') 


L,  £ 


Fig.  27-20  A typical  electric  field  line 
emanating  from  a charged  particle  which 
undergoes  a brief  constant  acceleration 
from  rest  to  a final  speed  small  compared  to 
c.  The  time  interval  during  which  it  is  accel- 
erated from  L to  L’  is  supposed  to  be  so 
small,  compared  to  the  time  interval  during 
which  it  moves  at  its  constant  final  speed 
from  L'  to  L",  that  L and  L'  are  drawn  in  the 
same  location. 


It  is  not  difficult  to  evaluate  the  energy  flux  5 in  the  radiation  emitted 
by  a particle  of  charge  q when  the  particle  is  given  an  acceleration  a to  a 
final  speed  small  compared  to  the  speed  of  light.  We  first  construct  Fig. 
27-20,  showing  a typical  electric  field  line.  It  is  very  similar  to  Fig.  27-18,  ex- 
cept that  the  time  interval  during  which  the  charged  particle  is  accelerated 
from  L to  L'  is  assumed  to  be  quite  small  compared  to  the  time  interval 
during  which  it  coasts  from  L'  to  L".  Thus  the  distance  from  L to  L'  is  quite 
small  compared  to  the  distance  from  L'  to  L",  and  so  in  the  figure  no  dis- 
tinction has  been  made  between  the  locations  L and  L'. 


The  figure  shows  the  two  components  of  the  electric  field  just  beyond 
the  inner  limit  of  the  kink.  These  are  the  component  parallel  to  the 
direction  of  the  field  interior  or  exterior  to  the  kink  and  the  component  %± 
perpendicular  to  that  direction.  The  components  form  two  sides  of  a right 
triangle  ABC  which  is  similar  to  a larger  right  triangle  AB  'C'  defined  by  the 
extent  of  the  kink  measured  in  the  parallel  and  perpendicular  directions. 
The  length  of  the  perpendicular  side  AB'  of  the  larger  triangle  is  the  dis- 
tance traveled  by  the  particle  moving  at  velocity  v for  time  C — t' , projected 
on  the  perpendicular  direction.  Since  the  field  line  makes  an  angle  9 with 
the  direction  of  the  particle’s  acceleration  and  subsequent  motion,  the 
length  of  AB'  is  v(t"  — t')  sin  9.  The  length  of  the  parallel  side  B'C'  of  the 
larger  triangle  is  the  difference  between  the  radii  of  the  spheres  defining  the 
outer  and  inner  limits  of  the  kink,  that  is,  c(t"  — t)  — c(t"  — t')  = 
c(t'  - t). 

Using  the  proportionality  of  corresponding  sides  of  similar  triangles, 
we  have 


_ v(t"  ~ t')  sin  9 
%\  c(t'  - t) 


But 


v — a(t'  — t) 


So  this  is 


_ a{t'  — t)(t"  — t')  sin  9 _ a(t"  — t')  sin  9 
c(t'  — t)  c 

We  can  write  the  relation  as 


<9ii ar  sin  9 


(27-36) 


1300  Maxwell’s  Equations  and  Electromagnetic  Waves 


where  the  distance  r is  defined  to  be 


r = c(t"  - t') 

This  is  the  distance  from  L' , the  location  of  the  charged  particle  when  its 
acceleration  ended,  to  the  inner  limit  of  the  kink  in  its  electric  field  line. 
Now,  the  figure  has  been  constructed  to  illustrate  the  relations  between  the 
similar  triangles.  It  does  not  make  very  evident  the  fact  that,  since  v is  sup- 
posed to  be  small  compared  to  c,  the  distance  v(t"  — t')  through  which  the 
charged  particle  has  moved  at  time  t"  is  in  actuality  small  compared  to  the 
distance  c(t"  — t')  through  which  the  inner  limit  of  the  kink  has  moved  at 
that  time.  A figure  better  illustrating  this  fact  would  make  it  clear  that  r = 
c(t"  - t')  is  also  essentially  equal  to  the  distance  from  the  particle’s  location 
L",  at  the  time  t"  depicted  in  the  figure,  to  the  inner  limit  of  the  kink  in  its 
electric  held.  That  is,  the  distance  r is  not  substantially  different  from  the 
distance  L"A. 


The  part  of  the  electric  held  interior  to  the  kink  has  only  a parallel 
component.  The  held  in  this  region  is  produced  by  a particle  of  charge  q 
moving  at  velocity  v.  But  since  its  magnitude  is  small  compared  to  c,  Cou- 
lomb’s law  (equivalent  to  Maxwell’s  hrst  equation)  can  be  used  to  evaluate  the 
parallel  component  of  the  interior  held.  Just  before  the  inner  limit  of  the 
kink  it  is 


*„  = 


Q 

47 re0r2 


(27-37) 


The  parallel  component  of  the  held  just  beyond  the  inner  limit  of  the  kink 
must  have  this  same  value.  (If  this  is  not  apparent,  it  can  be  proved  by 
applying  Gauss’  law  to  a cylinder  enclosing  one  end  of  a kink.)  Thus  rep- 
resents the  same  value  in  Eqs.  (27-36)  and  (27-37),  and  we  can  combine 
them,  eliminating  %tl,  to  yield 


qa  sin  0 
4:7T€0C2r 


(27-38) 


Here  q is  the  charge  of  the  particle,  a the  value  of  the  brief  accelera- 
tion it  has  recently  experienced,  r the  distance  from  the  particle  to  the  point 
where  the  transverse  electric  held  is  being  measured,  and  0 the  angle 
between  the  direction  of  the  acceleration  and  the  direction  to  that  point. 
The  transverse  held  at  the  point  is  present  for  a time  interval  equal  to  the 
duration  of  the  pulse  of  radiation,  and  it  does  not  arrive  until  after  a time 
delay  equal  to  r/c. 

Knowing  the  transverse  electric  held  in  the  pulse,  we  can  obtain  the 
corresponding  transverse  magnetic  held  immediately  by  applying  the  rela- 
tion between  these  two  helds  that  is  always  satished  by  electromagnetic 
radiation  in  vacuum.  This  is  Eq.  (27-23),  which  we  write  here  as 


It  gives  us 


33x 


qa  sin  0 
4TT€0C3r 


(27-39) 


27-6  Emission  of  Radiation  by  Accelerated  Charges  1301 


sin2  6 


130°  140°  150°  160°  170°  180°  170°  160°  150°  140°  130° 

Fig.  27-21  A polar  plot  of  the  energy  flux  emitted  by  a charged  particle  that  has  experienced 
a pulse  of  acceleration.  The  arrow  indicates  the  vertical  direction  of  the  acceleration. 


Now  we  can  evaluate  the  energy  flux  S by  applying  the  definition  of  Eq. 
(27-29 a),  written  as 


We  find 


5 = 


M-o 


2 2 • 2 
q a*  sirr  6 


or,  using  c2  = I/moCo.  we  get 

5 

Figure  27-21  is  a polar  plot  showing  the  dependence  of  the  energy  flux 
5 on  the  angle  9 between  the  direction  of  the  acceleration  and  the  direction 
of  the  line  from  the  charge  along  which  5 is  being  measured.  There  is  no 
flux  of  energy  emitted  in  the  direction  of  acceleration  or  in  the  opposite 
direction,  and  the  flux  has  a maximum  value  at  right  angles  to  this  direc- 
tion. Can  you  use  Fig.  27-19  to  explain  why?  The  energy  flux  is  symmet- 
rical about  an  axis  along  the  direction  of  acceleration.  What  is  the  physical 
reason  for  this?  We  also  see  from  Eq.  (27-40)  that  the  energy  flux  S de- 
creases in  inverse  proportion  to  the  square  of  the  distance  rfrom  the  accel- 
erated charge  to  the  point  where  S is  measured.  Why  does  5,  the  energy- 
passing  per  unit  time  through  a unit  normal  area,  obey  this  inverse- 
square  law? 


16tt2  n0€oC5r2 


q~a  sin-  9 


167r2e0c3r 


3r2 


(27-40) 


Finally,  let  us  evaluate  the  total  energy  emitted  per  unit  time  in  all 
directions  while  a charged  particle  accelerates.  This  is  the  radiated  power 


1302  Maxwell’s  Equations  and  Electromagnetic  Waves 


Fig.  27-22  An  element  of  area  on  a 
sphere.  The  element  is  in  the  form  of  a 
band  cut  from  the  surface  of  the  sphere  by 
radial  lines  making  angles  9 and  6 + dO 
with  the  axis  of  the  sphere,  which  is  along 
the  direction  of  acceleration  of  the  charge. 
The  width  of  the  band  is  r dO.  The  radius  of 
the  band  is  r sin  9.  and  so  its  periphery  has  a 
length  27 rr  sin  9.  Since  the  area  of  the  band 
is  its  width  times  its  length,  the  area  is 


r dB  2ttt  sin  9 = 2 nr2  sin  9 d9. 


Acceleration 
of  charged 
particle 


P.  It  can  be  obtained  by  integrating  the  energy  per  unit  time  per  unit  area, 
S,  over  the  area  of  a sphere  of  arbitrary  radius  r centered  on  the  charge. 
We  have 


5 2t7i-2  sin  9 cl9 


P = 


o 


Here  2rrr2  sin  9 dd  is  an  element  of  area  of  the  sphere  for  which  the  angle 
lies  in  the  range  9 to  9 + d9.  This  is  demonstrated  in  Fig.  27-22  and  its  cap- 
tion. Using  Eq.  (27-40)  for  5,  we  get 


What  is  the  significance  of  the  fact  that  r drops  out?  The  value  of  the  inte- 
gral, found  by  standard  analytical  methods,  is  |.  Thus  we  find 


This  is  the  rate  of  emission  of  energy  by  the  accelerated  charge.  Note  that  it 
is  proportional  to  the  square  of  its  acceleration. 

In  most  cases  electromagnetic  radiation  is  emitted  by  charges  that  un- 
dergo a repetitive  series  of  accelerations,  rather  than  a single  pulse  of  accel- 
eration. An  example  is  found  in  the  emission  of  light  by  atoms.  Atoms  emit 
light  because  of  the  accelerations  of  their  electron  charge  distributions  as  the  charge 
distributions  perform  harmonic  oscillations.  The  frequency  of  the  sinusoidal 
wave  produced  in  the  process  equals  the  frequency  of  the  sinusoidal  oscilla- 
tion producing  it — just  as  the  frequency  of  a wave  produced  in  a stretched 
string  by  making  one  end  oscillate  equals  the  frequency  of  the  oscillation. 
The  charges  in  any  particular  atom  can  be  thought  of  as  oscillating  along  a 
line  which  has  a particular  orientation.  This  means  that  they  emit  radiation 
which  is  polarized,  with  the  direction  of  the  transverse  electric  field  confined 
to  a plane  passing  through  the  line  of  oscillation  and  the  direction  of  prop- 
agation. But  in  most  circumstances  there  are  atoms  in  an  emitter  which 
have  charges  oscillating  along  lines  with  all  orientations  and  with  all  phases. 
Then  the  total  radiation  emitted  contains  transverse  electric  fields  in  all 
planes  passing  through  the  propagation  direction  and  having  all  phases. 
Such  radiation  is  not  polarized  and  does  not  have  a particular  phase. 


27-6  Emission  of  Radiation  by  Accelerated  Charges  1303 


9 


Let  us  apply  Eq.  (27-41)  to  evaluate  the  average  power  radiated  in  the 
idealized  case  of  a single  atomic  electron  whose  y coordinate,  measured 
from  the  essentially  fixed  nucleus,  oscillates  with  amplitude  y0  and  fre- 
quency v.  That  is, 

y = y0  cos(2 irvt) 


The  acceleration  at  some  instant  is  found  by  calculating 


a 


d^y 

dt2 


-4772jA>0  COS,(2jTVt) 


So  Ecp  (27-41)  tells  us  that  the  power  radiated  at  this  instant  is 


P = 


16ttTVS,2 


COS2(27 TVt) 


67re0c3 

Averaging  over  an  oscillation  cycle,  we  have  the  average  radiated  power 

16ttW)’5  1 

' ' Gjre.c3  2 

since  (cos2(27 rvt))  — i.  This  simplifies  to 

a M 4 2 2 

477^^00 


(P)  = 


3e0c3 


(27-42) 


Example  27-7  applies  this  equation  to  a radiating  atom. 


EXAMPLE  27-7 

Estimate  the  power  radiated  by  an  atom  emitting  light  of  wavelength  A = 5.0  x 
10~7  m.  Assume  that  a single  electron  of  charge  q = — 1.6  X 10-19  C is  responsible 
for  the  radiation  and  that  its  oscillation  amplitude  equals  the  typical  atomic  dimen- 
sion 1 .0  x 1 0-10  m. 

" You  can  calculate  the  frequency  of  the  light  emitted  from 


3.0  x 108  m/s 

5.0  x 1(U7  m 


= 6.0  x 1014  Hz 


11ns  is  also  the  oscillation  frequency  of  the  electron.  Using  it  and  the  other  values 
given  in  Eq.  (27-42),  you  have 

4 x 7J-3  x (6.0  x 1014  Hz)4  x (1.6  x 10-19  C)2  x (1.0  x KU10  m)2 
3 x 8.85  x HU12  C2/(N-m2)  x (3.0  x 108  m/s)3 
= 5.7  x 10-12  W 


The  energy  llux  5 radiated  by  an  individual  atom  has  the  sin2  9 angular- 
dependence  of  Eq.  (27-40),  with  9 measured  from  the  oscillation  axis  of  the 
charge  emitting  the  radiation.  The  radiation  emitted  from  the  atom  in  a 

o o 

particular  direction  is  polarized.  Its  electric  field  lies  in  the  plane  containing 
the  directions  of  oscillation  and  emission,  and  its  magnetic  field  is  normal 
to  that  plane.  The  power  emitted  by  the  atom  is  quite  small  indeed,  as  we 
saw  in  Example  27-7.  But  any  light  source  contains  a very  large  number  of 
atoms  emitting  simultaneously,  and  the  total  radiated  power  need  not  be 
small.  Except  in  the  case  of  the  laser,  the  electrons  in  different  atoms  oscil- 
late independently.  And  if  no  external  magnetic  field  is  applied  to  the 
source,  there  will  be  no  correlation  between  the  oscillation  directions  of  the 
various  atoms.  Thus  in  common  light  sources  the  radiation  pattern  and 


1304  Maxwell’s  Equations  and  Electromagnetic  Waves 


polarization  characterizing  a single  emission  process  are  averaged  over  all 
possible  oscillation  directions,  and  the  light  produced  is  neither  polarized 
nor  directional.  The  light  from  such  a source  spreads  out  uniformly  in  all 
directions.  It  can  be  focused  into  a beam  by  using  an  appropriately  shaped 
mirror  or  lens.  These  focusing  techniques  are  studied  in  Chaps.  28  and  29. 
Techniques  for  polarizing  an  unpolarized  beam  of  light  are  described  in 
Chap.  28. 


From  transmitter 


Fig.  27-23  A half-wave  dipole  trans- 
mitting antenna.  The  arrows  indicate 
the  directions  of  currents  flowing  in  the 
antenna  and  in  its  leads  at  some  instant. 
Half  an  oscillation  cycle  later  all  these 
directions  are  reversed. 


The  electrons  in  a simple  radio  transmitting  antenna  are  doing  much 
the  same  thing  as  electrons  in  radiating  atoms,  although  the  scale  is  very 
different.  A half-wave  dipole  transmitting  antenna  is  shown  in  Fig.  27-23. 
Just  as  for  the  receiving  antenna  discussed  in  Sec.  27-4,  its  total  length  L is 
made  equal  to  A./2,  half  the  wavelength  of  the  wave  it  is  designed  to  emit 
most  efficiently.  The  reason  is  that  the  natural  frequency  of  current  oscilla- 
tions in  the  antenna  will  then  resonate  with  the  frequency  of  the  emitted 
wave.  Alternating  current  at  this  frequency  is  fed  to  the  dipoles  by  the 
closely  spaced  lead  wires.  There  is  no  appreciable  radiation  from  the  leads 
because  the  transverse  electric  and  magnetic  fields  produced  by  the  acceler- 
ations of  charges  in  one  lead  always  nearly  cancel  those  produced  by  the 
opposite  accelerations  of  the  nearby  charges  in  the  other  lead.  So  the  net 
effect  of  the  current  flowing  in  the  two  leads,  and  then  out  along  the  an- 
tenna, is  the  same  as  if  an  alternating  current  were  flowing  on  the  antenna 
alone.  The  strength  of  the  current  varies  along  the  antenna.  Just  as  in  a 
half-wave  dipole  receiving  antenna,  the  current  pattern  is  a longitudinal 
standing  wave  with  nodes  at  both  ends  and  an  antinode  at  the  center. 

A calculation  of  the  energy  flux  emitted  by  the  antenna,  for  a given 
current  fed  into  its  center,  starts  from  Eqs.  (27-38)  and  (27-39).  The  calcu- 
lation is  complicated  because  there  is  a varying  current  flow  along  a length 
that  is  comparable  to  the  wavelength  of  the  radiation  emitted.  The  proce- 
dure is  as  follows:  (1)  Considering  only  a single  infinitesimal  element  of  the 
antenna,  evaluate  the  transverse  electric  and  magnetic  fields  resulting  from 
the  acceleration  of  charge  in  the  element.  (2)  Integrate  along  the  antenna 
to  get  the  vector  sums  of  these  fields,  keeping  track  of  all  phase  relations 
among  the  fields  that  are  clue  to  the  individual  elements  as  they  superpose 
at  a particular  location.  (3)  Compute  the  energy  flux  from  the  net  electric 
and  magnetic  fields.  [These  complications  do  not  arise  when  the  radiation 
emitted  by  an  atom  is  calculated  since  the  length  of  the  atomic  “antenna”  is 
about  1 x 10-10  m while  the  wavelength  of  the  radiation  emitted  is  about 
5 x 10-7  m.  So  the  atom  acts  like  a single  infinitesimal  source  to  which  Eqs. 
(27-40)  and  (27-41)  are  directly  applicable.]  The  complicated  half-wave  di- 
pole calculations  predict  that  the  average  energy  flux  radiated  is 


(I2)  COS2[(7t/2)  cos  0] 

4-77 -2e0cr2  sin2  6 


(27-43) 


where  6 is  the  angle  to  the  direction  of  radiation,  measured  from  the  line 
formed  by  the  antenna.  In  this  expression  I is  the  amplitude  of  the 
standing  current  wave  at  the  center  of  the  antenna.  In  other  words,  (P)  is 
the  mean-square  current  delivered  to  the  antenna  by  the  lead  wires.  The 
angular  dependence  of  the  radiation  pattern  is  shown  in  the  polar  plot  of 
Fig.  27-24.  It  is  quite  similar  to  the  basic  sin2  6 dependence  of  Eq.  (27-40) 
and  Fig.  27-21,  but  is  somewhat  more  strongly  peaked  at  90°. 


27-6  Emission  of  Radiation  by  Accelerated  Charges  1305 


50° 


40° 


cos2  ^ J cos  6^ 
sin2  6 

30°  20°  10°  0°  10' 


10°  20°  30°  40° 


50° 


60° 

70° 

80° 

90° 

0 

100° 

110° 

120° 


Fig.  27-24  A polar  plot  of  the  en- 
ergy flux  emitted  by  a half-wave 
dipole  transmitting  antenna.  The 
vertical  alignment  of  the  antenna  is 
indicated  schematically. 


The  average  radiated  power  for  the  half-wave  dipole  antenna  is 

(P)  = I (S)2nr2  sin  6 d6 
J o 

(I2)  r 71  COS2[(  77-/2)  COS  6] 

2i re0c  J o sin  6 ( ^ 

The  integral  cannot  be  evaluated  analytically.  But  it  is  easy  to  use  the  nu- 
merical integration  program  in  the  Numerical  Calculation  Supplement  to 
determine  that  its  value  is  1.219.  By  also  evaluating  27re0c,  the  expression 
for  ( P ) can  be  written  as 

(P)  = </2>  R (27-44a) 

where 


1.219 

R = 7: = 73.1  n (27-44 b) 

2-rreoC 

The  expression  for  (P)  has  the  form  of  Joule’s  law:  power  = (current)2  x 
resistance.  The  so-called  radiation  resistance  R has  the  value  73.1  O for  the 
half-wave  dipole.  The  radiation  resistance  of  an  antenna  is  a figure  of 
merit.  The  larger  the  radiation  resistance,  the  more  electric  power  will  be 
taken  from  a source  producing  a given  current  and  converted  into  electro- 
magnetic radiation. 

Example  27-8  makes  use  of  Eq.  (27-446.) 


EXAMPLE  27-8 

Evaluate  the  power  radiated  by  an  ideal  half-wave  dipole  antenna  if  the  amplitude 
of  the  sinusoidal  current  delivered  to  its  center  is  10  A. 


1306 


Maxwell’s  Equations  and  Electromagnetic  Waves 


■ Since  the  current  varies  sinusoidally  with  time,  you  know  [see  Eq.  (26-79)]  that 


(10  A)2 
2 


= 50  A2 


So  you  have  immediately 


(P)  = ( 12)R  = 50  A2  x 73.1  Q 
= 3.7  x 103  W 


Transmitting  antennas  used  for  AM  radio  broadcasting  commonly 
consist  of  a single  vertical  tower,  mounted  on  a base  that  is  insulated  from 
the  earth.  The  tower  acts  as  half  of  a half-wave  dipole.  The  other  half  is 
supplied  by  the  electrical  “image”  of  the  tower  in  the  conducting  plane  of 
the  earth,  which  arises  from  currents  induced  in  the  earth.  In  other  words, 
the  earth  acts  like  a “mirror”  in  providing  the  missing  half  of  the  half-wave 
dipole  antenna. 

Example  27-9  applies  Eq.  (27-41)  to  the  simple  situation  of  a single 
charged  particle  having  an  acceleration  of  constant  magnitude. 


EXAMPLE  27-9 

A proton  of  kinetic  energy  K = 50.0  MeV  is  traveling  in  a circular  orbit  of  radius 
R = 1.00  m through  the  uniform  magnetic  held  of  a cyclotron.  Evaluate  the  energy 
it  loses  to  radiation  in  one  trip  around  the  orbit. 

■ The  hrst  thing  to  do  is  to  look  up  the  rest-mass  energy  of  a proton.  You  will 
find  the  value  to  be  938  MeV.  Since  this  is  large  compared  to  the  value  quoted  for 
the  proton’s  kinetic  energy,  you  know  that  the  proton  is  moving  at  a speed  small 
compared  to  the  speed  of  light.  Thus  Eq.  (27-41),  and  any  other  nonrelativistic 
equations,  can  be  applied  to  predict  the  behavior  of  the  proton. 

In  the  circular  orbit,  the  proton  undergoes  an  acceleration 


a 


v 


2 


R 


or 

_ 2 mv2  _ 2 K 
(l  2m  R mR 


where  m is  its  mass.  So  it  emits  electromagnetic  radiation,  and  the  energy  radiated 
per  second  is 


cV  _ 2 e2K2 

6ve0c3  377 e0c3m2R2 


(27-45) 


where  e is  the  charge  of  the  proton.  The  energy  radiated  per  orbit  traversal  is  P 
multiplied  by  the  orbital  period  2ttR/v.  This  energy  can  come  only  from  the  pro- 
ton’s kinetic  energy,  so  the  loss  of  kinetic  energy  by  radiation  is 


A K = - 


4 e2K2 


Setting  v = V2 K/m,  you  hav< 


3 eoC3m2Rv 


2^3/2 


A K = - 


4e2K 


3 \/2e0c3m3l2R 

Using  the  conversion  factor  1 MeV  = 1.60  ,x  10“ 13  J,  you  find  the  numerical 


27-6  Emission  of  Radiation  by  Accelerated  Charges  1307 


value  of  the  energy  loss  per  orbit  traversal  to  be 
AK  = 


4 x (1.60  x 10~19  C)2  x (50.0  x 1.60x  KT13])3'2 
3V2  x 8.85  X 10-12C2/(N-m2)x  (3.00  x 108m/s)3  x (1.67  x KT27  kg)3'2  x 1.00m 
= -3.35  x 10“29  J = -2.09  x KT16  MeV 

The  energy  lost  to  radiation  per  orbit,  AK,  is  completely  negligible  compared  to  the 
energy  of  the  proton,  K = 50.0  MeV.  So  the  effect  can  be  ignored  in  designing  the 
cyclotron. 


For  a particle  accelerator  in  which  protons  traverse  circular  orbits  with 
energies  in  billions  or  trillions  of  electron  volts,  instead  of  millions,  their  en- 
ergy loss  from  radiation  can  become  appreciable.  Such  machines  are  called 
synchrotrons,  and  the  radiation  is  called  synchrotron  radiation.  Since  the 
particles  are  moving  at  relativistic  speeds,  the  equations  we  have  derived 
are  not  applicable.  But  some  of  their  features  still  hold  qualitatively.  For  in- 
stance, other  factors  being  equal,  the  smaller  the  rest  mass  of  the  charged 
particle,  the  more  intense  is  its  synchrotron  radiation — compare  this  state- 
ment with  Eq.  (27-45).  The  angular  dependence  of  the  emitted  radiation 
becomes  more  complicated  as  the  orbital  speed  of  the  particle  becomes 
more  relativistic.  When  the  speed  is  highly  relativistic,  the  radiation  is 
emitted  in  a narrow  range  of  directions  centered  on  the  instantaneous 
direction  of  motion  of  the  particle.  The  radiation  “illuminates”  the  par- 
ticle’s path  around  its  orbit  like  a headlight. 


Synchrotron  radiation  constantly  draws  energy  from  particles  orbiting 
around  a particle  accelerator.  It  can  be  compensated  for  in  the  accelerator  design 
by  increasing  the  energy  given  to  the  particles  by  the  accelerating  electric 
fields — providing  the  radiation  drain  is  not  too  great.  Because  they  have  very 
small  rest  masses  compared  to  protons,  electrons  emit  so  much  synchrotron  radia- 
tion that  it  is  not  practical  to  use  circular  accelerators  of  reasonable  radius  to  give 
them  very  high  energies.  This  is  the  reason  for  building  such  linear  electron  accel- 
erators as  the  more  than  3-km-long  machine  at  Stanford  University. 

On  the  other  hand,  circular  electron  accelerators,  of  not  too  high  an  energy, 
have  been  built  for  the  specific  purpose  of  producing  synchrotron  radiation.  The 
highly  directional  emitted  radiation  and  the  very  narrow  wavelength  range  in 
which  it  is  emitted  make  synchrotron  radiation  particularly  useful  in  many 
branches  of  experimental  physics. 

Synchrotron  radiation  by  charged  particles,  moving  in  circular  (or  helical) 
paths  in  magnetic  fields,  is  known  to  be  responsible  for  the  electromagnetic  radia- 
tion emitted  from  various  celestial  objects.  Such  radiation  has  been  observed  over 
a spectral  range  extending  from  radio  frequencies  to  X-ray  frequencies.  For  ex- 
ample, part  of  the  radiation  emitted  by  the  crab  nebula  has  properties  which  iden- 
tify it  as  synchrotron  radiation  from  electrons  with  energies  up  to  about  106  MeV, 
moving  in  magnetic  fields  within  the  nebula  of  strength  about  10-8  T.  The  detec- 
tion and  measurement  of  such  magnetic  fields  provide  important  information  to 
astronomers  and  astrophysicists. 


1308 


Maxwell’s  Equations  and  Electromagnetic  Waves 


EXERCISES 


Group  A 

27-1.  Displacement  current,  /.  Eacli  of  the  plates  of  a 
plane-parallel  capacitor  has  an  area  of  0.10  in2.  The  mag- 
nitude of  the  electric  held  in  the  vacuum  between  the 
plates  is  changing  at  the  rate  di£/dt  = 5.0  x 1010  V/(nvs). 
What  is  the  displacement  current? 

27-2.  Displacement  current,  II.  A plane-parallel  vac- 
uum capacitor  of  capacitance  1.0  X 10~5  F is  connected 
across  an  ac  source  of  voltage  with  amplitude  150  V and 
frequency  60  Hz. 

a.  Evaluate  the  displacement  current  for  an  instant 
when  there  is  no  potential  difference  between  the  capaci- 
tor plates. 

b.  Evaluate  the  displacement  current  for  an  instant 
when  the  potential  difference  is  150  V. 

27-3.  One  size  fits  all.  Explain  why  it  must  be  tiue  that 
the  radius  i\  of  the  “stove  pipe  of  the  hat  " in  Example  27-1 
cancels  out  in  the  final  result  of  the  calculation. 

27-4.  At  the  throw  op  a switch.  1 he  plane-parallel  vac- 
uum capacitor  C in  Fig.  27E-4  has  circular  plates  of  radius 
0. 150  m.  The  battery  V has  a voltage  of  22.5  V,  and  the 
resistor  R has  a resistance  of  500  ff . After  having  been  in 
position  A for  a considerable  time,  the  switch  5 is  thrown 
to  position  B.  Calculate  the  magnetic  field  strength  at  a 
point  at  the  edge  of  the  evacuated  region  between  the 
capacitor  plates  immediately  after  the  switch  is  thrown. 


C 


Fig.  27E-4 


27-5.  You  win!  Imagine  that  you  have  just  discovered 
experimentally  the  existence  of  magnetic  monopoles.  To 
make  winning  the  Nobel  Prize  certain,  you  want  to  include 
in  your  research  report  the  modifications  in  Maxwell’s 
equations  required  by  your  discovery.  What  are  the  modi- 
fied forms  of  these  equations? 

27-6.  Cyclic  symmetry.  Without  performing  a detailed 
derivation,  transcribe  Eqs.  (27-10)  through  (27-15)  into 
the  analogous  equations  for  a plane  wave  traveling  in  the 
positive  z direction,  with  the  electric  field  polarized  in  the  x 
direction. 

27-7.  Frequency  and  wavelength.  The  frequency  of  the 
electromagnetic  waves  emitted  by  a certain  AM  radio  sta- 
tion is  9.82  x 105  Hz.  What  is  the  wavelength? 

27-8.  The  electromagnetic  spectrum.  Referring  to  Fig. 
27-11,  name  the  region  of  the  electromagnetic  spectrum 


in  which  the  wavelengths  are  comparable  to  the  follow- 
ing: 

a.  the  diameter  of  an  atom  (approximately  10-10  m) 

b.  the  diameter  of  a living  cell  (approximately 
KT5  m) 

c.  the  width  of  your  thumb 

d.  the  height  of  a human 

e.  the  length  of  a football  field 

f.  the  distance  from  New  York  to  San  Francisco 

27-9.  Dimensional  analysis,  I.  Prove  that  l/V/u0e0  has 
the  dimensions  of  (velocity). 

27-10.  Dimensional  analysis,  II.  Prove  that  has 

the  dimensions  of  (energy)/(area-time). 

27-11.  Force  exerted  by  radiation. 

a.  A plane  electromagnetic  wave  of  finite  extent  has 
power  P.  That  is,  the  total  energy  it  carries  per  unit  time 
past  an  imaginary  surface  normal  to  its  propagation  direc- 
tion is  P.  The  wave  is  incident  on  a surface  that  absorbs  it 
completely.  Show  that  the  strength  F of  the  force  exerted 
on  the  absorbing  surface  has  the  value  F = P/c. 

b.  The  power  in  a certain  plane  wave  is  500  W.  What 
force  does  the  wave  exert  on  a smooth  metal  surface, 
normal  to  the  propagation  direction,  which  reflects  the 
wave  back  on  itself? 

27-12.  Radiation  pressure.  Lise  the  result  of  Example 
27-6  to  calculate  the  radiation  pressure  of  sunlight  just 
outside  the  surface  of  the  sun.  The  radius  of  the  sun  is 
equal  to  6.96  x 105  km.  The  distance  of  the  sun  from  the 
earth  is  equal  to  1.50  x 108  km. 

27-13.  Light  power!  Two  particles  with  the  same  elec- 
tric charge  ( q2  = qf  but  unequal  masses  (m2  f m x)  are 
accelerated  by  the  application  of  the  same  net  force. 

a.  Find  the  ratio  P2/P i of  the  powers  radiated  by  the 
two  particles. 

b.  Evaluate  the  ratio  in  the  case  of  an  electron  and  an 
antiproton:  map  — 1840  me. 

27-14.  Characteristic  time  for  radiative  energy  loss  by 
atoms.  Use  the  result  of  Example  27-7  to  estimate  how 
much  lime  is  needed  for  an  atom  with  excess  energy  of 
1 eV  (a  typical  value  for  an  "excited”  atom)  to  radiate 
away  that  excess  energy. 

27-15.  CB  antenna.  The  type  of  transmitting  antenna 
commonly  used  in  the  citizen's  band  consists  of  half  of  a 
half-wave  dipole  projecting  above  the  steel  roof  of  an  au- 
tomobile. The  metal  surface  approximates  the  “mirror” 
supplying  the  missing  half,  as  described  in  the  text.  In  a 
particular  case,  the  average  power  radiated  by  the  an- 
tenna is  (P)  = 3.0  W when  the  root-mean-square  current 
delivered  to  it  is  (I2)112  = 0.25  A.  Calculate  the  radiation 
resistance  of  the  antenna,  and  compare  it  with  the  radia- 
tion resistance  of  an  ideal  half-wave  dipole. 


Exercises  1309 


Group  B 

27-16.  Displacement  current,  III.  In  Fig.  27E-16,  the 
switch  S is  closed  at  time  t = 0. 

a.  What  is  the  displacement  current  through  the 
capacitor,  and  the  conduction  current  at  (i)  point  b,  (ii) 
point  c,  (iii)  point  d at  a time  t soon  after  t = 0? 

b.  Repeat  part  a,  allowing  for  the  batteries’  internal 
resistance  r.  Take  t <§  rC. 

C Fig.  27E-16 


27-17.  Magnetic  field  in  a 60- Hz  circuit.  A resistor  R 
and  a capacitor  C are  connected  in  series  across  an  ac 
source  of  voltage  V = V0  cos((ot).  The  plane-parallel 
capacitor  has  circular  plates,  and  the  region  between  them 
is  evacuated. 

a.  Show  that  when  steady-state  conditions  have  been 
reached,  the  charge  q on  the  capacitor  can  be  written  as 
q = q0  cos (cot.  + 8),  with  q0  > 0,  and  determine  q0  and  8. 
You  may  assume  that  cjRC  < 1. 

b.  Find  the  magnetic  held  in  the  region  between  the 
capacitor  plates. 

c.  Suppose  the  capacitor  plates  are  1.00  m in  radius 
and  1.00  cm  apart.  What  is  the  capacitance  value? 

d.  If  R = 100  a V0  = 100  V,  and  m/2tt  = 60  Hz, 
what  is  the  steady-state  charge  q on  the  capacitor? 

e.  For  the  numerical  values  given,  evaluate  the  mag- 
netic held  in  the  region  between  the  plates. 

f.  Compare  the  maximum  magnitude  of  this  held  to 
the  magnitude  of  the  earth’s  magnetic  held. 

27-18.  Displacement  current  in  a spherical  capacitor.  An 
ac  source  is  connected  across  a spherical  vacuum  capacitor 
of  capacitance  C,  so  that  the  potential  of  the  outer  elec- 
trode with  respect  to  the  inner  one  is  given  by  V0  cos(c ot). 
Prove  that  the  displacement  current  from  inner  to  outer 
electrode  is  —CcoV0  sin  (cot). 

27-19.  Charge  conservation,  I.  Consider  a hxed  volume 
of  space  bounded  by  a hxed  closed  surface.  Use  Maxwell’s 
equations  to  prove  that  the  electric  charge  density  p and 
the  electric  current  density  j satisfy 

d f 

rd“~  ~ h I pdv 

closed  enclosed 

surface  volume 

This  shows  that  Maxwell’s  equations  imply  the  conserva- 
tion of  electric  charge.  Hint:  Take  the  time  derivative  of 
Eq.  (27-6a)  and  apply  it  to  the  closed  surface  and  enclosed 
volume.  Next  consider  a closed  curve  that  girdles  the 


closed  surface,  cleaving  the  closed  surface  into  two  parts. 
1 hen  apply  Eq.  (27-6 d)  to  the  girdling  curve  and  one  part, 
and  apply  it  again  to  the  girdling  curve  and  the  other  part. 

27-20.  Charge  conservation,  II.  Sketch  patterns  for 
each  of  the  following  time-independent  electric  current 
densities  j.  (i)j  =j0x.  (ii)  j = j0sin(Ax)x.  (iii)j  = j0  sin(^y)x. 
Which  of  these  corresponds  to  a time-independent  charge 
density  pi 

27-21.  The  other  polarization.  By  proceeding  in  a 
manner  completely  analogous  to  that  of  Sec.  27-3,  derive 
wave  equations  for  the  electric  and  magnetic  fields  of  a 
plane  wave  traveling  in  the  positive  x direction  with  the 
electric  held  polarized  in  the  z direction. 

27-22.  Magnetic  field  components  in  a plane  wave,  I. 
Prove  that  in  an  electromagnetic  wave,  whose  wave  fronts 
are  planes  perpendicular  to  the  x axis,  the  magnetic  held 
cannot  have  a component  which  depends  on  x. 

27-23.  Magnetic  field  components  in  a plane  wave,  II.  By 
following  the  procedure  suggested  in  the  paragraph 
below  Eq.  (27-11),  derive  two  equations  which  prove  that 
when  a plane  wave  propagating  in  the  x direction  is  po- 
larized so  that  its  electric  held  has  only  a y component, 
then  its  magnetic  held  has  only  a z component. 

27-24.  Have  solution,  will  travel.  Verify  by  substitution 
into  the  electromagnetic  wave  equations  that  they  have 
traveling-wave  solutions  of  the  form  «?v(x,  t)  = %y(x  — vt) 
and  3iz(x,  t)  = £$2(x  - vt),  where  |u|  = 1 /V/u 0e0. 

27-25.  Finding  the  partner.  Find  an  expression  for  the 
magnetic  held  in  a polarized  plane  wave,  traveling  in  the 
negative  direction,  which  has  f?2(x,  t)  = 0 and  %„(x,  t)  = 
A cos3[(27t/\)(x  + ct)\ 

27-26.  Standing  electromagnetic  waves.  The  heart  of  a 
laser  is  a cavity  with  highly  reflecting  mirrors  at  opposite 
ends.  Just  as  standing  mechanical  waves  of  certain  dehnite 
wavelengths  can  be  established  on  a stretched  string  of 
given  length,  standing  electromagnetic  waves  can  be  es- 
tablished within  the  laser  cavity.  These  can  be  regarded  as 
resulting  from  the  superposition  of  oppositely  directed 
traveling  waves.  A laser  cavity  lies  along  the  x axis.  Assume 
that  the  electric  held  of  the  rightward-moving  wave  within 
the  cavity  is  War(x,  t)  = A cos[(27r/X)(x  — ct)\  Because  of 
the  end  mirrors,  there  is  a leftward-moving  wave  of  the 
same  frequency,  amplitude,  and  polarization:  %yi(x,  t)  = 
A cos[(27r/A.)(x  + ct)  + 8]. 

a.  Show  that  the  total  electric  held  can  be  written  as 
the  product  of  two  sinusoidal  functions,  one  of  x and  the 
other  of  t,  so  that  it  is  a standing  wave. 

b.  Find  the  period  of  the  standing  wave  and  prove 
that  K is  its  wavelength. 

c.  Express  the  total  magnetic  held  as  a product  of  sin- 
usoidal functions.  (Assume  that  the  vacuum  form  of  Max- 
well’s equations  is  correct.) 


1310  Maxwell’s  Equations  and  Electromagnetic  Waves 


Fig.  27E-29 


d.  Show  that  the  nodes  of  the  magnetic  held  do  not 
coincide  with  those  of  the  electric  held. 

e.  Compare  the  time  dependences  of  the  total  electric 
and  magnetic  fields. 

27-27.  Applying  the  Poynting  vector.  A plane-parallel 
vacuum  capacitor  is  being  charged.  The  plates  are  circulai 
and  of  radius  r.  At  a certain  instant  the  magnitude  of  the 
electric  field  between  the  plates  is  %. 

a.  Express  in  terms  of  % the  magnitude  and  direction 
of  the  electric  field  on  the  three  parts  of  a closed  surface 
having  the  form  of  a circular  cylinder  in  the  region 
between  the  plates  of  radius  slightly  less  than  their  radius 
and  of  length  slightly  less  than  their  separation.  (Ignore 
“edge  effects.”  That  is,  assume  the  electric  field  terminates 
abruptly  just  outside  the  capacitor  and  is  uniform  every- 
where inside  it.) 

b.  Evaluate  the  magnitude  and  direction  of  the  mag- 
netic field  on  the  three  parts  of  the  cylinder  in  terms  of  %. 

c.  Evaluate  the  Poynting  vector  on  the  three  parts  of 
the  cylinder. 

d.  Calculate  the  rate  at  which  energy  is  flowing 
through  the  closed  surface  formed  bv  the  three  parts  of 
the  cylinder. 

e.  Show  that  the  value  obtained  in  part  d equals  the 
rate  of  change  of  the  energy  content  of  the  electric  field  in 
the  volume  enclosed  by  the  surface. 

27-28.  What  a versatile  vector!  The  Poynting  vector 
S = £ x ffi  /p-o  can  be  used  to  calculate  the  flux  of  energy 
carried  in  a unit  time  across  a unit  area  by  a combination 
of  electric  and  magnetic  fields  £ and  (B.  even  when  £ and 
© are  steady  fields.  A cylindrical  piece  of  resistance  wire  of 
resistivity  p,  radius  r,  and  length  / is  carrying  current  i. 

a.  Determine  £ and  © at  the  surface  of  the  wire,  and 
then  show  that  at  the  surface  S is  everywhere  directed 
normal  to  the  surface  and  into  the  wire.  Then  evaluate  its 
magnitude  S. 

b.  Multiply  S by  the  total  area  of  the  surface  of  the 
wire  to  find  the  total  energy  flowing  from  the  electric  and 
magnetic  fields  into  the  resistance  wire  in  a unit  time. 
Then  use  |oule's  law  to  evaluate  the  energy  consumption 
in  the  resistance  wire  in  a unit  time.  Compare  the  two 
quantities,  and  comment. 

27-29.  Power  transmitted  along  a coaxial  cable.  A length 
of  coaxial  cable  has  negligible  resistance.  There  is  an  ex- 
ternal resistance  R connected  between  the  inner  and  outer 
cylinders.  A battery  supplying  a difference  of  potential  V 
is  similarly  connected.  See  Fig.  27E-29. 

a.  1 he  electric  field  £ in  the  region  between  the  cylin- 
ders is  in  the  radial  direction  r and  equals  (A’/r)r.  where  K 
is  a constant  to  be  evaluated,  and  r is  the  radial  coordinate. 
Show  that  V„  — Vb  = V = K In (b/a)  and  therefore  that 
£ = [T/rln(6/a)]r. 

b.  Show  that  the  magnitude  of  the  magnetic  field  in 
the  space  between  the  cylinder  is  :3i  = ijl0V/2tt)R. 

c.  Show  that  the  magnitude  of  the  Poynting  vector  is 


S = V2/2vr2\n(b/a)R  and  that  it  is  directed  along  the 
length  of  the  cable  from  right  to  left. 

d.  Bv  integrating  S,  show  that  the  power  transmitted 
through  the  region  between  the  cylinders  from  right  to 
left  in  the  direction  of  the  cable’s  length  is  equal  to  the 
power  consumed  in  the  resistor. 

27-30.  Falling  radiator. 

a.  Find  the  power  emitted  by  an  electron  falling 
freely  in  the  earth’s  gravitational  field. 

b.  For  an  electron  starting  from  rest  and  dropping 
1.0  km  in  the  earth’s  gravitational  field,  find  the  ratio  of 
radiated  energy  to  the  increase  in  kinetic  energy. 

c.  Repeat  the  analysis  of  parts  a and  b for  an  electron 
falling  1.0  km  from  rest  near  the  surface  of  a neutron 
star,  tor  which  the  gravitational  acceleration  is  1.0  x 
1013  m/s2. 


Group  C 

27-31.  A pure  capacitor ?.  I.  Can  a pure  capacitor,  a cir- 
cuit element  which  exhibits  finite  capacitance  but  zero  re- 
sistance and  zero  inductance,  really  exist?  Consider  a pair 
of  circular  perfectly  conducting  plates  of  radius  R held  a 
distance  d apart  in  vacuum,  with  d « R. 

a.  What  is  the  capacitance  C of  this  “capacitor”? 

b.  If  the  current  to  the  capacitor  is  i.  what  is  the  mag- 
netic field  strength  (3i  at  a point  between  the  plates,  at  a 
distance  r < R from  the  central  axis? 

c.  Find  the  magnetic  energy  density  pm  and  the  total 
magnetic  energy  Um  contained  in  the  region  between  the 
plates,  0 

d.  Although  there  is  additional  magnetic  field  energy 
in  the  region  r > R (as  well  as  in  and  around  the  wires 
which  connect  the  capacitor  to  the  rest  of  the  circuit),  the 
result  of  part  c is  sufficient  to  indicate  the  presence  of  an 
effective  inductance  L'  in  the  capacitor.  Putting  L'i2/2  = 
Um,  find  L' . 

e.  Suppose  the  circuit  is  oscillating  sinusoidally  at  fre- 
quency a>.  so  that  the  charge  q on  one  capacitor  plate  obeys 
the  relation  q = q0  cos (wl  + 8).  Find  the  time-averaged 
electric  and  magnetic  energies,  (Ue)  and  (Um),  stored  in 
the  capacitor.  (In  evaluating  {Ue) , neglect  the  contribution 
of  Faraday  induction  to  the  electric  field.) 

f.  Although  the  nonzero  value  of  ( Um)  means  the  cir- 
cuit element  is  not  a “pure”  capacitor,  the  inductance 
exhibited  by  the  element  is  negligibly  small  when 
{Um)  « ( Ue ).  Express  the  ratio  (Um)/(Ue)  in  terms  of  R, 
oo,  and  c,  the  speed  of  light. 


Exercises  1311 


27-32.  A pure  capacitor ?,  II. 

a.  A circular-plate  vacuum  capacitor  with  C — 1.0  X 
10_n  F is  constructed  using  R = 20 d.  (See  Exercise 
27-31.)  Determine  the  spacing  and  size  of  the  plates. 

b.  If  this  capacitor  is  connected  to  an  inductor  with 
L = 1.0  X 10~2  H,  what  is  the  angular  frequency  of  oscil- 
lation, ca0,  of  the  LC  circuit?  Assume  the  circuit  to  be 
resistance-free. 

c.  What  is  the  effective  inductance  L'  of  the  capaci- 
tor? (See  Exercise  27-3 Id.) 

d.  If  the  tin  uit  is  oscillating  at  frequency  w0,  what  is 
the  ratio  oj0R/c,  where  c is  the  speed  of  light?  What  is  the 
ratio  (Um)/(Ue)  of  time-averaged  magnetic  and  electric 
held  energies  in  the  capacitor ? (See  Exercise  27-31/.) 

27-33.  Keeping  current,  I.  Consider  a region  of  space 
in  which  there  is  no  net  charge  density  p but  in  which 
there  is  a current  density  given  by  j = j(x,  t) y.  Make  what- 
ever modifications  in  the  development  of  Sec.  27-3  are  re- 
quired to  describe  a plane  wave  traveling  in  the  positive  x 
direction  with  its  electric  held  polarized  in  the  )’  direction. 

27-34.  Keeping  current,  II.  Consider  a region  of  space 
in  which  there  is  no  net  charge  density  p but  in  which 
there  is  a current  density  given  by  j = j0  cos(A0x  — a>0t) y, 
with  k\  > Wo /c2. 

a.  Show  that  the  following  functions  describing  a 
plane  wave  satisfy  the  equations  obtained  in  Exercise 
27-33: 

a>  , , P-O^ojo  . 

<5y(x,  t)  = - — — sin(«0x  - «o t) 

«0  Wo  I C 

®z(X,  t)  = — %y{X,  t) 

w0 

b.  Find  the  wavelength  and  propagation  speed  of  the 
plane  wave. 

27-35.  Linear  and  circular  polarization.  Equations 
(27-20)  and  (27-22)  specify  the  electric  and  magnetic  helds 
of  a sinusoidal  plane  wave  which  is  propagating  in  the  pos- 
itive x direction  with  its  electric  held  polarized  in  the  y 
direction. 

a.  Show  that  the  following  equations  specify  the 
helds  in  a sinusoidal  plane  wave  of  the  same  wavelength 
and  propagation  direction,  but  polarized  in  the  z direc- 
tion: 

c?z(x,  t)  = A'  cos[(27r/\)(x  - ct)  + §] 

&u(x,  t)  = - g/x,  t)/c 

b.  Since  Maxwell’s  equations  are  linear  in  S and  ®. 
any  linear  combination  of  solutions  is  itself  a solution.  De- 
scribe carefully  the  solutions  obtained  by  each  of  the  fol- 
lowing superpositions  [ A is  the  amplitude  appearing  in 
Eq.  (27-20)]:  (i)  A'  = A and  S = 0;  (ii)  A'  = A and  6 = 7 r; 
(iii)  A'  = A and  8 = — tt/2\  (iv)  A'  = A and  8 = ir/2.  Hint: 
Superpositions  (i)  and  (ii)  yield  what  are  called  linearly 
polarized  waves;  superpositions  (iii)  and  (iv)  yield  what 
are  called  circularly  polarized  waves. 


27-36.  The  phase  difference  makes  no  difference.  Redo 
the  calculation  leading  to  Eq.  (27-32),  pmomentum  = 
Pe nei-gy/c.  assuming  that  there  is  a phase  difference  8 be- 
tween the  oscillating  electron  velocity  v and  the  oscillating 
electric  force  Fe. 

27-37.  Alternative  derivation  of  E = me2.  Equation 
(27-32)  shows  that  electromagnetic  radiation  with  energy 
content  E will  have  momentum  content  E/c.  By  using  this 
fact,  Einstein  was  able  to  derive  the  relation  E = me 2 from 
the  following  thought  experiment.  An  initially  stationary, 
isolated  box  of  length  L and  mass  M has  mass  M/2  concen- 
trated at  each  end,  as  in  Fig.  27E-37.  Its  center  of  mass 
must  remain  stationary  regardless  of  what  happens  inside 
the  box.  Suppose  the  left  end  emits  a pulse  of  radiation 
traveling  to  the  right  of  energy  E and  momentum  E/c. 
The  box  must  recoil  to  the  left  at  speed  v so  that  the  total 
momentum  of  the  isolated  system  remains  zero.  The  pulse 
travels  to  the  right  end  of  the  box  with  speed  c.  While  the 
pulse  is  in  flight,  the  box  moves  to  the  left  a distance  d. 
When  the  pulse  strikes  the  right  end  of  the  box  it  is  ab- 
sorbed, and  the  box  stops  moving  in  order  for  the  system 
to  maintain  zero  momentum.  Since  the  box  moved  to  the 
left,  the  only  way  for  the  center  of  mass  to  remain  in  the 
same  place  is  for  the  absorbed  pulse  of  radiation  to  have 
mass  m. 

Fig.  27E-37 


L 


a.  By  considering  the  situation  when  the  pulse  is  in 
flight,  show  that  E/c  — (M  — m)v  = 0. 

b.  By  considering  the  situation  both  before  the  pulse 
is  emitted  and  also  after  it  is  absorbed,  show  that 
(M/2  + m)(L/2  - d)  - (M/2  - m)(L/2  + d)  = 0. 

c.  Letting  t be  the  time  of  flight  of  the  pulse,  show 
that  d = vt  = v(L  — d)/c. 

d.  Eliminate  d from  the  equations  obtained  in  parts  b 
and  c. 

e.  Eliminate  v from  the  equations  obtained  in  parts  a 
and  d.  Then  solve  for  E to  produce  E = me2. 

27-38.  Thomson  scattering.  At  t = 0,  a free  electron  is 
at  rest  at  the  origin,  but  it  is  subject  to  the  effects  of  the 
sinusoidal  polarized  plane  wave  given  by  Eq.  (27-20). 

a.  Neglecting  the  change  in  position  of  the  electron 
as  a result  of  the  action  of  the  plane  wave,  find  the  acceler- 
ation a of  the  electron.  (Ignore  the  effect  of  the  wave’s 
magnetic  held.) 

b.  Use  Eq.  (27-41)  and  the  result  of  part  a to  deter- 
mine the  average  power  (P)  radiated  by  the  electron. 


1312  Maxwell's  Equations  and  Electromagnetic  Waves 


c.  If  the  principle  of  the  conservation  of  energy  is  to 
be  obeyed,  the  energy  radiated  by  the  electron  must  be 
offset  by  a loss  of  energy  from  the  plane  wave.  Show  that 
in  terms  of  the  amount  of  energy  taken  from  the  plane 
wave,  the  electron  effectively  blocks  a cross-sectional  area 
crT  given  by  aT  = (877 /3)(e2 / 47re0mec2)2 . 

d.  The  removal  and  reradiation  of  energy  from  the 
incident  wave,  as  found  above,  is  called  Thomson  scat- 
tering, and  the  cross-sectional  area  crT  is  called  the 
Thomson  cross  section.  Evaluate  aT. 

Numerical 


27-39.  Using  numerical  integration.  Use  the  numerical 
integration  program  to  evaluate  the  integral 


COS2  COS  0 
sin  0 


dO 


which  determines  the  radiation  resistance  of  a half-wave 
dipole  transmitting  antenna.  Compare  your  result  with 
the  value  quoted  above  Eq.  (27-44a). 

27-40.  Solving  a transcendental  equation.  An  airplane  is 
flying  horizontally  at  height  H above  the  ground  along  a 
straight  course  that  will  carry  it  directly  over  a vertical 
half-wave  dipole  radio  beacon  transmitting  antenna  lo- 
cated on  the  ground.  Show  that  the  energy  flux  at  the  po- 
sition of  the  airplane  passes  through  a maximum  when 
the  airplane  is  over  a position  that  is  a certain  distance  D 
from  the  antenna.  Solve  the  resulting  transcendental 
equation  (here  an  equation  involving  trigonometric  func- 
tions) numerically  (by  a trial  and  error  procedure)  to  de- 
termine D in  terms  of  H with  an  accuracy  of  1 percent. 


Exercises 


1313 


Wave  Optics 


28-1  HUYGENS’ 
CONSTRUCTION 


Our  world  is  bombarded  by  electromagnetic  radiation,  coming  principally 
from  the  sun.  Most  of  this  radiation  has  wavelengths  in  a range  extending 
from  about  4 x 10-7  m to  about  7 x 10-7  m.  Through  the  process  of  evolu- 
tion, our  eyes  have  developed  their  maximum  sensitivity  in  the  same  range 
of  wavelengths.  Thus  electromagnetic  radiation  in  this  wavelength  range  is 
the  visible  radiation  we  call  light. 

In  1800  the  British  astronomer  William  Herschel  used  thermometers  to  detect 
radiation  from  the  sun  with  wavelengths  somewhat  longer  than  that  of  the  red 
light  at  the  long-wavelength  end  of  the  visible  spectrum.  Radiation  in  this  wave- 
length range  is  now  called  infrared  radiation.  A little  later  ultraviolet  solar  radia- 
tion was  detected  for  the  first  time.  Its  wavelengths  are  somewhat  shorter  than  that 
of  the  violet  light  found  at  the  short-wavelength  end  of  the  visible  spectrum.  Al- 
most all  the  discussion  of  light  in  this  chapter  is  applicable  also  to  infrared  and 
ultraviolet  radiation.  In  fact,  a large  part  of  the  discussion  applies  to  a much  wider 
range  of  the  electromagnetic  spectrum. 

The  identification  of  light  as  a form  of  electromagnetic  radiation, 
made  in  Chap.  27,  hinges  on  Maxwell's  theoretical  demonstration  that  elec- 
tromagnetic radiation  propagates  as  a transverse  wave  at  a speed  which  has 
the  same  value  as  the  measured  speed  of  light.  This  theory  was  supported 
by  the  experiments  of  Hertz.  Hertz  generated  radio  waves,  in  a way 
suggested  by  Maxwell’s  theory,  and  showed  that  they  could  be  manipulated 
experimentally  by  using  techniques  completely  analogous  to  those  long 
used  for  manipulating  light.  Such  manipulations  have  to  do  with  the  phe- 
nomena called  reflection,  refraction,  and  diffraction. 

The  phenomena  just  mentioned  involve  the  way  light  moves  from 
emitter  to  detector.  They  are  best  understood  in  terms  of  the  wavelike  na- 


1314 


ture  of  light.  But  the  qualifying  term  “like”  in  the  word  “wavelike”  must  be 
stressed.  Light  is  light.  Insofar  as  it  behaves  in  ways  analogous  to  distur- 
bances traveling  along  a stretched  string  or  across  a surface  of  water,  it  is 
fair  to  say  that  light  is  “like”  these  mechanical  wave  phenomena.  But  there 
are  also  ways  in  which  light  is  profoundly  different  from  waves  in  mechan- 
ical systems.  These  aspects  of  the  behavior  of  light,  which  involve  the  way 
light  interacts  with  matter  as  it  is  emitted  or  detected,  are  not  describable  in 
terms  of  waves.  I hey  are  considered  in  Chap.  30. 

In  this  chapter  we  investigate  the  wavelike  aspect  of  the  nature  of  light. 
The  topic  is  called  wave  optics.  (The  word  “optics”  means  the  scientific 
study  of  light.)  One  procedure  for  carrying  out  such  an  investigation  in- 
volves using  the  electromagnetic  wave  equation.  This  is  done  by  finding  so- 
lutions to  the  partial  differential  equation  which  fit  the  particular  condi- 
tions of  each  optical  system  of  interest.  But  there  are  very  few  systems  so 
simple  that  the  required  solutions  have  the  simplicity  of  the  waves,  with 
plane  wave  fronts  of  indefinite  extent,  treated  in  Chap.  27.  For  most 
systems,  making  an  investigation  of  the  wavelike  nature  of  light  by  a direct 
application  of  the  electromagnetic  wave  equation  is  a very  complicated 
matter. 


Huygens’  construction  provides  a practical  procedure  for  studying 
the  way  light  waves  travel  through  an  optical  system.  It  is  a universally  ap- 
plicable geometrical  construction  which  leads  to  the  same  predictions  con- 
cerning the  motion  of  light  waves  as  those  obtained  by  finding  solutions  to 
the  electromagnetic  wave  equation.  But  usually  it  is  much  easier  to  carry 
out  Huygens'  construction  than  to  solve  the  electromagnetic  wave  equation. 
To  use  Huygens’  construction,  you  must  know  the  shape  and  location  of  a 
wave  front  at  some  initial  time  t = 0,  the  direction  in  which  it  is  moving,  and 
the  speed  of  its  motion.  The  construction  allows  you  to  determine  from  this 
information  the  shape  and  location  of  the  wave  front  at  a subsequent  time.  It 
involves  the  following  steps,  which  we  will  first  state  and  then  illustrate  by  ex- 
ample. 

1.  Pick  a set  of  closely  spaced  points  on  the  initial  wave  front.  Consider 
each  of  these  points  to  be  a source  of  a little  wave  pulse  which  is  emitted  at 
the  initial  time  t = 0.  These  little  wave  pulses  spread  away  from  the  source 
in  a limited  range  of  directions  spanning  the  direction  of  motion  of  the  ini- 
tial wave  front  at  the  source.  And  they  travel  at  a speed  equal  to  the  speed 
of  the  wave  front.  We  call  each  of  these  little  wave  pulses  a wavelet  and  its 
source  a sourcelet. 

2.  Construct  the  wavelets  at  time  t = T,  where  T is  a short  time  in- 
terval. Each  wavelet  is  part  of  a sphere  centered  on  its  sourcelet  and  having 
a radius  equal  to  the  speed  of  the  initial  wave  front  multiplied  by  the  time 
interval  T.  (If  the  wave  front  belongs  to  a sinusoidal  wave,  then  usually  it  is 
convenient  to  let  T be  its  period.  This  makes  the  radii  of  the  spheres  equal 
to  the  distance  traveled  by  the  sinusoidal  wave  in  one  period,  which  is  its 
wavelength  A..) 

3.  Construct  the  surface  tangent  to  all  the  wavelets.  This  common 
tangential  surface  gives  the  shape  of  the  wave  front  at  time  t = T. 

One  justification  of  Huygens’  construction  is  that  it  describes  correctly 
the  way  light  waves  move.  (In  fact,  it  describes  correctly  the  motion  of 


28-1  Huygens’  Construction  1315 


(a) 


\ 

\ 

) 

) 

) 

) 

'i 

\ 

\ 

\ 

) 

1 


(*) 


f \ 
t \ 

i / 

T \ 

T ( 

t \ 

) 

T \ 

k ( 

) 

: 

T 

) ) 

(c) 

Fig.  28-1  Huygens’  construction  for  a 
plane  wave  front  propagating  to  the 
right.  The  plane  wave  front  is  of  indefi- 
nite extent,  but  only  a limited  part  can  be 
shown  in  the  figure,  (a)  A set  of  source- 
lets  is  distributed  along  the  wave  front 
at  t = 0.  (b)  Wavelets  are  constructed. 
Each  is  centered  on  its  sourcelet,  and 
has  a radius  equal  to  the  speed  of  the 
wave  multiplied  by  the  time  interval  T. 
(c)  The  surface  tangent  to  the  sourcelets 
is  constructed.  It  gives  the  shape  and  lo- 
cation of  the  wave  front  at  t = T. 


waves  of  any  nature,  mechanical  as  well  as  electromagnetic.)  We  will  show 
that  Huygens’  construction  agrees  with  observed  motions  by  applying  it  to 
several  simple  cases. 

Consider  hrst  the  particularly  simple  case  of  a plane  wave  front  which 
is  moving  to  the  right.  Its  shape  and  location  at  time  t = 0 are  indicated  in 
Fig.  28- la  by  a line.  This  line  represents  the  intersection  with  the  plane  of 
the  page  of  a wave  front  extending  indefinitely  in  the  direction  normal 
to  the  page  and  indefinitely  in  the  direction  along  the  line.  The  points  rep- 
resent sourcelets  distributed  over  the  wave  front.  They  are  supposed  to 
extend  in  the  direction  normal  to  the  page,  as  well  as  in  the  direction 
along  the  line,  but  this  cannot  be  shown  in  the  figure.  In  Fig.  28-16,  the  cir- 
cular arcs  represent  spherical  wavelets  emanating  from  the  sourcelets  in 
the  general  direction  of  motion  of  the  wave  front.  Their  radii  equal  the  dis- 
tance traveled  by  the  wave  in  a time  interval  T.  The  surface  tangent  to  all 
the  wavelets  is  indicated  by  the  solid  line  in  Fig.  28-  lr.  This  common 
tangential  surface  gives  the  shape  and  location  of  the  wave  front  at  time  t = 
T.  It  is  a plane  located  to  the  right  of  the  initial  wave  front  at  a distance 
equal  to  its  speed  multiplied  by  T. 

Huygens’  construction  predicts  that  as  time  proceeds,  a wave  front 
which  initially  is  in  the  form  of  a plane  of  indefinite  extent  traveling  at  a 
certain  speed  in  a certain  direction  normal  to  the  plane  will  continue  to 
have  the  form  of  a plane  of  indefinite  extent  traveling  at  that  speed  and  in 
that  direction.  This  statement  is  in  complete  agreement  with  what  the  elec- 
tromagnetic wave  equation  has  to  say  about  light  in  the  form  of  plane 
waves,  that  is,  light  waves  having  infinite  plane  wave  fronts.  (And  the  wave 
equation  for  sound  waves  says  exactly  the  same  thing  about  sound  waves.) 

Consider  next  a case  in  which  a plane  wave  of  indefinite  extent  is 
moving  to  the  left.  The  pertinent  Huygens  construction  is  shown  in  Fig. 
28-2.  It  is  like  a condensed  version  of  Fig.  28-1,  except  for  the  fact  that  it 
uses  wavelets  emanating  to  the  left  of  their  sourcelets  since  that  is  the  direc- 
tion of  motion  of  the  wave  front.  Note  that  Huygens’  construction  allows  a 
plane  wave  front  to  propagate  in  either  of  the  two  directions  that  are 
normal  to  the  plane.  This  agrees  with  the  fact  that  in  nature  propagation 
can  take  place  in  either  direction.  Which  direction  is  the  proper  one  in  a 
particular  case  depends  on  the  direction  of  emission  of  the  wave  front  from 
its  source.  The  information  must  be  supplied  to  the  Huygens  construction. 
But  the  situation  is  the  same  when  the  wave  equation  is  used.  In  Sec.  27-4 
we  found  that  plane  wave  solutions  to  the  electromagnetic  wave  equation 
must  describe  wave  fronts  having  a certain  speed,  but  are  not  restricted  as  to 
the  velocity  of  the  wave  front.  That  is,  the  wave  equation  does  not  determine 
in  which  of  the  two  directions  normal  to  its  plane  the  wave  front  moves. 

Figure  28-3  is  a Huygens  construction  for  an  outward-moving  spheri- 
cal wave  front,  whose  center  lies  on  the  page.  Its  initial  shape  and  position 
are  represented  by  the  inner  circle  that  is  its  intersection  with  the  page.  Its 
shape  and  position  after  a certain  time  interval  are  represented  similarly  by 
the  outer  circle.  The  wave  front  remains  spherical,  but  its  radius  increases 
by  an  amount  equal  to  the  product  of  its  speed  and  the  time  interval.  A 
three-dimensional  form  of  the  electromagnetic  wave  equation  has  solutions 
which  describe  expanding  spherical  wave  fronts.  But  obtaining  these  solu- 
tions involves  the  use  of  rather  sophisticated  mathematical  techniques. 
(Both  statements  apply  to  the  wave  equation  for  sound.) 

There  are  also  solutions  to  the  electromagnetic  wave  equation  which 
describe  contracting  spherical  wave  fronts.  You  should  sketch  a Huygens 


1316  Wave  Optics 


construction  for  such  a wave  front.  An  example  of  contracting  wave  fronts 
of  light  that  are  parts  of  spheres  is  found  when  plane  wave  fronts  are  re- 
flected from  a parabolic  mirror  toward  a point  at  the  “focus”  of  the  mirror. 
We  analyze  this  process  in  Sec.  28-2. 


? 

i 

| 

f 

| 

? 


Fig.  28-2  Huygens’  construction  for  a 
plane  wave  front  of  indefinite  extent 
propagating  to  the  left. 


Fig.  28-3  Huygens'  construction  for 
an  expanding  spherical  wave  front.  The 
figure  shows  only  the  intersection  of  the 
wave  front  with  the  page,  and  the  inter- 
sections with  the  page  of  those  wavelets 
whose  sourcelets  lie  in  the  page. 


The  procedure  for  carrying  out  Huygens’  construction  was  stated  first 
by  Christian  Huygens  in  1678 — long  before  the  development  of  wave 
equations.  He  justified  the  construction  by  showing  that  it  correctly 
described  the  behavior  of  wave  fronts  in  simple  cases,  just  as  we  have  done 
in  this  section.  Then  he  used  it  to  explain  the  phenomena  of  reflection  and 
refraction,  just  as  we  will  do  in  Secs.  28-2  and  28-4.  In  1820  Augustin  Fresnel 
modified  the  procedure  slightly  into  a form  equivalent  to  the  one  we  have 
given  (by  specifying  that  wavelets  propagate  from  their  sourcelets  in  a lim- 
ited range  of  directions  spanning  the  direction  of  propagation  of  the  initial 
wave  front,  instead  of  in  all  directions).  In  this  form,  Huygens’  construction 
is  able  to  describe  a key  property  of  the  phenomenon  of  diffraction,  as  we 
will  see  in  Sec.  28-8. 

An  understanding  of  why  Huygens’  construction  leads  to  correct 
descriptions  of  the  behavior  of  wave  fronts  can  be  found  in  two  consider- 
ations. First,  treating  each  point  on  a wave  front  as  a sourcelet  which  emits 
a spreading  wavelet  certainly  is  consistent  with  the  behavior  of  waves  illus- 
trated in  Fig.  28-4.  The  figure  is  a ripple-tank  photo.  (See  the  caption  to 
Fig.  12-16.)  It  shows  t ipples,  in  the  form  of  straight  wave  fronts,  which 
move  to  the  t ight  across  the  surface  of  a tank  of  water.  The  wave  fronts 
strike  a barrier  pierced  by  a very  narrow  slit.  (The  separation  between  adja- 
cent wave  fronts  is  a wavelength  of  the  wave,  so  the  photo  shows  that  the  slit 
width  is  less  than  a wavelength.)  Only  the  part  of  a wave  front  intercepted 
by  the  slit  can  continue — the  rest  is  absorbed  by  the  barrier.  The  apparatus 
comes  near  to  making  it  possible  to  watch  a single  wavelet  emitted  by  a 
single  sourcelet  at  the  intersection  of  a wave  front  and  the  slit.  The  wavelet 
demonstrates  just  the  behavior  it  is  assumed  to  have  in  Huygens’  construc- 
tion. It  spreads. 


Fig.  28-4  A ripple-tank  photograph  used  at  this  point  in  the  text  to 
demonstrate  the  behavior  of  a sourcelet.  The  same  photograph  is  used 
later  to  demonstrate  the  simplest  case  of  the  phenomenon  of  diffraction. 
Both  uses  depend  on  the  fact  that  the  straight  wave  fronts  moving  to 
the  right  impinge  on  a barrier  containing  a slit  which  is  very  narrow. 
Specifically,  the  slit  width  (the  width  of  the  gap  in  the  vertical  black  ob- 
stacle) is  less  than  the  wavelength  of  the  waves  (the  separation  between 
adjacent  vertical  bright  stripes).  The  caption  of  Fig.  12-16  explains  the 
operation  of  a ripple  tank,  and  the  procedure  used  to  obtain  a ripple- 
tank  photograph.  Ripple-tank  photographs  are  presented  throughout 
this  chapter  because  they  demonstrate  phenomena  encountered  in  wave 
optics  in  such  a way  that  individual  wave  fronts  can  be  inspected.  Water 
waves  in  ripple  tanks  move  in  two  dimensions,  while  light  waves  in  space 
move  in  three  dimensions.  But  a light  wave  with  plane  wave  fronts  corre- 
sponds to  a water  wave  with  linear  wave  fronts.  In  general,  the  water 
wave  in  a ripple  tank  can  be  thought  of  as  representing  a cross-sectional 
view  of  an  analogous  light  wave. 


28-1  Huygens’  Construction  1317 


Fig.  28-5  A set  of  wavelets  emitted  by 
sourcelets  distributed  along  a wave 
front.  The  open  dots  and  the  crosses 
show  points  at  which  wavelets  superpose 
constructively. 


28-2  REFLECTION 


Second,  the  common  tangential  surface  in  a Huygens  construction  is 
just  the  surface  on  which  there  is  constructive  superposition  of  all  the  wave- 
lets spreading  away  from  the  sourcelets  distributed  along  a wave  front.  A 
plane  wave  front  traveling  to  the  right  and  a set  of  wavelets  are  represented 
in  Fig.  28-5.  All  the  sourcelets  on  the  wave  front  emit  their  wavelets  at  the 
same  time,  and  all  the  wavelets  spread  away  from  their  sourcelets  with  the 
same  speed.  The  open  dots  in  Fig.  28-5  are  points  in  space  where  a pair  of 
wavelets  from  sourcelets  that  are  next  to  each  other  superpose  construc- 
tively because  they  arrive  at  the  same  time.  Now  imagine  the  spacing 
between  sourcelets  to  decrease.  This  causes  the  open  dots  to  fuse  into  the 
plane  that  is  the  common  tangential  surface.  The  crosses  in  the  figure  are 
the  points  where  wavelets  from  next-but-one  sourcelets  superpose  con- 
structively. These  also  fuse  into  the  common  tangential  surface  as  the 
spacing  between  the  sourcelets  decreases.  The  same  is  true  for  next-but- 
two  sourcelets,  next-but-three  sourcelets,  and  so  forth.  Thus  the  wave  front 
specified  by  the  common  tangential  surface  is  formed  by  the  constructive 
superposition  of  wavelets  emitted  by  sourcelets  distributed  continuously 
along  the  initial  wave  front. 

You  have  seen  that  Huygens’  construction  describes  correctly  the 
behavior  of  waves  in  several  simple  cases,  and  you  now  have  an  under- 
standing of  why  it  does  so.  Next  you  will  see  it  applied  to  more  complicated 
cases,  starting  with  those  involving  reflection.  You  will  find  that  such  appli- 
cations require  making  minor  changes  in  the  construction.  They  are  ex- 
plained as  they  are  introduced. 


When  a beam  of  light  strikes  a flat,  shiny  surface  called  a plane  mirror,  it  is 
reflected.  The  photograph  in  Fig.  28-6a  shows  that  there  is  symmetry  in  the 
reflection  process.  The  symmetry  is  described  in  terms  of  angles  defined  in 
Fig.  28-6 b.  One  is  the  angle  9 between  the  incident  beam  and  a line  normal 
to  the  surface  of  the  mirror,  called  the  angle  of  incidence.  The  other  is  the 
angle  9'  between  the  reflected  beam  and  the  normal  line,  called  the  angle 
of  reflection.  As  the  photo  shows, 

9 = 9'  (28-1) 

This  is  the  law  of  reflection:  The  angle  of  incidence  equals  the  angle  of  reflection. 
We  will  demonstrate  that  the  law  of  reflection  is  predicted  by  an  appropri- 
ate Huygens  construction. 

Figure  28-7 a depicts  at  time  t = 0 a light  beam  striking  a mirror  that 
lies  in  a plane  perpendicular  to  the  plane  of  the  page.  The  light  is  in  the 
form  of  a sinusoidal  wave  of  period  T and  wavelength  A.  The  figure  shows 
parts  of  three  consecutive  plane  wave  fronts  separated  by  distances  equal  to 
A.  (The  wave  fronts  actually  extend  far  beyond  the  parts  that  can  be 
shown.)  Since  the  propagation  direction  of  the  light  beam  is  normal  to  the 
planes  of  the  wave  fronts,  the  angle  of  incidence  9 is  measured  between  a 
line  along  this  direction  and  a line  in  a direction  normal  to  the  plane  of  the 
mirror.  The  part  of  the  first  wave  front  being  considered  has  just  inter- 
cepted the  mirror  at  the  point  labeled  i,  and  its  reflection  has  just  begun. 

Figure  28-7 b shows  the  parts  of  the  wave  fronts  under  consideration  at 
time  t = T.  In  the  interval  since  t = 0 the  wave  fronts  have  advanced  a dis- 
tance equal  to  A.  Thus  the  first  wave  front  now  intercepts  the  mirror  at  the 


1318  Wave  Optics 


Fig.  28-6  (a)  A light  beam  reflected  from  the  surface  of  a mirror  obeys  the  law  of  reflection: 

The  angle  of  incidence  6 equals  the  angle  of  reflection  8' . (From  PSSC  Physics,  2d  ed.,  D.  C. 
Heath,  Boston,  1965.  Courtesy  Education  Development  Corporation.)  (b)  Definition  of  the  angle  of 
incidence  9 and  the  angle  of  reflection  6' . 


point  labeled  j,  whereas  the  next  one  intercepts  it  at  point  i.  Also  shown  is  a 
wavelet  of  radius  A centered  on  a sourcelet  at  i.  It  was  emitted  from  that 
sourcelet  when  the  hrst  wave  front  arrived  at  t — 0.  In  this  case  involving 
reflection,  the  wavelet  does  not  spread  in  a range  of  directions  extending 
over  the  direction  of  motion  of  the  incident  wave  front,  because  that  would 
take  it  into  the  mirror.  Instead,  it  spreads  away  from  the  mirror.  The  hg- 


Fig.  28-7  Huygens’  construction  used  to  derive  the 
law  of  reflection. 


28-2  Reflection  1319 


Fig.  28-8  A ripple-tank  photograph 
showing  ripples  with  linear  wave  fronts 
moving  up  a straight  barrier  at  a certain 
angle  of  incidence.  They  are  reflected  at 
an  angle  of  reflection  equal  to  the  angle 
of  incidence.  ( From  PSSC  Physics,  2d  ed., 
D.  C.  Heath,  Boston,  1965.  Courtesy  Educa- 
tion Development  Corporation.) 


ure  indicates  that  half  of  the  first  wave  front  segment  has  been  reflected. 
The  reflected  part  is  a surface  tangent  to  the  wavelet  and  to  a “wavelet  of 
zero  radius”  that  has  just  been  emitted  from  a sourcelet  at  j where  the  first 
wave  front  has  just  arrived.  (Wavelets  emitted  from  sourcelets  between  i 
and  j are  not  shown.  They  would  all  be  tangent  to  the  reflected  part  of  the 
wave  front  segment.) 

In  Fig.  28-7 c wavelets  involved  in  the  reflection  of  the  first  wave  front 
segment  are  illustrated  at  time  t = 2 T.  Now  the  wavelet  emitted  from 
sourcelet  i at  the  arrival  of  the  first  wave  front  has  radius  2k,  and  the  wave- 
let emitted  from  sourcelet  j at  the  arrival  of  that  wave  front  has  radius  k. 
Also,  there  is  a wavelet  of  zero  radius  that  has  just  been  emitted  from  source- 
let  k on  account  of  the  arrival  of  the  first  wave  front.  The  first  wave  front 
segment  has  been  completely  reflected  and  forms  the  surface  tangent  to 
the  three  wavelets.  Half  of  the  second  wave  front  segment  has  been  re- 
fleeted,  but  wavelets  involved  in  the  process  are  not  shown. 

In  Fig.  28-7 d wavelets  to  which  the  reflected  first  wave  front  segment 
are  tangent  are  depicted  at  time  t — ST.  Also  depicted  are  the  fully  re- 
flected second  wave  front  segment  and  the  half-reflected  third  wave  front 
segment. 

Figure  28-7c  illustrates  the  three  wave  front  segments  at  time  t = AT, 
when  all  have  been  fully  reflected. 

The  angle  of  reflection  6'  is  shown  in  Fig.  28-7 c as  the  angle  between 
the  normal  to  the  mirror  and  the  normal  to  the  reflected  wave  front.  To  see 
that  it  must  be  equal  to  the  angle  of  incidence  9,  consider  the  right  triangle 
jlm  and  the  right  triangle  jnm.  Since  the  side  jl  of  one  and  the  side  jn  of  the 
other  are  both  of  length  k and  since  they  share  the  sidejVre,  the  triangles  are 
identical  (congruent).  Therefore  their  corresponding  angles  9 and  6'  are 
equal,  in  agreement  with  the  law  of  reflection. 

Figure  28-8  is  a ripple-tank  photo  which  makes  incident  and  reflected 
wave  fronts  visible.  Note  that  the  law  of  reflection  is  obeyed  because  the 
normal  to  the  reflecting  surface  forms  equal  angles  with  the  normals  to  the 
incident  and  reflected  wave  fronts. 

Mirrors  are  produced  with  shapes  other  than  that  of  a plane.  One  of 
the  most  useful  is  the  parabolic  mirror.  The  surface  of  such  a mirror  forms 
the  concave  side  of  a paraboloid  of  revolution.  In  other  words,  it  is  on  the 
concave  side  of  the  surface  formed  by  rotating  a parabola  about  the  line 
relative  to  which  the  parabola  is  symmetric.  Figure  28-9  shows  a cross  sec- 
tion of  a parabolic  mirror,  with  the  plane  of  the  page  containing  the  sym- 
metry axis  of  the  paraboloid.  This  axis  is  used  in  the  figure  as  an  x axis,  and 
a y axis  is  constructed  perpendicular  to  the  x axis  at  its  intersection  with 
the  mirror.  The  shape  of  the  cross-sectional  curve  can  be  specified  in  terms 
of  the  x and  y coordinates  by  the  equation  of  a parabola: 


y2  = 4/x 


(28-2) 


The  constant/ in  this  equation  is  the  distance  from  the  coordinate  origin  at 
x = 0,  y = 0 to  the  point  at  x =/  y = 0 shown  in  the  figure.  This  point  is 
called  the  focus  of  the  parabola.  Also  illustrated  is  a line  parallel  to  the  y 
axis  and  passing  through  the  point  at  x = — f,  y — 0,  which  is  called  the 
directrix  of  the  parabola. 

Parabolic  mirrors  are  useful  because  of  their  focusing  property.  That 
is,  if  a beam  of  light  with  plane  wave  fronts  strikes  a parabolic  mirror  while 
traveling  in  a direction  along  its  symmetry  axis,  all  the  light  intercepted  by 


1320  Wave  Optics 


Fig.  28-9  The  parabola  y2  = 4 fx, 
plotted  for  the  case  where  the  constant 
f has  the  numerical  value  f = 1 . The 
point  x = f,  y = 0 is  the  focus  of  the 
parabola.  The  straight  line  x = —f,  for 
all  y,  is  the  directrix  of  the  parabola. 
The  geometrical  significance  of  the 
focus  and  the  directrix  will  be  ex- 
plained shortly. 


the  mirror  will  be  reflected  in  such  a way  that  it  converges  to  the  focus  of 
the  parabola.  This  important  property  is  employed  in  reflecting  telescopes, 
as  you  will  see  in  Chap.  29. 

The  focusing  property  of  a parabolic  mirror  is  illustrated  and  ex- 
plained by  the  Huygens  construction  of  Fig.  28-10.  Plane  wave  fronts  travel 
toward  the  mirror,  moving  in  a direction  parallel  to  its  symmetry  axis.  The 
parts  of  two  of  these  wave  fronts  which  will  be  intercepted  by  the  mirror 
are  shown.  They  are  chosen  to  have  a separation  equal  to  the  wavelength  A 
of  the  sinusoidal  light  wave.  Also  shown  are  a set  of  wavelets  and  their 
common  tangent  surface  forming  one  of  the  reflected  wave  fronts,  as  in 
Fig.  28-7 d.  The  wavelets  of  radii  5A  were  emitted  upon  the  arrival  of  the 
incident  wave  front  at  sourcelets  i,  those  of  radii  4A  upon  its  arrival  at 
sourcelets  j,  those  of  radii  3A  upon  its  arrival  at  sourcelets  k,  those 
of  radii  2A  upon  its  arrival  at  sourcelets  /;  and  the  wavelet  of  radius  X was 
emitted  upon  the  arrival  of  the  wave  front  at  sourcelet  m.  This  accurate 
construction  makes  it  evident  that  the  reflected  wave  front  is  part  of  a 
sphere  converging  on  the  focus  of  the  parabola,  at/.  You  should  confirm 
this  by  constructing  on  the  figure  the  set  of  wavelets  whose  common  tan- 
gential surface  defines  the  next  reflected  wave  front. 

Formal  proof  that  the  point  to  which  the  reflected  wave  fronts  converge  is  at 
the  focus  of  the  parabola  depends  on  the  fact  that  every  location  on  a parabola  is 
equidistant  from  its  focal  point  and  its  directrix.  [This  is  the  fundamental  defini- 


28-2  Reflection  1321 


Fig.  28-10  A Huygens  construction  which  indicates  that  plane  wave 
fronts  approaching  a parabolic  mirror  in  a direction  along  its  sym- 
metry axis  are  reflected  into  spherical  wave  fronts  converging  to  the 
focus  of  the  parabola. 


tion  of  a parabola.  If  you  are  unfamiliar  with  it,  you  can  verify  it  geometrically 
by  making  measurements  on  Fig.  28-9  or  algebraically  by  using  Eq.  (28-2).]  Figure 
28-11  shows  an  incident  wave  front,  which  is  part  of  a plane,  and  a reflected  wave 
front,  which  is  part  of  a sphere  of  very  small  radius  centered  on  the  point  of  con- 
vergence C.  Also  shown  is  one  of  the  wavelets  that  form  the  reflected  wave  front.  It 
was  emitted  from  a sourcelet  at  position  B.  The  time  elapsed  between  the  instant 
when  the  wavelet  reaches  the  position  shown  in  the  figure  and  the  instant  when 
the  incident  wave  front  giving  rise  to  it  was  in  the  position  shown  is  determined 
by  the  distance  AB,  which  that  wave  front  traveled  to  strike  the  sourcelet  at  B, 
plus  the  distance  BC,  which  the  wavelet  traveled  from  the  sourcelet  atB.  Now  the 
reflected  wave  front  is  made  up  of  many  wavelets,  all  of  which  must  arrive  simul- 
taneously at  the  wave  front.  Hence  the  distance  AB  + BC  must  be  independent  of 
the  choice  of  B.  It  will  be  if  C is  at  the  focus  of  the  parabola.  To  see  that  this  is  so, 
consider  point  D at  the  intersection  of  the  directrix  and  the  continuation  of  the 
line  from  A to  B.  By  the  definition  of  a parabola,  BC  = BD.  Thus  AB  + BC  = 
AB  + BD.  Since  the  plane  of  the  incident  wave  front  is  parallel  to  the  plane  con- 
taining the  directrix,  AB  + BD  is  independent  of  the  choice  ofB,  as  required,  pro- 
viding C is  at  the  focus  of  the  parabola,  as  stated. 

Light  waves  (and  all  other  waves)  have  the  property  of  reversibility. 
That  is,  light  can  propagate  with  plane  wave  fronts  traveling  to  the  right 
and,  equally  well,  with  plane  wave  fronts  traveling  to  the  left.  It  can  propa- 
gate with  spherical  wave  fronts  expanding  away  from  a point  and,  just  as 
well,  with  spherical  wave  fronts  converging  toward  the  point.  The  same  re- 
versibility applies  to  the  phenomenon  of  reflection.  If  the  light  source  in 
Fig.  28-6  was  not  visible  in  the  photograph,  then  the  incident  light  could  be 
traveling  from  the  upper  left  and  the  reflected  light  traveling  to  the  upper 


1322 


Wave  Optics 


Fig.  28-11  A figure  used  to  prove  that  the 
reflected  wave  fronts  actually  converge  to 
the  focus  of  the  parabola. 


D 


Directrix 


right,  or  the  direction  of  travel  of  light  through  the  system  could  be  the  re- 
verse of  this.  You  should  make  a Huygens  construction  like  the  one  in  Fig. 
28-7  in  which  the  incident  wave  fronts  come  from  the  upper  right  and  the 
reflected  ones  go  the  upper  left,  and  thereby  show  that  the  law  of  reflection 
is  obtained  nevertheless. 

For  reflection  from  a parabolic  mirror,  we  have  assumed  that  plane 
wave  fronts  travel  toward  the  mirror,  in  a direction  parallel  to  the  sym- 
metry axis,  and  have  found  that  the  reflected  wave  fronts  are  then  spheres 
converging  to  the  focus.  The  reversibility  property,  applied  to  this  finding. 


Fig.  28-12  Ripple-tank  photographs  showing  the 
reflection  of  wave  fronts  emitted  by  a point  source 
and  reflected  from  a parabola-shaped  barrier. 
The  point  source  is  located  at  the  focus  of  the 
parabola.  The  upper  photograph  was  taken  just 
after  the  source  had  executed  several  oscillations 
and  emitted  a circular  wave  containing  several 
consecutive  ripples.  In  the  lower  photograph, 
which  was  taken  a short  time  later,  the  parts  of  the 
circular  ripples  which  have  not  struck  the  mirror 
have  expanded,  but  remain  circular.  The  parts 
which  have  struck  the  mirror  have  been  reflected 
into  straight  ripples.  As  time  continues  to  pass,  the 
straight  ripples  move  to  the  right  along  the  sym- 
metry axis  of  the  parabolic  mirror.  They  consti- 
tute a beam  of  ripples  traveling  along  the  axis. 
( From  PSSC  Physics,  D.  C.  Heath,  Boston,  1965.  Cour- 
tesy Education  Development  Corporation. ) 


28-2  Reflection  1323 


predicts  that  spherical  wave  fronts  expanding/rom  the  focus  of  a parabolic 
mirror  will  be  reflected  so  as  to  form  plane  wave  fronts  which  travel  away 
from  the  mirror  in  a direction  parallel  to  the  symmetry  axis.  The  cor- 
rectness of  the  prediction  is  demonstrated  by  the  ripple-tank  photograph 
in  Fig.  28-12.  This  attribute  of  parabolic  mirrors  is  used  in  flashlights,  auto- 
mobile headlights,  and  similar  devices.  It  makes  it  possible  to  form  a beam 
of  light  from  a source,  such  as  a light  bulb,  from  which  light  diverges  into  a 
wide  range  of  directions.  Can  yon  modify  the  Huygens  construction  in  Fig. 
28-10,  or  the  argument  concerning  Fig.  28-11,  so  that  it  applies  to  wave 
fronts  diverging  from  a light  source  at  the  focus? 

Example  28-1  concerns  a practical  aspect  of  parabolic  mirror  design. 


EXAMPLE  28-1 

You  are  designing  an  automobile  headlight  in  which  light  will  be  produced  by  a fila- 
ment of  length  0.40  cm,  placed  in  a parabolic  mirror  as  shown  in  Fig.  28-13.  You 
want  the  reflected  light  in  the  beam  to  travel  within  6.0°  of  the  symmetry  axis  of  the 
parabolic  mirror  used  to  form  the  beam.  What  value  must  you  specify  for  the  con- 
stant f in  the  equation  y2  = 4 fx  that  determines  the  shape  of  the  mirror? 

■ For  the  case  of  a point  source,  the  light  waves  that  are  reflected  from  the 
mirror  at  its  intersection  with  the  axis  travel  to  the  mirror  along  the  axis  and  then 
are  reflected  back  along  the  axis.  But  light  waves  from  the  ends  of  the  filament 
travel  to  the  intersection  of  the  mirror  with  the  axis  along  paths  that  are  inclined  to 
the  axis  at  the  angle  6 indicated  in  the  figure.  After  reflection,  they  travel  along 
paths  which  are  inclined  to  the  axis  at  the  same  angle.  The  discrepancy  between  the 
paths  actually  followed  and  the  path  followed  for  the  case  of  a point  source  is 
greatest  for  these  light  waves.  Justify  this  statement,  so  that  you  can  then  say  the 
angle  0 in  the  figure  is  the  angle  having  the  specified  value  6.0°. 


1324  Wave  Optics 


According  to  the  figure,  its  value  also  can  be  written  as 

0.20  cm 

0 =• 


/ 

Solving  for / and  setting  0 = 6.0°  — 0.10  rad,  you  obtain 

0.20  cm  0.20  cm 
/ = : = - — = 2.0  cm 


0 


0.10 


28-3  THE  SPEED  OF 
LIGHT  IN 
TRANSPARENT 
MATERIALS 


The  speed  at  which  light  travels  through  a transparent  material  depends 
on  the  physical  properties  of  the  material.  While  there  is  a wide  variation  in 
the  speed  from  one  material  to  the  next,  its  value  is  always  less  than  c,  the 
speed  of  light  in  vacuum. 


A detailed  theoretical  explanation  of  this  experimental  observation  is  quite 
complicated.  But  you  will  be  able  to  understand,  in  a general  way,  the  main  fea- 
tures of  the  explanation.  Consider  what  happens  as  an  electromagnetic  wave,  such 
as  a light  wave,  travels  through  a layer  of  a transparent  material.  When  it  is  inci- 
dent on  an  electron  bound  to  an  atom  or  a molecule  of  the  material,  the  electric 
field  of  the  wave  causes  the  electron  to  oscillate  at  the  same  frequency  as  that  of 
the  wave.  The  work  done  on  the  electron  leads  to  a partial  absorption  of  the  wave. 
As  for  the  electron,  the  analysis  of  a driven  harmonic  oscillator  (similar  to  the 
driven  LRC  circuit  analysis  in  Secs.  26-5  through  26-7)  shows  that  the  time  at 
which  the  electron’s  acceleration  reaches  a certain  point  in  its  oscillation  cycle  is 
delayed  relative  to  the  time  at  which  the  electric  field  of  the  wave  incident  on  the 
electron  reaches  the  same  point  in  the  wave’s  oscillation  cycle.  This  is  true  no 
matter  what  the  relation  between  the  frequency  of  the  wave  and  the  resonant  fre- 
quency of  the  electron  oscillator.  Because  it  is  accelerating,  the  electron  emits  an 
electromagnetic  wave  of  its  own.  The  frequency  of  the  emitted  wave  is  the  same  as 
that  of  the  incident  wave.  But  it  is  delayed  relative  to  the  incident  wave  because  of 
the  delay  in  the  acceleration. 

Emerging  from  the  layer  of  material  is  a total  wave  traveling  in  the  direction 
of  the  incident  wave,  comprising  two  components.  One  is  the  incident  wave, 
which  is  attenuated  somewhat  by  partial  absorption  on  electron  oscillators  in  the 
layer.  The  other  component  is  the  waves  emitted  by  these  electron  oscillators  in 
the  direction  of  the  incident  wave.  (Detailed  analysis  shows  that  in  all  other  direc- 
tions the  emitted  waves  superpose  destructively  with  one  another  so  that  there  is 
no  net  emission.)  Since  the  waves  emitted  in  the  direction  of  the  incident  wave  are 
delayed  relative  to  the  incident  wave,  there  is  some  delay  in  the  total  wave  relative 
to  the  incident  wave.  This  delay  causes  the  total  wave  to  take  more  time  in  travel- 
ing through  the  layer  of  material  than  an  electromagnetic  wave  takes  in  traveling 
through  an  evacuated  region  of  the  same  thickness.  Hence  its  speed  is  lower  than 
the  speed  in  vacuum. 


Soon  we  will  develop  the  connection  between  the  fact  that  light  travels 
through  transparent  materials  more  slowly  than  it  does  in  vacuum  and 
the  important  phenomenon  of  refraction — the  bending  of  a beam  of  light 
entering  a transparent  material  oblique  to  its  surface.  But  hrst  we  consider 
experimental  measurements  of  the  speed  of  light  in  various  materials. 

The  hrst  direct  attempt  to  measure  the  speed  of  light  in  a dense,  trans- 
parent material  was  carried  out  by  Foucault  in  1850.  He  used  a modifica- 
tion of  the  method  devised  by  Fizeau  to  measure  the  speed  of  light  in  air, 
described  in  Sec.  14-2.  The  material  used  by  Foucault  was  water. 

Foucault’s  apparatus  is  shown  schematically  in  Fig.  28-14.  It  was  not 
possible  at  the  time  to  make  a direct  measurement  of  the  speed  of  light  in 

28-3  The  Speed  of  Light  in  Transparent  Materials  1325 


water.  Fizeau’s  light  path  had  been  17.26  km  long,  and  so  much  water  (or 
any  other  nongaseous  substance)  would  be  entirely  opaque.  Foucault’s 
apparatus  uses  a very  much  shorter  path  (a  few  meters  long)  and  compares 
the  speed  of  light  in  water  to  its  speed  in  air.  As  shown  in  the  figure,  light 
from  a source  passes  through  a slit  5.  It  then  is  focused  into  a beam  by  a 
set  of  lenses  not  shown  in  the  figure.  The  light  strikes  the  mirror  R,  which 
can  be  rotated  at  high  speed.  Suppose  for  the  moment  that  the  mirror  is 
stationary  in  position  1.  The  light  is  reflected  to  mirror  M1  and  then  back 
along  the  same  path  to  R.  There  is  a second  reflection  by  R,  and  the  image 
of  the  slit  S falls  on  a scale  at  Sq,  where  it  can  be  observed.  (The  scale  is 
shown  slightly  displaced  from  the  original  path  SR  for  clarity.  In  fact,  this 
displacement  can  be  achieved  by  a very  slight  misalignment  of  M1;  so  that 
the  paths  RM1  and  MXR  are  not  quite  identical.) 

If  instead  the  mirror  R is  stationary  in  position  2,  the  light  passes  over 
a similar  path  via  the  fixed  mirror  M2 . The  distances  RMt  and  R\U  are 
made  equal.  The  path  RM2  contains  a tube  IT  with  glass  ends,  which  can  be 
filled  with  water.  By  proper  adjustment,  the  image  of  the  slit  cast  on  the  ob- 
serving scale  by  light  passing  along  this  path  can  be  made  to  coincide  with 
the  image  produced  by  the  light  beam  passing  over  the  path  which  involves 
reflection  by  (Of  course,  the  two  images  cannot  be  obtained  simulta- 
neously, since  they  require  different  positions  of  the  mirror  R.) 

The  mirror  R is  now  set  into  rapid  rotation.  Light  striking  it  when  it  is 
momentarily  at  the  proper  angle  to  reflect  light  toward  Alj  will  pass  over 
the  path  RlV^R.  But  by  the  time  the  light  returns  to  R,  the  rotating  mirror 
will  have  turned  through  a small  angle.  As  a result,  the  image  formed  at  So 
when  the  mirror  is  stationary  is  now  formed  instead  at  a slightly  displaced 
position  S[  on  the  observing  scale.  If  there  is  no  water  in  the  tube  IT,  the 
time  delay  over  the  path  RM2R  is  the  same  as  that  over  the  path  RMXR.  So 
the  second  slit  image  is  displaced  by  the  same  amount  as  the  first,  and  the 
two  images  still  appear  to  coincide. 

But  if  the  tube  is  filled  with  water,  the  slit  image  produced  by  light 


1326  Wave  Optics 


passing  over  the  second  path  is  seen  to  be  displaced  to  S2  and  no  longer 
coincides  with  Sfi  Thus  two  images  are  seen  where  there  was  formerly  one. 
The  smaller  displacement  of  S[  implies  that  the  time  delay  over  RM ±R  is 
smaller  than  that  over  RM2R.  This  must  be  because  the  speed  of  light  is 
greater  in  air  than  it  is  in  water.  Foucault  obtained  a ratio  for  the  speeds  of 
light  in  air  and  in  water  of  approximately 


Tv  a ter  a 

Foucault  assumed,  correctly,  that  the  speed  of  light  in  a material  with 
density  as  low  as  that  of  air  is  very  nearly  equal  to  the  speed  of  light  in  vac- 
uum. (Experiments  performed  after  Foucault's  show  that  light  travels 
through  air  with  a speed  lower  than  its  speed  in  vacuum  by  only  3 parts  in 
100,000.)  Thus,  to  the  accuracy  of  his  measurements,  Foucault  concluded 
that 


c _ 4 
^water  3 

There  are  several  ways  to  make  more  precise  measurements  of  the 
speed  of  light  in  transparent  materials.  One  is  to  employ  the  Michelson  in- 
terferometer, discussed  in  Sec.  14-2.  The  material  is  introduced  in  the  light 
path  of  one  arm  of  the  interferometer.  This  increases  the  time  required  for 
the  light  wave  in  the  arm  to  go  through  its  round  trip  and  thereby  changes 
its  phase  relationship  with  the  light  wave  traveling  in  the  other  arm,  when 
the  two  waves  recombine.  Hence  the  fringe  pattern  is  shifted.  By  counting 
the  fringe  shifts,  as  the  material  is  introduced  gradually  in  the  light  path, 
information  is  obtained  from  which  the  speed  of  light  in  the  material  can 
be  determined  with  great  accuracy.  (Michelson  said  that  his  invention  of 
the  interferometer  owed  much  to  the  design  of  Foucault’s  apparatus.  Can 
you  see  any  similarity  between  the  two?) 

Another  way  to  measure  the  speed  of  light  in  a transparent  material 
involves  the  phenomenon  of  refraction.  Example  28-2  will  show  you  how 
this  is  done. 


Results  obtained  from  measurements  of  the  speed  of  light  in  various 
materials  usually  are  expressed  as  they  have  been  above,  that  is,  in  terms  of 
the  speed  c of  light  in  vacuum  divided  by  its  speed  v in  the  material.  This 
ratio  c/v  is  called  the  index  of  refraction  of  the  material,  and  it  is  desig- 
nated by  the  symbol  n.  Thus,  by  definition,  we  fiave 


(28-3) 


Every  transparent  substance  lias  an  index  of  refraction  which  is  deter- 
mined directly  by  the  speed  with  which  light  passes  through  it.  The  value  of 
n for  light  traveling  through  any  transparent  substance  is  greater  than  1 
since  in  all  such  substances  its  speed  v is  less  than  c. 

For  ideal  gases,  the  quantity  n — 1 is  directly  proportional  to  the  den- 
sity of  the  gas  and  therefore  depends  on  temperature  and  pressure.  Values 
usually  are  quoted  at  the  standard  conditions  T = 273.15  K and/>  = 1 atm. 
For  most  liquids,  solids,  and  glasses  (glasses  are  noncrystalline  solids  that, 


28-3  The  Speed  of  Light  in  Transparent  Materials  1327 


Table  28-1 


Index  of  Refraction  of  Selected  Substances 

(A.  = 589  nm) 

Substance 

n 

I.  Gases  (T  = 273.15  K,  p = 1 atm) 

Air (dry) 

1.000293 

Carbon  dioxide 

1.000450 

Helium 

1.000036 

II.  Liquids 

Carbon  disulfide 

1.6259 

Ethanol 

1.3645 

Water 

1.3369 

III.  Crystalline  Solids 

Aluminum  oxide  (ruby,  sapphire) 

1.76 

Carbon  (diamond) 

2.417 

Water  ice 

1.309 

IV.  Glasses 

Crown  glass 

1.5171 

Flint  glass,  medium 

1.6273 

Flint  glass,  dense 

1.6555 

Fused  quartz 

1.4585 

V.  Plastics 

Polymethyl  methacrylate  (Lucite) 

1.49 

Polycarbonate  (optical  quality) 

1.6056 

over  long  periods,  flow  like  liquids)  there  is  only  a weak  dependence  of  n on 
temperature  and  pressure.  But  for  all  materials  the  index  of  refraction  has 
a more  or  less  rapid  dependence  on  the  wavelength  of  the  light  passing 
through  the  material.  For  the  purposes  of  standard  measurements,  it  is 
often  convenient  to  use  the  “D  line”  in  the  spectrum  of  sodium  as  the  stan- 
dard wavelength.  This  wavelength  is  kD  = 589  x 10-9  m = 589  nm,  where 
nm  stands  for  nanometer.  That  is, 

1 nm  = 10-9  m 

Table  28-1  lists  values  of  the  index  of  refraction  for  a variety  of  materials. 
In  Sec.  28-4  the  index  of  refraction  will  play  an  important  role  in  describ- 
ing the  way  in  which  light  passes  from  one  transparent  material  to  another. 


28-4  REFRACTION  AND  When  a beam  of  light  traveling  in  air  strikes  the  surface  of  a transparent 
TOTAL  INTERNAL  material,  some  of  it  is  reflected  at  the  surface,  but  much  of  it  passes  through 
REFLECTION  sm'face  into  the  material.  The  part  of  the  beam  entering  the  material 
experiences  an  abrupt  “bending”  at  the  surface.  An  example  is  shown  in 
Fig.  28-15a,  where  a beam  of  light  travels  from  air  into  glass.  This  bending 
is  called  refraction,  from  the  Latin  word  refractus,  meaning  “turned  aside” 
or  “broken."  Note  that  the  beam  of  light  traveling  from  the  less  dense 
material  air  into  the  more  dense  material  glass  is  bent  toward  a line  normal 
to  the  surface.  This  statement  can  be  made  more  explicit  in  terms  of  the 


1328  Wave  Optics 


Fig.  28-15  (a)  A beam  of  light  traveling  through  air 

from  the  upper  left  is  incident  on  a block  of  glass.  Some 
of  the  beam  is  reflected  at  the  surface  of  the  glass,  but 
here  this  process  is  not  of  interest  to  us.  What  is  interest- 
ing is  that  some  of  the  light  beam  enters  the  glass,  and  it 
is  bent  abruptly  upon  entering.  The  phenomenon  is  called 
refraction.  The  light  beam  also  experiences  refraction  as 
it  leaves  the  glass.  ( Courtesy  Education  Development  Corpo- 
ration.) ( b ) Definition  of  the  angle  of  incidence  6 and  the 
angle  of  refraction  S' . 


( b ) 


quantities  defined  in  Fig.  28-156:  the  line  normal  to  the  surface,  the  angle 
of  incidence  9 between  the  incident  beam  and  the  normal  line,  and  the 
angle  of  refraction  9'  between  the  refracted  beam  and  the  normal  line. 
The  statement  is  that  6'  is  less  than  9. 

The  phenomenon  of  refraction  is  a direct  consequence  of  the  change 
in  speed  of  light  waves  as  they  pass  from  one  material  into  another.  We 
show  this  by  the  Huygens  construction  of  Fig.  28-16.  The  construction  rep- 
resents a beam  of  light  traveling  through  material  of  index  of  refraction  n 
to  the  surface  separating  it  from  material  of  larger  index  of  refraction  n' . 
The  beam  arrives  at  the  surface  at  an  angle  of  incidence  9,  and  it  leaves  at 
an  angle  of  refraction  9'.  The  shape  and  position  of  consecutive  wave 
fronts  are  depicted  in  the  construction,  each  being  obtained  from  the  pre- 
ceding one  by  a Huygens  construction  using  the  sourcelets  indicated  by 
dots  on  the  wave  fronts.  Every  wavelet  has  a radius  equal  to  the  length  of 
the  path  that  a light  wave  travels  in  a time  interval  T,  moving  at  the  speed  of 
light  in  the  material  in  which  the  wavelet  is  located. 

According  to  Eq.  (28-3),  in  the  material  of  index  of  refraction  n the 
speed  of  light  has  the  value  v = c/n.  Thus  in  time  interval  T light  will  travel 
a path  of  length  / = vT,  which  has  the  value 

rjy 

l = — (2  8-4<7 ) 

n 


28-4  Refraction  and  Total  Internal  Reflection  1329 


N 


Fig.  28-16  Huygens’  construction  used  to  derive  the  law  of  refrac- 
tion. 


ri  > n 


N 


For  light  traveling  in  the  material  of  index  of  refraction  n' , the  path  length 
/'  in  the  same  time  interval  is  given  by 


(28-46) 


Equations  (28-4«)  and  (28-46)  explain  why  the  wavelets  in  the  material  with 
the  larger  index  of  refraction  n'  have  smaller  radii.  It  is  simply  because 
light  travels  less  rapidly  in  material  of  greater  index  of  refraction.  The  con- 
struction shows  why  this  causes  the  wave  fronts  to  bend.  And  since  the 
beam  in  each  material  travels  in  a direction  normal  to  its  wave  fronts,  as  the 
wave  fronts  bend,  so  must  the  beam  bend.  Thus  refraction  arises  from  the 
change  in  the  speed  of  light  waves  as  they  enter  a material  with  an  index  of 
refraction  different  from  that  of  the  material  through  which  they  have 
been  traveling.  It  should  be  noted  that  in  this  Huygens  construction  T is 
not  set  equal  to  the  oscillation  period  of  the  light  wave.  Instead,  to  make  the 
construction  more  convenient,  T is  chosen  so  that  an  integral  number  of 
wavelet  radii  fit  into  the  length  of  the  path  that  light  waves  travel  in  going 
between  the  points  labeled  A and  B,  and  also  in  going  between  the  points  la- 
beled C and  D.  The  particular  integer  used  is  4;  but  any  other  could  be 
used . 

The  Huygens  construction  makes  it  easy  to  predict  the  relation 
between  the  angles  Q and  O',  on  the  one  hand,  and  the  indices  of  refraction 
n and  n' , on  the  other.  First  consider  the  angles  ACB  and  ABN.  Since  the 
line  AC  is  perpendicular  to  the  line  AB  and  since  the  line  CB  is  perpendic- 
ular to  the  line  BN,  these  angles  are  equal.  In  fact,  they  both  equal  the  angle 
of  incidence  6 , as  shown  in  the  figure.  A similar  argument  verifies  that  the 
angle  CBD  equals  the  angle  of  refraction  O' , as  shown. 


1330  Wave  Optics 


» 


Because  the  triangles  ACB  and  DCB  are  both  right  triangles,  the  values 
of  9 and  9'  can  be  expressed  in  terms  of  the  lengths  of  the  lines,  as  follows: 


AB 

s,n  0 = CB 


and 


sin  9'  = 


CD 

CB 


Eliminating  the  common  length  CB  by  dividing  the  Brst  equation  by  the 
second,  we  obtain 


sin  9 AB 
sin  9'  CD 


(28-5) 


The  figure  shows  that  AB  equals  the  product  of  an  integer  and  l,  and  CD 
equals  the  product  of  the  same  integer  and  /'.  Hence 

AB  _ l 

~cd~T' 

Dividing  Eq.  (28-4a)  by  Eq.  (28-46)  to  evaluate  ///',  we  find 

AB  _ n' 

CD  n 


Using  this  in  Eq.  (28-5)  produces 

sin  9 n' 
sin  9'  n 


or 


n sin  9 = n'  sin  9'  (28-6) 

This  basic  relation  between  the  angle  of  incidence  9 and  the  angle  of  refrac- 
tion 9'  is  called  Snell’s  law,  or  the  law  of  refraction.  The  law  applies  to  re- 
fraction of  a beam  of  light  at  any  surface  between  two  transparent  mate- 
rials. It  says  that  the  product  of  the  index  of  refraction  and  the  sine  of  the  angle 
between  the  beam  and  the  normal  to  the  surface  has  a value  for  the  incident  beam 
equal  to  its  value  for  the  refracted  beam. 

Figure  28-17  shows  the  refraction  of  a narrow  beam  of  light  traveling 
from  air  to  glass,  for  several  different  angles  of  incidence.  An  analysis  of 
the  photographs  in  this  figure  is  made  in  Example  28-2.  It  confirms  the 
predictions  obtained  from  the  Huygens  construction  for  refraction  that  are 
summarized  by  Eq.  (28-6).  Example  28-2  also  shows  how  refraction  mea- 
surements can  be  used  to  determine  the  speed  of  light  in  a transparent 
material. 


a.  Measure  the  angle  of  incidence  and  the  angle  of  refraction  in  Fig.  28-176, 
and  use  them  in  Snell’s  law  to  determine  the  index  of  refraction  of  the  glass. 

■ Careful  inspection  of  the  photo  shows  that  8 = 40.0°  and  8'  = 25.0°,  with  un- 
certainties of  perhaps  0.2°.  Writing  Snell’s  law,  Eq.  (28-6),  as 

n'  sin  8 
n sin  8' 


28-4  Refraction  and  Total  Internal  Reflection  1331 


Fig.  28-17  Photographs  showing  refrac- 
tion and  reflection  of  a narrow  beam  of 
light  traveling  through  air  from  the  upper 
left  to  strike  the  flat  surface  of  a half- 
cylinder of  glass.  Why  is  the  light  beam  not 
refracted  as  it  leaves  the  curved  surface? 
(From  PSSC  Physics,  D.  C.  Heath,  Boston, 
1965.  Courtesy  Education  Development  Cor- 
poration. ) 


t 


you  have 

n'  sin  40.0° 

~n  ~ sin  25.0°  “ L°2 

Since  the  incident  light  beam  is  traveling  in  air,  to  the  accuracy  of  this  work  you  can 
certainly  set  n = 1 and  obtain 

n'  = 1.52  ■ 

b.  Evaluate  the  speed  of  light  in  this  glass. 

■ Using  Eq.  (28-3),  the  definition  of  index  of  refraction,  you  have  for  the  speed  v' 

c 3.00  x 108  m/s 

v'  = — = — = 1.97  x 108  m/s 

n’  1.52  ■ 

c.  Identify  the  type  of  glass. 

■ Comparison  of  the  value  n'  = 1.52  with  the  entries  in  Table  28-1  indicates  that 

it  is  crown  glass.  ■ 

d.  Measure  the  angle  of  incidence  0 in  Fig.  28-1 7c.  Then  use  Snell’s  law  and 
the  value  you  have  obtained  for  n'  to  predict  the  angle  of  refraction  6'.  Compare 
this  with  the  value  you  measure  from  the  figure,  and  comment  on  the  significance 
of  the  comparison. 

■ Inspection  of  the  photo  shows  that  0 = 60.0°,  within  about  0.2°.  Setting  n = 
1 in  Snell’s  law,  you  have 


1332  Wave  Optics 


That  is, 


sin  O'  = 


sin  0 
n' 


sin  60.0° 
1.52 


O'  = sin  1 


sin  60.0°  \ 
1.52  / 


34.7° 


This  value  agrees,  to  0.2°  or  so,  with  the  value  O'  = 35.0°  measured  to  that  accuracy 
from  the  photograph.  The  agreement  provides  experimental  confirmation  of  Snell's 
law. 


As  is  true  of  all  the  phenomena  considered  earlier  in  this  chapter,  light 
waves  (and  other  waves)  exhibit  the  property  of  reversibility  in  the  phenome- 
non of  refraction.  For  an  example  of  what  this  means,  consider  the  refrac- 
tion taking  place  in  Fig.  28-  f 7 b.  There  a light  beam  traveling  from  air  into 
a particular  type  of  glass  at  an  angle  of  incidence  6 = 40°  is  found  to  have 
an  angle  of  refraction  of  O'  = 25°.  (The  weak  reflected  beam  seen  in  the 
photograph  makes  it  apparent  that  the  beam  travels  this  way,  and  not  from 
glass  to  air,  even  though  the  source  of  the  beam  is  not  visible.)  The  prop- 
erty of  reversibility  tells  us  that  if  the  direction  of  travel  of  the  light  beam  is 
reversed,  so  that  it  goes  from  the  glass  into  air  at  an  angle  of  incidence  0 — 
25°,  then  it  will  be  found  to  have  an  angle  of  refraction  of  O'  = 40°.  An 
experiment  will  verify  that  this  is,  indeed,  what  happens.  (In  the  experi- 
ment the  reversed  direction  of  travel  could  be  distinguished,  even  if  the 
source  is  not  visible,  by  what  happened  to  the  weakly  reflected  beam.)  You 
should  make  a Huygens  construction  like  the  one  in  Fig.  28-16,  but  with 
the  direction  of  travel  of  the  light  beam  reversed,  and  use  it  to  demonstrate 
the  reversibility  property  for  refraction. 

Snell’s  law  is  completely  consistent  with  the  property  of  reversibility. 
That  is,  the  equation  n sin  0 = n'  sin  O'  applies,  no  matter  whether  a light 
beam  goes  from  a region  of  small  index  of  refraction  to  a region  of  large 
index  of  refraction  or  from  a region  of  large  index  of  refraction  to  a region 
of  small  index  of  refraction.  The  only  thing  you  must  remember  is  always 
to  let  the  unprimed  quantities  in  the  equation  be  the  ones  measured  in  the 
region  where  the  light  beam  is  initially,  and  let  the  primed  quantities  be  the 
ones  measured  in  the  region  where  it  is  finally. 

The  following  statements  give  a qualitative  summary  of  the  phenome- 
non of  refraction:  When  a light  beam  crosses  a surface  into  a region  of 
larger  index  of  refraction,  it  is  bent  toward  the  normal  to  the  surface.  When 
a light  beam  crosses  into  a region  of  smaller  index  of  refraction,  it  is  bent 
away  from  the  normal.  See  Fig.  28- 15a. 

Snell's  law  is  named  after  the  Dutch  mathematician,  astronomer,  and  map- 
maker  Willebrord  Snell  (1591-1626),  although  Snell  derived  the  law  in  a different 
mathematical  form,  using  different  (and  incorrect)  assumptions.  Snell's  work  was 
not  widely  known  until  after  his  death.  In  the  meantime,  the  law  had  been 
derived — probably  on  the  basis  of  a perusal  of  Snell's  unpublished  manu- 
script— by  the  French  philosopher  and  mathematician  Rene  Descartes  (1596- 
1650),  the  inventor  of  analytical  geometry.  Descartes  expressed  the  law  of  re- 
fraction in  the  form  given  by  Eq.  (28-6).  In  France  Snell’s  law  is  known  as  Des- 
cartes’ law. 

Descartes  derived  the  law  of  refraction  on  the  basis  of  the  assumption  that 
light  is  composed  of  a stream  of  tiny  particles  rather  than  a train  of  waves.  The 


28-4  Refraction  and  Total  Internal  Reflection  1333 


mathematical  result  is  of  exactly  the  same  form  as  Eq.  (28-6),  but  there  is  an  in- 
triguing difference  in  its  meaning.  In  order  to  obtain  agreement  with  the  observed 
fact  that  light  beams  bend  toward  the  normal  on  going  from  air  into  a dense,  trans- 
parent material,  it  is  necessary  to  assume  that  the  speed  of  light  is  greater  in  the 
dense  material,  smaller  in  air,  and  smallest  in  vacuum.  In  Descartes’  derivation, 
therefore,  the  index  of  refraction  is  defined  as  the  reciprocal  of  the  quantity 
n = c/v. 

Thus  the  theoretical  question — Does  light  move  like  a stream  of  particles  or  a 
train  of  waves? — hinged  on  the  experimental  question — Is  the  speed  of  light  in 
dense,  transparent  materials  greater  or  less  than  the  speed  of  light  in  vacuum? 
Since  there  was  no  way  to  make  the  speed-of-light  measurements  that  would 
answer  the  question  at  the  time,  the  matter  remained  a center  of  dispute  for  about 
200  years.  Foucault’s  experiment  of  1850,  described  at  the  beginning  of  this  sec- 
tion, was  a final  crucial  test  between  the  wave  and  particle  theories  of  light  mo- 
tion. 

But  in  the  actual  course  of  events,  Foucault’s  work  was  largely  a matter  of 
confirmation  of  what  was,  by  then,  almost  universally  acknowledged — that  light 
moves  like  waves.  The  dispute  had  been  settled  in  favor  of  the  wave  theory  by 
about  1820,  largely  on  the  basis  of  a series  of  extraordinary  experiments  carried 
out  by  the  French  physicist  Augustin  Fresnel  (1788— 1827).  These  experiments  in- 
vestigated the  phenomenon  of  diffraction,  which  is  taken  up  later  in  this  chapter. 


Flie  phenomenon  of  total  internal  reflection  is  closely  related  to 
Snell’s  law.  Total  internal  reflection  is  illustrated  in  Fig.  28-18.  Several 
beams  of  light  pass  through  a glass  prism  and  strike  the  glass-to-air  surface 
from  inside  the  prism  at  different  angles  of  incidence.  The  uppermost 
beam  strikes  the  surface  at  a relatively  small  angle  of  incidence.  Most  of  the 
light  is  refracted  and  passes  into  air.  Only  a little  light  is  reflected.  The  next 
three  beams  of  light  strike  the  glass-to-air  surface  at  increasing  angles  of  in- 
cidence. The  amount  of  light  reflected  increases,  but  most  of  the  light  still 
passes  out  of  the  glass  prism  into  air,  with  the  angle  of  refraction  increasing 
as  the  angle  of  incidence  increases.  In  the  case  of  the  bottom  two  beams, 
however,  all  the  light  is  reflected  back  into  the  glass,  and  none  is  refracted 
into  the  air. 

To  find  the  largest  angle  of  incidence  9 for  which  refraction  occurs, 
consider  Snell’s  law  written  in  the  form 


ii/ 

— 7 sin  6 = sm  6' 
n 


Fig.  28-18  Photograph  of  six  beams  of  light  passing  through  a 
glass  prism  in  air.  Total  internal  reflection  occurs  for  the  two 
beams  that  are  closest  to  the  lower  surface  of  the  prism.  ( From 
PSSC  Physics,  D.  C.  Heath,  Boston,  1965.  Courtesy  Education  Develop- 
ment Corporation. ) 


(28-7) 


1334  Wave  Optics 


The  quantity  n/n'  is  greater  than  1 since  here  n is  the  index  of  refraction  of 
glass  and  n'  is  the  index  of  refraction  of  air.  Suppose  9,  and  therefore  sin  6, 
increases  from  the  value  zero.  When  a certain  critical  value  of  6 is  reached, 
called  the  critical  angle  9C,  the  left  side  of  Eq.  (28-7)  reaches  the  value  1. 
That  is, 


— sin  9C  = 1 
n 


or 


n' 

sin  9C  — — (28-8) 

When  this  is  true,  Eq.  (28-7)  becomes 

sin  S'  = 1 


or 


9'  = 90° 

In  these  circumstances  the  quantity  sin  9'  has  its  maximum  possible  value  1, 
the  quantity  9'  has  its  corresponding  value  90°,  and  the  refracted  beam  is 
parallel  to  the  glass-to-air  surface.  The  refraction  process  described  by 
Snell’s  law  cannot  take  place  for  values  of  9 larger  than  9C.  For  such  values 
of  9 the  law  requires  sin  9'  to  be  larger  than  1,  and  this  cannot  be.  Thus  for 
angles  of  incidence  larger  than  the  critical  angle,  refraction  ceases  abruptly, 
as  can  be  seen  in  the  photograph  of  Fig.  28- 1 8.  Total  internal  reflection  can 
occur  only  when  light  strikes  a surface  on  the  other  side  of  which  is  a mate- 
rial of  smaller  index  of  refraction.  Why? 

In  Example  28-3  the  critical  angle  for  total  internal  reflection  is  evalu- 
ated for  several  combinations  of  transparent  materials. 


EXAMPLE  28-3 


Find  the  critical  angle  of  incidence  6C  for  total  internal  reflection  for  the  cases  of  dia- 
mond in  air,  crown  glass  in  air,  and  crown  glass  in  water. 

■ Using  Eq.  (28-8)  to  write  an  explicit  expression  for  6C,  you  obtain 


0r  = sin  1 


Inserting  numerical  values  from  Table  28-1  gives  you: 

.0003 


For  diamond  in  air 


= 24.45° 


For  crown  glass  in  air  6C  = sin  1 f ) = 41.25° 

V 1.51  / 1 / 


For  crown  glass  in  water  9C  = sin  1 1 


1.5171 


= 61.79° 


The  small  critical  angle  for  diamond  is  the  principal  reason  for  its  apparent 
brilliance.  The  surface  of  the  diamond  (or  other  jewel  having  a high  index  of  re- 
fraction) is  cut  into  many  “facets”  which  make  small  angles  with  one  another. 
Suppose  that  the  diamond  is  in  a relatively  dim  environment,  but  is  illuminated 
by  a beam  of  relatively  bright  light  coming  from  one  direction.  The  light  enters  the 


28-4  Refraction  and  Total  Internal  Reflection  1335 


Fig.  28-19  A light  pipe,  showing  the  passage  of  light  by 
means  of  repeated  total  internal  reflection. 


diamond.  Because  of  the  small  critical  angle,  and  because  light  is  most  likely  to 
strike  several  facets  on  the  opposite  side  of  the  diamond  at  fairly  large  angles  of  in- 
cidence, much  of  the  light  is  likely  to  experience  total  internal  reflection.  Once 
this  happens,  the  small  angles  between  facets  make  it  likely  that  the  light  will  be 
internally  reflected  many  times  before  it  finally  strikes  a facet  at  a small  enough 
angle  of  incidence  to  reemerge.  The  diamond  thus  appears  to  cast  light  in  all 
directions.  An  illusion  is  often  created  that  the  light  originates  in  the  diamond 
itself. 


At  even  the  shiniest  of  metal  mirror  surfaces,  a small  percentage  of  the 
light  is  absorbed  on  reflection.  In  total  internal  reflection,  no  light  is  ab- 
sorbed at  the  reflecting  surface.  This  fact  is  put  to  practical  use  in  optical  in- 
struments, in  which  there  may  be  many  reflecting  surfaces  and  loss  of  light 
cannot  be  tolerated. 

Total  internal  reflection  also  underlies  the  operation  of  the  light  pipe. 
Light  enters  the  end  of  a long,  thin  rod,  often  made  of  transparent  plastic. 
Because  of  the  geometry,  the  light  can  strike  the  surface  of  the  light  pipe 
only  at  large  angles  of  incidence,  as  shown  in  Fig.  28-19.  As  long  as  the  light 
pipe  itself  is  not  curved  too  sharply,  total  internal  reflection  is  ensured,  and 
light  entering  the  pipe  is  propagated  down  it  by  multiple  reflection.  A typi- 
cal propagation  path  is  shown.  Light  can  thus  be  guided  along  a circuitous 
route  from  a source  to  where  it  is  desired.  Light  pipes  are  beginning  to  play 
an  important  role  in  communication  systems. 


Fig.  28-20  Endoscope  view  of  hemorrhagic 
tumor  on  lung.  ( Courtesy  Olympus  Corporation  of 
America. ) 


1336  Wave  Optics 


An  important  application  of  the  light  pipe  is  in  fiber  optic  bundles.  Glass  is 
drawn  out  into  extremely  thin  fibers,  usually  a few  micrometers  in  diameter.  Many 
such  fibers  are  gathered  into  bundles  embedded  in  a plastic  having  an  index  of  re- 
fraction lower  than  that  of  the  glass,  as  is  required  for  total  internal  reflection 
within  the  glass  fibers.  An  individual  fiber  which  happens  to  be  brightly  illumi- 
nated at  one  end  will  appear  bright  at  the  other,  while  a dimly  illuminated  fiber 
will  appear  dim  at  the  other  end.  Thus  the  bundle  of  quite  flexible  fibers  can  be 
used  to  produce  an  image,  as  shown  in  Fig.  28-20.  Fiber  optic  bundles  have  made 
possible  the  manufacture  of  the  medical  instruments  called  endoscopes.  These 
enable  the  physician  to  see  deeply  into  the  body,  either  through  natural  orifices  or 
through  very  small  incisions,  and  make  possible  diagnosis  and  surgery  which 
were  previously  more  difficult,  more  dangerous,  or  even  impossible. 


28-5  DISPERSION  In  vacuum,  all  electromagnetic  radiation  travels  at  exactly  the  same 

speed,  no  matter  what  its  frequency.  There  is  no  such  uniformity  of  speed, 
however,  for  electromagnetic  radiation  traveling  through  matter.  A mate- 
rial in  which  the  speed  of  radiation  traveling  through  it  depends  on  the 
frequency  (or  the  wavelength)  of  the  radiation  is  said  to  be  a dispersive 
material.  All  transparent  substances  are  more  or  less  dispersive  for  elec- 
tromagnetic radiation  in  that  part  of  the  spectrum  in  which  the  radiation 
is  called  light. 

The  variation  with  frequency  of  the  speed  at  which  light  travels  through  a 
transparent  material  is  a consequence  of  the  processes  responsible  for  the  speed 
being  smaller  than  the  speed  of  light  in  vacuum.  As  was  explained  at  the  begin- 
ning of  Sec.  28-3,  the  speed  of  light  in  a transparent  material  is  reduced  because 
there  is  a time  delay  between  the  light  waves  incident  on  electron  oscillators  of  the 
material  and  the  light  waves  emitted  by  these  oscillators.  The  amount  of  time 
delay  depends  on  the  relation  between  the  frequency  of  the  incident  waves  and 
the  resonant  frequencies  of  the  electron  oscillators.  Hence  the  amount  by  which 
the  speed  of  light  in  a transparent  material  is  less  than  the  speed  of  light  in  vac- 
uum varies  as  the  frequency  of  the  light  changes. 


Table  28-2 

Approximate  Wavelength 
Ranges  Associated  with 
Color  Names 

Color  Wavelength  (in  nm) 

Violet  400  to  424 

Blue  424  to  491 

Green  491  to  575 

Yellow  575  to  585 

Orange  585  to  647 

Red  647  to  700 


The  index  of  refraction  of  a material  is  inversely  proportional  to  the 
speed  of  light  in  the  material.  Therefore,  if  the  speed  of  light  in  the  mate- 
rial varies,  its  index  of  refraction  must  also  vary.  Thus  the  index  of  refrac- 
tion of  a dispersive  material  depends  on  the  frequency  of  the  light  traveling 
through  it.  Since  frequency  and  wavelength  are  related,  it  can  just  as  well 
be  said  that  in  a dispersive  material  the  index  of  refraction  is  a function  of 
the  wavelength  of  the  light  traveling  through  it. 

If  a beam  of  light  consisting  of  a mixture  of  wavelengths  is  refracted 
on  entering  or  leaving  a dispersive  material,  the  amount  of  refraction  will 
be  different  for  different  wavelengths  because  of  the  variation  with  wave- 
length of  the  index  of  refraction.  As  a consequence,  a single  beam  of  many 
wavelengths  incident  on  the  surface  of  a dispersive  material  leaves  the  sur- 
face as  a fanshaped  set  of  many  beams,  each  having  a particular  wave- 
length. T his  is  called  dispersion. 

The  dispersive  separation  of  sunlight  by  raindrops  underlies  the  spectacular 
phenomenon  of  the  rainbow.  While  the  geometry  of  rainbow  formation  is  very 
complex,  two  basic  facts  are  responsible  for  the  colors:  (1)  The  index  of  refraction 
of  water  depends  on  the  wavelength  of  the  light  traveling  through  it;  (2)  the  human 
eye-brain  system  perceives  differences  in  wavelength  as  differences  in  color.  (This 
is  analogous  to  the  fact  that  the  ear-brain  system  perceives  differences  in  sound 
wavelength,  and  therefore  in  frequency,  as  differences  in  pitch.)  The  relation 
between  wavelength  and  color  is  given  in  Table  28-2. 


28-5  Dispersion  1337 


Index  of  refraction  n 


Wavelength  X (in  nm) 


Fig.  28-21  Dispersion  curves — plots  of  index  of  refraction  versus  wavelength — for  medium 
flint  glass  and  for  crown  glass. 


Figure  28-21  is  a plot  of  the  index  of  refraction  as  a function  of  wave- 
length for  two  types  of  glass  commonly  used  in  optical  systems.  Such  a plot 
is  called  a dispersion  curve.  The  dispersion  curves  of  glasses  in  the  visible 
region  are  not  describable  by  simple  functions,  and  they  are  similar  to  one 
another  in  only  a general  way.  But  for  many  practical  purposes,  a 
straight-line  approximation  to  the  actual  dispersion  curve,  having  a slope 
equal  to  the  average  slope  of  the  actual  curve  in  the  visible  region,  is  ade- 
quate. Example  28-4  makes  use  of  the  data  in  Fig.  28-21. 

Sunlight  is  “white  light,”  that  is,  light  composed  of  all  colors.  A beam  of  sunlight 
strikes  the  flat  surface  of  a piece  of  medium  flint  glass  at  an  angle  of  30°.  Find  the 
angular  separation  between  the  paths  in  the  glass  of  the  blue  light  having  wave- 
length A.blue  = 450  nm  and  the  orange  light  having  wavelength  Arrange  = 625  nm. 

■ You  can  read  the  indices  of  refraction  from  Fig.  28-21.  They  are  nbIue  = 1-646 
and  norange  = 1.625.  Using  Snell’s  law,  Eq.  (28-6),  you  make  the  following  calcula- 
tions for  the  two  colors,  assuming  the  index  of  refraction  of  air  to  be  1 : 


1338 


Wave  Optics 


sin  30°  = nblue  sin  0blue 


Oh  me  = sin- 


1 


1.646 


sin  30°  = 17.68° 


Sm  30  ^orange  Sin  0 orange 


Sill 


1 


1.625 


sin  30°  = 17.92° 


The  angular  separation  is 

Grange  “ 0blue  = 17-92°  - 17.68°  = 0.24° 

Such  an  angular  separation  is  easily  measured,  if  the  proper  equipment  is  used. 


The  dispersion  of  glass  and  other  transparent  materials  underlies  the 
operation  of  the  prism  spectrometer.  Figure  28-22  shows  such  a device. 
Light  from  a narrow,  slit-shaped  source  is  formed  into  a beam  by  the  small 
telescope  on  the  left,  called  a collimator.  The  beam  then  falls  on  the  prism 
and  is  refracted  twice — once  as  it  enters  and  again  as  it  leaves  the  prism. 
Each  refraction  is  dispersive,  and  the  result  is  a series  of  images  of  the  slit, 
each  one  of  a different  wavelength.  (The  series  of  images  may  appear  as  a 
continuous  smudge  or  as  an  irregular  sequence  of  discrete  images,  de- 
pending on  the  nature  of  the  light  source.)  The  telescope  on  the  right  is 
mounted  on  a precise  angle  scale.  The  telescope  is  turned  on  an  axis  con- 
centric with  the  scale  until  its  cross  hairs  are  lined  up  with  a particular 
image  of  the  slit.  If  the  dispersive  properties  of  the  prism  material  and  the 
geometry  of  the  spectrometer  are  known,  the  wavelength  of  the  slit  image 
can  be  determined  with  considerable  accuracy.  A good  prism  spectrometer 
can  typically  determine  wavelengths  within  0.1  nm  or  better.  In  modern 
practice,  however,  other  instruments  are  used  when  the  highest  possible 
precision  is  required. 

The  detailed  study  of  the  wavelength  composition  of  the  light  emitted 
from  various  sources  is  called  spectroscopy.  (In  the  language  of  Sec.  13-7, 


28-5  Dispersion  1339 


determining  the  wavelength  composition  of  light  amounts  to  determining 
its  Fourier  spectrum.)  The  devices  developed  over  the  past  150  years  for 
such  analysis  are  among  the  most  exquisitely  precise  instruments  ever  built. 
Measurements  of  wavelength  within  1 part  in  108  are  by  no  means  unusual, 
and  much  higher  accuracy  cau  be  obtained  under  special  circumstances. 
Spectroscopy  has  been  (and  remains)  the  most  important  single  tool  for 
studying  the  interaction  of  light  with  matter  on  the  microscopic  level.  It  has 
proved  indispensible  to  our  understanding  of  the  physical  universe. 

A principal  shortcoming  of  a prism  spectrometer  lies  in  the  fact  that  no 
material  is  transparent  over  an  unlimited  range  of  wavelengths.  Most 
glasses,  for  example,  absorb  strongly  in  the  ultraviolet  and  infrared  wave- 
length regions  and  cannot  be  used  to  study  electromagnetic  radiation  in 
these  regions.  I bis  difficulty  is  overcome  by  the  grating  spectrometer,  dis- 
cussed in  Sec.  28-7.  The  grating  spectrometer  has  superseded  the  prism 
spectrometer  for  all  but  a few  special  applications.  But  before  we  can  ex- 
plain how  a grating  spectrometer  works,  we  must  first  investigate  the  phe- 
nomenon of  diffraction.  We  start  this  investigation  in  Sec.  28-6. 


28-6  TWO-SLIT  One  of  the  most  striking  properties  of  waves  is  their  ability  to  bend  around 
DIFFRACTION  corners.  Figure  28-23  shows  water  waves  doing  this  as  they  enter  a calm 

area  through  an  opening  in  a breakwater  and  penetrate  into  regions  that 
are  “shadowed”  by  the  breakwater.  I he  penetration  occurs  because  waves 
spread  after  they  pass  a barrier.  The  phenomenon  is  called  diffraction, 
from  the  Latin  diffractus,  meaning  “broken  apart." 

The  simplest  example  of  diffraction  can  be  seen  in  the  ripple-tank 
photograph  of  Fig.  28-4.  It  shows  a train  of  waves  with  straight  wave  fronts 
arriving  at  a single  slit  in  a barrier  extending  parallel  to  these  wave  fronts. 
The  slit  is  very  narrow  — its  width  is  less  than  the  wavelength  of  the  waves. 
For  this  reason,  the  slit  acts  almost  like  a single  point  source.  That  is,  the  slit 
emits  a train  of  waves  with  almost  circular  wave  fronts  that  spread  almost 


Fig.  28-23  An  aerial  view  of  a 
breakwater.  (Courtesy John  S.  Shelton.) 


1340  Wave  Optics 


Fig.  28-24  Ripple-tank  photograph  of 
two-slit  diffraction.  (From  PSSC  Physics, 
D.  C.  Heath,  Boston,  1965.  Courtesy  Educa- 
tion Development  Corporation. ) 


uniformly  in  a broad  range  of  directions  in  die  region  beyond  the  barrier. 
We  have  made  use  of  this  example  to  explain  the  physical  basis  of  Huygens’ 
construction.  It  is  so  simple  that  there  is  not  much  more  to  be  said  about  it. 

When  two  narrow  slits  are  made  in  the  barrier  in  the  ripple  tank,  there 
are  two  point  sources  emitting  trains  of  circular  wave  fronts,  one  at  each 
slit.  Each  set  of  waves  spreads  as  it  travels  beyond  the  barrier.  So  each  set 
ultimately  spreads  into  the  other.  Where  the  two  sets  of  waves  overlap,  they 
superpose  to  form  a resultant  wave.  If  the  slits  are  very  far  apart,  the  waves 
reaching  the  overlap  region  will  be  very  weak,  and  the  superposition  effects 
will  be  minor.  But  significant  effects  of  superposition  arise  if  the  slits  are 
fairly  close  together.  An  example  is  shown  in  the  ripple-tank  photograph 
of  Fig.  28-24.  fhe  complex  pattern  in  the  region  beyond  the  two  narrow 
slits  results  from  the  superposition  of  waves  originating  at  one  slit  with 
waves  originating  at  the  other.  We  call  the  process  responsible  for  the  pro- 
duction of  the  pattern  two-slit  diffraction.  It  is  the  subject  of  this  section. 

A distinction  in  terminology  is  sometimes  made  between  the  process  in- 
volved when  waves  pass  through  different  slits  (the  slits  having  small  width)  and 
that  involved  when  waves  pass  through  different  parts  of  the  same  slit  (the  slit 
having  appreciable  width).  This  is  done  by  calling  the  first  interference  and  the 
second  diffraction.  That  is,  the  process  in  Fig.  28-24  is  sometimes  called  two-slit 
interference  to  make  a contrast  with  the  single-slit  diffraction  process  in  Fig. 
28-4.  We  do  not  follow  this  convention  because  it  obscures  the  fundamental  sim- 
ilarity of  the  two  processes.  They  are  both  consequences  of  the  way  waves  spread 
and  the  way  they  superpose. 

Figure  28-25  is  a diagram  that  can  be  used  to  explain  the  production  of 
a two-slit  diffraction  pattern,  such  as  the  one  seen  in  Fig.  28-24.  It  illus- 


Fig.  28-25 


C/5  ^ C/5 

cu  O 14 
O f— i o 


Formation  of  a two-slit  diffraction  pattern. 


28-6  Two-Slit  Diffraction  1341 


trates  the  situation  at  an  instant  when  a linear  wave  front  of  the  incident 
wave  arrives  at  the  two  slits,  with  the  wave  front  corresponding  to  a crest  (a 
region  where  the  displacement  has  a maximum  positive  value).  Other 
linear  wave  fronts  of  the  incident  wave  are  shown.  They  also  represent 
crests  since  the  separation  between  adjacent  ones  is  k,  the  wavelength  of 
the  wave.  Spreading  from  the  two  slits  are  two  sets  of  wave  fronts.  Each  set 
is  circular  because  it  is  assumed  that  each  originates  from  a single  point 
source  at  the  center  of  the  slit  from  which  it  spreads.  Since  every  incident 
wave  front  arrives  at  both  of  these  sources  simultaneously,  the  sources  oscil- 
late in  phase.  A consequence  of  this  fact  is  that  for  both  sets  of  circular  wave 
fronts  the  radius  of  the  innermost  one  is  A.,  the  radius  of  the  next  is  2A.,  and 
so  forth.  Halfway  between  any  two  consecutive  wave  fronts  representing 
crests  of  the  waves  are  the  troughs  of  the  waves  (regions  where  the  dis- 
placement has  a maximum  negative  value). 

At  the  locations  marked  by  open  clots  in  the  figure,  a crest  from  one 
set  of  waves  superposes  constructively  with  a crest  from  the  other  set,  to 
produce  a resultant  crest  higher  (more  positive)  than  either— or  else  a 
trough  from  one  set  superposes  constructively  with  a trough  from  the  other, 
to  produce  a resultant  trough  lower  (more  negative)  than  either.  In 
between,  at  the  locations  marked  by  crosses,  troughs  superpose  destructively 
with  crests,  resulting  in  very  small  crests  or  troughs. 

With  the  passage  of  time,  the  waves  in  each  set  move  away  from  their 
sources  at  the  slits,  troughs  following  crests  and  vice  versa.  Thus  a location 
where  constructive  superposition  has  produced  a high  crest  will  soon  be 
host  to  a deep  trough.  That  is,  the  location  will  experience  a resultant  wave 
of  large  amplitude  because  it  is  composed  of  two  combining  waves  that 
always  arrive  in  phase  with  each  other.  At  the  intermediate  locations,  how- 
ever, the  waves  from  the  two  slits  always  superpose  destructively  since  they 
always  arrive  out  of  phase  with  each  other.  As  the  crest  of  one  gives  way  to 
the  following  trough,  the  trough  of  the  other  gives  way  to  the  following 
crest.  Thus  at  these  locations  the  wave  resulting  from  the  combination  of 
the  two  waves  has  zero  amplitude  (providing  the  two  combining  waves  have 
equal  amplitudes). 

In  Fig.  28-25,  curves  connecting  adjacent  open  dots  represent  paths 
along  which  the  strongest  resultant  waves  travel.  These  paths  are  called 
lines  of  antinodes.  Similar  curves  connecting  adjacent  crosses  trace  paths 
along  which  the  amplitude  of  the  resultant  wave  is  zero.  They  are  called 
lines  of  nodes.  If  you  look  back  at  Fig.  28-24,  you  can  see  lines  of  nodes  in 
the  ripple-tank  photograph  of  two-slit  diffraction  as  narrow  bands.  The 
lines  of  antinodes  are  the  broader  bands  in  between  the  narrower  ones. 

Suppose  a screen  is  placed  behind  the  barrier  containing  the  two  slits 
and  parallel  to  it,  as  indicated  in  Fig.  28-25.  The  screen  will  be  “striped”  al- 
ternately with  lines  of  antinodes  and  lines  of  nodes.  Wherever  a line  of  an- 
tinodes strikes  the  screen,  the  resultant  wave  will  have  a maximum  ampli- 
tude, and  the  wave  action  will  be  large.  Wherever  the  screen  is  intersected 
by  lines  of  nodes,  the  resultant  wave  will  have  zero  amplitude,  and  there 
will  be  no  wave  action.  Such  regions  on  the  screen  are  called,  respectively, 
diffraction  maxima  and  diffraction  minima.  The  entire  system  of  maxima 
and  minima  on  the  screen  is  called  a diffraction  pattern.  At  a diffraction 
maximum  the  amplitude  of  the  wave  incident  on  the  screen  is  at  a max- 
imum, and  so  a large  flux  of  energy  is  delivered  to  the  screen  by  the  wave. 
At  a diffraction  minimum  the  wave  amplitude  is  zero,  and  no  energy  is 
delivered. 


1342  Wave  Optics 


The  conclusions  we  have  obtained  apply  to  waves  of  any  nature — not 
just  to  water  waves  moving  over  the  surface  of  a ripple  tank — providing 
they  can  be  represented  by  the  diagrams  we  have  drawn.  This  is  possible 
for  light  waves  with  plane  wave  fronts  incident  on  a barrier  which  is  parallel 
to  these  planes  and  contains  two  long,  narrow  slits.  With  the  slits  and  the 
screen  extending  into  and  out  of  the  plane  of  the  page  in  Fig.  28-25,  the 
straight  lines  represent  intersections  with  the  page  of  the  incident  plane 
wave  fronts,  and  the  circular  arcs  represent  the  intersections  of  the  emitted 
wave  fronts,  which  are  parts  of  cylinders. 

The  diagram  in  Fig.  28-26,  which  is  constructed  in  the  plane  of  the 
page  in  the  preceding  figures,  can  be  used  to  make  a quantitative  analysis 
of  two-slit  diffraction.  The  analysis  assumes  that  the  width  of  each  slit  is  small 
compared  to  the  spacing  D between  them.  Then  the  distance  from  any  point  on 
slit  1 to  a point  P on  the  screen  is  approximated  closely  by  the  distance  lx 
from  the  center  of  the  slit  to  P,  and  the  same  is  true  for  slit  2.  Incident 
plane  wave  fronts  move  to  the  right  at  speed  v,  each  arriving  simulta- 
neously at  the  two  slits.  Thus  the  slits  become  sources  1 and  2 that  oscillate 
in  phase.  This  means  that  two  “crests”  emitted  by  the  two  sources  start  on 
their  trips  to  P at  the  same  time.  If  P were  at  the  point  marked  0,  so  that 
the  path  lengths  /x  and  /2  were  equal,  then  the  time  intervals  fi  = IJv  and 
h = h/v  required  for  these  trips  would  be  equal  and  the  two  “crests”  would 
reach  P at  the  same  time.  In  these  circumstances  they  would  superpose 
constructively  at  P.  The  same  would  be  true  of  two  “troughs.” 

Suppose  instead  that  path  length  l2  is  just  a half-wavelength  longer 
than  path  length  lx . That  is,  suppose 


M = l2  ~ h — U 


Fhen  it  takes  an  extra  time  interval 


A/  1 x 


for  one  wave  to  travel  to  P beyond  what  is  necessary  for  the  other.  Using 
the  relation  k/v  = T,  where  T is  the  period  of  the  wave,  we  can  write  this  as 

M = hT 


Fig.  28-26  A diagram  used  in  the  quantitative  analysis 
p of  two-slit  diffraction. 


y 


Slit  1 


o 


Slit  2 


L 


Barrier 


Screen 


28-6  Two-Slit  Diffraction  1343 


That  is,  it  takes  one  wave  a half- period  longer  to  travel  to  P than  it  does  the 
other.  As  a consequence,  a “crest"  of  one  wave  arrives  when  a “trough”  of  the 
other  arrives.  The  waves  are  out  of  phase  at  their  arrival  at  P,  and  so  they 
superpose  destructively. 

Now  let  us  generalize  the  argument.  Destructive  superposition  occurs  at 
any  point  for  which  the  difference  in  the  path  lengths  from  the  two  slits  to 
the  point  P is  an  odd  half-integral  number  of  wavelengths,  that  is,  iA  or  §A 
orfA,  and  so  forth.  This  is  true  because  one  wave  takes  an  odd  half-integral 
number  of  periods  longer  than  the  other  to  travel  to  P,  and  so  the  two 
waves  superpose  destructively  on  arriving  there.  The  condition  for 
destructive  superposition  can  be  written 

M = l2  - f = (j  + i)A  (28-9) 

where  j is  one  of  the  integers  j = 0,  ±1,  ±2,  ±3,  .... 

Constructive  superposition  takes  place  at  any  point  P for  which  the  path 
lengths  are  different  by  an  integral  number  of  wavelengths,  because  the 
waves  arrive  in  phase.  The  condition  for  constructive  superposition  is 

M=  l2-  lx  = jk  (28-10) 

where  j — 0,  ± 1,  ± 2,  ± 3,  . . . . The  absolute  value  of  j in  these  two  equa- 
tions is  called  the  order  of  interference. 

Where  will  the  diffraction  maxima  and  minima  fall  on  the  screen?  It  is 
convenient  to  describe  the  result  in  terms  of  the  angle  between  the  paths 
and  the  normal  to  the  barrier  containing  the  slits.  We  assume  the  screen  is 
set  up  a long  distance  L from  the  barrier — a much  longer  distance  than  is 
shown  in  Fig.  28-26.  Specifically,  we  assume  that  the  distance  Lfrom  the  slits  to 
the  screen  is  large  compared  to  the  spacing  D between  the  slits.  It  follows  that  the 
paths  of  lengths  lx  and  /2  are  very  nearly  parallel,  so  that  the  angles  0X  and  d2 
are  very  nearly  equal.  We  can  therefore  drop  the  subscripts  for  the  angles 
and  write  6X  = d2  = 0 , as  in  Fig.  28-27. 

There  is  another  reason  for  specifying  the  condition  L ?s>  D.  If  the  superposi- 
tion at  the  nodes  is  to  be  nearly  completely  destructive,  so  that  the  amplitude  at 
the  nodes  is  close  to  zero,  the  amplitudes  of  the  two  waves  must  be  nearly  equal  at 
the  nodes.  This  is  possible  only  if  the  lengths  J x and  1 2 of  the  paths  from  the  slits  to 
a node  on  the  screen  are  nearly  equal. 

When  the  condition  L »D  is  satisfied,  the  diffraction  is  called  Fraunhofer 
diffraction,  after  the  German  scientist  and  lens  maker  Joseph  Fraunhofer 
(1787— 1826).  Otherwise  it  is  called  Fresnel  diffraction.  The  mathematical  analy- 
sis of  Fraunhofer  diffraction  is  much  simpler  than  that  of  Fresnel  diffraction.  In 
this  book  we  treat  only  the  Fraunhofer  case. 

Consider  the  general  point  P,  for  which  the  path  lengths  are  lx  and  l2. 
As  they  are  drawn  in  Fig.  28-27,  /2  is  greater  than  lx.  But  what  is  important 
is  the  difference  in  path  length,  A l = l2  — lx.  Starting  at  P,  we  mark  off  a 
length  equal  to  lx  along  the  path  of  length  i2.  Since  the  paths  are  almost  par- 
allel, this  is  done  by  dropping  a perpendicular  ab  to  the  path  of  length  /2, 
starting  at  slit  1 . 

The  path-length  difference  A l is  one  side  of  a right  triangle  of  which  D 
and  ab  are  the  other  sides.  The  angle  at  the  apex  of  that  triangle  is  equal  to 
6.  This  is  so  since  D is  perpendicular  to  the  dashed  lines  extending  toward 
O,  and  ab  is  perpendicular  to  the  paths  of  length  lx  and  l2.  Consequently, 
the  path-length  difference  can  be  expressed  in  terms  of  the  slit  separation 


1344  Wave  Optics 


Fig.  28-27  The  geometry  of  two-slit  diffraction  in  a situation  where 
the  distance  L from  the  barrier  to  the  screen  is  very  much  larger  than 
the  spacing  D between  the  slits. 


A l = D siti  9 


(28-11) 


I his  equation  can  be  combined  with  either  the  condition  for  destructive  su- 
perposition. Eq.  (28-9),  or  the  condition  for  constructive  superposition,  Eq. 
(28-10),  to  give  the  angles  at  which  diffraction  minima  or  maxima  are  ob- 
served. We  obtain 


and 


(j  + i)A  = D sin  9 (two-slit  minima) 


(28-12a) 


jk  = D sin  9 (two-slit  maxima) 


(28-126) 


In  these  expressions  j = 0,  ±1,  ±2,  ±3,  . . . . 

Let  us  give  the  name  y to  the  distance  OP  in  Fig.  28-26,  between  the 
center  point  on  the  screen  and  the  arbitrary  point  P.  Equations  (28-12 a) 
and  (28-126)  can  be  written  in  terms  of  y and  the  barrier-to-screen  distance 
L.  We  consider  only  values  of  y which  are  very  small  compared  to  L.  Then  the 
angle  6 will  be  small.  We  therefore  have,  to  a very  good  approximation, 
sin  6 = tan  9.  Now  Fig.  28-26  shows  that  when  y « L and  L » D,  then 

„ y 

tan  9 = — 


Thus  we  have  y = L tan  9 = L sin  9.  Using  Eqs.  (28-1 2a)  and  (28-126),  we 
obtain 


and 


y = 


L(J  + i)\ 
D 


(two-slit  minima) 


(28- 13a) 


(two-slit  maxima) 


(28-136) 


where  j = 0,  ±1,  ±2,  ±3,  . . . . 


The  separation  Ay  between  adjacent  maxima  provides  a way  of  charac- 
terizing a particular  two-slit  diffraction  pattern.  This  quantity  has  the  value 
Ay  = y„+1  — yn,  where  n is  any  integer  and  where  the  value  of  yn+1,  or  yn,  is 
found  from  Eq.  (28-136)  by  settings  equal  to  n + 1,  or  to  n.  Doing  this,  we 
have 

L(n  + 1)A  Lnk 


Performing  the  subtraction,  we  obtain  the  result 

= ii A 


(28-14) 


28-6  Two-Slit  Diffraction  1345 


The  same  result  is  obtained  for  the  separation  between  adjacent  minima  by 
using  Eq.  (28- 13a). 

Equation  (28-14)  shows  that  the  separation  Ay  between  adjacent 
maxima,  or  minima,  in  a two-slit  diffraction  pattern  is  proportional  to  the 
wavelength  A of  the  waves  being  diffracted.  And  it  shows  that  the  propor- 
tionality constant  is  L/D.  This  constant  has  a very  large  value.  Hence,  the 
distance  A y that  characterizes  the  diffraction  pattern  is  proportional  to,  but  very 
much  magnified  compared  to,  the  distance  A that  characterizes  the  waves  which  pro- 
duced it.  This  fact  is  turned  to  great  advantage  in  the  study  of  light  waves, 
whose  wavelengths  are  inconveniently  short  on  the  commonplace  scale  of 
things. 


Sodium 

vapor 

lamp 


Experimental  investigation  of  two-slit  diffraction  of  light  was  first  car- 
ried out  in  about  1800  by  Thomas  Young  (the  man  after  whom  Young's 
modulus  is  named).  It  was  Young’s  work  which  revived  interest  in  the  wave 
theory  of  light  after  a century  of  neglect.  During  the  period  ending  about 
1820,  Fresnel  in  particular  extended  Young’s  experiments  and  developed  a 
sound  theoretical  basis  for  understanding  them. 

Figure  28-28a  illustrates  a version  of  Young’s  experiment  which  is  ana- 
lyzed in  Example  28-5.  Note  that  the  light  striking  the  two  slits  in  the  sec- 
ond barrier  is  not  in  the  form  of  a beam  with  plane  wave  fronts.  But  it  is  not 
necessary  that  this  be  the  case  in  order  to  have  the  two  slits  act  like  sources 
which  emit  in  phase,  as  we  have  been  assuming.  The  single  slit  in  the  first 


barrier  is  so  narrow  that  all  the  light  emitted  from  the  slit  has  the  same 


n 


■1.24  m ■ 


Fig.  28-28  (a)  A two-slit  diffraction  apparatus  considered 

in  Example  28-5.  (b)  The  two-slit  diffraction  pattern  obtained 
with  the  apparatus.  Proceeding  away  from  the  center  of  the 
pattern,  in  either  direction,  the  intensity  of  the  maxima 
gradually  decreases.  The  reason  is  explained  at  the  end  of 
Sec.  28-8.  ( From  F.  Jenkins  and  H.  White,  Fundamentals  of 
Optics,  4th  ed.,  McGraw-Hill,  New  York,  1976.) 


(a) 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 in  1 1 1 1 1 1 n 1 1 1 1 1 1 1 hi  1 1 1 i|i 1 1 1 1 1 m | n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1~ 

012345678 

cm 


C b ) 


1346  Wave  Optics 


EXAMPLE  28-5 


28-7  MULTISLIT 
DIFFRACTION 


phase  as  it  is  emitted.  This  light  spreads  by  diffraction,  part  of  it  arriving  at 
each  of  the  two  slits.  If  the  two  slits  are  equidistant  from  the  single  slit,  the 
light  arriving  at  one  of  them  is  in  phase  with  the  light  arriving  at  the  other. 
Then  the  two  slits  will  emit  in  phase,  which  satisfies  our  assumption. 

In  practice,  the  two  slits  will  not  be  equidistant  from  the  single  slit  to  such  a 
small  fraction  of  a wavelength  that  they  act  like  sources  which  emit  in  phase.  In- 
stead, the  phase  of  one  will  differ  from  the  phase  of  the  other  by  a constant.  This 
means  that  Eqs.  (28-13a)  and  (28-13b)  will  not  be  complete.  A constant  term 
should  be  added  to  their  right  sides,  whose  value  is  determined  by  the  constant 
phase  difference.  But  this  has  no  effect  on  Eq.  (28-14)  because  the  constant  cancels 
in  the  subtraction  leading  to  that  equation.  Thus  Eq.  (28-14)  applies  to  the  diffrac- 
tion pattern  produced  by  the  apparatus  in  Fig.  28-28 a. 

In  many  modern  diffraction  experiments  a laser  is  used  as  a light  source. 
When  properly  adjusted,  it  produces  light  having  very  nearly  a single  wavelength 
and  wave  fronts  that  are  very  nearly  plane  wave  fronts  (except  close  to  their 
edges).  Such  a source  replaces  the  lamp,  filter,  and  single  slit  in  Fig.  28-28a. 


The  apparatus  in  Fig.  28-28a  is  used  to  determine  the  wavelength  of  the  yellow 
light  emitted  by  hot  sodium  vapor.  A yellow  filter  (a  sheet  of  yellow  transparent 
plastic)  placed  in  front  of  a sodium  vapor  lamp  absorbs  all  the  light  emitted  by  the 
lamp  except  that  which  lies  in  the  yellow-wavelength  range.  By  measurement  with  a 
traveling  microscope  you  find  the  separation  between  the  two  slits  to  be  D = 0.204 
mm.  You  measure  the  distance  from  the  second  barrier  to  the  photographic  him 
with  a meter  stick  and  obtain  the  value  L — 1.24  m.  After  exposing  the  him  to  the 
two-slit  diffraction  pattern,  you  develop  it  and  photograph  it  adjacent  to  a centime- 
ter scale.  Then  you  make  a print  like  the  one  displayed  in  Fig.  28-286.  What  is  the 
wavelength  of  the  yellow  light? 

■ You  can  evaluate  the  wavelength  X from  Eq.  (28-14), 


L 


Since  the  individual  maxima  in  the  diffraction  pattern  are  not  sharp,  you  can  obtain 
an  accurate  value  of  their  spacing  Ay  by  measuring  the  distance  between  two  widely 
separated  maxima  and  dividing  by  the  number  of  spacings  between  them.  Choosing 
the  two  outermost  maxima  which  are  clearly  discernible  (marked  with  arrows  in  Fig. 
28-28 b),  you  hnd  the  distance  between  them  to  be  7.54  cm.  Since  there  are  21 
spacings  between  these  two  maxima,  you  have 

7.54  x KT2  m 

Ay  = — = 3.59  x 1(T3  m 

According  to  Eq.  (28-14), 

D Ay 


Using  the  value  just  obtained  for  Ay  and  the  values  quoted  for  D and  L,  you  hnd 
0.204  x 10"3  m x 3.59  x 10~3  m 

X = — = 5.91  x 10“7  m = 591  nm 

1.24  m 

Precise  spectrometric  measurements  yield  a value  X = 589  nm  for  the  wavelength  of 
this  light. 


From  a practical  point  of  view,  the  two-slit  experiment  considered  in  Ex- 
ample 28-5  is  not  a very  good  one  for  measuring  the  wavelength  of  light. 
The  diffraction  maxima  produced  by  two  slits  are  quite  broad,  as  you  have 


28-7  Multislit  Diffraction  1347 


Fig.  28-29  A multislit  diffraction  ap- 
paratus. The  light  waves  incident  on 
the  barrier  have  plane  wave  fronts  that 
are  parallel  to  the  barrier.  The  separa- 
tions between  all  adjacent  pairs  of  the  N 
slits  have  the  same  value,  D.  In  the  anal- 
ysis it  is  assumed  that  the  separation  L 
between  the  barrier  and  screen  is  very 
large  compared  to  ND,  so  that  the  path 
from  any  slit  to  a point  P on  the  screen  is 
essentially  parallel  to  all  other  such 
paths. 


seen  in  Fig.  28-28 b.  This  makes  it  difficult  to  determine  accurately  the 
spacing  between  the  maxima  and  limits  the  precision  of  the  measurement. 
The  difficulty  can  be  overcome  by  using  a multislit  apparatus  instead  of  a 
two-slit  apparatus.  As  you  will  see  in  this  section,  diffraction  maxima  pro- 
duced by  a barrier  containing  a large  number  of  equally  spaced  slits  are 
very  narrow. 

A multislit  arrangement  is  shown  in  Fig.  28-29.  Light  waves  (or  some 
other  form  of  electromagnetic  or  mechanical  waves)  with  plane  wave  fronts 
are  incident  on  a barrier  in  a plane  parallel  to  the  wave  fronts.  The  barrier 
contains  a large  number  N of  evenly  spaced  slits,  instead  of  just  two  slits. 
The  width  of  any  slit  is,  again,  much  less  than  the  spacing  D between  adja- 
cent slits.  And  the  distance  L to  the  screen  is  much  larger  than  D,  as  before. 
Furthermore,  L is  much  larger  than  ND,  the  width  of  the  entire  array  of  slits.  This 
means  that  the  paths  of  length  lx,  l2,  l3,  . . . , lN  drawn  from  the  slits  to  the 
arbitrary  point  P on  the  screen  are  nearly  parallel  to  one  another. 

Suppose  that  a diffraction  maximum  lies  at  point  P.  Then,  just  as  in 
the  two-slit  case,  the  path  difference  A/1j2  = /2  — d must  be  an  integral 
number  of  wavelengths,  j A.  That  is  A /lj2  = j'A.  Since  the  triangles  abc,  cde, 
efg,  and  so  forth  are  essentially  identical  because  the  paths  are  nearly  paral- 
lel, the  differences  between  the  lengths  of  adjacent  paths  are  essentially 
equal.  To  be  specific, 

A/i,2  = A/2,3  = • • • = A In-1,N 

As  a consequence,  when  the  condition  for  constructive  superposition  is  sat- 
isfied for  the  first  two  patfis,  it  is  satisfied  for  all  the  paths,  and  a diffraction 
maximum  is  produced. 

As  far  as  the  locations  of  the  diffraction  maxima  are  concerned,  the 


1348 


Wave  Optics 


argument  just  concluded  shows  that  they  are  just  the  same  as  in  a two-sljt 
pattern  with  the  same  value  of  D,  because  they  are  determined  by  precisely 
the  same  condition  as  the  one  given  by  Eq.  (28- 126)  or  (28- 136)  for  two  slits. 
Hence,  the  condition  for  maxima  in  multislit  diffraction  is 


or 


jk  = D sin  9 (multislit  maxima) 


(28-15) 


(multislit  maxima) 


(28-16) 


where  j = 0,  ± 1,  ± 2,  ± 3,  . . . .In  these  expressions  A.  is  the  wavelength  of 
the  light,  D is  the  separation  between  adjacent  slits,  9 is  the  angle  at  which  a 
maximum  is  found,  y is  its  position  on  the  screen,  and  L is  the  distance  from 
the  slit  system  to  the  screen. 


But  there  is  an  important  difference  in  the  amplitude  of  the  resultant 
wave  at  the  diffraction  maxima,  compared  to  what  it  is  in  the  two-slit  case. 
Since  N waves  are  constructively  superposed  at  these  points  instead  of  only 
two,  the  amplitude  is  much  greater  for  the  multislit  case. 

Another  very  important  difference  between  the  two-slit  and  multislit 
arrangements  is  less  obvious.  The  maxima  produced  by  the  multiple  slits  are 
much  sharper.  In  the  two-slit  case  there  is  a gradual  variation  in  amplitude  as 
we  move  along  the  diffraction  pattern  from  a maximum  to  an  adjacent 
minimum.  For  as  we  do  so,  the  two  sets  of  combining  waves  go  gradually 
out  of  phase,  at  first  adding  less  and  less  effectively  and  then  subtracting 
more  and  more  effectively.  But  consider  a case  where  there  are  100  slits. 
Suppose  that  the  waves  following  all  the  paths  in  Fig.  28-29  combine  in 
phase  at  point  P,  so  that  P is  a maximum  in  the  diffraction  pattern.  Fet  P' 
be  a point  just  far  enough  away  from  P that  a wave  following  the  path  from 
slit  2 will  be  out  of  phase  with  a wave  following  the  path  from  slit  1 by  1/100 
of  an  oscillation  cycle  when  the  waves  arrive  at  P' . As  far  as  these  two  waves 
are  concerned,  the  effect  of  the  phase  difference  will  be  hardly  noticeable. 
But  a wave  from  slit  3 will  be  out  of  phase  with  the  wave  from  slit  1 by 
2/100  of  a cycle  at  P' , and  the  wave  from  slit  4 will  be  out  of  phase  with  the 
wave  from  slit  1 by  3/100  of  a cycle  at  that  point,  and  so  forth.  Thus  the 
waves  from  slits  1 and  5 1 will  be  a half  cycle  out  of  phase  when  they  arrive  at 
P and  will  superpose  destructively.  The  same  will  be  true  of  those  from  slits  2 
and  52,  3 and  53,  and  so  on  through  slits  50  and  100.  Hence  the  waves 
from  all  100  slits  combine  to  form  a resultant  wave  of  zero  amplitude,  and 
P'  is  a minimum  of  the  diffraction  pattern.  The  maximum  at  P in  the  mul- 
tislit case  is  much  sharper  than  a maximum  in  a two-slit  case  because  the 
distance  from  P to  P'  is  much  smaller  in  the  former  than  in  the  latter,  if  a 
fair  comparison  is  made  by  keeping  the  separation  between  adjacent  slits 
the  same  in  both  cases. 

To  see  this,  consider  what  happens  in  the  two-slit  case.  If  there  are 
only  slits  1 and  2,  then  for  P'  to  be  a minimum,  the  wave  arriving  at  that 
point  from  slit  2 must  be  out  of  phase  with  the  wave  arriving  there  from  slit 
1 by  1/2  of  an  oscillation  cycle,  instead  of  1/100  of  an  oscillation  cycle.  This 
means  that  when  there  are  two  slits,  the  distance  from  P to  P'  must  be  50 
times  larger  than  it  is  when  there  are  100  slits.  To  put  it  the  other  way, 
when  there  are  100  slits,  the  distance  from  a maximum  to  the  next  min- 
imum is  smaller  than  it  is  when  there  are  two  slits  by  the  factor  1/50  = 
2/100.  You  can  easily  extend  the  argument  to  show  that  if  there  are  N slits, 
this  factor  has  the  value  2/N. 


28-7  Multislit  Diffraction  1349 


A comparison  between  two-slit  and  multislit  diffraction  patterns  can  be 
made  most  conveniently  in  terms  of  a quantity  called  intensity.  The  inten- 
sity / in  a light  wave,  or  other  form  of  electromagnetic  wave,  is  a propor- 
tional measure  of  the  energy  carried  by  the  wave  past  any  fixed  point  per 
unit  time.  That  is,  the  intensity  is  proportional  to  the  energy  flux  in  the 
wave.  The  significance  of  this  quantity  can  be  appreciated  by  noting  that 
the  brightness  of  the  light  delivered  by  the  wave  to  a screen  intercepting  the 
wave,  being  proportional  to  the  energy  delivered  per  unit  time,  is  propor- 
tional to  the  intensity.  In  Sec.  27-5  the  Poynting  vector  S was  introduced  to 
provide  a quantitative  measure  of  the  energy  flux  in  an  electromagnetic 
wave.  The  intensity  / is  just  a quantity  that  is  proportional  to  (S),  the 
average  of  the  magnitude  of  the  Poynting  vector.  The  intensity  is  used  in  a 
more  qualitative  way  than  the  Poynting  vector  is  used. 

Equation  (27-30/;),  (S)  = «f|0/2 fx0c,  shows  that  the  intensity  of  an  elec- 
tromagnetic wave,  such  as  a light  wave,  is  proportional  to  the  square  of  its 
amplitude.  [The  amplitude  can  be  that  of  the  electric  field,  %y0,  as  in  Eq. 
(27-30 b),  or,  just  as  well,  the  amplitude  of  the  magnetic  field.]  Since  the  in- 
tensity is  proportional  to  the  Poynting  vector,  while  the  Poynting  vector  is 
proportional  to  the  square  of  the  amplitude,  we  see  that  the  intensity  of  a light 
wave,  or  other  form  of  electromagnetic  wave,  is  proportional  to  the  square  of  its  am- 
plitude. 

Now  we  compare  the  intensity  in  two-slit  and  multislit  diffraction  pat- 
terns. Figure  28-30  presents  the  results  of  calculations  similar  to  one  which 
is  carried  out  near  the  end  of  Sec.  28-8.  It  is  a plot  versus  position  of  the  in- 
tensity I in  diffraction  produced  by  barriers  with  equally  spaced  slits,  whose 
equal  widths  are  very  small  compared  to  their  separation.  In  one  case  the 
barrier  contains  two  slits,  and  in  the  other  it  contains  six  slits.  The  separa- 
tion D between  adjacent  slits,  the  screen  distance  L,  and  the  wavelength  A of 
the  diffracted  light  have  the  same  values  in  both  cases.  The  horizontal  axes, 
which  are  both  drawn  to  the  same  scale,  represent  either  the  angle  6 of  the 
dilfracted  light  measured  from  the  direction  of  the  incident  light  or  the  po- 
sition y at  which  the  light  hits  the  screen  measured  from  a position  in  line 
with  the  center  of  the  system  of  slits.  (It  is  assumed  that  6 is  small  enough 
that  0 is  proportional  to  y.)  file  vertical  axes,  which  are  also  drawn  to  the 
same  scale  in  both  cases,  represent  the  intensity  I of  the  diffracted  light. 

Inspection  of  the  diffraction  patterns  shown  in  the  figure  illustrates 
the  statement  made  earlier  about  the  maximum  in  a multislit  diffraction 
pattern  occurring  just  where  the  maxima  occur  in  a two-slit  pattern  for  the 
same  values  of  D,  L,  and  K.  I he  figure  also  shows  a set  of  very  weak  “sec- 
ondary maxima”  located  between  the  “principal  maxima”  of  the  multislit 
pattern.  There  are  always  N — 2 such  secondary  maxima,  where  N is  the 
number  of  slits.  Can  you  explain  their  origin?  With  increasing  N the  sec- 
ondary maxima  rapidly  become  so  weak  that  they  are  undetectable. 

The  maxima  in  a multislit  diffraction  pattern  are  higher  than  they  are 
in  a two-slit  pattern.  As  we  have  said  earlier,  at  the  maxima  of  an  A-slit  dif- 
fraction pattern  the  amplitude  of  the  resultant  wave  is  proportional  to  N 
because  there  are  N combining  waves  which  superpose  in  phase.  Since  the 
intensity  of  any  wave  is  proportional  to  the  square  of  its  amplitude,  we  con- 
clude that  the  peak  intensity  in  any  of  the  maxima,  labeled  h in  the  figure, 
should  be  proportional  to  N2.  The  proportionality  h N2  can  be  seen  by 
comparing  the  two  diffraction  patterns.  The  ratio  of  the  values  of  N is 
6/2  = 3/1,  and  the  ratio  of  the  values  of  the  peak  intensity  h is  9/1  = (3/1  )2. 


1350  Wave  Optics 


Fig.  28-30  A comparison  of  the  diffraction  patterns  produced  by  (a)  two  narrow  slits  and  ( b ) 
six  uniformly  spaced,  narrow  slits.  For  both  systems  the  slit  separation  D,  screen  distance  L , 
and  wavelength  A.  have  the  same  values.  The  intensity  I is  plotted  versus  the  diffraction  angle 
9,  or  versus  the  position  y on  the  screen.  Also  shown  are  the  “heights"  h of  the  maxima  and 
their  “widths”  w. 


The  figure  also  illustrates  the  conclusion,  drawn  earlier,  that  the 
spacing  between  a maximum  and  the  nearest  minimum  will  be  smaller  in  a 
diffraction  pattern  for  N slits  than  in  a diffraction  pattern  for  two  slits  by 
the  factor  2/N.  In  this  case  we  can  make  such  a comparison  for  N — 6.  The 
predicted  factor  has  the  value  2/6  = 1/3.  Inspection  of  the  figure  shows  that 
it  agrees  with  the  prediction. 

As  the  number  N of  slits  increases,  the  spacing  from  a maximum  to  the 
nearest  minimum  decreases  in  inverse  proportion.  Since  the  width  of  a 
maximum  is  determined  by  the  spacing  to  the  nearest  minimum,  the  width 
must  have  the  same  dependence  on  N as  the  spacing.  The  quantity  labeled 
w in  Fig.  28-30  frequently  is  used  to  specify  the  width  of  a maximum.  Its 
full  name  is  the  half-width  at  half-maximum  intensity,  but  often  it  is  called 
simply  the  width.  The  maxima  in  a multislit  diffraction  pattern  become 
sharper  with  increasing  N because  the  width  w of  a multislit  diffraction  max- 
imum decreases  in  inverse  proportion  to  the  number  N of  slits: 

W (28-17) 


28-7  Multislit  Diffraction  1351 


A multislit  device  of  the  sort  just  discussed,  having  many  slits  and  thus 
a large  value  of  N,  is  called  a diffraction  grating.  The  spacing  D between  its 
slits  is  called  the  grating  spacing.  A crude  diffraction  grating  was  first  con- 
structed anti  described  in  1786  by  tbe  U.S.  instrument  maker  and  optician 
David  Rittenhouse.  Beginning  with  the  work  of  Rowland  about  1 880,  a tre- 
mendous amount  of  meticulous  effort  has  been  put  into  tbe  manufacture 
of  highly  uniform  diffraction  gratings  with  many  slits  and  small  grating 
spacing  for  use  in  spectroscopy. 

If  light  of  mixed  wavelengths  strikes  a diffraction  grating,  every  dif- 
ferent wavelength  produces  a different  diffraction  pattern  whose  maxima 
lie  at  angles  6 to  the  incident  beam  satisfying  Eq.  (28-15),  jk  = D sin  6. 
Solving  this  equation  for  sin  6 gives 

sin  0 ——  (grating  maxima)  (28-18) 

where  j = 0,  ±1,  ±2,  ±3,  . . . . That  is,  the  wavelengths  comprising  the  light 
are  spread  out  into  spectra  (one  spectrum  for  each  value  of/).  (In  the  termi- 
nology of  Sec.  13-7,  a diffraction  grating  acts  as  a Fourier  analyzer  for 
complex  waveforms  of  light.  Do  you  see  that  this  is  so?) 

The  making  of  practical  diffraction  gratings  involves  ruling  a large 
number  of  parallel,  evenly  spaced  grooves,  called  lines,  on  a plate  of  glass 
or  other  backing  with  a scribing  tool.  In  a transmission  grating,  light  passes 
through  the  spaces  between  the  grooves,  but  less  well  or  not  at  all  through 
the  grooves  themselves.  The  clear  spaces  thus  become  the  slits.  In  the 
more  common  reflection  grating,  the  glass  plate  is  covered  with  a thin  layer 
of  reflective  metal,  in  which  the  fines  are  ruled.  Light  is  then  reflected 
from  the  undisturbed  spaces  between  the  lines,  producing  a diffraction 
pattern  just  like  that  of  a transmission  grating.  One  advantage  of  the  reflec- 
tion grating  is  that  it  can  be  used  for  light  in  wavelength  ranges  which  are 
absorbed  by  most  materials.  Another  advantage  is  that  the  grating  can  be 
ruled  on  a spherical  or  paraboloidal  surface,  thus  eliminating  the  need  for 
lenses  for  focusing  purposes. 

Three  criteria  must  be  considered  in  making  a good  diffraction 
grating: 

1.  The  grating  should  have  as  many  lines  as  possible.  This  increases 
the  amount  of  light  that  can  pass  through  the  system.  More  important,  it 
sharpens  the  maxima,  thereby  increasing  the  resolving  power — the  ability 
of  the  system  to  separate  components  of  nearby  wavelength  in  a beam  of 
light. 

O 

2.  The  lines  should  be  as  close  together  as  possible.  This  increases  the 
dispersion  — the  angle  through  which  light  of  a given  wavelength  is  dif- 
fracted. (Compare  this  use  of  the  word  “dispersion”  with  the  similar  use  in 
Sec.  28-5.) 

3.  The  spacing  of  the  lines  must  be  exceedingly  uniform.  Otherwise, 
the  angle  of  diffraction  for  a given  wavelength  of  light  will  be  different  for 
different  parts  of  the  grating,  and  the  spectrum  will  be  fuzzy  and  poorly 
defined. 

A good  modern  diffraction  grating  may  have  as  many  as  12,000 
exceedingly  uniform  lines  per  centimeter  of  width,  with  a total  of  as  many 
as  50,000  lines.  The  manufacture  of  such  a master  grating  is  a highly  exact- 


ing  task,  involving  the  most  sophisticated  tools  and  control  systems  at  the 
disposal  of  the  mechanical  engineer.  Once  such  a master  has  been  made, 
reasonably  good  (and  much  cheaper)  replica  gratings  can  be  made  by 
using  the  master  as  a mold  for  producing  castings  of  some  plastic  material. 

It  is  not  at  all  necessary  to  use  special  equipment  to  see  diffraction  effects  with 
light.  Fine  gauze  fabrics,  such  as  nylon  chiffon  curtain  material,  will  produce 
clear  diffraction  effects  if  you  look  through  them  at  a small  bright  object.  A 
mercury-vapor  streetlight  at  night  is  a particularly  good  light  source.  Since  the 
weave  pattern  of  the  fabric  is  square  or  rectangular,  the  cloth  acts  like  two  diffrac- 
tion gratings  crossed.  The  result  is  a cross-shaped  diffraction  pattern.  The  effec- 
tive grating  spacing  can  be  changed  easily  by  looking  through  the  cloth  obliquely. 
This  changes  the  spacing  of  the  diffraction  pattern  in  one  direction,  but  not  the 
other. 

The  resolving  power  of  a good  diffraction  grating  is  illustrated  in  Ex- 
ample 28-6. 


EXAMPLE  28-6 

a.  A diffraction  grating  has  exactly  12,000  lines  in  its  1-cm  width.  Yellow  light 
from  a sodium-vapor  lamp  falls  on  the  grating  perpendicular  to  its  surface.  The 
wavelength  is  589  nm.  (See  Example  28-5.)  At  what  angle  will  be  found  the  first- 
order  diffraction  maximum,  that  is,  the  diffraction  maximum  for  which  j = 1? 

■ Setting  j = 1 in  Eq.  (28-18),  you  have 


Thus  the  angle  is 

9 - (/ 

If  you  express  the  value  of  A.  in  meters  and  set  D equal  to  the  reciprocal  of  the 
number  of  lines  pet  meter,  you  obtain 


9 = sin  1 


589  x IQ-9  m 
1/(1.2000  x 106  m-1)- 


0.785  rad  = 45.0° 


b.  The  yellow  sodium  light  actually  consists  of  two  components,  whose  wave- 
lengths are  588.995  nm  and  589.592  nm.  For  the  same  diffraction  grating,  find  the 
angular  spacing  between  the  first-order  maxima  produced  by  the  twro  components. 
■ You  have  for  one  component 


_ . _j  " 588.995  x 10  9 nr 

1 ~~  Sm  L 1/(1. 2000  x 106  m-1)  _ 

For  the  other  you  have 

. , T 589.592  x 10“9  m 1 

2 L 1 /( 1 .2000  x 106  nr1). 


0.78496  rad 


0.78597  rad 


Subtracting  gives  you  the  angular  separation 

AS  = 02  — 9i  = 0.00101  rad  = 0.0578°  = 3.47  minutes  of  arc 


28-7  Multislit  Diffraction  1353 


Alternatively,  you  can  differentiate  k = D sin  0 with  respect  to  0,  to  obtain 


dk 

— = D cos  0 
dd 


or 


dk  = D cos  0 dd 


Solving  for  dO  gives 


dO 


dk 

D cos  0 


Setting  dO  and  dk  equal  to  the  small  but  finite  differences  AO  and  A A.,  you  have,  to  a 
good  approximation, 

A\ 

AO  = 

D cos  0 


Using  the  value  of  0 found  in  part  a,  you  calculate 


AO  = 


589.592  x 1Q~9  m - 588.995  x 1Q~9  m 
[1/(1.20  00  x 106  mT1)]  x cos  45.0° 


= 1.01  X 10~3  rad 


This  agrees  with  the  preceding  calculation  of  A0.  ■ 

c.  Estimate  the  angular  widths  of  these  two  maxima.  Then  predict  whether 
they  will  be  resolved. 

■ Turning  back  to  Fig.  28-30,  you  see  that  for  a “grating"  with  N = 6 the 
width  (that  is,  its  half-width  at  half-maximum  intensity)  of  a maximum  is  1 /12  of  the 
spacing  between  adjacent  maxima.  This  spacing  does  not  depend  on  N,  while  Eq. 
(28-17)  shows  that  the  width  is  inversely  proportional  to  N.  Therefore  you  can  say 
that  for  the  grating  with  N = 12,000  the  width  of  a maximum  will  be  about 
(1/12)(6/ 12,000)  of  the  spacing,  or  about  1/24,000  of  the  spacing. 

For  the  case  at  hand,  you  are  dealing  with  the  angular  spacing  between  adja- 
cent maxima  for  the  same  wavelength.  It  is  comparable  to  the  angle  0 = 0.785  rad 
evaluated  in  part  a.  (Justify  this  to  yourself.)  Thus  you  can  estimate  the  width  of  a 
maximum  to  be  about 


1 

24,000 


x 0.8  rad  = 3 x 10  5 rad 


Since  this  angular  width  is  nearly  two  orders  of  magnitude  smaller  than  the  angular 
separation  A0  = 1 x 10~3  rad  between  the  adjacent  maxima  for  the  two  wave- 
lengths, the  maxima  will  be  very  well  resolved. 

It  should  also  be  said  that  there  is  no  difficulty  in  measuring  angles  with  an 
accuracy  much  better  than  1 x 10-3  rad  if  precision  equipment  is  available.  ■ 


28-8  SINGLE-SLIT  When  a beam  of  light  strikes  a barrier  pierced  by  a slit  which  allows  only 
DIFFRACTION  part  of  the  beam  to  pass,  the  light  beyond  the  barrier  no  longer  propagates 

only  in  the  direction  of  incidence  of  the  beam.  Instead  it  fans  out  over  a 
range  of  directions.  You  can  understand  the  principal  feature  of  what 
happens,  in  a qualitative  way,  by  considering  the  Huygens  construction  of 
Fig.  28-31.  On  the  left  of  the  barrier  are  wave  fronts  propagating  to  the 
right,  which  describe  the  beam  of  light  incident  on  the  barrier.  The  beam  is 
so  broad  that  only  the  central  portions  of  the  wave  fronts  can  be  indicated 
in  the  figure.  These  portions  are  planes  normal  to  their  direction  of 
propagation.  The  barrier,  which  contains  a single  slit,  restricts  the  passage 


1354  Wave  Optics 


of  the  plane  wave  fronts  by  allowing  only  the  parts  incident  on  the  slit  to 
pass  into  the  region  on  the  right  of  the  barrier.  The  restricted  wave  fronts 
continue  propagating  to  the  right,  but  not  as  parts  of  planes. 

Although  it  is  possible  to  form  a wave  front  of  restricted  extent  which 
is  a plane  over  its  complete  extent,  the  wave  front  cannot  remain  thus  as  it 
propagates.  The  explanation  of  the  physical  basis  of  Huygens’  construction 
at  the  end  of  Sec.  28-1  shows  that  the  propagation  of  wave  fronts  is  deter- 
mined by  the  way  wavelets  tend  to  spread  and  by  the  way  they  superpose.  In 
the  central  part  of  a restricted  plane  wave  front,  both  factors  operate  and 
that  part  remains  a plane  as  it  propagates.  But  at  an  edge  superposition 
plays  no  role  since  there  are  no  wavelets  beyond  the  edge  with  which  the 
wavelets  at  the  edge  can  superpose.  Hence  the  behavior  of  the  wave  front 
at  its  edges  is  governed  by  the  tendency  of  wavelets  to  spread.  The  result  is 
that  a plane  wave  front  of  restricted  extent  begins  immediately  to  “curl”  at 
its  edges.  Before  the  wave  front  has  gone  very  far  from  the  restricting  bar- 
rier, a significant  part  of  it  is  no  longer  in  the  shape  of  a plane.  Then  the 
wave  front  describes  not  light  propagating  in  a particular  direction,  but 
light  propagating  in  a range  of  directions.  This  process  is  known  as 
single-slit  diffraction. 


Fig.  28-31  Huygens’  construction  for 
single-slit  diffraction  from  a slit  whose 
width  is  3 times  the  wavelength  of  the 
incident  waves. 


A single-slit  diffraction  apparatus  is  illustrated  schematically  in  Fig. 
28-32a.  Light  traveling  through  the  slit  is  diffracted  into  a range  of  direc- 
tions to  form  the  single-slit  diffraction  pattern  shown  in  Fig.  28-32c.  Most 
of  the  light  goes  into  the  broad  “central  maximum”  of  the  pattern.  But 
some  of  it  goes  into  the  weaker  “auxiliary  maxima”  lying  outside  the  central 
maximum. 

To  obtain  a complete  understanding  of  this  diffraction  pattern,  we 
must  do  more  than  give  qualitative  consideration  to  a Huygens  construc- 
tion. We  must  analyze  quantitatively  the  phase  relations  of  combining 
waves,  just  as  we  have  done  for  two-slit  and  multislit  diffraction.  But  in  con- 
trast to  our  earlier  diffraction  analyses,  where  we  could  treat  each  slit  as  if 
its  width  were  negligible  compared  to  the  spacing  between  slits,  here  we 
must  take  into  account  the  fact  that  the  slit  has  an  appreciable  width.  In 
single-slit  diffraction  there  is  no  slit  spacing  to  which  the  slit  width  can  be 
compared  and  deemed  negligible. 

The  slit  of  width  d can  be  divided  into  a large  number  of  equal  regions, 
say  the  100  regions  indicated  in  Fig.  28-32a.  Each  of  these  can  be  consid- 
ered a sourcelet  of  a train  of  wavelets  which  spreads  into  the  region  to  the 
right  of  the  slit  (as  in  the  first  step  of  a Huygens  construction).  Since  all 
parts  of  a wave  front  incident  on  the  slit  arrive  at  these  regions  simulta- 
neously, the  sourcelets  are  oscillating  in  phase.  Hence  the  wavelet  trains  the 
sourcelets  emit  are  in  phase  when  they  start  their  trip  to  the  point  on  the 
screen  labeled  P.  But,  in  general,  they  will  not  be  in  phase  when  they  arrive 
at  P since  the  paths  on  which  they  travel  from  their  sourcelets  to  P are  of 
different  length.  For  instance,  the  figure  shows  that  the  path  from  source- 
let  51  to  P has  a length  which  is  greater  than  that  of  the  path  from  source- 
let  1 to  P by  the  amount 

d . 

A/  = — sin  d 


with  angle  d defined  as  in  the  figure.  We  assume  that  the  separation  L between 
the  barrier  and  the  screen  is  large  compared  to  the  width  d of  the  slit  so  that  all  the 
paths  are  very  nearly  parallel.  Then  the  same  A / will  be  the  difference  in 


28-8  Single-Slit  Diffraction  1355 


Fig.  28-32  (a)  A single-slit  diffraction 

apparatus.  The  light  waves  incident  on 
the  barrier  have  plane  wave  fronts  that 
are  parallel  to  the  barrier.  In  the  analy- 
sis it  is  assumed  that  the  separation  L 
between  the  barrier  and  the  screen  is 
very  large  compared  to  the  slit  width  d. 
Therefore  the  path  from  any  part  of  the 
slit  to  a point  P on  the  screen  is  essen- 
tially parallel  to  all  other  such  paths. 
This  figure  is  used  in  the  text  to  deter- 
mine the  condition  for  the  hrst  min- 
imum of  the  diffraction  pattern.  Here 
the  convention  of  representing  source- 
lets  as  grey  dots  cannot  be  used  be- 
cause the  dots  would  be  too  small  to  be 
visible.  Instead,  each  white  space 
between  adjacent  black  dashes  repre- 
sents a sourcelet.  (b)  A figure  used  to  de- 
termine the  condition  for  the  second 
minimum,  (c)  The  single-slit  diffraction 
pattern  obtained  with  the  apparatus. 
(Courtesy  Education  Development  Corpora- 
tion. ) 


(b) 


(c) 


1356  Wave  Optics 


length  of  the  paths  to  P from  sourcelet  52  anti  sourcelet  2,  from  sourcelet 
53  and  sourcelet  3,  and  so  forth. 

Now  consider  what  happens  if  this  path-length  difference  equals  half  a 
wavelength,  that  is,  if  A / = A./2.  Then  the  wavelet  train  from  sourcelet  51 
will  be  out  of  phase  with  the  wavelet  train  from  sourcelet  1 on  arriving  at  P. 
This  means  that  the  wavelets  from  the  pair  of  sourcelets  will  superpose 
destructively.  Furthermore,  the  same  will  be  true  of  the  wavelets  from 
sourcelet  52  and  sourcelet  2,  from  sourcelet  53  and  sourcelet  3,  and  so  on. 
Thus  the  wavelets  will  cancel  in  pairs,  and  there  will  be  no  resultant  wave 
arriving  at  point  P.  We  have  found  the  condition  for  the  hrst  minimum  in 
the  single-slit  diffraction  pattern.  It  is 


A/ 


k 

2 


d . k 

2 sin  6 = g 


or 


sin  6 =—  (single-slit  hrst  minimum)  (28-19) 

If  A.  is  small  compared  to  d,  as  is  usually  true,  then  sin  0 will  be  small 
compared  to  1 and  approximately  equal  to  6 , measured  in  radians.  In  these 
circumstances  we  have,  to  good  approximation, 

d = ~r  for  A.  « d (single-slit  hrst  minimum)  (28-20) 

The  “size"  6 of  a single-slit  diffraction  pattern  is  given  approximately  by  the  ratio  of 
the  wavelength  k of  the  waves  that  are  being  diffracted  and  the  width  d of  the  slit 
which  is  diffracting  them. 

We  have  found  the  condition  for  the  hrst  minimum  in  the  single-slit 
diffraction  pattern  by  considering  the  slit  as  composed  of  two  halves.  To 
hncl  the  next  minimum,  we  consider  it  to  be  composed  of  four  quarters. 
Figure  28-326  shows  paths  to  P from  sourcelets  at  the  top  of  the  hrst  and 
second  quarters.  The  difference  in  the  length  of  these  paths  has  the  value 
(d/4)  siu  6.  If  this  path-length  difference  is  half  a wavelength,  the  wavelets 
following  the  two  paths  will  superpose. destructively  at  P.  The  same  will  be 
true  of  the  wavelets  following  paths  to  P from  the  sourcelets  at  the  top  of 
the  third  and  fourth  quarters  of  the  slit.  Furthermore,  the  wavelets  from 
the  sourcelets  just  below  the  top  of  the  hrst  and  second  quarters  will  super- 
pose destructively  at  P,  as  will  the  wavelets  from  the  sourcelets  just  below 
the  top  of  the  third  and  fourth  quarters,  and  so  forth.  Thus  another  condi- 
tion for  a minimum  is 


d . k 

4 sin  6 = 2 


or 


k 

sin  9 = 2 

cl 

By  repeating  the  argument  with  the  slit  considered  to  be  composed  of 

28-8  Single-Slit  Diffraction  1357 


1 -) 


(b) 


Fig.  28-33  (a)  Phasor  diagram  for 

single-slit  diffraction,  treating  the  slit  as 
if  it  were  composed  of  10  equal  source- 
lets.  As  time  passes,  the  pattern  main- 
tains a fixed  shape.  But  it  rotates  coun- 
terclockwise at  the  angular  rate  u>  = 
2ttv , where  v is  the  frequency  of  the 
light,  with  the  base  of  phasor  1 re- 
maining at  the  coordinate  origin.  ( b ) 
The  same  as  in  part  a,  except  treating 
the  slit  as  if  it  were  composed  of  an  infi- 
nite number  of  equal  infinitesimal 
sourcelets. 


sixths,  eighths,  tenths,  and  so  on,  we  find  that  the  general  condition  for  an 
intensity  minimum  in  the  single-slit  diffraction  pattern  is 

sin  6 =j—  (single-slit  minima)  (28-21) 

where  j = ±1,  ±2,  ±3,  . . . . Negative  values  of  the  integer  j have  been 
included  in  this  expression  so  that  it  will  describe  the  negative  angles  lo- 
cating the  diffraction  minima  below  the  central  point  O as  well  as  the  posi- 
tive angles  locating  the  minima  above  that  point. 

Now,  for  the  first  time  in  our  study  of  diffraction,  we  will  derive  an  ex- 
pression giving  the  complete  “shape”  of  a diffraction  pattern.  To  be  specif- 
ic, we  will  find  the  intensity  I at  an  arbitrary  point  P on  a screen  located  at 
distance  L behind  a single  slit  of  width  d.  We  assume  L to  be  large  compared  to 
d so  that  all  paths  from  slit  to  screen  which  need  be  treated  are  nearly  par- 
allel and  of  nearly  the  same  length.  We  also  assume  the  light  striking  the  slit 
has  plane  wave  fronts  which  are  parallel  to  the  barrier  containing  the  slit. 
We  consider  the  amplitudes  of  all  the  wavelet  trains  emitted  from  the  equiva- 
lent sourcelets  to  be  equal  on  arriving  at  P.  Amplitude  is  not  a very  sensitive 
function  of  path  length,  and  all  the  path  lengths  are  nearly  the  same.  But 
phase  is  an  extremely  sensitive  function  of  path  length  since  it  goes 
through  a complete  cycle  of  change  in  one  wavelength.  So  we  must  take 
into  account  the  fact  that  the  phases  of  the  arriving  wavelet  trains  are  not 
equal.  We  do  this  by  using  phasors,  developed  in  Sec.  26-7,  to  represent  the 
wavelet  trains  arriving  at  P with  the  same  amplitudes  but  different  phases. 

As  a first  approximation,  let  us  divide  the  slit  into  10  equal  segments. 
Each  of  these  approximates  a sourcelet,  and  each  emits  in  phase  with  the 
others.  Consider  an  instant  at  which  the  wavelet  train  arriving  at  P from  the 
sourcelet  1 in  the  first  segment  can  be  represented  by  phasor  1 directed 
along  the  x axis  in  Fig.  28-33a. 

Since  the  path  to  point  P from  sourcelet  2 in  the  second  segment  is 
slightly  longer  than  that  from  sourcelet  1,  a wavelet  from  sourcelet  2 arriv- 
ing at  P at  the  same  instant  as  a wavelet  from  sourcelet  1 must  have  left  the 
slit  slightly  earlier.  That  is,  the  wavelet  train  from  sourcelet  2 replicates 
what  was  happening  at  the  slit  a little  earlier,  so  that  its  phase  relative  to  the 
wavelet  train  from  sourcelet  1 is  negative.  This  is  represented  by  phasor  2 
in  the  figure. 

The  wavelet  train  arriving  simultaneously  at  P from  sourcelet  3 left  the 
slit  still  earlier  and  is  represented  by  the  phasor  3,  and  so  on  for  the  10 
phasors. 

The  amplitude  of  the  total  light  wave  at  point  P at  this  moment  is  given 
by  the  magnitude  A of  the  large  phasor,  shown  in  Fig.  28-33o,  which  ex- 
tends from  the  base  of  phasor  1 to  the  tip  of  phasor  10.  While  the  direc- 
tions of  small  phasors  1 through  10  vary  in  time,  their  relative  phases  re- 
main constant.  Thus  the  phasor  which  is  their  resultant  rotates  about  the 
origin  at  some  angular  rate  co,  but  its  magnitude  A does  not  change. 

It  is  only  a rough  approximation  to  divide  the  slit  into  10  finite  seg- 
ments and  to  assume  that  all  parts  of  the  wavelet  trains  emanating  from 
all  parts  of  a given  finite  segment  are  in  phase  on  arriving  at  P,  so  that  a 
single  phasor  can  represent  all  these  wavelet  trains.  But  it  is  not  necessary 
to  make  this  approximation.  Instead  we  can  divide  the  slit  into  an  infinite 
number  of  segments  of  infinitesimal  width.  Each  of  these  segments  is  a true 


1358  Wave  Optics 


sourcelet.  With  each  there  is  associated  a phasor.  The  vector  sum  of  these 
infinitesimal  phasors  is  found  just  as  in  Fig.  28-33a.  But  now  the  phasors 
form  the  arc  of  a circle  rather  than  a part  of  a polygon.  This  is  shown  in 
Fig.  28-33 b.  The  radii  of  the  circle  which  intersect  the  two  ends  of  this  arc 
subtend  an  angle  A.  The  angle  A is  the  phase  difference  between  the 
wavelet  trains  arriving  at  P from  the  extreme  ends  of  the  slit.  (This  is  so 
because  one  of  the  radii  is  perpendicular  to  the  hrst  phasor,  and  the  other 
is  perpendicular  to  the  last  phasor.) 

To  find  the  magnitude  A of  the  resultant  phasor,  which  represents  the 
amplitude  of  the  resultant  light  wave  at  P,  we  bisect  the  angle  A.  From  Fig. 
28-33 b,  it  can  be  seen  that  A/ 2 is  given  by  the  trigonometric  relation 

A o • 

■9  = R sm  (J2 

If  the  length  of  the  circular  arc  is  C and  its  radius  is  R,  the  angle  A is  given 
in  radians  by  the  expression  A = C/R.  Thus 


We  therefore  have 


C sin( A/2) 
A/2 


(28-22) 


As  noted  above,  the  quantity  A in  this  equation  is  the  difference  in 
phase  of  arriving  wavelet  trains  which  have  traveled  to  P along  paths  from 
the  extreme  ends  of  the  slit.  Its  value  equals  the  product  of  2v/k  and  the 
difference  in  the  lengths  of  these  paths.  This  extreme  path-length  dif- 
ference is  twice  the  difference  in  the  lengths  of  the  paths  to  P from  one  end 
of  the  slit  and  from  its  center.  That  is,  it  is  twice  the  quantity  (d/2)  sin  8 
shown  in  Fig.  28-32o.  Thus  the  phase  difference  A has  the  value 

A = d sin  8 (28-23) 

A. 


where  k is  the  wavelength  of  the  light,  d is  the  width  of  the  slit,  and  8 is  the 
angle  defined  in  Fig.  28-32a  to  specify  the  location  of  P. 

The  quantity  C in  Eq.  (28-22)  does  not  depend  on  the  phase  difference 
A.  (The  value  of  C is  determined  entirely  by  the  amount  of  light  striking 
the  slit  per  unit  time.)  Hence  we  can  express  the  dependence  of  the  ampli- 
tude A of  the  resultant  wave  at  P on  the  phase  difference  A by  the  propor- 
tionality 

sin(A/2) 


The  intensity  I of  the  resultant  wave,  being  proportional  to  the  square 
of  its  amplitude  A,  obeys  the  proportionality 

T sin(A/2)l2 
L A/2  J 

Let  us  use  the  symbol  I0  to  represent  the  maximum  value  of  the  intensity  I. 
That  is,  I0  is  the  value  of  I at  the  peak  of  the  central  maximum  of  the  dif- 
fraction pattern,  where  8 = 0.  Then  the  proportionality  immediately  above 
can  be  written  as  the  equality 


28-8  Single-Slit  Diffraction  1359 


• 0.6 


Central  maximum 


Substituting  (lie  value  of  A given  in  Eq.  (28-23),  we  can  write  the  intensity  I 
at  point  P as  an  explicit  function  of  its  angular  location  d.  We  have 

sin[(mi/A.)  sin  0]  | 2 
'°  j (ttcI/K)  sin  0 

This  function  specifies  the  “shape”  of  the  single-slit  diffraction  pattern. 

Equation  (28-24«)  is  plotted  in  Fig.  28-34.  You  should  compare  the 
plot  to  the  photograph  of  a single-slit  diffraction  pattern  in  Fig.  28-32 b. 

In  Fig.  28-35  phasor  diagrams  are  constructed  for  the  central  max- 
imum (A/2  = 0),  the  hrst  minimum  (A/2  = tt),  and  the  first  auxiliary  max- 


A Fig.  28-35  Phasor  diagrams  for  critical  features  of  the  single-slit 

A - 0,  or  -2=0  diffraction  pattern. 


First  minimum 


A = 27 r,  or  -=-  = 7r 


First  auxiliary  maximum 


A ~ ^ A __  377 

A — 37T,  or  rr  — 


1360  Wave  Optics 


imum  (A/2  — 37r/2).  In  all  three  cases  the  length  of  the  curve  extending 
along  the  infinitestimal  phasors  has  the  same  value,  C,  as  explained  above. 
The  diagrams  make  it  apparent  why  the  first  auxiliary  maximum  is  much 
weaker  than  the  central  maximum.  Why  is  the  value  of  A/2  a little  less  than 
377/2  for  the  first  auxiliary  maximum?  What  does  the  phasor  diagram  look 
like  for  the  second  auxiliary  maximum? 

Example  28-7  makes  use  of  the  plot  of  / versus  A/2  displayed  in  Fig. 
28-34. 


EXAMPLE  28-7 

A beam  of  light  of  wavelength  A = 591  nm  falls  on  a barrier,  parallel  to  the  plane 
wave  fronts,  containing  a single  slit  of  width  d — 0.250  mm. 

a.  Show  that  the  first  minimum  in  the  diffraction  pattern  occurs  at  an  angle 
^ first  min  = A/d,  in  agreement  with  Ecp  (28-20).  Then  calculate  the  value  of  0firstmm, 

b.  Calculate  the  angular  half-width  at  half-maximum  intensity,  9W,  of  the  central 
maximum  in  the  diffraction  pattern.  That  is,  find  the  angle  6W  at  which  the  intensity 
I of  the  light  drops  to  a value  equal  to  one-half  its  maximum  value  I0.  Note  that  6lc  is 
the  angular  version  of  the  measure  of  the  width  of  a diffraction  maximum  intro- 
duced in  connection  with  multislit  diffraction  in  Sec.  28-7  and  illustrated  in  Fig. 
28-30. 

■ a.  Figure  28-34  and  Eq.  (28-24a)  show  that  the  first  minimum  in  the  pattern 
occurs  where  A/2  = 77.  or  A = 2tt.  Substituting  this  value  in  Eq.  (28-23), 

277 

A = — d sin  6 
A 

you  obtain 

277 

277  = — d sin  0 
A 


01 


sin  6 = — 
d 


Now  for  this  case  A = 5.91  x 10-7  m and  d = 2.50  x 10~4 * *  m.  So  A is  very  small 
compared  to  d.  Thus  A /d  is  very  small  compared  to  1,  and  the  same  is  true  of  sin  6- 
Therefore  when  6 is  expressed  in  radians,  to  a good  approximation  sin  6 = 6,  and 
you  have 

_ A 

^first  min  ~"7 


4 his  agrees  with  Eq.  (28-20),  which  was  obtained  from  an  argument  different  from 

the  one  leading  to  Eq.  (28-24a)  and  Fig.  28-34. 

Substituting  the  numerical  values  of  A and  d into  this  expression  for  0firstmin, 

you  find 


6 


first  min 


5.91  x 10~7  m 
2.50  x 10-4  m 


= 2.36  x 10"3  rad 


b.  Careful  inspection  of  Fig.  28-34  shows  you  that  I falls  to  the  value  70/2  (that 
is,  ///0  = 1/2)  where  A/2  = 0.430  77.  Thus  at  half-maximum  intensity  the  phase  dif- 
ference A has  the  value 


A = 2(0.430  77) 

The  corresponding  value  of  the  diffraction  angle  is  found  by  again  using  the 
equation 

277 

A = — d sin  6 
A 


28-8  Single-Slit  Diffraction  1361 


You  have 


2(0.430  7 r)  = — d sin  6 
A 


or 


A 

sin  0 = 0.430  — 
d 

The  angle  specified  by  this  relation  will  be  even  smaller  than  the  angle  calculated  in 
part  a.  So  again  you  can  equate  the  sine  to  the  angle.  You  then  obtain  for  the  angu- 
lar half-width  at  half-maximum  intensity 

0W  = 0-430  ~ = 0.430  0firstmin 

In  this  particular  case,  its  numerical  value  is 

dw  = 0.430  x 2.36  x 10“3  rad  = 1.02  x 10“3  rad 


Fig.  28-36  The  diffraction  pattern 
produced  when  light  with  plane  wave 
fronts  is  incident  on  a barrier  parallel  to 
the  wave  fronts  and  the  barrier  contains 
a single  circular  hole.  (Courtesy  Michel 
Cagnet.) 


We  now  apply  what  we  have  learned  about  single-slit  diffraction  to  a 
significant  property  of  multisht  diffraction  patterns.  Note  that  the  maxima 
in  the  two-slit  diffraction  pattern  photographed  in  Fig.  28-286  gradually 
become  less  intense  in  proceeding  away  from  the  center  of  the  pattern,  and 
ultimately  disappear.  The  same  effect  is  seen  in  photos  of  multislit  diffrac- 
tion patterns.  It  is  not  predicted  by  the  treatments  we  have  given  for  these 
processes  because  the  treatments  assumed  that  the  width  of  an  individual 
slit  is  always  negligible  compared  to  its  distance  from  a neighboring  slit. 
I bis  not-very-realistic  assumption  can  be  dropped  by  extending  the  treat- 
ment given  here  for  one  slit  of  appreciable  width  to  cases  where  there  are 
two  or  more  such  slits.  The  results  lead  to  the  conclusion  that  a diffraction 
pattern  like  the  one  in  Fig.  28-28 6 is  described  mathematically  by  the  prod- 
uct of  two  functions.  The  hrst  function  describes  the  diffraction  pattern 
that  would  be  obtained  for  the  system  of  slits  if  their  widths  really  were  neg- 
ligible. File  second  function  is  that  diffraction  pattern  that  would  be  ob- 
tained for  any  one  of  these  slits  by  taking  into  account  its  width.  In  other 
words,  the  actual  diffraction  pattern  consists  of  an  ideal  multislit  pattern 
that  falls  within  a “modulation  envelope"  given  by  the  single-slit  pattern. 
With  this  in  mind,  you  should  be  able  to  use  the  simple  equations  specifying 
the  characteristics  of  single-slit  and  two-slit  diffraction  patterns  to  obtain  a 
good  estimate  of  the  ratio  of  the  slit  width  to  the  slit  separation  in  the  appa- 
ratus used  to  produce  the  photo  in  Fig.  28-286. 

We  close  this  section  by  describing  the  relation  between  the  diffraction 
pattern  of  a single  slit  of  appreciable  width  and  that  of  a single  circular  hole 
of  appreciable  width.  The  latter  can  be  calculated  by  using  phasors  in  a way 
completely  analogous  to  the  way  we  have  used  them  here.  But  the  geome- 
try is  three-dimensional  instead  of  two-dimensional,  which  considerably 
complicates  the  calculation.  On  the  other  hand,  the  results  are  very  similar 
to  those  we  have  obtained  for  a single  slit.  You  can  see  this  in  Fig.  28-36, 
which  is  a photograph  of  a diffraction  pattern  produced  by  a circular  hole. 
Except  for  the  fact  that  it  has  the  expected  circular  symmetry,  it  is  very 
much  like  the  single-slit  diffraction  pattern  of  Fig.  28-32c.  When  the  wave- 
length A of  the  beam  of  light  incident  on  the  hole  is  small  compared  to  its 


1362  Wave  Optics 


diameter  d,  the  first  diffraction  minimum  occurs  at  an  angle  which  is  given, 
to  good  approximation,  by 

6 = 1 .22  for  k « d (single-hole  first  minimum)  (28-25) 

Note  the  similarity  of  this  equation  to  6 = k/d,  which  locates  the  first  min- 
imum in  the  diffraction  pattern  of  a single  slit  of  width  d. 


28-9  POLARIZATION 
OF  LIGHT 


V 


x (outward) 


Fig.  28-37  A representation  of  the 
electric  field  in  a light  beam  propagating 
out  of  the  page  and  polarized  such  that 
its  electric  field  vectors  have  only  ver- 
tical components. 


The  emission  of  light  by  oscillations  in  the  electron  charge  distributions  of  a 
set  of  atoms  was  described  in  Sec.  27-6.  It  was  explained  that  the  light 
emitted  by  an  individual  atom  is  polarized.  That  is,  the  transverse  electric  field 
in  (fie  light  is  always  directed  in  a plane  that  contains  both  the  line  along 
which  the  charge  distribution  can  be  considered  to  oscillate  and  the  clirec- 
tion  along  which  the  light  travels.  This  is  illustrated  in  Fig.  28-37  by  indi- 
cating schematically  the  oscillating  electric  field  vectors  in  light  emitted  in 
the  x direction  (out  of  the  page)  by  an  atom  in  which  the  charges  are  oscil- 
lating along  a line  oriented  in  the  y direction  (vertical).  The  light  is  po- 
larized because  the  oscillating  electric  field  vectors  have  only  a y compo- 
nent. 

It  was  explained  also  that  (in  most  circumstances)  the  light  emitted  by  a 
collection  of  atoms  is  unpolarized.  I he  reason  is  that  there  is  no  relation 
between  the  orientation  of  the  line  of  oscillation  of  the  charges  in  one  atom 
and  the  orientation  of  this  line  in  another  atom.  Hence  there  is  no  relation 
between  the  direction  of  polarization  of  the  light  emitted  by  one  atom  and 
the  direction  of  polarization  of  the  light  emitted  by  another.  All  possible 
polarizations  occur,  and  so  there  is  no  polarization.  A schematic  illustration 
of  this  situation  is  shown  in  Fig.  28-38a.  The  figure  indicates  the  oscillating 
electric  field  vectors  in  unpolarized  light  emitted  in  the  x direction  (out  of 
the  page).  There  is  a random  distribution  of  these  electric  field  vectors  in 
directions  perpendicular  to  the  x direction.  Another  way  of  describing  the 
unpolarized  light  emitted  in  the  x direction  is  indicated  in  Fig.  28-38 b.  In 
this  description  the  oscillating  electric  field  vectors  are  represented  by  two 
components  along  perpendicular  directions.  These  directions  have  been 
chosen  to  be  tbe  y direction  (vertical)  and  the  z direction  (horizontal)  in  the 
figure;  however,  any  pair  of  directions  perpendicular  to  each  other  and  to 
the  direction  of  emission  will  do.  The  two  components  oscillate  with  the 
same  amplitude.  But  their  phases  are  unrelated;  that  is,  the  difference  in 
their  phases  changes  randomly  in  time. 


Polarized  light  can  be  produced  in  a number  of  different  ways  from 
unpolarized  light  emitted  by  an  ordinary  source  such  as  a light  bulb.  One 
method  exploits  the  property  of  certain  crystals  that  is  called  dichroism, 
which  we  now  explain.  A number  of  minerals  occur  naturally  in  the  form 
of  fairly  large  transparent  crystals,  and  even  larger  ones  can  be  produced 
by  the  art  of  artificial  crystal  growing.  Many  such  crystals  are  highly  aniso- 
tropic. That  is,  their  physical  properties  are  not  the  same  in  all  directions. 
Such  anisotropy  is  due  ultimately  to  the  anisotropic  form  of  the  bonds 
between  the  molecules  which  comprise  the  crystals.  Very  often  (but  not 
always)  this  anisotropy  is  reflected  in  the  gross  shape  of  the  crystals,  which 
are  long  and  thin. 

Certain  anisotropic  crystals  are  much  less  transparent  for  light  whose 
electric  field  vectors  are  in  one  plane  containing  the  propagation  direction 


28-9  Polarization  of  Light  1363 


y 


(a) 


x (outward) 

•+ » 


( b ) 

Fig.  28-38  (a)  One  representation  of 

the  electric  field  in  an  unpolarized  light 
beam  propagating  out  of  the  page.  The 
electric  held  vectors  are  supposed  to  be 
distributed  randomly,  but  uniformly,  in 
all  directions  perpendicular  to  the  direc- 
tion of  the  beam,  (b)  Another  represen- 
tation of  the  electric  held  in  the  unpo- 
larized light  beam.  The  electric  held 
vectors  have  components  along  two 
directions  perpendicular  to  the  beam 
and  to  each  other.  The  components  are 
of  equal  amplitude,  and  the  difference 
in  their  phases  changes  in  time  so  as  to 
have  a random,  but  uniform,  distribu- 
tion. 


Fig.  28-39  An  initially  unpolarized 
light  beam  can  be  polarized  by  passing  it 
through  a crystal  if  one  component  of 
its  electric  held  is  absorbed  more  rapidly 
than  the  other. 


(for  instance,  the  xz  plane  in  Fig.  28-386)  than  for  light  whose  electric  held 
vectors  are  in  the  perpendicular  plane  (the  xy  plane  in  the  same  figure). 
These  circumstances  arise  when  the  crystal  structure  makes  it  easier  for  os- 
cillating electric  held  vectors  in  one  plane  to  induce  oscillations  in  the 
crystal  than  it  is  for  oscillating  electric  held  vectors  in  the  other  plane.  The 
result  is  a significantly  more  rapid  absorption  of  energy  from  light  in  one 
case  than  in  the  other.  Figure  28-39  illustrates  what  happens  when  unpo- 
larized light  is  incident  on  such  a crystal,  the  unpolarized  light  being  repre- 
sented as  in  Fig.  28-38 b.  When  the  light  has  passed  through  an  appreciable 
thickness  of  the  crystal,  the  oscillating  electric  held  vectors  in  the  plane  in 
which  the  energy  loss  is  greatest  have  been  almost  completely  absorbed. 
Thus  the  part  of  the  unpolarized  light  wave  incident  on  the  crystal  whose 
polarization  orientation  is  unfavorable  has  been  absorbed  by  the  crystal. 
The  part  that  remains  as  the  light  wave  emerges  from  the  crystal  is  po- 
larized. 

Crystals  which  display  this  property  are  called  dichroic.  The  natural 
crystal  most  often  exploited  for  its  dichroism  is  tourmaline.  However,  large 
single  crystals  are  expensive,  delicate,  and  bulky.  It  is  desirable  to  employ 
the  phenomenon  of  dichroism  in  a cheaper,  more  convenient  manner.  The 
artificial  material  called  Polaroid  is  a good  solution  to  this  important  tech- 
nical problem.  The  process  of  making  sheets  of  Polaroid  was  invented  by 
Edwin  H.  Land  in  1932.  l iny,  needlelike  crystals  of  the  strongly  dichroic 
organic  compound  iodoquinine  sulfate  are  suspended  in  a liquid  that  is 
allowed  to  flow  across  a sheet  of  a transparent  material  (which  serves  as  the 
backing  in  the  finished  product).  The  crystals  tend  to  align  themselves  with 
the  direction  of  flow,  after  which  the  liquid  is  allowed  to  evaporate.  The  re- 
sult is  a convenient  sheet  of  polarizing  material,  which  can  be  cut  to  any  de- 
sired shape  and  size.  (Since  1932,  other  methods  have  been  devised  for 
achieving  the  same  result.) 

You  may  have  tried  the  experiment  of  holding  two  sheets  of  Polaroid 
up  to  the  light  and  turning  them  with  respect  to  each  other,  as  in  Fig. 
28-40.  When  oriented  properly  with  respect  to  each  other,  the  sheets 


1364  Wave  Optics 


0 


Fig.  28-40  T ransmission  of  a light  beam  through  two  Polaroid 
sheets.  The  incident  light  is  unpolarized.  After  passing  through  the 
first  sheet  (called  the  polarizer),  the  light  is  polarized  with  its  elec- 
tric field  oriented  along  the  solid  line  drawn  on  that  sheet,  because 
the  structure  of  the  tiny  aligned  crystals  of  the  polarizer  allows  the 
electric  field  component  along  this  line  to  pass.  When  the  second 
sheet  (called  the  analyzer)  is  rotated  so  that  the  solid  line  drawn  on 
it  is  aligned  with  the  one  on  the  polarizer,  the  transmitted  beam  has 
maximum  intensity.  In  these  circumstances  the  crystals  in  the  an- 
alyzer are  aligned  with  those  in  the  polarizer.  When  the  two  solid 
lines  are  perpendicular,  the  intensity  of  the  transmitted  beam  is  re- 
duced to  zero  because  the  crystals  in  the  analyzer  are  perpendic- 
ular to  those  in  the  polarizer.  Hence  neither  component  of  the  inci- 
dent beam  can  pass  through  both  Polaroid  sheets. 


allow  much  of  the  light  to  pass.  If  one  of  them  is  then  turned  through  the 
angle  </>  of  90°,  nearly  all  the  light  is  absorbed,  and  the  pair  appears  opaque. 
In  these  circumstances  the  light  transmitted  by  the  first  Polaroid  is  po- 
larized with  its  oscillating  electric  held  vectors  oriented  so  that  they  are  effi- 
ciently absorbed  by  the  second  Polaroid.  In  such  an  experiment,  the  hrst 
element  is  called  the  polarizer  and  the  second  is  called  the  analyzer. 


An  unpolarized  light  beam  of  intensity  I is  incident  on  the  pair  of  Polaroid  sheets  in 
Fig.  28-40.  Assume  each  sheet  is  ideal  in  that  it  absorbs  none  of  the  oscillating  elec- 
tric field  vectors  oriented  along  the  solid  line  shown  on  the  sheet  and  absorbs  all  the 
oscillating  electric  field  vectors  oriented  perpendicular  to  the  line. 

a.  What  is  the  intensity  A of  the  light  just  beyond  the  first  Polaroid  sheet? 

b.  What  is  the  intensity  I2  of  the  light  just  beyond  the  second  one? 

■ a.  In  the  incident  unpolarized  beam,  the  amplitude  of  the  electric  field  vectors 
oriented  along  the  solid  line  on  the  first  Polaroid  sheet  has  the  same  value  as  the  am- 
plitude of  the  electric  field  vectors  oriented  perpendicular  to  that  line.  A conse- 
quence of  this  equality  of  amplitudes  is  that  the  part  of  the  light  absorbed  by  the 
first  sheet  has  an  intensity  equal  to  that  of  the  part  transmitted.  Thus  you  can  con- 
clude that  half  the  total  intensity  / of  the  incident  beam  is  transmitted  by  the  first 
sheet,  so  that  just  beyond  it  the  light  has  an  intensity  I whose  value  is  given  by  the 
relation 


Fig.  28-41  The  electric  field  vector  £t 
treated  as  the  sum  of  a vector  £m  along 
the  solid  line  drawn  on  the  analyzer  and 
a vector  £lx  along  a perpendicular  line. 


I 


This  light  is  polarized  with  its  electric  field  vectors  oriented  along  the  solid  line 
shown  on  the  first  sheet.  To  put  it  another  way,  the  electric  field  vectors  are 
oriented  along  the  dashed  line  on  the  second  sheet. 

b.  You  can  determine  what  fraction  of  the  polarized  light  incident  on  the  sec- 
ond sheet  passes  through  it  by  considering  this  light  to  consist  of  two  polarized  con- 
stituents. For  one  constituent  the  electric  field  vectors  are  oriented  along  the  solid 
line  in  the  second  sheet,  and  for  the  other  they  are  oriented  perpendicular  to  that 
line.  What  you  are  doing  in  this  step  is  to  say  that  at  any  instant  any  of  the  electric 
field  vectors  of  the  beam  incident  on  the  second  sheet  can  be  represented  as  the 
sum  of  two  constituent  vectors.  Figure  28-41  shows  such  a representation  at  an  in- 
stant when  the  electric  field  vector  8!  of  the  beam  incident  on  the  second  sheet  is  at 
a maximum  of  its  oscillation  cycle,  so  that  the  magnitude  «?]  of  the  vector  is  the  am- 
plitude of  this  electric  field.  One  of  the  constituent  vectors,  8m,  is  directed  parallel 
to  the  solid  line  on  the  second  sheet,  and  the  other,  81X,  is  directed  perpendicular  to 
that  line.  The  two  electric  fields  described  by  these  two  constituent  vectors  oscillate 
at  the  same  frequency  and  with  the  same  phase  (in  contradistinction  to  the  situation 
discussed  in  connection  with  Fig.  28-38 b).  But  they  do  not  have  the  same  ampli- 
tudes, unless  the  angle  </)  between  the  solid  and  dashed  lines  on  the  second  sheet 
happens  to  have  the  value  45°.  In  fact,  the  amplitudes  of  the  two  electric  fields  equal 


28-9  Polarization  of  Light  1365 


the  magnitudes  i’m  and  %1±  of  the  two  vectors  illustrated  in  the  figure,  and  the  fig- 
ure shows  that 

<21N  = COS  (/> 

and 

«n_L  = sin  p 

All  the  electric  held  with  held  vectors  directed  parallel  to  the  solid  line  on  the 
second  sheet  is  transmitted  by  the  sheet,  and  all  the  electric  held  with  held  vectors 
directed  perpendicular  to  that  line  is  absorbed  by  the  sheet.  Hence  %in  is  the  ampli- 
tude of  the  electric  held  in  the  light  just  beyond  the  second  sheet.  Thus  writing  this 
amplitude  as  you  have  %1U  = «?2.  Then  using  this  equality  in  the  hrst  equation 
displayed  above,  you  obtain 

%2  — COS  <f> 

The  amplitude  «?2  of  the  electric  held  in  the  light  just  beyond  the  second  sheet  is 
smaller  by  the  factor  cos  p than  the  amplitude  %x  of  the  electric  held  in  the  light 
just  beyond  the  hrst  sheet.  Since  the  intensity  of  light  is  proportional  to  the  square 
of  the  amplitude  of  its  electric  held,  this  means  that  the  intensity  /2  is  smaller  than 
the  intensity  Ix  by  the  factor  cos2  p.  That  is, 

h = h cos2  p (28-26) 

Using  the  result  ij  = 7/2  found  in  part  a,  you  finally  conclude  that 

I cos2  p 

72  = —r~ 


We  referred  earlier  to  Fresnel’s  long  series  of  experiments  designed  to  support 
the  wave  theory  of  light  motion.  The  one  experiment  which  seems  to  have  been 
most  convincing  to  his  contemporaries  involved  polarization.  Fresnel  repeated 
Young’s  experiment  and  produced  the  usual  two-slit  diffraction  pattern  within  a 
“modulation  envelope”  of  a one-slit  pattern.  Then  into  the  beams  of  light  passing 
through  the  two  slits  he  inserted  polarizing  crystals,  with  the  orientation  of  one 
perpendicular  to  that  of  the  other.  The  two-slit  diffraction  pattern  disappeared, 
and  no  amount  of  adjustment  could  restore  it  as  long  as  the  polarizers  remained  in 
place.  But  polarizers  oriented  parallel  to  one  another  did  not  affect  the  two-slit  dif- 
fraction pattern.  Can  you  explain  Fresnel’s  observations?  How  do  they  support  the 
wave  theory  of  light  propagation? 

Polarized  light  can  be  produced  from  unpolarized  light  in  a manner 
cpiite  different  from  the  one  we  have  been  considering.  When  unpolarized 
light  traveling  through  air  strikes  a flat  surface  at  not  too  small  an  angle  of 
incidence,  an  appreciable  amount  of  the  light  is  reflected,  no  matter  what 
the  nature  of  the  underlying  material.  Etienne  Malus  discovered  in  1809 
(by  looking  through  an  analyzer  at  the  light  of  the  setting  sun  reflected  in 
the  windows  of  a palace  in  Paris)  that  the  reflected  light  is  partially  po- 
larized. That  is,  the  reflected  light  is  a mixture  comprising  partly  unpo- 
larized light  and  partly  light  polarized  in  a specific  orientation.  The  de- 
gree of  polarization  depends  on  the  indices  of  refraction  on  both  sides  of 
the  surface  from  which  the  reflection  takes  place,  as  well  as  on  the  angle  of 
incidence  of  the  light.  For  any  angle,  more  light  is  reflected  whose  electric 
field  vectors  are  in  directions  parallel  to  the  surface  than  light  whose  elec- 
tric field  vectors  are  in  directions  not  parallel  to  the  surface. 

If  the  underlying  material  is  transparent,  there  is  a particular  angle  of 
incidence  6P  at  which  the  reflected  light  is  completely  polarized.  At  this  angle 


1366 


Wave  Optics 


Fig.  28-42  Polarization  by  reflection.  Light  traveling  through  a 
transparent  material  of  index  of  refraction  n is  incident  at  Brew- 
ster’s angle  dp  on  the  surface  of  a transparent  material  of  index  of 
refraction  n' . The  reflected  light  is  completely  polarized,  its  electric 
field  having  a component  only  in  a direction  parallel  to  the  surface. 
The  refracted  light  is  partially  polarized,  the  component  of  its  elec- 
tric held  in  a direction  parallel  to  the  surface  having  the  smallest 
amplitude. 


of  incidence  the  electric  field  vectors  of  the  reflected  light  are  completely 
oriented  along  lines  parallel  to  the  surface,  as  illustrated  in  Fig.  28-42.  The 
angle  Qp  is  called  Brewster’s  angle,  and  its  value  is  given  by  the  expression 

n' 

tan  6„  = — (28-27) 

n 

Here  n is  the  index  of  refraction  on  the  side  of  the  surface  where  the  inci- 
dent and  reflected  beams  are  found,  and  n'  is  the  index  of  refraction  on  the 
other  side.  Equation  (28-27)  is  known  as  Brewster’s  law.  It  was  discovered 
experimentally  by  the  Scottish  physicist  Sir  David  Brewster  (1781-1868).  It 
can  be  derived  from  Maxwell's  equations,  but  the  derivation  is  beyond  the 
level  of  this  book. 

The  phenomenon  of  polarization  by  reflection  is  the  basis  for  the  utility  of 
Polaroid  sunglasses.  On  bright  days,  much  of  the  discomfort  arises  from  the  pres- 
ence of  glare,  that  is,  high  levels  of  light  intensity  coming  from  large,  uninter- 
esting sources.  These  are  often  flat  surfaces  such  as  wet  beaches,  bodies  of  water, 
or  snow  fields.  In  the  presence  of  this  intense  background  light,  it  is  difficult  to  de- 
tect the  light  coming  from  interesting  sources  (such  as  sunbathers).  Glare- 
producing  surfaces  are  generally  horizontal.  If  the  light  from  the  sun  happened  to 
strike  them  at  Brewster’s  angle,  all  the  reflected  light  would  be  polarized  in  a hori- 
zontal direction.  We  cannot  usually  hope  for  such  good  luck.  But  in  any  case,  a 
major  part  of  the  reflected  light  is  horizontally  polarized  (the  oscillating  electric 
vectors  having  horizontal  directions).  By  wearing  sunglasses  containing  Polaroid 
oriented  so  as  to  absorb  horizontally  polarized  light,  you  do  more  than  merely  re- 
duce the  total  light  intensity,  as  when  you  wear  ordinary  colored  glasses.  Rather, 
you  discriminate  against  the  light  coming  from  the  horizontal,  glare-producing 
surfaces.  Next  time  you  are  wearing  Polaroid  sunglasses  outdoors,  check  that  this 
is  so  by  turning  your  head  so  that  one  eye  is  over  the  other.  You  will  see  that  while 
the  total  brightness  is  still  reduced  by  the  glasses,  they  make  the  glare  worse  rather 
than  better. 


28-9  Polarization  of  Light  1367 


EXERCISES 

Group  A 

28-1.  Huygens’  construction,  I.  Make  a Huygens  con- 
struction for  a contracting  spherical  wave  front. 

28-2.  Double  the  angle.  A beam  of  light,  10  in  Fig. 
28E-2,  is  reflected  from  a plane  mirror,  AOB.  The  mirror 
is  rotated  through  angle  6 to  position  A' OB'  with  the  beam 
fixed.  Prove  that  the  rotation  of  the  mirror  will  rotate  the 
reflected  beam  through  an  angle  equal  to  26. 

Fig.  28E-2 


28-3.  Huygens’  construction,  11.  Make  a Huygens  con- 
struction for  reflection  from  a plane  mirror  like  the  one  in 
Fig.  28-7  but  with  the  incident  wave  front  approaching 
from  the  upper  right. 

28-4.  Huygens’  construction,  III.  Modify  the  Huygens 
construction  for  a parabolic  mirror  in  Fig.  28-10  so  that  it 
applies  to  spherical  wave  fronts  diverging  from  a light 
source  at  the  focus.  Then,  using  Fig.  28-11,  make  the 
same  modification  in  the  formal  proof  of  the  focusing 
properties  of  the  mirror. 

28-5.  Waves  in  water.  A source  of  light  sends  out 
waves  of  wavelength  6.00  x 10-7  nr  in  air. 

a.  What  is  the  speed  of  this  light  in  water,  which  has 
an  index  of  refraction  1.34? 

b.  What  is  the  frequency  of  the  light  in  air?  in  water? 

c.  What  is  the  wavelength  of  the  light  in  water? 

28-6.  Huygens’  construction,  IV.  Make  a Huygens  con- 
struction for  refraction  like  the  one  in  Fig.  28-16  but  with 
the  incident  light  beam  traveling  through  the  material  of 

o o o 

higher  index  of  refraction. 

28-7.  Verifying  Snell’s  law.  Use  the  index  of  refraction 
obtained  in  part  a of  Example  28-2  and  the  angle  of  inci- 
dence you  measure  from  part  d of  the  photograph  in  Fig. 
28-17  to  predict  the  angle  of  refraction.  Then  measure 
this  angle  from  the  photograph  and  compare  it  with  your 
prediction. 

28-8.  Refracting  pool. 

a.  A leaf  is  floating  on  the  surface  of  a pool  2.00  m 
deep.  The  sun  is  just  setting.  How  far  from  a point  on  the 
bottom  directly  below  the  leaf  is  its  shadow?  Assume  that 
the  sun  is  unobstructed  and  the  water  is  calm,  and  take 
n = 1.34  for  water. 


b.  What  is  the  half-apex  angle  of  the  cone  of  light 
formed  by  the  sky  as  seen  by  a fish  looking  up  from 
beneath  the  surface  of  the  pool? 

28-9.  Measuring  index  of  refraction.  A thin  layer  of  liq- 
uid coats  the  horizontal  surface  of  a half  cylinder  of  glass 
whose  index  of  refraction  is  1.60.  See  Fig.  28E-9.  The  crit- 
ical angle  0C  for  total  internal  reflection  is  found  to  be  55°. 
What  is  the  measured  index  of  refraction  of  the  liquid? 


28-10.  Separated  colors.  A beam  of  sunlight  strikes  the 
flat  surface  of  a piece  of  crown  glass  at  an  angle  of  45°. 
Find  the  angular  separation  between  the  paths  in  the  glass 
of  green  light  having  wavelength  A.gl.een  = 525  nm  and  red 
light  having  wavelength  Xred  = 675  nm. 

28-11.  Two-slit  diffraction.  A two-slit  diffraction  appa- 
ratus has  for  a light  source  a mercury  vapor  lamp  filtered 
by  a sheet  of  green  plastic.  The  wavelength  of  this  light  is 
546  nm.  The  spacing  between  the  slits  is  0.150  mm,  and 
the  distance  from  the  barrier  containing  the  slits  to  the 
screen  on  which  the  diffraction  pattern  is  detected  is 
25.0  cm.  What  is  the  separation  between  adjacent  max- 
ima of  the  diffraction  pattern? 

28-12.  Two-slit  diffraction  in  a ripple  tank.  A two-slit  dif- 
fraction experiment  is  carried  out  in  a ripple  tank,  as  in 
Fig.  28-24.  The  spacing  between  the  slits  is  2.0  cm,  and  the 
frequency  of  the  vibrating  plunger  which  produces  the 
ripples  is  75  Hz.  On  a line  parallel  to  the  barrier  containing 
the  slits,  and  15  cm  from  it,  the  distance  from  the  center  of 
the  diffraction  pattern  to  the  nearest  maximum  on  either 
side  is  3.3  cm.  Determine  the  speed  at  which  the  rip- 
ples move  across  the  tank. 

28-13.  Diffraction  grating,  I.  A diffraction  grating  has 
5000  lines  to  the  centimeter.  What  is  the  angle  between 
the  second-order  ( j = 2)  maxima  formed  by  light  of  wave- 
length 656  nm  and  by  light  of  wavelength  410  nm?  The 
light  is  incident  normal  to  the  plane  of  the  grating. 

28-14.  Diffraction  grating,  II.  Using  sodium  light  of 
wavelength  589.3  nm,  it  is  found  that  the  angle  between 
the  left  and  right  second-order  maxima  (j  = 2)  is  120.0° 
when  the  light  is  incident  normal  to  the  plane  of  the  grat- 
ing. How  many  lines  are  there  per  centimeter  of  the  grat- 
ing? 


1368  Wave  Optics 


Fig.  28E-20 


28-15.  Looking  through  a grating.  In  Fig.  28E-15,  C 
represents  the  cross  section  of  a very  slender  mercury  dis- 
charge tube  (light  source),  covered  by  a green  biter.  It  is 
1.000  m from  a grating  G which  has  6000  lines  per  centi- 
meter. The  lines  are  parallel  to  the  axis  of  the  discharge 
tube.  AB  is  a meter  stick  next  to  C and  perpendicular  to 
CG.  Looking  through  the  grating,  an  observer  sees  the 
green  image  of  the  discharge  tube  in  first  order  (j  = 1)  at 
34.7  cm  from  C.  Calculate  the  wavelength  of  the  green 
light. 


Fig.  28E-15 


28-16.  Single-slit  diffraction.  What  is  the  angular  width 
of  the  central  maximum  (measured  from  the  minimum  on 
one  side  to  the  minimum  on  the  other  side)  in  the  diffrac- 
tion pattern  of  a slit  0.0100  mm  wide  which  is  illuminated 
by  light  whose  wavelength  is  600  nm? 

28-17.  Estimating  the  ratio  of  slit  width  to  slit  spacing.  Use 
the  procedure  suggested  in  the  paragraph  following  Ex- 
ample 28-7  to  estimate  the  ratio  of  the  slit  width  to  the  slit 
spacing  in  the  apparatus  used  to  make  the  two-slit  diffrac- 
tion pattern  photograph  in  Fig.  28-2S&. 

28-18.  Polarization  by  reflection.  A sheet  of  crown  glass 
is  being  used  to  completely  polarize  by  reflection  a beam 
of  light  of  wavelength  550  nm.  Use  Fig.  28-21  to  deter- 
mine the  necessary  value  of  the  angle  of  incidence. 

28-19.  Three  Polaroid  sheets.  A beam  of  unpolarized 
light  is  sent  through  three  ideal  Polaroid  sheets.  The  ori- 
entation of  the  line  along  which  the  second  sheet  absorbs 
no  oscillating  electric  held  is  rotated  in  a certain  sense  by 
an  angle  of  20°  with  respect  to  the  orientation  of  that  line 
in  the  first  sheet.  The  orientation  of  that  line  in  the  third 
sheet  is  rotated  in  the  same  sense  with  respect  to  the  orien- 
tation in  the  first  sheet  by  75°.  What  fraction  of  the  light 
intensity  incident  on  the  system  passes  through  it? 

Group  B 

28-20.  Cat’s  eye  reflector.  A beam  of  light  falls  on  one 
of  a pair  of  mirrors  MM'  and  NM'  joined  at  right  angles 


• M' 


as  in  Fig.  28E-20,  undergoing  reflection  at  each.  Prove  that 
in  all  circumstances  it  emerges  with  its  direction  of  travel 
exactly  reversed.  This  is  the  principle  of  operation  of  the 
“cat's  eye  reflectors,”  widely  used  to  define  the  lines  sep- 
arating highway  lanes.  Explain  why  cat's  eye  reflecting 
buttons  set  in  a highway  ahead  of  a driver  appear  to  the 
driver  to  be  particularly  bright  at  night. 

28-21.  Geometry  of  a parabola.  Use  the  equation  of  a 
parabola,  y2  = 4 fx,  to  prove  that  the  distance  from  any 
point  on  it  to  the  focus  equals  the  distance  from  that  point 
to  the  directrix.  Then  make  measurements  which  verify 
this  on  the  parabola  plotted  in  Fig.  28-9. 

28-22.  Optical  whispering  chamber.  The  ellipse  in  Fig. 
28E-22  satisfies  the  relation  FP  + PF'  = constant,  where 
P is  any  point  on  the  ellipse  and  F and  F'  are  two  fixed 
points  called  its  foci.  Consider  an  ellipsoid  of  revolution 
formed  by  rotating  the  ellipse  about  its  symmetry  axis 
through  F and  F',  with  the  interior  surface  being  a 
mirror.  Give  a Huygens  construction  similar  to  the  one  in 
Fig.  28-10,  or  an  argument  similar  to  the  one  illustrated  by 
Fig.  28-1  1 . showing  that  if  a spherical  wave  front  of  light  is 
emitted  from  F after  reflection  from  the  mirror  it  will  be  a 
spherical  wave  front  converging  on  F' . The  optical  system 
is  the  analogue  of  an  acoustical  whispering  chamber. 


Fig.  28E-22 


28-23.  Foucault’s  measurement.  In  a measurement  of 
the  speed  of  light  in  water  like  Foucault’s  (see  Fig.  28-14) 
the  following  data  are  obtained:  Angular  speed  of  ro- 


Exercises  1369 


tating  mirror,  9.00  x 103  rotations/min;  distance  from 
rotating  mirror  to  scale,  21.0  m;  displacement  of  slit  image 
on  scale  when  tube  is  filled  with  water,  0.41  mm;  length  of 
tube,  10.0  m.  Use  these  data  to  obtain  a value  for  the 
speed  of  light  in  water  and  a value  for  the  index  of  refrac- 
tion of  water.  Hint:  See  Exercise  28-2. 

28-24.  R efraction  by  a glass  slab. 

a.  Prove  that  the  emergent  light  beam  in  Fig.  28E-24 
is  parallel  to  the  incident  light  beam  if  the  faces  of  the 
glass  slab  are  parallel. 

b.  If  0 is  the  angle  of  incidence  and  / the  thickness  of 
the  slab,  show  that  the  transverse  displacement  d of  the 
beam  is  given  by  d = t0(n  — 1 )/n,  for  small  0. 


Fig.  28E-24 


28-25.  A canny  way  of  measuring  index  of  refraction.  A 
cylindrical  can  is  25  cm  in  diameter  and  25  cm  high.  An 
observer  places  herself  so  that  she  can  see  only  the  most 
distant  part  of  the  bottom,  the  point  A in  Fig.  28E-25. 
Then  a liquid  is  poured  into  the  can.  When  it  reaches  the 
brim,  the  observer  without  changing  her  position  can  just 
see  a small  coin  B actually  at  the  center  of  the  bottom. 
What  is  the  index  of  refraction  of  the  liquid? 

Fig.  28E-25 

/ 

/ 

/ 

/ 

/ 

/ 


28-26.  Short  on  short  wavelengths.  A beam  of  white 
light  is  sent  into  a block  of  medium  flint  glass  and  up  to 
the  flat  interface  between  the  block  and  the  surrounding 
air,  striking  the  interface  at  an  angle  of  incidence  of  38.0°. 
It  is  found  that  the  part  of  the  beam  which  passes  out  into 
the  air  contains  no  wavelengths  shorter  than  a certain 
value  X.  Explain  what  is  happening,  and  use  Fig.  28-21  to 
find  X. 


28-27.  Where  did  it  go?  Yellow  sodium  light  consists  of 
two  components  with  wavelengths  Xx  = 589.0  nm  and 
X2  = 589.6  nm.  Hence  a two-slit  diffraction  pattern  formed 
from  it  consists  of  the  two  components.  At  one  location 
in  such  a pattern,  a maximum  of  one  component  coincides 
with  a maximum  of  the  other.  At  how  many  maxima  from 
this  coincidence  will  a minimum  of  one  coincide  with  a 
maximum  of  the  other?  At  the  locations  where  this 
happens  the  diffraction  pattern  will  disappear. 

28-28.  Measuring  the  thickness  by  diffraction.  A flake  of 
glass  of  index  of  refraction  1.5  is  placed  over  one  of  the 
openings  of  a double-slit  diffraction  apparatus.  There  is  a 
displacement  of  the  diffraction  pattern  through  seven 
successive  maxima  toward  the  side  where  the  flake  was 
placed.  If  the  wavelength  of  the  diffracted  light  is  X = 
6.0  x 10-5  cm,  how  thick  is  the  flake? 

28-29.  Diffraction  at  oblique  incidence.  A beam  of  light 
of  wavelength  X falls  on  a diffraction  grating  of  line 
spacing  D at  an  angle  of  incidence  <p  measured  from  the 
normal  to  the  plane  of  the  grating.  Show  that  the  maxima 
in  the  diffraction  pattern  occur  at  angles  6 which  are  de- 
termined by  the  equation  j'X  = D (sin  0 - sin  <p),  where 
; = 0,  ± 1,  ±2,  ±3 

28-30.  Dispersion  and  resolving  power.  A diffraction 
grating  has  5000  lines  per  centimeter  and  is  2.00  cm 
wide.  It  is  being  used  to  investigate  the  spectrum  of  light 
emitted  from  a mercury-vapor  lamp.  What  is  its  disper- 
sion and  resolving  power  in  second  order  (j  = 2)  near  X = 
546  nm,  the  wavelength  of  the  intense  green  component 
of  the  spectrum?  In  other  words,  near  this  wavelength, 
what  is  the  change  in  the  angular  location  of  a second- 
order  diffraction  maximum  per  nanometer  change  in 
wavelength,  and  what,  approximately,  is  the  difference  in 
wavelength  between  two  adjacent  second-order  maxima 
that  are  just  resolvable? 

28-31.  Phasor  diagram.  Construct  to  scale  accurate 
phasor  diagrams  for  the  central  maximum,  the  first  auxil- 
iary maximum,  and  the  second  auxiliary  maximum  of  a 
single-slit  diffraction  pattern.  Measure  from  them  the  rel- 
ative values  of  the  amplitude  A at  these  points  on  the  dif- 
fraction pattern.  Then  determine  the  corresponding  rela- 
tive values  of  the  intensity  /.  Compare  your  results  with 
Fig.  28-34. 

28-32.  Missing  maxima.  A transmission  grating  is  used 
with  light  incident  normal  to  its  plane.  The  width  of  each 
slit  is  one-third  the  spacing  between  slits.  By  considering 
single-slit  diffraction,  show  that  the  third-order  (j  = 3) 
multislit  diffraction  maxima  are  missing  from  the  diffrac- 
tion pattern  of  the  grating. 

28-33.  Brewsters  angle. 

a.  Prove  that  when  a beam  of  light  is  completely  po- 
larized by  reflection  at  Brewster’s  angle  0P.  the  angle 
between  the  reflected  beam  and  the  refracted  beam  is  90°. 


1370  Wave  Optics 


b.  Suppose  a beam  of  polarized  light  is  incident  at 
Brewster’s  angle  on  the  surface  of  a transparent  material, 
with  its  electric  field  vectors  oscillating  in  the  plane  of  inci- 
dence (the  plane  containing  the  incident  beam  and  the 
normal  to  the  surface).  Electrons  at  the  surface  are  forced 
into  oscillation  in  the  plane  of  incidence.  These  acceler- 
ated charges  radiate,  and  as  a result  there  is  a refracted 
beam.  Construct  a simplified  version  of  a figure  like  Fig. 
28-42,  and  use  it  to  argue  that  basic  properties  of  the  radi- 
ation emitted  by  accelerated  charges  predict  there  will  be 
no  reflected  beam.  Then  extend  the  argument  to  the  case 
of  an  unpolarized  incident  beam,  and  explain  the  phe- 
nomenon of  complete  polarization  by  reflection  at  Brew- 
ster’s angle. 

28-34.  Four  Polaroid  sheets.  Four  ideal  Polaroid  sheets 
are  arranged  so  that  in  each  the  orientation  of  the  line 
along  which  no  oscillating  electric  field  is  absorbed  is  ro- 
tated through  30°  with  respect  to  the  orientation  of  that 
line  in  the  preceding  sheet,  with  all  the  rotations  being  in 
the  same  sense.  What  fraction  of  the  unpolarized  light  in- 
tensity incident  on  the  system  passes  through  it? 

Group  C 

28-35.  Where  will  it  come  out ? Figure  28E-35  shows  a 
square  slab  of  glass  ABCD  of  low  index  of  refraction  n = 
1 .30  lying  on  a table.  A beam  of  light  parallel  to  the  table 
top  strikes  side  AB  at  its  center. 


a.  For  what  range  of  angle  of  incidence  6 will  the 
beam  emerge  from  side  CD? 

b.  For  what  range  of  6 will  the  beam  be  totally  inter- 
nally reflected  from  side  BC? 

c.  For  what  range  of  0 will  the  beam  emerge  from 
side  BC? 

28-36.  Deviation  by  a prism,  I.  A beam  of  light  of  a 
single  wavelength  is  refracted  on  entering  and  on  leaving 
a triangular  prism  of  index  of  refraction  n and  refracting 
angle  a,  as  a result  of  which  it  undergoes  angular  devia- 
tion 8.  See  Fig.  28E-36.  If  the  prism  is  rotated  in  the  sense 
of  the  arrow,  keeping  the  incident  beam  I fixed,  the 
emergent  beam  E moves  upward,  decreasing  8.  If  the  ro- 
tation of  the  prism  is  continued,  the  emergent  beam  stops 
moving  upward  and  moves  downward,  increasing  8. 
There  is  consequently  only  one  position  of  the  prism  for 
which  8 is  a minimum. 


a.  Use  the  property  of  optical  reversibility  to  show 
that  when  the  deviation  is  a minimum,  8m,  the  path  of  the 
beam  is  symmetric;  that  is,  angle  <j>i  equals  angle  d>E. 

b.  Show  that  for  this  symmetric  path,  n = 
sin[(a  + 8,„)/2]/sin(a/2). 

28-37.  Deviation  by  a prism,  II.  A beam  of  light  of  a 
single  wavelength  strikes  a face  of  a triangular  prism  at  al- 
most normal  incidence.  The  index  of  refraction  of  the 
prism  is  n,  and  its  refracting  angle  a is  very  small.  Show 
that  the  angular  deviation  8 of  the  beam  is  given  by  8 = 
(n  — 1)  a.  The  quantities  a and  8 are  defined  in  Fig. 
28E-36. 

28-38.  Designing  a direct  vision  spectroscope.  It  is  pos- 
sible to  combine  prisms  of  different  kinds  of  glass  to  pro- 
duce appreciable  dispersion  with  negligible  angular  devia- 
tion. Such  a combination  makes  a compact  direct  vision 
spectroscope.  The  example  shown  in  Fig.  28E-38  has  a 
crown-glass  prism  of  small  refracting  angle  ac  combined 
in  opposition  with  a flint-glass  prism  of  small  refracting 
angle  a F . 


Fig.  28E-38 


The  following  table  gives  the  indices  of  refraction  of 
the  two  materials  for  light  of  three  different  colors. 

n 

Glass  Red  Yellow  Blue 

Crown  1.514  1.517  1.523 

Flint  1.643  1.650  1.664 

a.  Using  the  result  of  Exercise  28-37  for  the  angular 
deviation  8 of  a single  prism  of  small  refracting  angle,  8 = 
(n  — 1)  a,  and  the  data  in  the  table,  calculate  the  ratio 
ac/aF  if  the  combination  is  to  give  no  angular  deviation  to 
a beam  of  yellow  light.  If  aF  = 10°,  what  is  ac? 

b.  What  is  the  angular  deviation  (i)  of  a red  beam?  (ii) 
of  a blue  beam?  Indicate  the  sense  of  each  of  these  angu- 
lar deviations,  as  well  as  its  magnitude. 


Exercises  1371 


28-39.  Michelson’s  stellar  interferometer.  Many  of  the 
stars  occur  as  doublets,  a doublet  being  a pair  of  stars  ro- 
tating about  their  common  center  of  mass.  However,  even 
the  best  telescopes  have  difficulty  resolving  the  two  stars  in 
a doublet  if  they  are  relatively  closely  spaced,  unless  they 
are  supplemented  by  an  apparatus  invented  by  A.  A.  Mi- 
chelson,  called  a stellar  interferometer.  Michelson  at- 
tached a beam  supporting  mirrors  M to  the  100-inch  Mt. 
Wilson  telescope,  as  indicated  schematically  in  Fig. 
28E-39.  The  outer  two  mirrors  could  be  moved  together 


Fig.  28E-39 


or  apart.  Suppose  the  telescope  is  directed  at  a pair  of 
stars  of  small  angular  separation  fi.  Show  that  the  two-slit 
diffraction  pattern  observed  by  the  telescope  disappears 
when  4>  = A/2 D,  where  A is  the  wavelength  of  the  starlight 
and  D is  the  distance  between  the  two  outer  mirrors.  Ex- 
plain how  the  apparatus  would  be  used. 

28-40.  Preventing  smudge.  A single-filament  tubular 
electric  light,  such  as  those  used  in  store  showcases,  is  cov- 
ered with  red  cellophane.  It  is  examined  through  a barrier 
containing  two  slits  held  in  front  of  the  eye.  The  filament 
is  parallel  to  the  slits  and  1.00  m from  them.  The  slit 
spacing  is  1.00  mm.  What  is  the  maximum  allowable  width 
for  the  filament  if  the  diffraction  pattern  is  not  to  be 
smudged  out?  Take  the  wavelength  of  the  light  to  be  A = 
6.50  X 10-5  cm. 

28-41.  Light  filter.  A light  filter  designed  to  transmit  a 
narrow  range  of  wavelengths  of  white  light  is  made  by 


evaporating  a very  thin  semitransparent  metal  film  B on 
glass.  A layer  of  quartz,  n = 1.42,  is  evaporated  onto  B. 
Another  semitransparent  metal  film  A is  deposited  onto 
the  quartz.  See  Fig.  28E-41.  Some  of  the  white  light  inci- 
dent normally  on  the  filter  penetrates  A and  crosses  the 
quartz  to  B where  most  of  the  light  is  reflected,  although 
some  is  transmitted.  When  the  reflected  light  reaches  A, 
most  is  reflected.  The  reflected  light  returns  to  B,  where 
most  is  reflected,  although  some  is  transmitted.  The 
process  described  is  repeated  many  times.  If  the  thickness 
of  the  quartz  is  200  nm,  which  wavelength  emerges  with 
the  highest  intensity?  What  is  its  color? 

28-42.  Practical  applications  of  wave  optics.  Two  plates 
of  glass  are  placed  on  each  other.  A very  thin  wire  is 
placed  between  them  near  one  pair  of  edges,  forming  an 
air  wedge.  See  Fig.  28E-42.  The  arrangement  is  illumi- 
nated from  above  by  sodium  light  of  wavelength  A = 
589  nm  and  is  viewed  from  above.  A series  of  dark  bands, 
parallel  to  the  edges  in  contact,  is  seen.  There  are  100  of 
these  bands. 


Fig.  28E-42 


a.  Account  for  their  formation. 

b.  Determine  the  diameter  of  the  wire. 

c.  If  the  bands  are  not  straight  and  uniformly  spaced, 
what  does  this  tell  you  about  the  glass  plates? 

d.  Most  precision  machinists  own  a glass  plate  that  is 
certified  to  be  extremely  flat.  They  use  it  to  test  the 
flatness  of  a newly  machined  metal  surface.  How  do  they 
do  this? 

28-43.  Light  propagates  like  a wave! 

a.  Write  one  page  explaining  what  was  observed  in 
Fresnel’s  two-slit  diffraction  experiment  with  polarized 
light  (described  in  small  print  after  Example  28-8),  and 
explaining  how  these  observations  support  the  wave 
theory  of  light  propagation. 

b.  Derive  an  expression  for  the  ratio  of  the  intensity 
at  a maximum  in  the  diffraction  pattern  to  the  intensity  at 
an  adjacent  minimum,  as  a function  of  the  angle  between 
the  orientations  of  the  two  polarizers. 


Fig.  28E-41 


Broad  range  of 


Narrow  range  of 


wavelengths 


wavelengths 


A Quartz  B 

1372  Wave  Optics 


Ray  Optics 


29-1  WAVE  OPTICS  When  a broad  beam  of  light  passes  through  a narrow  hole  in  a barrier 
AND  RAY  OPTICS  oriented  perpendicular  to  the  beam,  the  light  is  diffracted.  WTe  can  charac- 
terize the  amount  of  diffraction  by  the  angle  9 which  locates  the  hrst  min- 
imum in  the  diffraction  pattern.  According  to  Eq.  (28-25),  this  angle  is  re- 
lated to  the  wavelength  k of  the  light  and  to  the  diameter  d of  the  hole  by 
the  expression 

6 — E22  4 (29-1) 

providing  k/d  is  small  compared  to  1.  The  diffraction  process  is  indicated 
qualitatively  in  Fig.  29- la  by  showing  the  central  parts  of  the  plane  wave 
fronts  of  a broad  beam  moving  up  to  a barrier,  as  well  as  the  wave  fronts 
that  pass  through  a hole  in  the  barrier.  As  has  been  explained  in  connec- 
tion with  Fig.  28-31,  the  wave  fronts  passing  through  the  hole  cannot  con- 
tinue to  propagate  as  parts  of  planes  because  their  extent  has  been 
restricted.  Instead  they  curl  at  their  edges  and  so  describe  light  traveling  not 
in  a particular  direction,  but  in  a range  of  directions.  I bis  diffraction 
process  provides  a classic  verification  of  wave  optics  since  it  can  be  given  a 
completely  satisfactory  explanation  only  in  terms  of  wave  motion. 

Flowever,  if  diffraction  is  to  be  observed,  the  ratio  k/d  must  not  be  so 
small  that  the  angle  9 is  of  negligible  size.  If  k/d  is  very  small,  then  9 will  be 
also,  and  the  angular  spread  of  the  light  passing  through  the  hole  will  be 
difficult  to  detect.  In  such  circumstances,  almost  all  the  light  that  has  passed 
through  the  hole  will  appear  to  travel  to  the  viewing  screen  along 
straight-line  paths  that  are  undeviated  extensions  of  the  paths  traveled  by 
the  light  in  the  beam  incident  on  the  hole,  as  illustrated  in  Fig.  29-16.  The 
paths  are  called  rays,  and  the  study  of  their  behavior  in  passing  through 


1373 


(a) 


i 

( b ) 

Fig.  29-1  (a)  Wave  fronts  indicating 

the  general  behavior  of  a beam  of  light 
passing  through  a hole  of  diameter 
somewhat  larger  than  the  wavelength  of 
the  light.  ( b ) Rays  indicating  the  general 
behavior  of  a beam  of  light  passing 
through  a hole  of  diameter  very  much 
larger  than  the  wavelength  of  the  light. 
Actually,  there  is  some  diffraction  of  the 
light  passing  near  the  edge  of  even  a 
very  wide  hole.  But  in  ray  optics  this  ef- 
fect is  ignored  since  almost  all  the  light 
is  not  diffracted. 


optical  systems  is  called  ray  optics.  The  fundamental  idea  in  ray  optics  is 
that  a beam  of  light  whose  wavelength-to-width  ratio  is  very  small  can  be 
considered  to  travel  through  any  uniform  material  along  rays  that  are 
everywhere  straight.  In  other  words,  ray  optics  is  the  approximation  in 
which  diffraction  is  ignored.  This  idea  is  voiced  in  the  basic  law  of  ray 
optics:  Light  travels  in  straight  rays,  providing  it  travels  through  material  which  is 
uniform. 

How  wide  must  a hole  be  for  ray  optics  to  provide  a good  description  of  its  ef- 
fect on  an  incident  beam  of  light?  Take  a hole  of  width  1.0  mm  = 1.0  x io-3  m. 
According  to  ray  optics — that  is,  in  the  absence  of  diffraction — the  width  of  the 
illuminated  region  on  the  screen  is  just  1.0  mm.  But  according  to  wave  optics, 
there  is  diffraction.  Let  us  estimate  it.  Consider  visible  light  of  wavelength  5.0  x 
10-7  m.  The  diffraction  angle,  measured  from  the  central  maximum  to  the  first 
minimum,  is  6 = 1.2X/d  = 1.2  x 5.0  x 10_7m/(1.0  X IO-3  m)  = 6.0  X 10_4rad. 
The  total  angular  width  of  the  diffracted  beam,  measured  from  the  first  minimum 
on  one  side  of  the  maximum  to  the  first  minimum  on  the  other  side,  is  twice  this 
value,  or  1 .2  x io-3  rad.  If  the  distance  from  the  hole  to  the  screen  is  1 .0  m,  diffrac- 
tion will  spread  light  over  the  screen  in  a region  of  width  equal  to  the  product  of 
the  total  angular  width  of  the  diffracted  beam  and  the  distance  from  where  the  dif- 
fraction occurs  to  the  screen.  That  is,  the  width  of  the  illuminated  region  will  be 
1.2  x io-3  x l.o  m = 1.2  x io-3  m = 1.2  mm.  This  is  to  be  compared  to  the 
1.0-mm  width  which  the  illuminated  region  would  have  in  the  absetrce  of  diffrac- 
tion. A detectable  amount  of  spreading  is  produced  by  diffraction  in  this  case. 

But  as  the  width  of  the  hole  is  increased  from  1.0  mm,  the  diffraction 
spreading  soon  becomes  very  difficult  to  detect.  One  reason  is  that  there  is  a de- 
crease (in  inverse  proportion  to  the  width  of  the  hole)  in  the  diffraction  spreading. 
The  other  reason  is  that  there  is  an  increase  (in  direct  proportion  to  the  width  of 
the  hole)  in  the  width  of  the  illuminated  region  on  the  screen  whose  origin  is 
described  by  ray  optics — that  is.  by  saying  light  travels  in  straight  rays.  Thus  dif- 
fraction spreading  becomes  quite  negligible  when  the  hole  is  not  much  wider  than 
1.0  mm.  If  the  distance  from  the  hole  to  the  screen  is  0.1  m,  instead  of  1.0  m,  there 
is  no  significant  diffraction  spreading  when  the  hole  is  somewhat  wider  than 
0.1  mm. 

In  many  practical  situations  the  slight  spreading  of  the  light  beam  traveling 
from  the  hole  to  the  screen  can  be  ignored  as  a minor  effect,  compared  to  the  fun- 
damental observation  that  it  travels  along  a straight  path.  This  is  just  what  is  done 
in  making  the  common  statement:  “Light  travels  in  straight  lines." 

The  same  situation  is  found  in  optical  systems  containing  mirrors  and  lenses. 
The  dimensions  of  optical  systems  are  such  that  diffraction  effects  can  be  ignored 
for  most  purposes  and  ray  optics  can  be  used  to  trace  the  paths  of  light  beams 
through  the  systems.  The  purpose  of  tracing  these  rays  is  to  obtain  the  information 
needed  to  design  a system  with  certain  required  characteristics.  As  an  example,  ray 
tracing  is  employed  to  determine  what  lenses  must  be  used  to  produce  a telescope 
with  a given  amount  of  magnification. 

In  Chap.  28  we  treated  plane  mirrors  and  parabolic  mirrors.  In  this 
chapter  we  are  concerned  almost  exclusively  with  lenses.  The  reason  is  that 
lenses  are  much  more  commonly  used  as  components  of  optical  systems 
than  are  mirrors.  To  treat  lenses,  the  basic  law  of  ray  optics  must  be  sup- 
plemented by  Snell’s  law  of  refraction,  which  describes  how  a light  ray  is 
bent  when  passing  across  the  boundary  between  regions  of  different  in- 
dices of  refraction. 

The  techniques  used  for  dealing  with  ray  optics  are  basically  geometri- 
cal. (For  this  reason,  ray  optics  is  frequently  called  geometrical  optics.)  The 
properties  of  simple  systems  of  lenses  can  be  predicted  by  graphical  con- 


1374 


Ray  Optics 


Mirror 


Fig.  29-2  The  law  ot  reflection. 


29-2  FERMAT’S 
PRINCIPLE 


Fig.  29-3  The  law  of  refraction. 


Fig.  29-4  I he  application  of  Fermat’s 
principle  in  a region  of  uniform  index 
of  refraction.  Light  travels  from  A to  B 
along  the  straight  path  because  it  is  the 
path  of  least  travel  time. 


structions,  made  to  scale.  But  this  method  suffers  from  limited  accuracy,  so 
we  also  develop  algebraic  equations  that  can  be  applied  to  simple  optical 
systems.  Then  applications  are  given  for  important  optical  instruments 
such  as  telescopes  and  microscopes. 

At  the  end  of  the  chapter  we  present  an  algebraic  method  of  handling 
the  ray  optics  of  lens  systems  having  any  degree  of  complexity.  The 
method  has  great  power,  and  it  is  very  straightforward.  These  attributes  re- 
sult from  employing  a mathematical  technique  called  matrix  multiplica- 
tion. If  you  have  already  encountered  matrices  in  your  study  of  mathemat- 
ics, you  will  enjoy  seeing  them  used  here  for  a practical  purpose.  But  it  is 
not  assumed  that  you  have  ever  heard  of  a matrix.  A self-contained  expla- 
nation of  matrix  multiplication  is  given  at  the  appropriate  point. 


The  subject  of  ray  optics  is  built  on  three  foundations.  One  is  the  basic  law 
that  light  travels  in  straight  rays  through  a uniform  region,  a ray  being  simply 
a directed  straight  line.  Another  foundation  is  the  specification  of  how  a 
ray  is  reflected  at  the  boundary  between  ^uch  a region  and  a mirror.  This  is 
the  law  of  reflection 

9 = 0''  (29-2) 

illustrated  in  Fm.  29-2.  The  third  foundation  is  Snell’s  law,  which  tells  how 

o 

a ray  is  bent  when  passing  from  a region  where  the  value  of  the  index  of  re- 
fraction is  n to  a region  where  its  value  is  n . This  is  Snell's  law  of  ref  raction, 
depicted  in  Fig.  29-3: 

n sin  9 = n'  sin  9'  (29-3) 

The  origins  of  all  three  of  these  experimentally  based  laws  have  been  ex- 
plained in  Chap.  28  in  terms  of  wave  optics.  Here  we  show  how  they  can  be 
obtained  from  a single  comprehensive  principle  of  ray  optics,  known  as 
Fermat’s  principle:  A ray  of  light  follows  the  path  between  two  points  which  re- 
quires the  least  time. 

In  a material  of  index  of  refraction  n,  the  speed  v of  light  has  the  value 


where  c is  the  speed  of  light  in  vacuum.  This  is  the  definition  of  n.  So  in  a 
uniform  region,  that  is,  a region  of  constant  index  of  refraction,  light  trav- 
els at  the  same  speed  along  any  path.  Therefore  the  shortest  path  between 
any  two  points  in  such  a region  is  also  the  path  that  requires  the  least  travel 
time.  The  well-known  fact  that  the  shortest  path  between  any  two  points  A 
and  B is  the  straight  line  connecting  them  is  illustrated  in  Fig.  29-4.  This 
fact  makes  the  basic  law  of  ray  optics  a trivial  consequence  of  Fermat’s  prin- 
ciple. 

The  argument  yielding  the  law  of  reflection  uses  Fig.  29-5.  Consider- 
ations given  in  the  figure  caption  show  that  the  path  of  least  time  must  be  in 
the  plane  that  contains  the  two  points  A and  B and  is  perpendicular  to  the 
plane  of  the  mirror.  The  length  of  the  perpendicular  from  A to  the  mirror 
is  a,  the  length  of  the  perpendicular  from  B to  the  mirror  is  b,  and  the  dis- 
tance between  the  two  perpendiculars  is  /.  The  as  yet  unknown  point  at 
which  the  ray  following  the  path  of  minimum  time  is  reflected  from  the 

29-2  Fermat’s  Principle  1375 


/ »|  Mirror 


Fig.  29-5  F ermat’s  principle  applied  to 
reflection.  The  point  at  which  the  light 
path  of  minimum  travel  time  strikes  the 
mirror  is  in  the  plane  of  the  page — that 
is,  in  the  plane  perpendicular  to  the 
plane  of  the  mirror  and  passing 
through  A and  B.  This  is  so  because 
moving  the  point  into  or  out  of  the  page 
increases  the  travel  time  because  it  in- 
creases both  path  lengths.  The  black 
barrier  prevents  light  from  traveling 
directly  from  A to  B . 


Fig.  29-6  F ermat’s  principle  applied  to 
refraction.  The  point  at  which  the 
minimum-travel-time  light  path  strikes 
the  surface  separating  the  regions  of 
different  indices  of  refraction  is  in  the 
plane  of  the  page.  The  reason  is  the 
same  as  it  is  for  Fig.  29-5. 


A 


B 


mirror  is  at  a distance  x from  the  base  of  the  perpendicular  from  A to  the 
mirror. 

The  figure  and  the  pythagorean  theorem  show  that  the  path  length 
from  A to  the  point  of  reflection  is  \/ a2  + x2  and  that  the  path  length  from 
the  point  of  reflection  to  B is  vV  + (/  - x)2.  For  light  traveling  at  speed  v 
along  the  path  from  A to  B , the  time  t required  is  the  total  path  length  di- 
vided by  v.  Thus 

Vo2  + x2  + Vb2  + (/  - x)2 

t = - 

v 

Since  the  value  of  t depends  smoothly  on  the  value  of  x,  differential  cal- 
culus tells  us  that  if  there  is  a value  of  x which  minimizes  t,  then  dt/dx  will  be 
zero  for  that  value.  So  we  evaluate  the  derivative,  obtaining 

dt  _ 1 r x l — x 

dx  v - v/fl2  + x2  \/b2  + (/  — x)2- 

Equating  the  derivative  to  zero  gives  us 

x / — x 

V«2  + x2  vV  + (/  — x)2 

I he  left  side  of  this  equality  is  just  sin  9,  and  the  right  side  is  just  sin  9'.  So 

we  have 

sin  9 = sin  9' 

From  this  we  immediately  obtain  the  law  of  reflection, 

9 = 9' 

(The  relation  dt/dx  — 0 is  the  criterion  that  t be  either  a minimum  or  a max- 
imum for  the  value  of  x found  from  the  relation.  You  can  convince  yourself 
that  the  value  obtained  for  x actually  gives  a minimum  t by  calculating  this  t 
and  comparing  it  with  the  t obtained  for  x = 0 and  x = /.) 


To  derive  the  law  of  refraction  from  Fermat’s  principle,  we  construct 
Fig.  29-6,  with  the  plane  containing  the  light  path  perpendicular  to  the 
plane  separating  the  regions  of  indices  of  refraction  n and  n' . The  light 
travels  from  point  A in  the  first  region  to  a point  at  unknown  distance  x 
from  the  base  of  the  perpendicular  to  the  separation  plane.  The  length  of 
the  perpendicular  is  a.  Then  the  light  continues  in  the  second  region  to  B , 
which  is  at  a perpendicular  distance  b from  the  plane.  The  distance 
between  the  perpendiculars  is  /.  According  to  Eq.  (29-4),  the  speed  of  light 
in  the  hrst  region  is 

o 


c 

v — — 
n 


and  its  speed  in  the  second  region  is 


So  the  time  t required  for  light  to  travel  along  the  path  connecting  A and  B 
is 


1376  Ray  Optics 


V a2  + X2  x/Vl  + (/  - xf 

t — — 1 7 

v v 

= — [nVa2  + x2  + n'y/b 2 + (/  — x)2] 
c 

Again  evaluating  dt/dx,  we  find 

dt  1 [ nx  n'{l  — x) 

dx  c .x/ a2  + x2  x/b 2 + (/  — x)2  - 

Setting  the  derivative  equal  to  zero  gives 

x I — x 

n — , - = n'  — , — 

xja~  + x2  x/b2  + (/  - x)2 

This  equation  is  just 

n sin  6 = n'  sin  6' 

which  is  the  law  of  refraction.  (You  can  convince  yourself  that  the  relation 
dt/dx  = 0 actually  gives  a minimum  t,  instead  of  a maximum  t,  by  going 
through  the  procedure  suggested  in  parentheses  at  the  end  of  the  preced- 
ing paragraph.) 

Fermat’s  principle  can  be  taken  as  the  postulatory  basis  of  ray  optics.  In  the 
nineteenth  century  an  analogous  principle  was  found  that  can  be  used,  instead  of 
Newton’s  laws  of  motion,  as  a complete  foundation  for  particle  mechanics  in  the 
newtonian  domain.  It  can  be  given  several  different  expressions.  One  is  known  as 
the  principle  of  least  action  because  it  requires  a particle  to  move  between  initial 
and  final  conditions  in  such  a way  as  to  minimize  a quantity  called  “action.”  This 
quantity  is  the  integral  over  time  of  twice  the  kinetic  energy  of  the  particle.  If  you 
study  mechanics  at  an  advanced  level,  you  will  have  a chance  to  investigate  these 
matters. 


29-3  LENSES  A typical  cylindrical  lens  consists  of  a piece  of  glass  shaped  so  as  to  make 
all  the  light  rays  of  a beam  of  parallel  rays  incident  on  one  side  pass  very 
near  a single  line  when  they  emerge  from  the  other  side.  The  flat  plate  and 
double  prism  arrangement  shown  in  the  top  of  the  photograph  in  Fig.  29-7 
gives  a crude  approximation  to  this  behavior.  The  rays  that  strike  that  flat 
plate  of  glass  pass  straight  through.  Rays  hitting  the  upper  prism  are  bent 
down  by  refraction  at  each  of  the  two  surfaces  between  the  glass  and  the 
surrounding  air,  and  those  hitting  the  lower  prism  are  bent  up.  The 
drawing  in  the  top  part  of  Fig.  29-8  shows  the  paths  of  several  rays.  This 
arrangement  does  produce  some  convergence  of  the  light  rays,  but  it 
would  not  make  a very  satisfactory  lens. 

A better  job  is  done  by  the  arrangement  shown  in  the  middle  of  the 
photo.  It  consists  of  a central,  flat  glass  plate  with  prisms  of  moderately 
nonparallel  sides  above  and  below,  and  prisms  that  have  more  markedly 
nonparallel  sides  above  and  below  these.  The  single  shaped  piece  of  glass  at 
the  bottom  of  the  photo  produces  a very  good  convergence — or,  as  it  is 
said,  focus — of  the  bundle  of  parallel  incident  light  rays.  All  these  rays  pass 
very  near  a single  line  (the  line  is  perpendicular  to  the  plane  of  the  photo) 
on  the  far  side  of  the  lens. 


29-3  Lenses  1377 


Fig.  29-7  Photograph  of  light  beams 
passing  through  two  prism  arrange- 
ments approximating  a cylindrical  lens 
and  through  an  actual  cylindrical  lens. 
( From  PSSC  College  Physics,  D.  C.  Heath, 
Boston,  1968.  Courtesy  Raytheon  Education 
Co.  and  Education  Development  Corpora- 
tion. ) 


The  cylindrical  lens  at  the  bottom  of  the  photo  focuses  the  incident 
rays,  to  a very  good  approximation,  because  there  is  just  the  right  change  in 
the  slope  of  its  surfaces  with  increasing  distance  from  the  center.  The  ray 
incident  on  the  center  of  the  lens  is  undeviated.  Rays  hitting  above  and 
below  the  center  are  bent  by  amounts  which  increase  as  their  distance  from 
the  center  increases.  The  surfaces  of  the  lens  in  the  photo  are  parts  of  cir- 
cular cylinders. 

Careful  measurements  show  that  although  the  lens  produces  an  essentially 
perfect  focus  for  the  rays  passing  near  its  center,  for  the  rays  passing  far  from  its 
center  the  focus  is  only  approximate.  In  theory,  a perfect  focus  for  all  rays  will  be 
produced  by  a lens  in  which  the  surface  struck  by  the  ray  is  part  of  an  elliptical 
cylinder  and  the  other  surface  is  part  of  a circular  cylinder,  with  both  surfaces  con- 
cave in  the  general  direction  of  travel  of  the  ray.  A perfect  focus  will  also  be  pro- 
duced by  a lens  in  which  the  struck  surface  is  a plane  and  the  other  is  part  of  a hy- 
perbolic cylinder  concave  in  the  direction  opposite  to  the  general  direction  that 
light  travels  through  the  system.  But  in  practice,  such  lenses  are  rarely  used  be- 
cause they  are  very  difficult  to  manufacture  compared  to  lenses  in  which  both  sur- 
faces are  parts  of  circular  cylinders.  In  almost  all  practical  situations,  the  latter 
produce  quite  acceptable  results. 

I he  surfaces  of  most  lenses  are  not  parts  of  cylinders  (circular  or  oth- 
erwise). Instead  they  are  parts  of  spheres.  Such  a lens,  which  is  called  a 
spherical  lens,  has  an  axis  of  symmetry  passing  through  the  centers  of  the 
two  spheres.  It  will  make  the  rays  of  a beam  of  parallel  light  rays  (or  parallel 
beam  for  short)  come  to  a focus  at  a point  on  that  axis.  T he  point  is  known  as 
the  focal  point. 


Fig.  29-8  Ray  diagrams  describing  the 
behavior  of  the  prism  arrangements 
and  of  the  cylindrical  lens  in  the  pho- 
tograph of  Fig.  29-7. 


The  focus  is  essentially  perfect  for  rays  passing  near  the  axis,  but  only  approx- 
imate for  rays  that  are  far  from  the  axis.  A lens  in  which  one  surface  is  part  of  an  el- 
lipsoid or  of  a hyperboloid  would  be  even  better  than  one  in  which  both  surfaces 
are  parts  of  spheres,  because  it  would  have  perfect  focusing  properties  even  for 
rays  which  are  not  near  the  axis.  But  it  is  not  usually  practical  to  grind  such  sur- 
faces in  glass. 

The  focusing  produced  by  a spherical  lens  is  illustrated  by  the  ray 
drawing  in  Fig.  29-9 a.  In  this  drawing  the  symmetry  axis  of  the  lens  lies  in 
the  plane  of  the  page,  and  so  it  looks  just  like  the  ray  drawing  for  the  cylin- 
drical lens  in  Fig.  29-8.  The  explanation  of  the  focusing  in  terms  of  the  pro- 
gressively increasing  bending  of  the  light  rays  is  also  just  like  the  explana- 
tion given  for  a cylindrical  lens. 

A wave  optics  explanation  of  focusing  is  presented  in  Fig.  29-9b.  The  wave 
fronts  are  related  to  the  light  rays  by  the  requirement  that  the  tangent  plane  at  each 
point  on  a wave  front  be  normal  to  a ray  passing  through  that  point.  The 
incident  parallel  beam  is  represented  by  a set  of  plane  wave  fronts.  Wave  fronts  of 
the  emergent  beam  consist,  approximately,  of  a set  of  spherical  surfaces  that  con- 
verge on  the  focal  point  of  the  lens.  This  happens  because  the  part  of  the  wave 
front  passing  through  the  center  of  the  lens  experiences  a maximum  retardation.  It 
travels  more  slowly  in  glass  than  in  air  and  is  passing  through  the  thickest  part  of 
the  glass.  The  farther  from  the  symmetry  axis  a part  of  the  wave  front  is,  the  less 
glass  it  traverses,  and  so  the  less  it  is  held  back.  At  all  points  on  any  particular 
wave  front  of  a sinusoidal  wave  it  has  the  same  phase — that  is  precisely  what 
makes  it  a wave  front.  So  all  components  of  the  light  passing  through  the  lens  are 
in  phase  when  they  converge  at  the  focal  point.  Thus  they  superpose  construc- 
tively, and  the  light  intensity  has  a maximum  value  at  that  point. 


1378  Ray  Optics 


Fig.  29-9  (a)  A ray  diagram  illus- 

trating how  a parallel  beam  of  light 
is  brought  to  a focus  by  a lens. 
(b)  A wave  front  diagram  illustrat- 
ing how  a parallel  beam  of  light  is 
brought  to  a focus  by  a lens. 


(b) 


+ 


The  ray  and  wave  diagrams  in  Fig.  29-9a  and  b can  also  be  used  to  explain  the 
formation  of  a beam  of  light  with  parallel  rays  and  plane  wave  fronts  from  a di- 
verging source  of  light.  As  explained  in  Sec.  28-4,  light  travels  in  a completely 
reversible  manner  from  a region  where  the  index  of  refraction  has  one  value  to  a 
region  where  it  has  another  value.  Thus  the  effect  of  the  lens  in  Fig.  29-9a  and  b is 
exactly  the  same  as  shown  in  the  figure  if  the  direction  of  travel  of  the  light  is  re- 
versed. In  other  words,  if  a point  source  of  light  is  placed  on  one  side  of  a lens  at  a 
distance  from  it  equal  to  the  distance  from  the  lens  to  its  focal  point,  then 
emerging  from  the  other  side  is  a beam  of  light  with  parallel  rays  and  plane  wave 
fronts. 

Now  we  will  use  Snell’s  law  of  refraction  to  trace  a light  ray  through  a 
lens.  This  calculation  will  prove  that  a lens  whose  surfaces  are  parts  of 
spheres  does  possess  the  focusing  properties  that  we  have  described.  It  will 
also  yield  some  simple  formulas  that  play  a crucial  role  in  the  ray  optics  of 
lenses. 

T he  lens  in  the  top  part  of  Fig.  29- 10  has  a symmetry  axis  indicated  by 
the  clash-dot  line.  Its  surfaces  are  parts  of  spheres  of  radii  px  and  p2,  and  it 
is  made  of  material  with  index  of  refraction  n.  We  assume  the  lens  is  in 
vacuum  or  air,  so  that  the  index  of  refraction  of  the  material  surrounding 
the  lens  is  essentially  equal  to  one.  The  ray  incident  on  the  lens  is  directed  to 
the  right  and  strikes  surface  1 first.  For  generality,  we  do  not  take  tfie  inci- 
dent ray  to  be  parallel  to  tfie  symmetry  axis.  Instead,  it  passes  upward 
through  the  axis  at  a distance  5 in  front  of  the  lens.  The  ray  emerging 
from  the  lens  travels  downward  and  passes  through  the  axis  at  distance  s' 
behind  the  lens. 

The  middle  part  of  Fig.  29-10  shows  the  refraction  of  the  ray  at  sur- 
face 1,  and  the  bottom  part  shows  the  refraction  at  surface  2.  In  each,  the 
dotted  line  is  a radius  of  the  surface  passing  through  the  point  where  the 
refraction  occurs.  T he  quantities  yt  and  y2  give  the  distances  from  those 
points  to  the  axis.  Several  angles  will  enter  in  the  calculation.  They  are  de- 
fined in  the  figure,  using  dashed  lines  that  are  parallel  to  the  axis. 

It  is  apparent  from  the  figure  that  the  angle  of  incidence  6l  of  the  ray 
on  surface  1 is  given  by 

Oi  = oil  + <t>  i (29-5a) 

and  that  its  angle  of  refraction  9[  on  the  same  surface  is  given  by 

d[  = ai  + 4>  i 


(29-5  b) 

29-3  Lenses  1379 


Fig.  29-10  (a)  The  path  of  a light  ray 

through  a lens.  ( b ) The  refraction  of 
the  ray  as  it  passes  through  the  Hist 
surface  of  the  lens,  (c)  The  refraction 
of  the  ray  as  it  passes  through  the  sec- 
ond surface. 


The  angles  d1  and  6[  are  related  by  Snell’s  law, 

nl  sin  0 1 = n[  sin  Q[ 

Here  n1  = 1,  the  index  of  refraction  of  the  air  to  the  left  of  the  surface. 
Furthermore,  we  have  n[  = n,  the  index  of  refraction  of  the  glass  to  its 
right.  Thus  we  have 

sin  6X  — n sin  9[ 

To  simplify  this  relation,  we  now  restrict  ourselves  to  treating  only  rays 
for  which  both  of  the  angles  6X  and  d[  are  small  angles.  If  you  consider  the 
figure  for  a moment,  visualizing  its  appearance  in  a case  when  y1  is  small, 
you  will  see  that  in  such  a case  both  0X  and  0[  are  small.  Thus  restricting  6X 
and  6[  to  be  small  means  that  we  consider  only  rays  that  are  near  the  axis.  Such 
rays  are  called  paraxial  rays.  (Of  course,  the  rays  drawn  in  the  figure  are 
not  very  near  the  axis  because  all  the  angles  must  be  exaggerated  for  the 
sake  of  clarity.)  With  this  restriction,  we  are  able  to  replace  the  sines  of  6X 
and  6[  by  the  angles  themselves,  measured  in  radians.  Then  Snell’s  law 
reads 


6X  = nd[ 


(29-6) 


1380 


Ray  Optics 


Using  Eqs.  (29 -5a)  and  (29-56),  we  obtain  from  Eq.  (29-6)  the  expression 

a i + </>i  = n{ax  + c/fi)  (29-7) 


At  surface  2 the  angles  of  incidence  and  refraction  are  given  by 


0 2 — (X  o 0 2 


(29-8a) 


and 


02  = a2  A 0 2 

Snell’s  law  for  the  refraction  at  this  surface  is 

n2  siu  d2  — n'2  sin  02 


(29-8  b) 


Setting  n2  = n and  n2  = 1,  and  taking  advantage  of  the  fact  that  in  re- 
stricting ourselves  to  paraxial  rays  we  ensure  also  that  both  d2  and  02  are 
small,  we  obtain 


n02  = d'2  (29-9) 

Using  Eqs.  (29-8«)  and  (29-86)  in  Eq.  (29-9),  we  find 

n{a2  — <62 ) = o/-2  + (62  (29-10) 

Now  the  figure  shows  that  (f>2  = 4> [■  Thus  we  can  eliminate  this  angle 
that  is  common  to  the  two  parts  of  the  calculation.  This  is  done  by  writing 
0(  for  <f>2  in  Eq.  (29-10)  and  then  transposing  so  that  it  becomes 

a2  + $2  = n(a  2 — 4>'i) 

Adding  this  equation  to  Eq.  (29-7)  produces 

a.i  + a2  + 0i  + 4>2  = n(ai  + oc  2) 


or 


<6l  + 02  = (n  ~ l)(«i  + Oto) 


(29-11) 


Next  we  find  expressions  for  the  angles  appearing  in  Eq.  (29-1  1).  The 
figure  shows  that  we  have 

7i  , 72 

sin  a,  = — and  sin  a2  = — 

Pi  ' P2 

But  the  paraxial  restriction  allows  us  to  write  these  as 

7i  y2 

«!=—  and  a2=—  (29-12) 

Pi  “ P2 

To  obtain  similar  expressions  for  4>  1 and  </)2,  we  impose  the  restriction  that 
we  deal  only  with  a thin  lens.  Specifically,  the  thickness  of  the  lens  at  its  sym- 
metry axis  is  supposed  to  be  small  compared  to  the  radii  of  curvature  px  and  p2  of  its 
surfaces  and  also  small  compared  to  the  distances  s and  s'  from  the  lens  to  the  points 
where  the  rays  cross  the  axis.  Then  it  makes  no  difference  from  what  location 
in  the  lens  the  distances  5 and  s'  are  measured,  and  we  can  write 


tan  (/>!=—  and 


tan  4>2  — 


72 

5' 


The  paraxial  restriction  assures  us  that  01  and  02  are  small  and  thus  allows 
us  to  replace  the  tangent  of  each  angle  by  the  angle  itself.  So  these  expres- 
sions simplify  to 


4>i  - 


7i 

v 


(29-13) 


anti  4)o  = % 

s 


29-3  Lenses  1381 


Fig.  29-11  (a)  A diagram  illustrating  the  way  a lens  re- 

fracts divergent  light  rays  originating  from  an  axial  point 
on  one  side  of  the  lens  so  that  they  converge  to  an  axial 
point  on  the  other  side,  (b)  A diagram  illustrating  the  way  a 
lens  retards  diverging  wave  fronts  originating  from  an 
axial  point  on  one  side  of  the  lens  so  that  they  converge  to 
an  axial  point  on  the  other  side.  The  behavior  shown  in 
both  parts  of  the  figure  is  followed  strictly  only  for  the 
rays,  or  for  the  parts  of  the  wave  fronts,  which  are  near  the 
axis.  The  same  is  true  of  Fig.  29-9. 


Using  Eqs.  (29-12)  and  (29-13)  in  Eq.  (29-11)  gives  us 


(29-14) 


Furthermore,  with  the  thin-lens  restriction  the  quantities  yt  and  y2  must 
have  essentially  the  same  value.  Dividing  Eq.  (29-14)  through  by  that  value, 
we  obtain  the  result 


Note  that  the  angle  c/q  does  not  appear  in  the  result.  This  is  crucial  be- 
cause it  means  that  all  paraxial  rays — independent  of  their  angle  of  inclina- 
tion with  respect  to  the  axis — which  start  from  a point  on  the  axis  at  dis- 
tance 5 to  the  left  of  the  thin  lens  will  be  bent  by  it  in  such  a way  as  to  cross 
the  axis  at  a point  at  distance  s'  to  its  right.  This  property  is  depicted  in  the 
ray  diagram  of  Fig.  29-1  la.  A light  source  is  located  at  a point  on  the  axis  to 
the  left  of  a thin  lens.  In  obtaining  Eq.  (29-15),  we  have  proved  that  all 
paraxial  rays  emanating  from  the  source  will  pass  through  a point  on  the 
axis  to  the  right  of  the  lens. 

The  corresponding  wave  front  diagram  is  shown  in  Fig.  29-llb.  Expanding 
away  from  the  light  source  at  the  axis  on  the  left  side  of  the  lens  is  a set  of  spheri- 
cal wave  fronts.  The  wave  fronts  are  retarded  by  different  amounts  in  passing 
through  different  regions  of  the  lens.  And  for  the  parts  of  the  wave  fronts  which 
are  sufficiently  near  the  axis  to  satisfy  the  paraxial  restriction,  the  retardation  con- 
verts them  into  a set  of  spherical  wave  fronts  that  converge  to  a point  at  the  axis  on 
the  right  side  of  the  lens. 

Since  a wave  front  is  a surface  on  which  a sinusoidal  wave  everywhere  has 
the  same  phase,  all  components  of  the  light  wave  are  in  phase  at  the  single  point  to 
which  it  converges.  They  were  also  in  phase  at  the  point  where  they  diverged  from 
the  source.  From  these  two  statements  it  is  not  difficult  to  conclude  that  the  same 
time  is  required  for  different  paraxial  parts  of  any  wave  front  to  travel  from  the 
point  of  divergence,  through  the  thin  lens,  to  the  point  of  convergence.  Expressed 
in  ray  language,  this  conclusion  tells  us  that  the  same  time  is  required  for  light  to 
travel  between  the  two  points  along  all  paraxial  rays  passing  through  the  lens. 
Fermat’s  principle  says  that  for  any  of  these  paths  the  travel  time  is  a minimum. 
Thus  the  paraxial  rays  passing  between  axial  points  on  opposite  sides  of  a thin 


1382  Ray  Optics 


lens  form  a family  of  paths  along  which  the  light  travel  times  all  have  the  same 
minimum  value — provided  the  locations  of  these  axial  points  are  related  as  speci- 
fied in  Eq.  (29-15). 

The  question  of  when  rays  may  be  considered  as  paraxial  is  a complicated 
one.  We  can  get  some  idea  of  the  error  in  Eq.  (29-15),  resulting  from  the  assump- 
tion that  sines  and  tangents  of  angles  equal  the  angles  themselves,  by  noting  that 
for  an  angle  of,  say,  0.2  rad  = 11°  the  sine  differs  from  the  angle  by  0.7  percent  and 
the  tangent  differs  from  the  angle  by  1.3  percent.  But  the  relation  between  the  dis- 
tance of  a ray  from  some  point  on  the  axis  and  the  error  actually  made  depends  on 
the  particular  shape  of  the  thin  lens  (and  on  other  particular  properties  of  the 
geometry  in  a system  containing  more  than  one  such  lens).  A method  frequently 
used  to  control  how  far  from  the  axis  a ray  may  be,  and  still  be  used  in  an  optical 
system,  is  to  place  opaque  screens  in  one  or  more  planes  normal  to  the  axis 
with  holes  centered  on  the  axis  and  of  appropriate  radii.  Such  devices  are  called 
stops. 

A separate  question  concerning  the  accuracy  of  Eq.  (29-15)  has  to  do  with  the 
assumption  that  the  thickness  of  a lens  is  negligible  compared  to  the  other  dimen- 
sions of  importance.  This  is  investigated  later  in  the  chapter  by  comparing  results 
obtained  from  Eq.  (29-15)  with  those  obtained  in  a treatment  in  which  the  thin- 
lens  assumption  is  not  made. 

We  will  use  E(|.  (29-15)  for  a variety  of  different  types  of  lenses.  In  so 
doing,  it  will  be  more  convenient  if  we  write  it  in  terms  of  the  quantities  rx 
and  r2,  which  are  defined  to  have  the  values 

ri  = Pi  and  r2  — ~~  P2  (29-16) 

In  these  terms,  the  equation  assumes  the  form 

I +1  = (w  - l)(~  -I)  (29-17) 

The  quantity  r associated  with  each  surface  of  the  lens  has  a magnitude 
equal  to  its  radius  of  curvature  and  a sign  given  by  the  following  conven- 
tion. The  sign  of  r is  positive  if  the  direction  from  the  surface  to  its  center  of  curva- 
ture is  the  same  as  the  general  direction  in  which  light  travels  through  the  system,  and 
is  negative  otherwise.  Inspect  Fig.  29-10«  and  convince  yourself  that  the  signs 
found  in  Eq.  (29-16)  agree  with  this  convention. 

The  left  side  of  Eq.  (29-17)  is  the  sum  of  the  reciprocals  of  the  dis- 
tances from  the  lens  to  the  points  where  the  incident  and  emergent  rays 
cross  the  axis.  To  interpret  the  right  side  of  the  equation,  consider  a case  in 
which  the  incident  ray  is  precisely  parallel  to  the  axis,  as  in  Fig.  29-12.  Then 
the  distance  5 becomes  infinitely  large,  and  the  value  of  its  reciprocal  \/s 

for  s = 00  (29-18) 

Fig.  29-12  All  paraxial  rays  parallel 
to  the  axis  of  a thin  lens  will  cross  the 
axis  at  the  same  distance  s'  behind  the 
plane  of  the  lens.  That  distance  is  just 
the  focal  length  / of  the  lens. 


becomes  zero.  In  this  case,  we  have 


29-3  Lenses  1383 


Incident 

light 


Fig.  29-13  A thin  lens,  such  as  is  used 
in  leading  glasses  for  a person  with  far- 
sighted vision. 


// 

P = 7.0  cm 


p=  13.0  cm 

n = 1.50 


Note  that  this  result  will  be  obtained  for  any  ray  passing  through  a thin 
lens,  no  matter  how  far  it  is  from  the  axis  before  it  is  incident  on  the  lens, 
providing  it  is  not  so  far  as  to  violate  the  paraxial  restriction.  When  the 
ray  emerges  from  the  lens,  it  will  cross  the  axis  at  a distance  s'  beyond  the 
lens.  And  so  will  every  other  paraxial  ray  of  a beam  that  is  parallel  to  the 
axis,  since  Ecp  (29-18)  applies  to  all  of  them.  Thus  the  equation  predicts 
that  a beam  of  such  rays  will  be  focused  by  the  lens  to  a point  whose  dis- 
tance from  the  lens  is  the  value  of  s'  specified  by  the  equation.  This  point  is 
the  one  that  previously  we  have  defined  to  be  the  focal  point  of  the  lens, 
l he  distance  from  the  lens  to  the  focal  point  is  called  the  focal  length /of 
the  lens.  So  s'  — f in  Eq.  (29-18),  and  we  have 

<29'19> 

This  is  known  as  the  lens  maker’s  formula.  It  tells  us  how  to  predict  the 
focal  length  of  a lens  in  terms  of  its  physical  characteristics.  Calculations 
very  much  like  the  one  we  made  for  the  type  of  lens  in  Fig.  29-10 a show 
that  Eq.  (29-19)  can  be  used  for  any  type  of  thin  lens  with  spherical  sur- 
faces, if  proper  attention  is  paid  to  the  sign  convention  for  r.  A plane  surface 
can  be  handled  by  using  an  infinitely  large  radius  of  curvature. 

Examples  29-1  through  29-3  illustrate  the  use  of  the  lens  maker’s 
formula. 


EXAMPLE  29-1 

The  lens  shown  in  Fig.  29-13  is  used  in  reading  glasses  for  a farsighted  person.  Its 
first  and  second  surfaces  have  radii  of  curvature  7.0  cm  and  13.0  cm,  respectively, 
and  it  is  made  of  glass  with  an  index  of  refraction  equal  to  1 .50.  Determine  the  focal 
length  of  the  lens. 

■ Here  both  surfaces  are  concave  in  the  direction  ot  passage  of  the  light  through 
the  lens,  so  r is  positive  for  both  of  them.  Specifically,  you  have  ry  = + 7.0  cm  and 
r2  = +13.0  cm.  It  you  set  n = 1.50,  the  lens  maker’s  formula  gives 

— — (1.50  ^(  + 7.o  cm  +13.0  cm) 

= 0.033  cm-1 


or 


/ = 30  cm 


EXAMPLE  29-2 

The  average  index  of  refraction  of  crown  glass  for  white  light  is  1 .52.  A lens  is  to  be 
made  from  this  glass  with  one  surface  spherical  and  the  other  a plane.  It  will  be 
used  with  light  first  striking  the  spherical  surface,  as  in  Fig.  29- 1 4a. 


1384  Ray  Optics 


Incident 

light 

► 


(a) 


Fig.  29-14  A thin  lens  with  one  surface 
spherical  and  the  other  a plane,  (a)  The 
light  passing  through  the  lens  strikes  the 
spherical  surface  first.  ( b ) The  lens  is 
turned  around  so  that  the  light  strikes 
the  plane  surface  first.  Example  29-2 
shows  that  the  focal  length  of  the  thin 
lens  is  the  same  in  both  cases. 


a.  Determine  the  radius  of  curvature  of  the  spherical  surface  which  will  make 
the  lens  have  a focal  length  of  20.0  cm. 

b.  If  the  lens  is  reversed,  so  that  light  first  strikes  the  plane  surface  as  in  Fig. 
29-146,  what  will  the  focal  length  of  the  lens  be? 

■ a.  In  Fig.  29- 14a  surface  1 is  concave  in  the  direction  that  light  travels  through 
the  lens.  So  in  the  lens  maker’s  formula  you  should  set  r\  = +p,  where  p is  the 
radius  of  curvature  to  be  evaluated.  The  radius  of  curvature  of  surface  2 is  infi- 
nitely large,  so  its  reciprocal  is  zero  and  you  set  l/r2  = 0.  Then  you  have 


Solving  for  p gives 

p = (n  - 1 )/ 

The  numerical  value  is 


p = (1.52  — 1)  x 20.0  cm  = 10.4  cm 

b.  For  light  passing  through  the  lens  in  the  direction  shown  in  Fig.  29-146,  you 
have  \/rx  = 0 and  r|  = — p.  Thus  the  lens  maker’s  formula  says  the  focal  length  will 
have  a value  given  by 


But  this  expression  is  identical  to  the  one  specifying  the  value  of/ that  you  obtained 
in  part  a.  Therefore  you  can  conclude  that  the  focal  length  of  the  thin  lens  is  the 
same,  no  matter  in  which  direction  light  passes  through  it.  This  is  true  for  any  thin 
lens.  Prove  it  for  the  one  considered  in  Example  29-1. 

Lenses  of  the  type  considered  in  this  example  are  very  common  because  they 
are  the  easiest  to  make.  Obtain  one  and  then  verify  the  conclusion  of  part  6 experi- 
mentally, using  sunlight  to  provide  a parallel  incident  beam. 


EXAMPLE  29-3 

Evaluate  the  focal  length  of  the  lens  shown  in  Fig.  29-15.  It  is  made  of  high-density 
Hint  glass  with  index  of  refraction  1.66,  and  both  radii  of  curvature  are  15.0  cm. 

■ Taking  the  direction  of  the  incident  light  to  be  toward  the  right,  for  surface  1 


29-3  Lenses  1385 


Fig.  29-15  A diverging  thin  lens. 


p = 15.0_cm 

’ 

— — 

— n = 1 .66 

you  have  r\  = — 15.0  cm  since  it  is  concave  to  the  left.  For  surface  2 the  value  is  r2  = 
+ 15.0  cm  because  that  surface  is  concave  to  the  right.  So  the  lens  maker's  formula 
says  the  focal  length  /of  the  lens  is  given  by 


1 

7 


(1.66  - 1)( l- ) 

V — 15.0  cm  +15.0  cm  / 

-0.088  cm"1 


or 


/=  -11.4  cm 


In  predicting  a negative  value  for  the  focal  length  / of  the  lens  in  Fig. 
29- 15,  the  lens  maker’s  formula  is  telling  us  that  it  is  a diverging  lens.  The 
light  rays  seen  in  the  photograph  of  Fig.  29-16  show  the  prediction  is  cor- 
rect. A diverging  lens  makes  the  rays  passing  through  it  appear  to  diverge 
from  a point  on  the  side  of  the  lens  they  strike  first.  Can  you  give  a wave 
optics  explanation  of  the  behavior  of  a diverging  lens,  analogous  to  the  one 
presented  for  a converging  lens  in  Fig.  29-1 16?  One  of  the  most  important 
uses  for  diverging  lens  is  in  eyeglasses  to  correct  nearsighted  vision,  as  you 
will  find  in  Sec.  29-4. 

A converging  lens  (f  > 0)  makes  incident  parallel  rays  actually  converge 
to  a point  on  one  side  of  the  lens.  A diverging  lens  (/  < 0)  makes  the  rays 
appear  to  diverge  from  a point  on  the  other  side.  Although  their  significances 
are  quite  different,  both  points  are  called  focal  points.  The  distinction  is 
made  amply  clear  by  the  sign  of / and  the  following  convention.  The  sign  off 
is  positive  if  the  direction  from  the  lens  to  the  focal  point  is  the  same  as  the  general 
direction  in  which  light  travels  through  the  system,  and  is  negative  otherwise. 


Fig.  29-16  A photograph  of  light 
beams  being  refracted  by  a diverging 
thin  lens.  The  lens  is  cylindrical,  but  a 
similar  spherical  lens  behaves  in  a simi- 
lar way.  (There  is  also  some  reflection  of 
the  beams.  But  this  is  not  of  interest.) 
(Courtesy  Bausch  and  Lomb.) 


1386  Ray  Optics 


The  accuracy  of  the  focusing  properties  of  a spherical  lens  is  limited  by 
how  well  the  beam  of  light  passing  through  it  satisfies  the  paraxial  restric- 
tion. If  the  lens  is  thin,  then  the  location  of  the  approximate  focus  is  given 
by  the  formula  in  Eq.  (29-19).  A thick  spherical  lens  will  produce  about  as 
good  a focus  as  a thin  lens  for  a given  light  beam,  but  the  formula  will  not 
be  useful  in  calculating  the  location  of  its  focal  point.  We  show  later  how  to 
treat  thick  lenses  theoretically.  For  any  lens  the  focal-point  location  can  be 
determined  experimentally,  by  using  the  procedure  indicated  in  Fig.  29-7 
or  at  the  end  of  Example  29-2. 


29-4  IMAGE  If  a point  lies  on  the  axis  of  a converging  lens,  and  is  not  too  close  to  the 
FORMATION  lens,  all  paraxial  light  rays  emitted  from  it  that  strike  the  lens  will  come 
together  at  a point  on  the  axis  lying  on  the  other  side  of  the  lens.  We  have 
given  the  symbol  s to  the  distance  along  the  axis  between  the  point  where 
the  incident  rays  leave  the  axis  and  the  thin  lens.  The  distance  along  the 
axis  between  the  thin  lens  and  the  point  where  the  rays  emerging  from  it 
return  to  the  axis  we  have  designated  as  s'.  Given  the  focal  length/ of  the 
lens,  we  can  immediately  obtain  the  relation  between  5 and  s' . We  take  the 
definition  of  the  focal  length  of  a thin  lens,  Eq.  (29-19): 


Then  we  substitute  it  into  Eq.  (29-17): 


We  obtain 


- + A = 7 (29-20) 

s s J 

This  simple  relation  is  very  important  in  the  ray  optics  of  thin  lenses.  It  lets 
us  determine  the  distance  s'  for  any  value  of  the  distance  s,  given  the  focal 
length / of  the  lens. 

Furthermore,  the  relation  can  be  used  even  if  the  light  rays  emanate 
from  a point  which  is  not  on  the  axis  of  the  thin  lens  — providing  it  is  dose 
enough  that  the  rays  are  paraxial  and  all  angles  are  small.  Figure  29-17 
shows  two  such  rays.  They  are  emitted  from  some  point  on  a plane,  called 


Object 

plane 


Fig.  29-17  All  paraxial  rays  emitted  from  a point  on  a plane 
parallel  to  the  plane  of  a converging  lens  will  converge  to  a 
point  on  another  parallel  plane — no  matter  where  the  point  of 
emission  is  located — if  the  object  distance  s is  greater  than  the 
focal  length  f.  Two  special  rays  are  shown  here,  but  the  same  is 
true  of  all  others.  Corresponding  to  each  different  point  of 
emission  there  is  a different  point  where  the  rays  converge.  The 
object  and  image  distances  s and  s'  are  related  to  the  focal 
length / of  the  lens  by  Eq.  (29-20).  This  relation  determines,  in 
turn,  the  ratio  of  the  heights  above  and  below  the  axis  of  the 
points  of  emission  and  convergence,  h and  h' , since  the  triangle 
whose  sides  are  s and  h is  similar  to  the  one  whose  sides  are  s' 
and  h' . 


29-4  Image  Formation  1387 


the  object  plane,  which  is  parallel  to  the  plane  containing  the  thin  lens.  The 
distance  ^ between  the  two  planes  is  called  the  object  distance.  We  will 
prove  that  these  two  rays  come  together  at  a point  on  another  parallel 
plane,  the  image  plane,  lying  at  the  image  distance  s'  on  the  other  side  of 
the  plane  of  the  converging  lens.  In  these  terms,  the  relation  of  Eq.  (29-20) 
states  that  the  reciprocal  of  the  object  distance  plus  the  reciprocal  of  the  image  dis- 
tance eq  uals  the  reciprocal  of  the  focal  length. 

In  fact,  we  will  prove  that  all  the  paraxial  light  rays  emitted  from  a par- 
ticular point  on  the  object  plane  converge  at  some  particular  point  on  the 
image  plane.  Rays  from  a different  point  on  that  object  plane  converge  at 
some  other  point,  but  the  point  lies  on  the  same  image  plane.  Furthermore, 
the  height  h'  below  the  axis  of  any  point  on  the  image  plane  is  proportional 
to  the  height  h above  the  axis  of  the  point  on  the  object  plane  that  gave  rise 
to  it.  As  a result  of  these  properties,  if  an  illuminated  object  is  located  in  a 
plane  parallel  to  the  plane  of  the  lens,  then  the  lens  will  cause  the  light 
emanating  from  it  to  form  a reproduction  of  the  same  shape  — that  is,  an 
image — in  another  parallel  plane.  The  size  of  the  image  is  generally  dif- 
ferent from  the  size  of  the  object,  and  its  orientation  may  be  reversed.  For 
the  circumstances  depicted  in  the  figure,  the  image  is  enlarged  and  inverted 
(that  is,  “upside  down”).  But  both  these  features  can  be  different  in  dif- 
ferent circumstances.  The  ability  to  form  images,  of  adjustable  size  and  ori- 
entation, is  what  makes  lenses  really  useful. 

To  prove  that  a thin  lens  has  these  image-forming  properties,  we  con- 
sider Fig.  29-18.  This  shows  the  two  rays  drawn  in  Fig.  29-17,  as  viewed  in 
the  plane  containing  the  rays.  The  object  is  an  arrow  of  height  h lying  in  a 
plane  parallel  to  that  of  the  lens.  Its  base  is  at  a point  on  the  lens  axis,  and 
we  use  the  earlier  symbolism  to  call  the  distance  from  the  arrow  to  the  lens 
s.  We  know  that  any  two  paraxial  rays  emerging  from  the  base  of  the  arrow 
come  together  again  at  the  axis  on  the  other  side  of  the  lens  at  distance  s', 
where  s'  is  related  to  5 and  the  focal  length  / of  the  lens  by  Eq.  (29-20). 
Where  clo  any  two  paraxial  rays  emerging  from  the  tip  of  the  arrow  come 
together? 


Fig.  29-18  A ray  diagram  used  to  demonstrate  the  image-forming  properties  of  a thin  lens 
for  paraxial  rays.  It  is  assumed  that  the  size  of  the  lens  measured  perpendicular  to  the  axis  is 
large  enough,  compared  to  the  size  of  the  object,  that  a ray  from  the  tip  of  the  object  directed 
parallel  to  the  axis  passes  through  the  lens.  This  assumption  allows  the  proof  to  be  simplified 
by  using  the  parallel  ray.  The  fact  that  the  results  obtained  in  the  proof,  Eqs.  (29-25)  and 
(29-28),  do  not  involve  the  size  of  the  lens  suggests  that  they  are  independent  of  the  assump- 
tion. The  suggestion  can  be  verified  by  a more  complicated  demonstration  which  leads  to  the 
same  equations,  but  does  not  use  the  parallel  ray. 


y 


1388  Ray  Optics 


The  question  is  most  easily  answered  by  considering  the  two  particular 
rays  shown  in  the  figure.  One  is  a ray  from  the  arrow  tip  that  is  directed 
parallel  to  the  axis.  When  it  passes  through  the  lens,  we  know  it  will  be  bent 
toward  the  axis  so  that  it  passes  through  the  focal  point  at  distance /beyond 
the  lens,  as  shown  in  the  figure.  This  follows  from  the  definition  of  focal 
length,  since  the  ray  in  question  could  also  be  a ray  of  a parallel  beam  of 
light  that  is  incident  on  the  lens.  The  second  ray  is  directed  so  that  it  hits 
the  center  of  the  lens.  It  will  not  be  bent  in  passing  through  the  thin  lens,  as 
the  figure  also  shows.  The  reason  is  that  the  ray  passes  through  the  lens  at 
the  point  where  its  two  surfaces  are  parallel.  So  the  angle  through  which  it 
is  refracted  on  entering  the  glass  is  the  opposite  of  the  angle  through  which 
it  is  refracted  on  leaving.  There  will  be  a small  lateral  displacement  of  the 
ray,  but  this  is  negligible  because  the  lens  is  thin. 

To  find  the  point  of  intersection  of  the  two  rays  emerging  from  the 
lens,  it  is  convenient  to  use  the  x and  y axes  defined  in  the  figure.  The  equa- 
tion of  the  ray  going  through  the  center  of  the  lens  is 

y = x (29-2 1) 


You  can  see  that  this  is  so  by  noting  that  it  correctly  predicts y = 0 for  x = 0 
and  y = h for  x = —5.  The  equation  for  the  the  ray  going  through  the  focal 
point  is 


y 


(29-22) 


This  can  be  verified  by  the  observation  that  it  gives  the  correct  values  y = h 
for  x = 0 and  y = 0 for  x = /. 

The  intersection  point  of  the  two  rays  is  the  point  where  both  Eq. 
(29-21)  and  Eq.  (29-22)  give  the  same  value  of  y.  So  we  find  that  point  by 
equating  the  right  sides  of  the  two,  obtaining 


This  gives  us 


= h - 


h 


h 


or 


1 


Therefore  the  x coordinate  of  the  point  of  intersection  of  the  two  rays  from 
the  tip  of  the  arrow  satisfies  the  relation 


1 11 

f s x 


(29-23) 


The  x coordinate  of  the  intersection  point  of  two  rays  from  the  base  of 
the  arrow  has  a value  s'  specified  by  Eq.  (29-20).  If  we  transpose  that  equa- 
tion, it  reads 


1 _ 1 _ I 

f s s' 


(29-24) 


29-4  Image  Formation  1389 


(29-25) 


Comparing  Eqs.  (29-23)  and  (29-24),  we  see  immediately  that 

x — s' 


Thus  the  distance,  measured  parallel  to  the  axis  from  the  plane  of  the  lens, 
to  the  point  of  convergence  of  the  rays  from  the  tip  of  the  arrow  is  the  same 
as  the  distance  to  the  point  of  convergence  of  the  rays  from  the  base  of  the 
arrow.  This  means  that  both  convergence  points  lie  in  a plane  parallel  to 
the  plane  of  the  lens.  The  same  is  obviously  true  for  the  points  of  con- 
vergence of  rays  coming  from  any  other  point  of  the  arrow.  All  these  points 
lie  in  the  same  image  plane  at  distance  s’  from  the  parallel  plane  containing 
the  thin  lens. 

To  determine  the  height  h'  of  the  image  formed  by  the  arrow,  we  find 
the  y coordinate  of  the  image  of  the  arrow’s  tip.  Substituting  the  value  x = 
s'  given  by  Ecj . (29-25)  for  its  x coordinate  into  Eq.  (29-21),  we  obtain 


h 


y = — s' 


Thus  we  have 


(29-26) 


The  negative  value  of  the  height  h'  of  the  arrow’s  image  tells  us  that  the 
image  is  inverted,  as  can  also  be  seen  from  Fig.  29-18.  But  otherwise  the 
image  has  the  same  shape  as  the  object  that  produced  it,  instead  of  being 
distorted  in  some  way.  This  is  a consequence  of  the  fact  that  h'  is  directly 
proportional  to  h.  Unless  s'  happens  to  equal  5,  the  size  of  the  image  differs 
from  the  size  of  the  object.  The  relation  between  the  two  sizes  is  measured 
by  the  lateral  magnification  /,  defined  as 


(29-27) 


According  to  Eq.  (29-26),  for  an  image  formed  from  an  object  by  a thin  lens 
/ has  the  value 


5 


(29-28) 


The  absolute  value  of  the  lateral  magnification  can  be  larger  than  1.  Or  it 
can  be  smaller  than  1,  in  which  case  there  is  actually  a lateral  contraction. 
The  situation  that  actually  occurs  depends  on  the  ratio  of  s'  to  5. 

For  the  case  we  have  been  considering  both  5 and  s'  are  positive.  So  / is 
negative,  implying  an  inverted  image.  But  situations  arise  in  which  either 
or  both  of  the  quantities  5 and  s'  are  negative.  Nevertheless,  Eq.  (29-28) 
gives  the  right  sign  for  /,  as  well  as  the  right  magnitude,  if  the  following  sign 
conventions  are  followed.  For  the  object  distance:  The  sign  of  s is  positive  if  the 
direction  from  the  object  plane  to  the  lens  is  the  same  as  the  general  direction  in  which 
light  travels  through  the  system,  and  is  negative  otherwise.  For  the  image  distance: 
The  sign  of  s'  is  positive  if  the  direction  from  the  lens  to  the  image  plane  is  the  same  as 
the  general  direction  in  which  light  travels  through  the  system,  and  is  negative  other- 
wise. For  the  object  height:  The  sign  ofh  is  positive  if  it  measures  a height  above  the 
lens  axis,  and  is  negative  otherwise.  For  the  image  height:  The  sign  of  h ' is  positive 
if  it  measures  a height  above  the  lens  axis,  and  is  negative  otherwise.  (Note  that  in  ray 


1390  Ray  Optics 


optics  the  names  object  “distance”  and  image  “distance"  are  used  for  the 
signed  scalars  5 and  s' , and  the  name  focal  “length"  is  used  for  the  signed 
scalar/,  despite  die  fact  that  elsewhere  the  words  “distance”  and  "length” 
refer  to  quantities  which  are  magnitudes  only  and  so  do  not  have  signs.) 

Example  29-4  shows  you  how  to  use  the  equations  developed  in  this 
section  to  analyze  the  operation  of  a simple  optical  system. 


A box  camera  has  a single  thin  converging  lens  of  focal  length  15.0  cm.  You  are 
using  it  to  photograph  a student  who  is  standing  800  cm  in  front  of  the  camera  lens, 
as  sketched  in  Fig.  29-19. 

a.  What  must  be  the  distance  from  the  lens  to  the  him  if  the  image  of  the  stu- 
dent is  to  be  in  focus  on  the  him? 

b.  The  student  is  180  cm  tall.  What  will  be  the  vertical  dimension  of  the  image? 
■ a.  You  use  Eq.  (29-20), 

1 11 

s s'  f 

to  find  the  distance  5'  from  the  lens  to  the  image  plane,  given  that  the  distance  5 
from  the  object  plane  to  the  lens  is  + 800  cm  and  that  the  focal  length  of  the  lens  has 
the  value/  = +15.0  cm.  You  find 

1 _ 1 1 

s'  f s 

or 


1 1 

1 //  — I/5  l/(+15.0cm)  - l/(+800  cm) 

The  him  should  be  at  the  image  plane,  so  this  is  also  the  required  distance  from  the 
lens  to  the  him.  Note  that  it  is  0.3  cm  more  than  the  distance  from  the  lens  to  the 
him  required  to  take  a properly  focused  photograph  of  a far-away  object.  Can  you 
explain  why?  In  most  cameras  the  adjustment  is  made  by  moving  the  lens  away 
from  the  him. 

b.  Assume  that  you  aim  the  camera  so  that  the  axis  of  the  lens  is  directed  at  the 
midpoint  of  the  student,  as  in  the  figure.  Then  the  height  above  the  axis  of  the  top 
of  his  head  is  half  his  vertical  dimension  and  has  the  value  A = +90  cm.  To  find  A', 


29-4  Image  Formation  1391 


the  height  above  the  axis  of  the  image  of  the  top  of  the  student's  head,  you  first  use 
Eq.  (29-28)  to  evaluate  the  lateral  magnification  l,  obtaining 


s'  _ +1 5.3  cm 
s + 800  cm 


-0.0191 


Then  you  compute  h'  by  using  Eq.  (29-27): 

h'  = Ih  = —0.0191  x ( + 90  cm)  = —1.72  cm 

The  negative  sign  tells  you  that  the  image  of  the  top  of  the  head  lies  below  the  axis. 
In  other  words,  the  image  of  the  student  is  inverted.  Its  vertical  dimension  is  twice 
the  magnitude  of  h'  and  has  the  value 

2\h'\  = 2 X 1.72  cm  = 3.44  cm 

<11  ■ II  ■■  MUillllllllllll 


Equation  (29-20),  written  as  1/s'  = 1 //—  1/s,  implies  that  in  a situa- 
tion where  the  object  distance  5 is  smaller  than  the  focal  length/,  the  image 
distance  s'  will  have  a negative  value.  This  is  because  1/s  will  be  larger  than 
1 //,  and  so  the  quantity  on  the  right  side  of  the  equation  will  have  a nega- 
tive value.  Example  29-5  investigates  such  a situation. 


EXAMPLE  29-5 

Figure  29-20  depicts  a thin  lens  of  focal  length  / = +6.67  cm  and  an  object  of 
height  + 0.500  cm  located  4.00  cm  to  the  left  of  the  lens.  Predict  the  location,  size, 
and  orientation  of  the  image. 

■ You  employ  Eq.  (29-20) 

1 1 _ 1 

s s'  f 

to  determine  the  location  of  the  image  plane  when  5 = +4.00  cm.  Calculating  1/s 
and  1//,  you  have 

1 1 1 

— = — = +0.150  cm  1 — ( + 0.250  cm  *)  = —0.100  cm  1 

s'  J s 

and 


1392  Ray  Optics 


s'  = - 10.0  cm 


So  the  image  lies  in  a plane  that  is  10.0  cm  away  from  the  thin  lens.  According  to  the 
sign  convention  for  s',  the  direction  from  the  lens  to  the  image  plane  is  opposite  to 
the  general  direction  in  which  light  travels  through  the  system.  Thus  you  can  pre- 
dict that  the  image  lies  10.0  cm  to  the  left  of  the  lens. 

To  predict  its  size  and  orientation,  you  use  Eq.  (29-28)  to  evaluate  the  lateral 
magnification 


s' 

s 


— 10.0  cm 
+ 4.00  cm 


+ 2.50 


Then  you  use  Eq.  (29-27)  to  calculate 

h'  = Ih  = +2.50  x (+0.500  cm)  = +1.25  cm 

The  sign  conventions  for  h and  h'  indicate  that  the  object  has  produced  an  erect 
image  (an  image  that  is  “right  side  up”). 


Figure  29-20  shows  how  you  can  make  a graphical  construction  to  find  the  charac- 
teristics of  the  image,  formed  by  the  thin  lens  of  known  focal  length,  of  the  object  at 
known  object  distance.  The  accuracy  may  not  be  as  good  as  that  of  the  algebraic 


Fig.  29-20  An  accurately  constructed  ray  diagram  showing  a converging  lens  forming  a vir- 
tual image  of  an  object  whose  distance  from  the  lens  is  less  than  its  focal  length.  The  diagram 
is  constructed  by  assuming  the  lens  to  be  thin  and  the  rays  to  be  paraxial.  These  assumptions 
play  a significant  role.  Also,  it  is  assumed  that  a ray  from  the  tip  of  the  object  directed  parallel 
to  the  axis  passes  through  the  lens.  This  assumption  is  not  significant,  as  is  explained  in  the 
caption  to  Fig.  29-18. 

procedure  just  used  — unless  you  are  very  caref  ul  — but  the  graphical  construction 
can  give  additional  insight  into  what  is  happening.  Working  carefully,  you  lay  out 
the  location  and  height  of  the  object,  relative  to  the  plane  containing  the  lens  and  to 
the  axis.  You  also  mark  on  the  axis  a point  on  the  other  side  of  the  plane  whose  dis- 
tance from  it  equals  the  focal  length.  Then  you  construct  a ray  beginning  at  the  top 
of  the  object  parallel  to  the  axis  and  directed  toward  the  plane  of  the  lens.  From  its 
point  of  intersection  with  the  plane,  you  draw  a continuation  of  the  ray  that  passes 
through  the  focal  point.  Why  is  this  the  correct  continuation  of  the  ray?  Next  yon 
draw  a second  ray  from  the  top  of  the  object  that  passes  through  the  intersection  of 
the  plane  of  the  lens  and  its  axis.  This  ray  continues  straight  through  the  lens  plane 
into  the  region  beyond  it.  Why? 

In  this  case  you  find  that  the  two  rays  from  the  tip  of  the  object  diverge  in  the 
region  beyond  the  lens.  So  backward  extensions  of  the  parts  of  the  rays  in  the 
region  beyond  the  lens  must  converge  in  the  region  in  front  of  the  lens.  To  find 
their  point  of  intersection,  you  make  the  extensions  with  dashed  lines,  as  in  the  fig- 
ure. Then  a measurement  of  its  distance  from  the  lens  plane  will  give  a result  close 
to  the  magnitude  of  s'  obtained  algebraically.  And  the  measured  distance  from  the 
axis  to  the  intersection  point  will  be  close  to  the  algebraic  prediction  for  the  magni- 
tude of  h' . The  signs  of  s'  and  h'  correspond  to  the  directions  from  the  lens  plane 
and  lens  axis  to  the  intersection  point  in  just  the  way  specified  by  the  sign  conven- 
tions. 


To  the  eye  of  an  observer  on  the  right  side  of  the  lens  in  Fig.  29-20,  the 
light  rays  have  every  appearance  of  coming  from  the  image  located  on  the 
left  side  of  the  lens.  The  image  is  called  a virtual  image  because  light  rays 
never  actually  converge  on  the  image — only  their  backward  extensions  do. 
In  contrast,  light  rays  really  do  converge  on  the  real  image  formed  by  the 
lens  in  Fig.  29-18.  If  you  place  a piece  of  paper  in  the  plane  of  the  image  in- 
dicated in  that  figure,  you  will  see  an  image  formed  on  the  paper.  If  you 
place  paper  in  the  image  plane  of  Fig.  29-20,  you  will  see  no  image  on  the 
paper.  Nevertheless,  if  you  look  from  the  right  side  of  the  lens,  the  virtual 
image  will  be  seen  because  your  brain  assumes  that  light  intercepted  by  your 
eye  has  reached  it  by  traveling  along  straight  rays. 

In  both  Figs.  29-18  and  29-20  the  lens  has  a positive  focal  length/,  and 
the  object  distance  5 is  positive.  The  reason  why  a real  image  is  formed  in 


29-4  Image  Formation  1393 


the  first  figure  and  a virtual  one  in  the  second  has  to  do  with  how  close  the 
object  is  to  the  converging  lens.  Consideration  of  the  figures  will  soon  show 
that  what  counts  is  whether  the  object  distance  is  greater  or  less  than  the 
focal  length.  The  point  is  made  apparent  immediately  by  considering  the 
thin-lens  relation  between  the  image  distance  s'  and  the  other  two  quan- 
tities: 


s'  f s 1 j 

With  both/ and  5 positive,  s'  will  be  positive  when  l//is  greater  than  1/s  or, 
in  other  words,  when  5 is  greater  than /.  When  s is  less  than / then  s'  will  be 
negative.  So  a real  image  is  formed  by  a converging  lens  when  the  object  is 
on  the  side  of  the  lens  from  which  light  is  incident  on  the  lens  and  is  farther 
from  it  than  its  focal  length.  A virtual  image  is  formed  if  the  situation  is  the 
same,  except  that  the  object  is  closer  to  the  lens  than  its  focal  length. 


In  certain  circumstances  an  object  can  be  on  the  side  of  the  lens  opposite  to 
that  from  which  light  is  incident  on  the  lens.  Then  the  object  distance  s is  nega- 
tive, and  it  is  said  to  be  a virtual  object.  This  cannot  happen  if  there  is  only  one 
lens  in  an  optical  system.  But  it  can  happen  in  multiple-lens  systems  if  a lens  in- 
tercepts rays  coming  from  a preceding  lens  before  they  have  converged.  An  ex- 
ample is  found  in  the  zoom-lens  system  considered  in  Sec.  29-7. 


A diverging  lens  has  a negative  focal  length.  If  a system  has  a single 
lens,  so  that  5 is  necessarily  positive,  and  if  the  lens  is  diverging,  so  that /is 
negative,  then  the  right  side  of  Eq.  (29-29)  will  be  negative  under  all  cir- 
cumstances. Thus  s'  will  always  be  negative — the  image  formed  by  a single  di- 
verging lens  is  always  virtual.  Figure  29-21  illustrates  virtual-image  forma- 
tion by  a diverging  lens  with  a ray  diagram,  when  the  object  distance  is 
greater  than  the  magnitude  of  the  focal  length.  In  this  case  the  virtual 
image  appears  to  an  observer  on  the  far  side  of  the  lens  to  be  erect,  reduced 
in  size,  and  closer.  You  should  construct  a diagram  for  a case  when  the  ob- 
ject  lies  inside  the  focal  point  of  the  lens. 

A very  important  application  of  lenses  with  negative  focal  lengths  is 
considered  in  Example  29-6. 


EXAMPLE  29-6 

A person  with  severe  nearsightedness  can  see  objects  clearly  only  if  their  distance, 
measured  from  the  eye,  is  between  15  cm  and  40  cm.  Determine  the  focal  length  of 
lenses  for  eyeglasses  that  will  provide  clear  vision  of  an  object  whose  distance  from 
the  eye  is  very  large,  instead  of  only  40  cm.  Then  find  the  distance  from  the  eye  to 
the  closest  object  that  the  person  can  see  clearly  when  wearing  the  glasses. 

• The  general  idea  of  what  must  be  done  can  be  understood  by  considering  again 
Fig.  29-21.  Imagine  the  eye  of  the  nearsighted  person  to  be  located  immediately  to 


Fig.  29-21  A ray  drawing  showing  the  forma- 
tion of  a virtual  image  by  a diverging  thin  lens. 


1394  Ray  Optics 


l he  right  of  the  lens  shown  in  the  figure,  since  eyeglass  lenses  are  placed  immedi- 
ately in  front  of  the  eyes.  Also  imagine  that  the  object  located  to  the  left  of  the  lens 
in  the  figure  is  actually  at  a very  large  distance  to  the  left  of  the  lens.  You  want  to  de- 
termine the  value  of  the  focal  length  of  the  lens  for  which  the  image  it  forms  of  the 
object  lies  40  cm  to  the  left  of  the  lens.  If  this  is  the  case,  then  the  nearsighted 
person’s  eye  will  receive  light  rays  that  in  every  way  appear  to  come  from  something 
located  40  cm  in  front  of  it.  Since  the  eye  is  able  to  focus  on  something  at  that  dis- 
tance, the  nearsighted  person  will  be  able  to  see  it  clearly. 

To  evaluate  the  required  focal  length,  you  use  the  thin-Iens  relation 

1 1 1 

T +7  = 7 

In  it  you  set  s = oo  and  s'  = — 40  cm.  As  in  Fig.  29-21,  the  value  of  s'  must  be  nega- 
tive because  the  direction  from  the  lens  to  the  image  must  be  opposite  to  the  general 
direction  that  light  travels  through  the  lens.  This  gives 

1 1 

0 “I- — — 

— 40  cm  / 

or 

/ = - 40  cm 

The  negative  value  of/  tells  you  that  diverging  lenses  must  be  used  in  eyeglasses 
that  correct  for  nearsightedness. 

Having  determined  the  focal  length  of  the  eyeglass  lenses,  you  can  determine 
the  distance  to  the  closest  object  that  the  person  will  be  able  to  see  clearly  when 
wearing  them  by  setting  s'  = — 15  cm.  This  gives 

1 1 1 

— H — 

5 —15  cm  —40  cm 

or 


1 /(  — 40  cm)  - 1 /( — 15  cm) 

Farsightedness  is  the  condition  in  which  a person  is  not  able  to  see  an  object 
clearly  unless  it  is  at  an  abnormally  large  minimum  distance  from  the  eye.  Why  are 
lenses  with  positive  focal  lengths,  such  as  the  one  in  Fig.  29-3,  used  in  reading 
glasses  for  farsighted  persons? 


29-5  OPTICAL  I n this  section  we  use  our  understanding  of  image  formation  to  explain 
SYSTEMS  briefly  the  operation  of  the  most  important  optical  systems:  eyes,  magnify- 
ing glasses,  telescopes,  and  microscopes. 

A diagram  of  a human  eye  is  shown  in  Fig.  29-22.  Light  rays  enter 
through  a transparent  aperture  called  the  pupil,  behind  a nearly  spheri- 
cal, fibrous  outer  structure  called  the  cornea.  The  muscular  structure  called 
the  iris  regulates  the  size  of  the  pupil  and  thus  controls  the  amount  of 
light  admitted  to  the  eye.  The  light  rays  pass  through  a clear  liquid  known 
as  the  aqueous  humor  and  then  enter  the  crystalline  lens.  This  lens  is  a flexible 
capsule  of  transparent  material,  which  is  attached  by  ligaments  to  the  ciliary 
muscle.  Beyond  the  lens,  the  rays  pass  through  a clear  fluid  called  the  vitre- 
ous humor.  Finally,  they  strike  the  surface  of  the  retina,  on  which  are  located 
light-sensitive  receptors  that  communicate  through  nerves  to  the  brain. 

The  aqueous  humor  and  the  vitreous  humor  are  mostly  water,  and  so 
have  indices  of  refraction  quite  close  to  1.34,  the  value  for  water.  The  index 
of  refraction  of  the  crystalline  lens  is  about  1.44.  Since  the  index  of  refrac- 


29-5  Optical  Systems  1395 


Iris 


Aqueous  humor 
Pupil 


Cornea 


Ciliary  muscle 


Retina 


Fig.  29-22  A human  eye. 

tion  of  air  has  the  value  1,  the  light  rays  are  refracted  primarily  when  they 
enter  the  curved  surface  of  the  cornea.  The  function  of  the  crystalline  lens 
is  to  produce  a small,  but  adjustable,  amount  of  additional  refraction.  Ad- 
justment is  accomplished  by  the  ciliary  muscle,  which  can  alter  the  shape  of 
the  lens.  When  the  muscle  is  relaxed,  parallel  rays  from  objects  at  infinity 
form  sharp  images  on  the  retina  of  a normal  eye.  To  focus  on  nearby  ob- 
jects, the  muscle  contracts  around  the  periphery  of  the  lens  and  deforms  it 
so  as  to  decrease  the  radius  of  curvature  of  its  surfaces.  This  increases  the 
amount  of  refraction  produced  and  makes  the  diverging  rays  from  the 
nearby  object  converge  on  the  retina.  There  is  a limit  to  the  amount  of  de- 
formation possible.  Hence  there  is  a minimum  distance  in  front  of  the  eye 
at  which  an  object  can  form  a clear  image  on  the  retina.  It  can  be  as  small  as 
10  cm  for  a person  with  normal  eyes  at  age  20.  But  it  increases  with  increas- 
ing age  because  of  loss  of  flexibility  of  the  lens.  A typical  figure  for  the  min- 
imum distance  at  which  an  eye  can  form  a clear  image  of  an  object  is  about 
25  cm. 

It  is  generally  believed  that  an  eye  can  focus  most  easily  on  a very  dis- 
tant object,  since  the  essentially  parallel  rays  from  such  an  object  can  be 
handled  with  a relaxed  ciliary  muscle.  But  recent  research  indicates  that 
there  may  be  less  strain  on  the  entire  eye,  over  prolonged  periods  of  use, 
when  it  focuses  on  a not-so-distant  object.  If  true,  it  could  be  an  adaptation 
to  the  fact  that  the  eye  is  most  frequently  used  in  such  circumstances.  Since 
i he  great  majority  of  optical  instruments  have  been  designed  to  be  em- 
ployed with  a relaxed  eye  viewing  an  object  that  is  very  distant  or  is  at  in- 
finity, we  assume  this  to  be  the  case  in  our  treatment  of  optical  instruments. 
The  assumption  simplifies  the  treatment  and  makes  little  practical  dif- 
ference in  results  obtained  from  it. 

Figure  29-23o  illustrates  the  operation  of  a magnifying  glass.  An  object 
to  be  inspected,  represented  by  an  arrow  of  height  h,  is  placed  in  front  of  a 
converging  lens  at  an  object  distance  5 slightly  less  than  the  focal  length  f of 
the  lens.  The  figure  shows  two  rays  emanating  from  the  tip  and  directed  in 
the  general  direction  of  the  eye  located  behind  the  lens.  One  ray  passes 


1396  Ray  Optics 


Fig.  29-23  (a)  The  use  of  a magnifying  glass.  The  observer  adjusts  the 

position  of  the  lens  so  that  the  distance  .s  from  the  object  viewed  to  the 
plane  of  the  lens  is  slightly  less  than  the  focal  length /of  the  lens.  The 
ray  from  the  tip  of  the  object  passing  through  the  focal  point  of  the  lens 
is  inclined  at  an  angle  with  respect  to  the  axis.  The  ray  passing 
through  the  center  of  the  lens  is  inclined  at  almost  exactly  the  same 
angle  because  the  two  rays  are  very  nearly  parallel  when  they  strike  the 
eye.  This  figure  is  an  accurate  graphical  construction  if  two  conditions 
are  satisfied.  First,  the  lens  shown  schematically  in  the  figure  as  having 
appreciable  thickness  must  actually  be  a thin  lens  located  in  the  plane  in- 
dicated by  the  vertical  line  at  its  midpoint.  Second,  the  rays  shown  for 
the  sake  of  clarity  at  appreciable  distances  from  the  axis  must  actually  all 
be  paraxial  rays.  The  same  remarks  apply  to  the  remaining  figures  in 
this  section,  (b)  An  observer  seeking  the  most  detailed  view  of  an  object, 
without  the  aid  of  a magnifying  glass,  places  it  about  25  cm  in  front  of 
the  eye.  This  is  the  minimum  distance  at  which  a typical  eye  can  produce 
a clear  image  on  the  retina.  In  these  circumstances  the  ray  from  the  tip 
of  the  object  is  inclined  at  an  angle  <j>  with  respect  to  the  axis. 


through  the  center  of  the  lens  without  bending.  The  other  ray,  initially 
maintaining  a constant  distance  from  the  axis,  passes  through  the  lens 
which  bends  it  so  that  the  ray  goes  through  the  focal  point  of  the  lens.  The 
construction  in  the  figure  indicates  that  the  two  rays  are  almost  parallel  to 
each  other  as  they  enter  the  eye.  Thus  they  appear  to  the  eye  to  be  di- 
verging from  an  object  which  is  a very  great  distance  in  front  of  the  lens. 
The  purpose  is  to  allow  the  observer  to  inspect  the  object  with  a relaxed 
eye. 

I he  geometrical  construction  in  Fig.  29-23a  indicates  that  the  image 
distance  has  a value  s'  approaching  — °°.  The  same  conclusion  can  be 
reached  from  the  thin-lens  equation,  1 /s'  = 1 // — \/s,  by  making  s slightly 
less  than/ so  that  1//  — \/s  is  negative  and  small.  Then  s'  is  negative  and 
large.  If  s is  made  equal  to /,  then  s'  becomes  infinitely  large. 

Because  the  image  formed  of  the  object  by  the  lens  is  so  far  in  front  of 
the  lens,  it  cannot  be  shown  in  the  figure.  What  can  be  shown  is  the  angle 
between  the  axis  and  the  rays  from  the  tip  of  the  image  as  these  rays  enter 
the  eye.  The  figure  also  shows  that  for  both  rays  the  value  of  this  small 
angle  is  approximately 


Figure  29-23 b illustrates  the  object  of  height  h being  inspected  without 
the  aid  of  the  magnifying  glass.  It  is  placed  as  close  to  the  eye  as  possible,  so 
that  a maximum  amount  of  detail  can  be  discerned.  But  it  cannot  be  placed 
closer  than  about  25  cm  since,  as  was  said  in  discussing  the  behavior  of  the 
eye,  the  minimum  distance  at  which  an  eye  can  form  a clear  image  is  typi- 
cally about  25  cm.  The  figure  shows  that  a ray  entering  the  eye  from  the  tip 
of  the  image  forms  a small  angle  <f>  with  the  axis  whose  value  is  approxi- 
mately 

!i__ 

^ 25  cm 

The  magnifying  glass  makes  the  object  being  inspected  appear  larger 

29-5  Optical  Systems  1397 


because  the  angle  </>'  is  larger  than  the  angle  /,  and  so  a larger  image  is 
formed  on  the  retina  of  the  eye.  The  angular  magnification  a of  the  magni- 
fying glass  is  defined  to  be 


a 


<t> 


(29-30) 


By  using  the  two  equations  displayed  above,  the  value  of  a for  a magnifying 
glass  is  seen  to  be  approximately 

h/f 

a h/{  25  cm) 


or 


25  cm 

~T~ 


(29-31) 


Since  the  converging  lens  has  a positive  focal  length/,  the  value  of  a is  posi- 
tive. This  signifies  that  the  image  produced  is  erect. 

The  smaller  the  focal  length  / of  the  magnifying  glass,  the  larger  its 
angular  magnification  a.  Unfortunately,  this  does  not  mean  that  arbitrarily 
large  values  of  a can  be  obtained  by  using  arbitrarily  small  values  of /.  The 
reason  is  that  the  lens  maker’s  equation  shows  that  small  values  of/ corre- 
spond to  a lens  having  surfaces  with  small  radii  of  curvature.  But  when  the 
radii  of  curvature  become  very  small,  only  the  rays  extremely  close  to  the 
axis  of  the  lens  satisfy  the  paraxial  restriction  and  come  to  a sharp  focus.  In 
practice,  the  smallest  value  of  the  focal  length / that  can  be  used  in  a magni- 
fying glass  made  from  a single  lens  is  about  / = 7 cm.  This  produces  an 
angular  magnification  of  about  a = 3.5. 


The  operation  of  an  astronomical  refracting  telescope  is  indicated  by 
the  ray  diagram  in  Fig.  29-24.  Essentially  parallel  light  rays  from  an  essen- 
tially infinitely  distant  object  enter  the  first  lens,  a converging  lens  ol  long 
focal  length  f0  called  the  objective.  This  lens  makes  the  rays  converge  to 
form  a real  inverted  image  (indicated  by  the  small  arrow)  at  its  focal  point, 
which  is  at  a distance  f0  beyond  the  lens.  The  rays  then  continue  as  di- 


Fig.  29-24  Rays  from  the  tip  of  an  object  at  infinity  passing  through  a refracting  astronomi- 
cal telescope.  The  inverted  intermediate  image  is  formed  at  the  focal  point  of  the  objective 
lens,  whose  focal  length  is  f„.  The  eyepiece  lens,  of  focal  length  fe,  forms  from  it  the  inverted 
virtual  image  at  infinity,  where  it  is  viewed  by  the  eye.  Note  that  the  ray  which  crosses  the  axis 
at  distance  /„  in  front  of  the  objective  passes  between  the  lenses  parallel  to  the  axis.  Why?  This 
ray  crosses  the  axis  again  at  distance/  behind  the  eyepiece  at  an  angle  which  determines  the 
angular  magnification  of  the  telescope.  The  eye  is  shown  well  to  the  right  of  the  eyepiece  in 
order  to  clarify  the  role  played  by  the  focal  point  of  the  eyepiece.  But  actually  the  eye  could 
just  as  well  be  immediately  to  the  right  of  the  eyepiece,  and  usually  it  is. 


1398 


Ray  Optics 


verging  rays  until  they  strike  the  second  lens,  a converging  lens  of  short 
focal  length  fe  known  as  the  eyepiece.  In  passing  through  this  lens,  the  rays 
are  selectively  bent  toward  the  axis.  With  the  eyepiece  located  at  a distance 
beyond  the  convergence  point  just  equal  to  the  focal  length  of  this  lens,  its 
convergence  properties  exactly  remove  the  divergence  of  the  rays.  So  the 
rays  are  again  parallel  when  they  leave  the  telescope  to  enter  the  observer’s 
eye.  As  a result,  the  observer  perceives  the  image  produced  by  the  tele- 
scope to  be  at  an  infinite  distance  in  front  of  the  eye  and  so  can  use  the  in- 
strument with  a relaxed  eye. 

Another  way  of  describing  the  telescope  is  to  say  that  the  image 
formed  by  the  objective  acts  as  an  object  for  the  eyepiece.  The  eyepiece  is 
located  so  that  the  distance  5 from  the  object  it  is  viewing  to  the  lens  equals 
its  focal  length  fe.  Thus  the  eyepiece  acts  essentially  like  a magnifying  glass, 
and  the  distance  s'  to  the  image  it  produces  has  the  value  s'  = — This 
follows  from  the  relation  1 /s'  = \/fe  — 1 /s  when  5 = fe.  (Note  that  there  is 
no  meaningful  distinction  between  the  values  s'  = — 00  and  s'  = + °°,  as  far 
as  this  relation  is  concerned.  Is  this  the  way  it  should  be?)  The  rays 
emerging  from  the  eyepiece  are  parallel  because  they  act  as  if  they  come 
from  a source  infinitely  far  away. 

The  observer  looking  through  the  eyepiece  sees  an  inverted  image  at 
infinity.  Its  lateral  magnification  is  indeterminate.  But  the  behavior  of  the 
telescope  is  specified  perfectly  well  in  terms  of  an  angular  magnification. 
For  an  astronomical  telescope  the  angular  magnification  a is  defined  as 


0' 


Here  </>'  is  the  angle  the  object  appears  to  subtend  at  the  observer’s  eye 
when  viewed  through  the  telescope,  4>  is  the  angle  it  would  subtend  without 
the  aid  of  the  telescope,  and  the  minus  sign  is  inserted  to  indicate  that  the 
image  seen  by  the  observer  is  inverted.  We  can  evaluate  these  angles  by 
considering  the  right  triangle  in  the  figure  whose  apex  angle  is  labeled  </>' 
and  whose  opposite  side  has  the  same  length  as  the  length  h of  the  arrow. 
Its  base  has  a length  equal  to  fe,  so  since  the  angles  are  small,  we  have 


Then  consider  the  right  triangle  with  apex  angle  labeled  c/>,  whose  opposite 
side  is  the  arrow  and  whose  base  has  a length  f0.  Here  we  have 


Thus 


V = h/fe=fo 

<t>  h/f0  fe 


Hence  the  angular  magnification  of  the  astronomical  telescope  is 


a = 


(29-32) 


The  negative  sign  means  that  the  image  is  inverted. 

The  angular  magnification  of  the  telescope  increases  as  the  focal 
length  of  the  objective  becomes  longer.  But  this  also  makes  the  telescope  it- 


29-5  Optical  Systems  1399 


self  longer,  causing  it  to  be  inconvenient  to  use  (and  expensive  to  build). 
For  a given  objective,  the  angular  magnification  can  be  increased  simply  by 
using  an  eyepiece  whose  focal  length  is  shorter  (a  relatively  inexpensive 
thing  to  do). 


Fig.  29-25  (a)  The  diffraction  pro- 

duced by  a circular  aperture.  ( b ) Two 
circular-aperture  diffraction  patterns 
which  are  overlapping  just  the  amount 
allowed  by  Rayleigh’s  criterion  for  reso- 
lution. ( Courtesy  Michel  Cagnet.) 


There  are  other  criteria  that  should  be  satisfied  by  an  astronomical 
telescope.  One  is  that  it  should  allow  the  observer  to  view  faint  stars.  To  do 
so,  the  objective  must  collect  as  much  incident  starlight  as  possible.  Since 
the  light-gathering  ability  is  proportional  to  the  cross-sectional  area  of  the 
lens  perpendicular  to  its  axis,  the  diameter  d of  the  objective  must  be  made 
as  large  as  possible. 

Another  criterion  is  that  the  telescope  should  be  able  to  resolve  two 
closely  spaced  stars.  That  is,  it  should  be  able  to  form  separate  images  of 
two  stars,  even  when  the  angular  separation  between  them  is  very  small. 
According  to  ray  optics,  this  would  not  be  a problem  — at  least  in 
principle — because  the  astronomer  could  always  choose  an  eyepiece  that 
would  give  the  telescope  enough  angular  magnification  to  do  the  job.  But 
in  fact  this  cannot  be  done.  Wave  optics  tells  us  that  light  traveling  through 
the  telescope  follows  straight  rays  only  to  an  angular  accuracy  given  by  Eq. 
(29-1): 


6 = 1.22  4 

d 

Light  of  wavelength  A.  passing  through  the  objective  of  diameter  d is  dif- 
fracted through  angles  in  a range  of  width  0 because  the  lens  allows  only 
part  of  the  total  incident  wave  fronts  into  the  telescope.  The  objective  acts  like 
a hole  in  a barrier  and  produces  a diffraction  pattern. 

The  angular  width  6 of  the  diffraction  pattern,  shown  in  Fig.  29-25o,  is 
measured  from  the  center  of  the  bright  spot  to  the  darkest  part  (essentially 
the  middle)  of  the  first  dark  ring.  Unless  the  angular  separation  of  light 
rays  entering  a telescope  exceeds  this  angle  by  an  appreciable  amount,  the 
rays  will  not  form  well-separated  images.  For  instance,  if  two  stars  happen 
to  be  separated  by  just  the  angle  given  in  Eq.  (29-1),  the  images  they  will 
form  have  the  appearance  shown  in  Fig.  29-25 b.  The  central  spot  in  the 
diffraction  pattern  of  each  image  falls  on  the  first  dark  ring  in  the  diffrac- 
tion pattern  of  the  other  image.  (Saying  this  is  essentially  equivalent  to 
saying  that  half-maximum  intensity  points  on  the  two  central  spots  coin- 
cide, so  that  the  two  spots  are  separated  by  an  amount  equal  to  the  sum  of 
their  half-widths  at  half-maximum  intensities.)  If  it  is  considered  to  be  just 
possible  to  discern  from  (fie  observed  pattern  that  it  actually  represents  two 
star  images,  Eq.  (29-1)  can  be  taken  to  specify  the  minimum  angular  sepa- 
ration that  a telescope  will  resolve.  This  is  called  Rayleigh's  criterion  for 
resolution.  Since  we  have  no  control  over  the  wavelength  K of  starlight,  the 
only  way  to  improve  the  resolution  of  a telescope  by  making  6 smaller  is  to 
increase  the  diameter  d of  its  objective.  Thus  there  is  a second  important 
reason  why  high-grade  refracting  astronomical  telescopes  must  use  objec- 
tive lenses  of  large  diameter  and  cost. 


The  fact  that  the  image  produced  by  astronomical  telescopes  is  in- 
verted presents  no  real  difficulty.  But  in  terrestrial  telescopes,  such  as  bin- 
oculars, it  would  be  intolerable.  In  the  most  common  type  of  binoculars,  an 
erect  image  is  produced  by  including  in  each  half  a pair  of  prisms.  A typical 
design  is  shown  in  Fig.  29-26.  Light  rays  experience  four  successive  total  in- 


1400  Ray  Optics 


Fig.  29-26  Prism  binoculars. 


ternal  reflections  from  the  slanted  sides  of  the  prisms.  This  has  the  effect  of 
reinverting  the  inverted  image  produced  by  the  objective  and  eyepiece,  so 
that  the  final  image  is  erect.  The  prisms  also  bend  the  light  rays  back  on 
themselves  in  such  a way  as  to  make  the  instrument  more  compact. 

Note  the  complicated  structure  of  the  lenses  that  are  actually  used  in  a 
typical  optical  system.  The  objective  “lens”  in  the  binoculars  of  Fig.  29-26 
really  consists  of  a pair  of  lenses,  one  converging  and  the  other  slightly  di- 
verging. The  eyepiece  “lens”  consists  of  three  converging-diverging  pairs. 

The  two  components  of  each  pair  are  made  of  different  materials,  for 
example,  flint  glass  and  crown  glass.  The  purpose  is  to  make  the  pair  act  as 
a single  lens  whose  focal  length  has  approximately  the  same  value  for  light 
of  different  colors.  Since  the  index  of  refraction  of  any  particular  type  of 
glass  depends  on  the  wavelength  of  the  light  passing  through  it,  the  focal 
length  of  a lens  made  from  it  depends  on  the  wavelength  too.  The  effect  is 
known  as  chromatic  aberration.  However,  the  dispersion — the  rate  of 
variation  of  index  of  refraction  with  wavelength — is  not  the  same  in  flint 
glass  and  crown  glass.  So  by  cleverly  designing  a converging-diverging  pair 
made  from  the  two  types  of  glass,  it  is  possible  to  obtain  a near  cancellation 
of  the  change  in  focal  length  of  the  pair  that  is  due  to  index-of-refraction 
variations  in  its  components. 

To  achieve  sufficient  angular  magnification,  eyepieces  in  binoculars 
must  have  quite  short  focal  lengths.  In  the  design  illustrated  in  Fig.  29-26, 
this  is  accomplished  by  making  the  eyepiece  from  three  converging- 
diverging  pairs.  Two  of  these  pairs  are  strongly  converging  in  net  effect, 
and  the  third  is  weakly  diverging.  Dividing  the  task  in  this  way  allows  the 
optical  engineer  to  correct  partially  for  several  types  of  distortions  that  are 
called  the  monochromatic  aberrations.  These  distortions  result  from  the 
fact  that  the  individual  lenses  in  the  eyepiece  do  not  have  perfect  focal 
properties  for  rays  which  are  not  paraxial  (as  is  true  for  any  lens  with 
spherical  surfaces),  and  from  the  fact  that  the  rays  going  through  their 
outer  regions  are  not  paraxial.  Note  finally  that  the  engineer  certainly 
cannot  use  the  thin-lens  relations.  In  Sec.  29-6  you  will  study  the  more  ver- 
satile relations  that  are  often  used. 


Astronomical  telescopes  with  the  best  light-gathering  ability  and  reso- 
lution— that  is,  with  the  largest-diameter  light  collectors — use  mirrors  in- 
stead of  objective  lenses.  It  is  much  less  costly  to  produce  a mirror  whose 


29-5  Optical  Systems  1401 


Fig.  29-27  A common  type  of  reflecting 
telescope,  called  a newtonian  telescope 
after  its  inventor. 


diameter  perpendicular  to  the  symmetry  axis  has  a certain  large  value  than 
it  is  to  make  a lens  with  the  same  diameter.  Yet  the  two  devices  will  have  the 
same  light-gathering  ability  and  resolution.  One  type  of  reflecting  telescope 
is  shown  in  Fig.  29-27.  Its  principal  feature  is  a large  parabolic  mirror. 

In  Sec.  28-2  we  proved  that  plane  wave  fronts  of  light  traveling  in  a 
direction  along  the  symmetry  axis  of  a parabolic  mirror  are  reflected  by  the 
mirror  into  spherical  wave  fronts  that  converge  to  the  focus  of  the  parabo- 
la. But  a ray  passing  through  any  point  on  a wave  front  is  normal  to 
the  tangent  plane  at  that  point.  The  relation  between  rays  and  wave  fronts 
makes  it  easy  to  see  that  the  proof  in  Sec.  28-2  is  also  a proof  that  when  light 
rays  directed  parallel  to  the  symmetry  axis  of  a parabolic  mirror  are  re- 
flected by  it,  then  they  converge  to  the  focus  of  the  parabola.  This  behavior 
is  indicated  in  Fig.  29-27.  But  before  the  reflected  rays  reach  the  focus, 
they  are  directed  to  the  side  of  the  telescope  by  a small  plane  mirror 
mounted  diagonally  to  the  symmetry  axis.  A real  image  is  formed.  It  is 
viewed  by  an  eyepiece  lens,  whose  focal  length  can  be  changed  to  vary  the 
magnification  of  the  telescope. 

The  Hale  reflecting  telescope  on  Mt.  Palomar  [the  world’s  largest)  has  a diam- 
eter d = 5.08  m.  For  light  of  wavelength  A.  = 500  nm  = 5 x 10-7  m,  in  principle 
it  can  resolve  two  stars  whose  angular  separation  is  9 = 1.22\/d  = 
1.22  x 5 x io-7  m/(5.08  m)  = 1.2  x 10-7  rad.  But  even  on  a calm  night,  atmo- 
spheric turbulence  “blurs"  the  incoming  rays  considerably,  so  the  telescope  does 
not  reach  this  limit  in  practice. 

Radio  telescopes  operate  in  the  same  general  way  as  reflecting  optical  tele- 
scopes and  use  paraboloidal  metallic  reflecting  surfaces  to  collect  and  focus  elec- 
tromagnetic radiation  in  the  radio  range  of  wavelengths.  The  Arecibo  radio  tele- 
scope (the  largest  in  the  world)  has  a diameter  d = 305  m.  When  detecting 
microwaves  of  wavelength  k = 21  cm,  its  limit  of  resolution  is  9 = 
1.22  x 0.21  m/(305  m)  = 8.4  x 10-4  rad — much  poorer  than  that  of  a high-class 
optical  reflecting  telescope.  Radio  telescopes  must  be  of  the  reflecting  type  be- 
cause there  are  no  transparent  lenses  for  radio  waves. 

A reflecting  optical  telescope  has  additional  advantages  over  a refracting  one. 
The  focus  of  a parabolic  mirror  is  perfect  (to  the  extent  that  the  mirror  has  been 
perfectly  shaped)  for  an  object  on  the  axis  of  the  parabola.  A mirror  has  no  chro- 
matic aberration  at  all.  Also,  a mirror  can  be  supported  all  over  its  back  side,  so 
that  it  does  not  distort  so  badly  under  its  own  weight.  And  a mirror  need  be 
flawless  only  on  its  front  side,  while  a lens  must  be  flawless  throughout. 


The  last  optical  system  that  we  discuss  is  the  microscope.  A ray  dia- 
gram is  shown  in  Fig.  29-28.  A microscope  is  reminiscent  of  an  astronomi- 
cal telescope  in  that  the  objective  produces  a magnified  real  image  of  the 
specimen  being  viewed,  with  that  image  acting  as  an  object  for  the  eye- 
piece. And  as  in  a telescope,  the  eyepiece  further  magnifies  the  specimen 
and  produces  a virtual  image.  Typically,  the  virtual  image  is  at  infinity,  so 
that  the  person  using  the  microscope  can  inspect  the  image  with  a relaxed 


1402  Ray  Optics 


The  inverted  intermediate  image  produced  by  the  objective  is  formed  by  the  eyepiece  into  an 
inverted  virtual  image  at  infinity.  Just  as  in  Fig.  29-24,  the  ray  which  crosses  the  axis  in  front  of 
the  objective  at  a distance  equal  to  its  focal  length  f„  passes  parallel  to  the  axis  between  the 
lenses  and  then  crosses  the  axis  again  at  a distance  behind  the  eyepiece  equal  to  the  focal 
length/,,  of  the  eyepiece.  The  intersection  with  this  ray  of  the  ray  passing  through  the  center 
of  the  objective  determines  the  location  and  size  of  the  intermediate  image.  The  eye  is  shown 
farther  behind  the  eyepiece  than  where  the  eye  would  be  placed  in  practice,  so  that  the  role 
played  by  the  focal  point  of  the  eyepiece  can  be  made  clear.  Another  liberty  that  has  been 
taken  for  the  sake  of  clarity  is  to  make  the  focal  lengths  of  both  lenses  considerably  longer,  rel- 
ative to  the  distance  x'  from  the  focal  point  of  the  objective  to  the  plane  containing  the  inter- 
mediate image,  than  is  the  case  in  a practical  microscope.  The  standard  value  of  x'  in  most 
microscopes  is  16  cm.  The  focal  length/,  of  the  objective  lens,  in  particular,  usually  is  much 
shorter  than  this.  For  instance,  an  objective  producing  a high  magnification  has  a typical  focal 
length  of  0.2  cm.  The  unrealistic  aspect  of  the  figure  has  the  effect  of  making  the  distance 
from  the  objective  to  the  intermediate  image  not  appear  to  be  as  large,  relative  to  the  distance 
from  the  object  to  the  objective,  as  is  the  case  in  a practical  microscope. 


eye.  But  in  contrast  to  a telescope,  the  specimen  being  viewed  by  the  objec- 
tive is  placed  very  close  to  it  so  as  to  get  the  most  detailed  view. 

The  hgure  depicts  the  specimen  as  an  arrow  of  height  h.  To  evaluate 
the  height  h'  of  the  magnified  image  produced  by  the  objective,  consider 
the  two  right  triangles  in  the  figure  with  apex  angles  labeled  </>.  For  one  tri- 
angle the  length  of  the  side  opposite  the  apex  angle  is  the  length  of  the 
arrow  labeled  h,  and  the  length  of  the  side  adjacent  to  the  apex  angle  is  the 
focal  length/0  of  the  objective.  For  the  other  triangle  the  length  of  the  side 
opposite  the  apex  angle  is  the  length  of  the  arrow  labeled  h' , and  the  length 
of  the  side  adjacent  to  the  apex  angle  is  the  distance  x'  from  the  focal  point 
of  the  objective  to  the  arrow  labeled  h' . Since  both  triangles  are  right  trian- 
gles and  have  the  same  apex  angle,  they  are  similar  triangles.  Hence 


h 

7 


(29-33) 


The  minus  sign  compensates  for  the  fact  that,  according  to  the  sign  con- 
vention, the  height  h'  is  counted  as  negative.  Solving  Eq.  (29-33)  for  h' /h, 
we  obtain 

h'  _ x' 

~h  = “To 


This  is  the  lateral  magnification  l produced  by  the  objective.  Thus 

l = - y (29-34) 

Jo 

The  eyepiece  views  the  magnified  image  of  the  specimen.  It  is  located 
in  front  of  the  eyepiece  at  a distance  equal  to  its  focal  length  fe,  so  the  eye- 
piece is  acting  like  a magnifying  glass.  The  eyepiece  produces  an  angular 


29-5  Optical  Systems  1403 


magnification  with  an  approximate  value  given  by  Eq.  (29-31), 


25  cm 

~7T 


The  overall  magnification  m of  the  microscope  is  defined  to  be  the 
product  of  the  lateral  magnification  produced  by  the  objective  and  the 
angular  magnification  produced  by  the  eyepiece.  That  is, 

7it  = la 


Using  Eqs.  (29-31)  and  (29-34),  we  have 


m 


x'  25  cm 

fo  fe 


(29-35) 


Since  all  die  quantities  in  this  expression  have  positive  values,  the  value  of 
m is  negative.  This  sign  signifies  that  the  final  image  is  inverted,  as  can  be 
seen  in  the  figure.  Most  microscopes  are  built  with  the  distance  x'  having 
the  value  16  cm.  If  the  instrument  has  this  standard  value  of  x',  its  overall 
magnification  is 


m — 


16  cm  25  cm 

“To  7T 


400  cm2 

“ TT 


The  overall  magnification  m of  a microscope  can  be  made  large  in 
magnitude  by  using  objective  and  eyepiece  lenses  of  short  focal  lengths. 
But  diffraction  of  the  light  entering  the  objective  sets  a limit  to  the  useful 
magnification  of  a microscope.  The  expression  for  the  resolution  is  dif- 
ferent from  the  one  applying  to  a telescope,  because  the  object  being 
viewed  is  very  close  to  the  objective  of  a microscope.  If  the  angle  subtended 
by  the  lens  at  the  object  is  a,  then  with  light  of  wavelength  A.  the  size  8 of  the 
finest  detail  that  can  be  resolved  is  given  by  the  relation 


2 sin(“ 


1.22 


(29-36) 


See  Fig.  29-29  and  the  derivation  given  in  its  caption.  Since  the  value  of 
2 sin(a/2)  is  approximately  1 for  a typical  microscope,  it  is  apparent  that  the 
size  8 of  the  smallest  structure  that  can  be  seen  clearly  by  a microscope  is  about  equal 
to  the  wavelength  k of  the  light  used  to  look  at  it.  This  means  about  5 x 10-7  m 
for  visible  light.  There  is  no  use  in  trying  to  do  better  by  increasing  the 
magnification  of  the  instrument.  At  the  point  where  the  magnification  be- 
comes high  enough  to  allow  you  to  begin  seeing  detail  finer  than  that  speci- 
fied by  Eq.  (29-36),  you  will  find  that  the  image  you  are  examining  begins  to 
deteriorate  into  a set  of  overlapping  diffraction  patterns.  That  is,  if  you  try 
to  increase  the  magnification  further,  you  will  merely  see  larger  blurs. 


29-6  THE  MATRIX  An  optical  system  comprising  more  than  two  thin  lenses  can  be  analyzed 
METHOD  by  the  same  methods  we  applied  in  the  figures  of  Sec.  29-5  to  systems  con- 
taining two  thin  lenses.  The  procedure  is  to  treat  the  image  formed  by  the 
first  lens  as  the  object  for  the  second,  the  image  formed  by  the  second  lens 
as  the  object  for  the  third,  and  so  on  through  the  system.  But  when  this  is 
done  algebraically,  it  can  become  quite  complicated  for  systems  containing 
a number  of  thin  lenses.  When  it  is  done  graphically,  it  can  also  be  very  in- 
volved and,  furthermore,  inaccurate  to  a degree  that  makes  it  completely 
unsatisfactory  for  the  practical  design  of  optical  systems. 


1404  Ray  Optics 


B 


\o  A 

to  A 
T°# 


Fig.  29-29  A diagram  used  to  derive  Eq.  (29-36).  In  a practical  microscope  the  objective  is  a 
combination  of  several  thick  lenses  designed  particularly  to  minimize  the  monochromatic 
aberrations  resulting  from  the  use  of  highly  nonparaxial  rays.  But  here  a single  thin  lens  rep- 
resents the  objective,  and  the  rays  shown  represent  the  most  nonparaxial  rays  passing  through 
the  objective.  The  points  01  and  02  are  in  an  object,  with  Ox  drawn  on  the  axis  for  convenience. 
These  points  are  separated  by  a distance  S which  is  at  the  microscope’s  limit  of  resolution. 
That  is,  their  images  Ij  and  /2  are  located  so  that  the  wave  diffracted  from  02  by  the  objective 
has  zero  intensity  at  Ix.  The  condition  for  this  to  occur  is  for  the  path  length  02AIx  to  differ 
from  the  path  length  02BIX  by  1.22A,  where  A is  the  wavelength  of  the  light  used.  [The  factor 
1.22  results  from  the  three-dimensional  geometry  in  diffraction  by  a hole.  It  is  the  same  factor 
that  appears  in  Eq.  (29-1).]  This  figure  is  somewhat  more  realistic  than  Fig.  29-28  with  regard 
to  showing  that  the  distance  from  the  lens  to  the  plane  of  the  images  is  appreciably  larger  than 
the  distance  from  the  plane  of  the  objects  to  the  lens.  But  considerations  of  clarity  still  prevent 
the  figure  from  showing  that  the  image  distance  in  a practical  microscope  is  very  much  larger 
than  the  object  distance.  As  a consequence  of  this  fact,  almost  all  the  path-length  difference 
arises  from  the  difference  in  the  path  lengths  02A  and  02B.  I he  insert  shows  that  02A  is 
shorter  than  OxA  by  the  length  8 sin(a/2),  where  a is  the  angle  subtended  by  the  lens  at  the  ob- 
ject. Also,  it  shows  that  0$  is  longer  than  OxB  by  the  same  amount.  Since  0,A  = 0,B,  the 
path-length  difference  is  02B  — 02A  = 28  sin(a/2)  = 02B1X  — 02A1X.  Equating  this  to  1.22A 
yields  Eq".  (29-36). 


If  a system  contains  one  or  more  lenses  of  thickness  that  is  not  neg- 
ligible compared  to  other  significant  dimensions,  then  it  is  not  valid  to 
apply  either  of  the  methods  — algebraic  or  graphical — that  we  have  devel- 
oped. Both  assume  that  the  thin-lens  restriction  is  satisfied.  Bin  lenses  in 
real  optical  systems  are  usually  thick. 

Fortunately,  there  is  a simple,  methodical  way  of  determining  how 
light  rays  pass  through  a system  containing  any  number  of  thick  or  thin 
lenses.  The  method  employs  the  mathematical  tool  called  matrix  multipli- 
cation. In  this  section  we  develop  the  method.  We  apply  it  to  examples  in 
Sec.  29-7.  In  (his  work  we  do  not  assume  that  you  have  prior  acquaintance 
with  matrices.  It  should  be  said  that  there  are  many  other  important  appli- 
cations of  matrix  multiplication  in  science  and  engineering,  some  being 
very  similar  to  the  one  we  study  here. 


Optical  systems  contain  regions  of  different  indices  of  refraction.  The 
boundaries  between  them  are  parts  of  spherical  surfaces  whose  centers  are 
located  on  a common  axis.  Figure  29-30  shows  light  traveling  to  the  right 
along  a ray  that  is  bent  by  refraction  as  it  crosses  the  boundary  from  a 
region  where  the  index  of  refraction  is  n to  a region  where  it  is  n' . I he  situ- 
ation is  the  same  as  that  illustrated  in  the  middle  of  Fig.  29-10,  except  that 
here  we  allow  (he  index  of  refraction  to  have  values  different  from  1 on 
both  sides  of  the  boundary.  Figure  29-30  shows  immediately  that 


and 


0 = a + 4> 

O'  = OL  + 4>’ 


Fig.  29-30  The  refraction  of  a 
light  ray  as  it  passes  through  a 
spherical  surface  separating  re- 
gions of  different  indices  of  re- 
fraction. 


(29-37fi) 

(29-376) 


29-6  The  Matrix  Method  1405 


The  angles  are  defined  in  the  figure  in  terms  of  the  axis,  the  line  parallel  to 
the  axis  passing  through  the  point  where  the  ray  intersects  the  surface,  and 
the  line  which  is  the  radius  of  the  surface  through  that  point.  The  angles  of 
incidence  and  refraction,  9 and  9' , are  also  related  by  Snell’s  law 

n sin  9 = n'  sin  9' 

Although  it  is  not  a necessary  limitation  of  the  matrix  method,  for  simplic- 
ity we  continue  to  restrict  our  attention  to  paraxial  rays  only.  In  other  words, 
we  treat  only  rays  which  are  always  near  the  axis,  so  that  all  angles  are 
small.  Then  we  have  sin  9=9 , sin  9'  = 9',  and  Snell's  law  takes  the  form 

n9  = n'9' 

Applying  Eqs.  (29-37 a)  and  (29-376),  we  obtain 

n(a  + 0)  = n'(a  + 0')  (29-38) 

The  small  angle  a has  the  value 

7 

a = — 

P 

with  y being  the  height  above  the  axis  of  the  point  where  the  ray  intersects 
the  surface  and  p being  the  surface’s  radius  of  curvature.  We  can  also  write 
this  as 


(29-39) 


where  r = + p.  That  is,  we  use  the  previous  convention  in  which  the  quan- 
tity r describing  a surface  has  a value  whose  magnitude  equals  the  radius  of 
curvature  and  whose  sign  is  positive  if  (as  in  the  case  illustrated)  the  direc- 
tion from  the  surface  to  the  center  of  curvature  is  the  same  as  the  general 
direction  of  the  ray.  Next  we  substitute  Ecj.  (29-39)  into  Eq.  (29-38)  to  ob- 
tain 


or 


(n  — n')~  + neb  = n'<p' 
r 


or 


1 n 

- y + — cf)  = (J) 


(29-40) 


Finally,  we  write 


y = y' 


(29-41) 


in  which  y'  is  also  the  height  above  the  axis  of  the  point  where  the  ray  inter- 
sects the  surface.  Here  we  treat  both  the  heights  y and  y'  and  the  angles  0 
and  4 i'  as  signed  scalars.  The  value  of  y or  y'  is  positive  if  the  ray  lies  above 
the  axis,  and  the  value  of  4>  or  0'  is  positive  if  the  slope  of  the  ray  is  positive. 
(In  the  case  illustrated  all  these  quantities  have  positive  values.)  Calcula- 
tions very  similar  to  the  one  we  have  carried  out  for  the  situation  illustrated 
in  Fig.  29-30  show  that  Eqs.  (29-40)  and  (29-41)  are  valid  for  any  situation, 
providing  the  sign  conventions  for  r,  y,  y' , 0,  and  0'  are  followed. 


We  have  written  Eq.  (29-41),  even  though  it  seems  to  convey  no  new 
information,  because  it  and  Eq.  (29-40)  together  constitute  transformation 
equations  which  convert  the  pair  of  numbers  y,  (b  into  the  pair  of  numbers  y',  ft. 
The  unprimed  pair  describes  the  ray  incident  on  the  surface  because  it  spe- 
cifies the  ray’s  height  y above  the  axis  at  the  point  of  incidence  and  also  the 
ray’s  slope  f at  that  point.  Similarly,  the  pair  of  numbers  y',  ft  describes  the 
ray  emerging  from  the  surface.  Thus  the  two  transformation  equations  tell 
• what  the  surface  does  to  the  light  ray  passing  through  it. 


The  pairs  of  numbers  that  we  have  expressed  in  the  familiar  notation 
y,  <p  and  y',  ft  can  also  he  expressed  in  matrix  notation  as 


y 

-4>- 


ancl 


In  this  notation  the  two  equations  that  transform  the  unprimed  pair  into 
the  primed  pair,  Eqs.  (29-40)  and  (29-41),  can  be  written  as  one  matrix 
equation: 


1 O' 

y 

y 

(4-0-  4 

_\n  ) r n J 

d> 

— 

ft 

(29-42) 


The  hrst  quantity  in  brackets  on  the  left  side  of  the  equals  sign  is  a set  of 
four  separate  terms  arranged  in  two  rows  and  two  columns  of  what  is  called 
a 2 by  2 matrix.  Each  of  these  terms  is  known  as  an  element  of  the  matrix. 
The  second  quantity  in  brackets  on  the  left  side  of  the  equation  and  the  one 
on  the  right  side  are  called  2 by  1 matrices  because  they  each  have  two  ele- 
ments arranged  in  two  rows  and  one  column.  The  equation  says  that  the 
matrix 

1 O' 

In  \ 1 n 

— - 1 7 

\n  Jr  n 

multiplied  into  the  matrix 

" y 
<t> 

yields  the  matrix 

y 


just  as  if  it  were  a standard  algebraic  equation  involving  scalars.  But  what 
does  the  word  “multiplied”  mean  in  this  context? 

The  matrix  multiplication  in  Ecp  (29-42)  is  an  operation  that  is  defined 
by  the  following  matrix  equation: 


a b 

e 

ae  + bf 

c d 

J. 

ce  + df 

(29-43) 


This  constitutes  a rule  that  tells  us  everything  there  is  to  know  about  what  it 
means  to  multiply  a 2 by  2 matrix  into  a 2 by  1 matrix.  The  rule  specifies 
what  kind  of  a matrix  will  be  produced  by  the  process  and  what  values  the 


29-6  The  Matrix  Method  1407 


elements  of  this  matrix  will  have.  In  particular,  the  rule  states  that  the  ma- 
trix produced  by  the  multiplication  will  be  a 2 by  1 matrix  whose  element  in 
the  first  row  of  its  single  column  has  the  value  ae  4-  bf  and  whose  element  in 
the  second  row  has  the  value  ce  + df,  where  a,  b,  c,  d,  e,  and/ are  the  indi- 
cated elements  of  the  matrices  that  are  being  multiplied.  Example  29-7  il- 
lustrates the  use  of  the  rule. 


EXAMPLE  29-7 

Multiply  the  matrix 

T 2" 
.3  4_ 

into  the  matrix 


“5“ 

_6_ 

■ First,  you  must  realize  that  you  are  asked  to  evaluate 


T 

9 " 

5 

.3 

4_ 

_6_ 

and  not 


"5 " 

1 

2 ' 

.6  _ 

.3 

4 . 

The  latter  is  undefined. 

An  easy  way  to  remember  the  rule  for  the  multiplication  that  has  been  defined 
is  indicated  schematically  below: 


[0  (0 

"V 

c d 

add  products  of 
adjacent  elements 


ae  + bf 


That  is,  you  imagine  taking  the  elements  of  the  first  row  of  the  2 by  2 matrix  and 
carrying  them  out  with  a 90°  clockwise  twist  so  that  they  are  adjacent  to  the  elements 
of  the  2 by  1 matrix.  Next  you  multiply  adjacent  elements,  then  add  the  results,  and 
enter  the  value  obtained  in  the  first  row  of  the  product  matrix.  Finally,  you  repeat 
the  process,  using  the  elements  in  the  second  row  of  the  2 by  2 matrix,  to  obtain  the 
element  in  the  second  row  of  the  product  matrix: 

ae  + bf 


add  products  of 
adjacent  elements 


ce 


df 


Applying  this  to  the  case  at  hand,  you  have 


1 2' 

~5 ” 

“1x54-2x6“ 

“17“ 

3 4_ 

_6j 

.3  x 5 + 4 x 6j 

.39. 

Let  us  use  the  matrix  multiplication  rule  to  show  that  Eq.  (29-42)  actu- 
ally is  equivalent  to  both  Eqs.  (29-40)  and  (29-41).  We  apply  the  rule,  Eq. 
(29-43),  to  Eq.  (29-42)  by  settings  = 1 ,b  — 0,  c — (n/n'  — 1)(1  /r),d  — n/n', 
e = y,  and  f = <f>.  Then  we  evaluate 

ae  + bf  = ly  4-  0</> 

and 


1408 


Ray  Optics 


ce  + df  = 


n , \ 1 , n j 

— - 1 )-y  + — 4> 

n Jr  n 


Next  we  set  ae  + bf  = y'  and  ce  + df  — </>',  obtaining 

y — y' 


and 


Fig.  29-31  The  transmission  of  a light 
ray  between  two  planes  separated  by  a 
region  of  constant  index  of  refraction. 


1 

— y + 
r 


n 

n7 


4>  = <$>' 


in  agreement  with  Eqs.  (29-40)  and  (29-41). 


In  an  optical  system  a light  ray  is  bent  by  refraction  at  each  surface 
where  there  is  a change  in  index  of  refraction.  Light  is  transmitted  along 
straight  rays  between  these  surfaces.  We  have  just  seen  how  to  use  matrices 
to  describe  the  refraction  at  a surface.  Let  us  see  how  to  use  matrices  to  de- 
scribe the  transmission  between  surfaces.  Ligure  29-31  indicates  two  planes 
normal  to  the  axis  of  the  system  and  separated  by  a distance  t.  Also 
shown  is  a light  ray  which  leaves  the  first  plane  at  height  y above  the  axis 
and  arrives  at  the  next  plane  at  height  y'  above  the  axis.  The  angle  it  makes 
with  a line  parallel  to  the  axis  is  <f>  when  it  leaves  the  first  plane  and  </>'  when 
it  arrives  at  the  next  one.  And 


</>  = </>'  (29-44) 

because  the  ray  extends  along  a straight  line  with  constant  slope.  The  rela- 
tion between  y and  y'  is 

y + t tan  = y' 

But  since  we  restrict  ourselves  to  paraxial  rays,  the  angle  is  small  and  we 
can  write 

y + t(f>  = y'  (29-45) 


The  sign  conventions  allow  Eqs.  (29-44)  and  (29-45)  to  apply  to  any  case,  not 
just  the  one  shown  in  Eig.  29-31. 

Each  plane  in  the  figure  represents  a surface  separating  regions  of  dif- 
ferent index  of  refraction.  It  is  true  that  these  surfaces  are  actually  spheri- 
cal, not  planar.  However,  a paraxial  ray  is  always  near  the  axis  when  tra- 
versing the  system.  So  it  always  passes  through  each  surface  in  a small 
region  near  the  axis.  This  small  region  of  the  spherical  surface  is  difficult  to 
distinguish  from  a plane  that  is  tangent  to  it  at  the  point,  called  its  vertex, 
where  it  intersects  the  axis.  In  fact,  for  the  purpose  of  determining  what 
happens  to  the  height  of  the  ray  above  the  axis  when  the  ray  passes 
between  one  refracting  surface  and  the  next,  the  paraxial  restriction  allows 
us  to  treat  each  of  these  surfaces  as  if  it  were  a plane  normal  to  the 
axis  and  passing  through  its  vertex.  The  planes  are  called  vertex  planes.  In 
other  words,  we  can  use  Eqs.  (29-44)  and  (29-45)  to  describe  the  transmis- 
sion of  light  along  a ray  between  consecutive  refracting  surfaces  if  we  take  t 
to  be  the  distance  between  the  vertex  planes  that  represent  the  locations  of 
those  surfaces. 

Again,  the  two  ecjuations  can  be  thought  of  as  transforming  the  un- 
primed pair  of  numbers  y,  f into  the  primed  pair  y',  f.  In  matrix  notation, 
they  can  be  written  as  the  single  equation 


29-6  The  Matrix  Method  1409 


(29-46) 


t r 

~y ' 

~y'  ’ 

.0  1_ 

-</>. 

.0'  J 

You  should  verify  that  this  is  so  by  carrying  out  the  matrix  multiplication 
and  comparing  the  results  with  Eqs.  (29-44)  and  (29-45). 


In  a symbolic  version  of  matrix  notation,  Eq.  (29-46)  is  written 

TC  = C'  (29-47) 

Here  T stands  for  what  we  call  the  transmission  matrix 


T - 


1 

0 


t 

1 


(29-48) 


(Some  call  it  the  translation  matrix.)  The  matrices  C and  C'  represent  the 

light  ray  matrices 


C 55 


y 

</> 


and 


(29-49) 


The  symbols  C and  C'  are  used  to  remind  you  that  both  are  column  matri- 
ces, that  is,  2 by  1 matrices,  in  contrast  to  all  the  others  we  deal  with,  which 
are  2 by  2 matrices.  Using  the  same  symbolic  notation,  we  can  write  Eq. 
(29-42)  as 

RC  = C'  (29-50) 

where  R is  the  refraction  matrix 


R = 


l 

O' 

( n \ 1 

n 

— - 1 - 

— 

\n  ) r 

n' 

(29-51) 


We  can  trace  the  path  of  any  paraxial  light  ray  through  any  optical  system 
by  carrying  out  a sequence  of  matrix  multiplications  involving  transmission 
and  refraction  matrices. 


For  instance,  consider  the  thick  lens  shown  in  Fig.  29-32.  A paraxial  ray 
is  directed  to  the  right  from  a point  in  the  object  plane  to  the  hrst  lens  sur- 
face. It  is  refracted  and  then  continues  to  the  second  lens  surface.  The  ray 
is  refracted  again  and  then  continues  through  a point  on  the  image  plane. 

1 he  curvatures  of  the  hrst  and  second  surfaces  are  specified  by  the  quan- 
tities r1  = + pj  and  r2  = — p2,  where  px  and  p2  are  their  radii  of  curvature. 
Ehe  quantity  r2  is  negative  because  the  second  lens  surface  is  concave  in  the 
direction  opposite  to  the  direction  of  the  light  ray.  Immediately  to  the  left 
of  the  hrst  lens  surface  the  index  of  refraction  has  a value  designated  nx. 
Immediately  to  the  right  of  that  surface  its  value  is  designated  n[.  The  val- 
ues immediately  to  the  left  and  right  of  the  second  surface  are  designated 
n2  and  n2.  Thus  n[  — n2,  and  their  common  value  is  the  index  of  refraction 
of  the  material  from  which  the  lens  is  made.  Since  the  ray  is  paraxial,  its  in- 
tersections with  the  hrst  and  second  lens  surfaces  occur  at  locations,  mea- 
sured along  the  axis,  which  are  essentially  the  same  as  the  locations  of  the 
planes  labeled  vertex  plane  1 and  vertex  plane  2.  The  distance  to  vertex 
plane  1 from  the  object  plane  is  indicated  as  s.  The  distance  to  vertex  plane 

2 from  vertex  plane  1 is  indicated  as  t21.  And  the  distance  to  the  image 
plane  from  vertex  plane  2 is  indicated  as  s'.  Thus  5 is  the  object  distance 
measured  to  the  hrst  vertex  plane,  ^21  is  the  thickness  of  the  lens  measured 


1410  Ray  Optics 


Vertex  plane  2 


Fig.  29-32  A thick  lens.  Two  rays  are 
shown  leaving  a single  point  on  the  ob- 
ject plane.  They  arrive  at  a single  point 
on  the  image  plane. 


between  its  vertex  planes,  and  s'  is  the  image  distance  measured  from  the 
last  vertex  plane. 

If  we  specify  the  height  above  the  axis  and  slope  of  the  ray  as  it  leaves 
the  object  plane  by  the  column  matrix  C,  its  specification  as  it  arrives  at  the 
hrst  vertex  plane  is  given  by  the  column  matrix  whose  value  is  TC,  where 


T = 


1 

0 


5 

1 


(29-52) 


This  is  the  transmission  matrix  taking  the  ray  the  distance  s to  the  hrst 
vertex  plane  from  the  object  plane.  Immediately  after  passing  through 
vertex  plane  1,  the  column  matrix  specifying  the  ray  has  the  value  given  by 
RiTC.  Here 


Ri  = 


0 ' 
«1 
n[_ 


is  the  refraction  matrix  describing  the  effect  of  vertex  plane  1 on  the  ray. 
The  specification  of  the  ray  immediately  before  arriving  at  vertex  plane  2 is 
contained  in  the  column  matrix  whose  value  is  given  by  TojRjTC,  with 


T 


21 


1 t2  i 

0 1 


This  transmission  matrix  takes  the  ray  the  distance  t21  to  vertex  plane  2 from 
vertex  plane  I.  Then  the  ray  is  refracted  at  vertex  plane  2,  and  on  the  far 
side  is  specified  by  the  column  matrix  whose  value  is  obtained  by  evaluating 
R2T21  RiTC.  The  refraction  matrix  at  vertex  plane  2 is 


1 

0 " 

r2  = 

ft-iU 

l\n2  / r2 

«2 

n2  _ 

After  leaving  the  last  vertex  plane,  the  ray  continues  to  the  image  plane.  It 
arrives  at  that  plane  with  a height  above  the  axis  and  a slope  that  are  speci- 
fied by  a column  matrix  whose  value  is  T'R2T21R1TC.  Here 


T'  = 


1 

0 


s' 

1 


(29-53) 


is  the  transmission  matrix  taking  the  ray  the  distance  s'  to  the  image  plane 
from  the  last  vertex  plane.  Writing  the  column  matrix  for  the  ray  arriving 


29-6  The  Matrix  Method  1411 


(29-54) 


at  the  image  plane  as  C',  we  have 

T RoTojRjTC  = C 

This  method  can  be  extended  to  a system  containing  any  number  of 
lenses.  For  every  vertex  plane  separating  regions  of  different  index  of  re- 
fraction that  is  added  to  the  system,  two  more  matrices  must  be  multiplied 
successively  from  the  left  into  the  matrix  R2  of  Eq.  (29-54),  before  the  ma- 
trix T'  is  multiplied  from  the  left  to  terminate  the  string  of  matrices.  The 
first  of  these  is  a T matrix  taking  the  ray  to  the  additional  vertex  plane.  The 
next  is  an  R matrix  taking  the  ray  through  the  plane. 

The  matrix  method  we  have  developed  is  capable  of  tracing  any 
paraxial  ray  through  any  optical  system.  So  it  is  capable  of  determining  the 
image  that  any  system  forms  of  any  object.  But  at  the  stage  to  which  we 
have  carried  its  development,  it  is  not  efficient  to  apply  the  matrix  method 
to  a specific  numerical  example.  You  can  see  this  by  considering  what  such 
an  application  would  involve.  Given  an  object  at  a particular  location,  we 
could  use  Eq.  (29-54)  to  trace  several  rays  from  the  object  through  a certain 
optical  system.  This  would  allow  us  to  find  the  image.  But  since  the  ele- 
ments of  matrix  C would  have  different  values  for  each  of  the  rays,  for 
each  we  would  have  to  perform  all  the  numerical  work  involved  in  the  ma- 
trix multiplications  on  the  left  side  of  Eq.  (29-54).  If  we  then  wanted  to  de- 
termine what  happens  to  the  image  when  we  change  the  location  of  the  ob- 
ject, we  would  have  to  repeat  the  entire  process  because  the  values  of 
elements  in  T and  T'  would  change. 

There  is  a much  more  efficient  way  to  use  the  matrix  method.  For  the 
optical  system  shown  in  Fig.  29-32  the  central  part  of  the  string  of  matrices 
in  Eq.  (29-54) — that  is,  the  part  reading  R2T21R! — depends  on  only  the 
properties  of  the  optical  system.  It  is  the  same,  no  matter  where  the  object 
is  located  or  which  ray  from  the  object  is  traced  through  the  system.  Hence 
if  we  evaluate  the  quantity  R2T21R!  once,  we  can  use  the  result  in  all  calcula- 
tions involving  the  system.  This  reduces  considerably  the  effort  required  in 
using  the  method.  The  quantity  under  consideration  is  called  the  system 
matrix  S.  That  is,  we  define 


RoTojRj  = S (29-55) 

Introducing  the  system  matrix  allows  us  to  write  Eq.  (29-54)  as 

T'STC  - C'  (29-56) 

I he  idea  sounds  like  a good  one,  but  it  raises  two  questions.  First,  what 
is  meant  by  the  matrix  products  that  enter  in  the  system  matrix,  such  as 
T2]Ri?  Both  factors  in  this  product  are  2 by  2 matrices,  and  up  to  this  point 
we  know  only  how  to  multiply  a 2 by  2 matrix  into  a 2 by  1 matrix.  Second, 
what  is  the  justification  for  changing  the  sequence  of  carrying  out  the  ma- 
trix multiplications  in  the  left  side  of  Eq.  (29-54)  from  the  sequence  speci- 
fied in  the  paragraph  which  closes  with  that  equation?  We  must  justify  the 
legitimacy  of  performing  first  t fie  multiplications  in  the  central  part,  as  in- 
dicated in  Eq.  (29-55),  and  then  the  remaining  multiplications,  as  indicated 
in  Eq.  (29-56). 

The  answer  to  the  first  question  is  found  in  studying  the  definition  of 
the  product  of  two  2 by  2 matrices.  Such  a product  is  defined  by  the  follow- 
ing equation: 


1412  Ray  Optics 


(29-57) 


a2  b2 

~ai  b i 

a2ax  + b2c1  a2bx  + b2dx 

_c2  d2_ 

_ci  di_ 

c2ax  + d2c  i c2bx  + d2d1_ 

According  to  this  rule  for  multiplying  a 2 by  2 matrix  into  a 2 by  2 matrix,  the 
result  is  another  2 hy  2 matrix  whose  elements  have  the  indicated  values. 
The  procedure  for  calculating  the  elements  of  the  product  matrix  from  the 
elements  of  the  two  matrices  being  multiplied  is  just  an  extension  of  the  pro- 
cedure used  in  multiplying  a 2 by  2 matrix  into  a 2 by  1 matrix.  That  is,  you 
get  the  elements  in  the  first  column  of  the  matrix  produced  by  the  multipli- 
cation by  going  through  the  same  procedure  you  would  in  multiplying  the 
matrix  involving  the  subscript  2 into  a column  matrix  comprising  the  ele- 
ments of  the  first  column  of  the  matrix  involving  the  subscript  1.  The  ele- 
ments of  the  second  column  are  obtained  by  doing  the  same  thing,  except 
that  you  use  the  elements  of  the  second  column  of  the  subscript  1 matrix. 
Carrying  out  such  a calculation  is  actually  much  easier  than  describing  it,  as 
you  will  see  in  Example  29-8. 


EXAMPLE  29-8 

Multiply  the  matrix 

1 2 
.3  4_ 

into  the  matrix 

‘5  6' 
.7  8. 

■ Applying  Eq.  (29-57),  you  obtain 


1 2" 

5 6- 

1x54  2 x 7 1 x 6 4 2 x 8' 

T9  22' 

3 4. 

_7  8. 

3x544x7  3 x 6 4 4 x 8_ 

.43  50. 

A calculation  which  will  give  you  some  useful  practice  in  applying  both 
of  the  matrix  multiplication  rules  is  to  employ  them  to  prove  that 


a2  b2 

( 

bi 

e 

) - ( 

1 

<N 

CN 

S3 

1 

’«i 

bi 

e 

<M 

C9 

l 

Cl 

d\  _ 

J. 

) V 

_c2  d2  _ 

Ci 

dil 

) 

J. 

This  is  a particular  case  of  what  mathematicians  call  the  associative  law  for 
the  multiplication  of  matrices.  In  general,  the  law  says  that  in  evaluating 
the  product  of  several  matrices,  they  may  be  grouped  to  be  multiplied  two 
at  a time  in  any  sequence  you  wish — just  as  in  the  multiplication  of  scalar 
quantities.  (But  remember  that  this  does  not  mean  you  can  rearrange  the 
ordering  of  terms  in  a matrix  product;  for  instance,  T21Rt  ^ RjT^.)  The 
associative  law  answers  the  second  question  posed  above.  That  is,  it  justifies 
first  carrying  out  the  matrix  multiplications  in  Eq.  (29-55)  and  then  car- 
rying out  the  multiplications  in  Eq.  (29-56),  instead  of  carrying  them  out  in 
the  sequence  described  in  the  paragraph  leading  to  Eq.  (29-54). 

For  an  optical  system  more  complicated  than  the  one  illustrated  in  Fig. 
29-32,  the  only  change  we  must  make  is  to  write  its  system  matrix  S as  the 
more  general  expression 

• • • R3T32R2T21R1  ss  S (29-58) 

It  keeps  on  going  to  the  left,  with  the  last  term  on  the  left  being  the  refrac- 
tion matrix  for  the  last  vertex  plane. 


29-6  The  Matrix  Method  1413 


y 

Any  system  of 
thick  or  thin  lenses 

Fig.  29-33  A ray  diagram  used  to  find 
the  location,  size,  and  orientation  of  an 
image  formed  by  an  arbitrary  set  of 
y>  lenses. 

S 

s' 

Object 

First 

Last 

Image 

plane 

vertex 

vertex 

plane 

plane 

plane 

Now  we  will  perform  calculations  which  trace  rays  from  any  object, 
through  an  optical  system,  to  the  image  formed  of  the  object.  We  will  do  it 
in  such  a way  that  the  results  obtained  will  apply  to  any  optical  system.  For 
this  purpose,  we  write  the  system  matrix  in  the  completely  general  form 


a b 
c d 


(29-59) 


The  results  of  the  calculations  will  be  two  quite  simple  scalar  equations. 
One  of  them,  s'  = —(as  + b)/(cs  + d),  will  allow  you  to  determine  the  value 
of  s',  which  locates  the  image,  after  you  have  carried  out  the  matrix  multi- 
plications in  Eq.  (29-58)  to  find  the  values  of  the  elements  a,  b,  c,  and  d of 
the  system  matrix  for  a particular  system.  You  will  be  able  to  substitute 
these  four  values,  and  the  value  of  the  quantity  5 specifying  the  location  of  a 
particular  object,  into  the  equation  and  immediately  find  the  value  of  s'. 
The  other  equation  to  be  obtained  will  allow  you  then  to  evaluate  the  lateral 
magnification  l of  the  image,  which  specifies  its  size  and  orientation  in 
terms  of  the  size  and  orientation  of  the  object.  The  equation  is  l = a + s'c. 
We  will  derive  these  two  equations  by  using  repeatedly  the  rule  for  multi- 
plying a 2 by  2 matrix  into  a 2 by  1 matrix. 

Figure  29-33  indicates  the  general  optical  system,  and  the  planes  on 
which  are  located  the  object  sending  paraxial  light  rays  through  the  system 
and  the  image  that  the  system  produces  of  the  object.  As  a ray  leaves  the 
object  plane,  it  is  specified  by  the  column  matrix 

y 

-</>- 

The  transmission  matrix 


1 s' 

_0  1_ 

takes  the  ray  the  distance  s from  the  object  plane  to  the  first  vertex  plane 
of  the  optical  system.  Thus,  on  arriving  at  the  first  vertex  plane,  the  ray  is 
specified  by  the  matrix 


1 s' 

y 

y + scf) 

_0  1 

.</>- 

. 4>  - 

The  system  matrix 

a b 
c d 

takes  the  ray  through  the  optical  system,  and  as  it  leaves  the  last  vertex 


1414  Ray  Optics 


plane,  it  is  specified  by  the  matrix 


a b 

y + sf 

ay  + asp)  + bp 

c d 

- f - 

cy  + csp  + dp  _ 

Finally,  the  transmission  matrix 

1 s'' 

_0  1 _ 

takes  the  ray  the  distance  s'  from  the  last  vertex  plane  to  the  image  plane, 
and  it  arrives  there  being  specified  by  the  matrix 


T s'' 

ay  + asp  + bp 

ay  + asp  + bp  + s'cy  + s'csp  + s'dp 

_0  1 . 

cy  + csp  + dp 

cy  + csp  + dp 

We  follow  the  symbolism  established  earlier  by  writing  the  column  matrix 
specifying  the  ray  arriving  at  the  image  plane  as 

~y' " 

where 


and 


ay  + asp  + bp  + s'cy  + s'csp  + s'd(f)  = y'  (29-60a) 


cy  + csp  + dp  = p' 


(29-60 b) 


Consider  the  first  of  these  scalar  equations,  and  rewrite  it,  grouping 
factors  of  y and  p,  to  obtain 

{a  + s'c)y  + (as  + b + s'cs  + s'd)p  = y'  (29-61) 

Now  y'  gives  the  height  above  the  axis  on  arriving  at  the  image  plane  of  a 
ray  leaving  the  object  plane  at  height  y above  the  axis  and  with  slope  p.  But 
all  paraxial  rays  leaving  a point  in  the  object  plane  — no  matter  what  their 
slope — arrive  at  a certain  point  in  the  image  plane.  (See  Fig.  29-32.)  There- 
fore the  values  of  y'  found  from  Eq.  (29-61)  should  not  depend  on  the  values  of 
p.  This  means  that  the  coefficient  of  p in  that  equation  must  have  the  value  zero. 
In  other  words,  the  quantities  s and  5'  must  be  related  in  such  a way  that 

as  + b + s'cs  + s' cl  = 0 (29-62) 

We  can  find  the  value  of  s',  for  a given  value  of  5,  by  solving  Eq.  (29-62) 
for  s'.  Factoring,  transposing,  and  then  dividing,  we  have 

s'(cs  + d)  = — ( as  + b) 

and 


as  + b 
cs  + d 


(29-63) 


The  quantity  s is  the  distance  from  the  object  plane  to  the  first  vertex  plane 
of  the  optical  system,  which  we  call  the  object  distance.  The  quantity  s'  is 
the  distance  from  the  last  vertex  plane  of  the  optical  system  to  the  image 
plane,  which  we  call  the  image  distance.  As  before,  the  value  of  s'  is  positive 
when  the  direction  from  the  last  vertex  plane  to  the  image  plane  is  the  same 
as  the  general  direction  of  light  rays  through  the  system,  and  it  is  negative 
otherwise.  Equation  (29-63)  gives  us  a very  easy  way  of  calculating  the 


29-6  The  Matrix  Method  1415 


image  distance  s'  from  the  values  of  a,  b,  c,  and  d,  the  four  elements  of  the 
system  matrix,  and  the  value  of  the  object  distance  s. 

\V  e evaluate  the  lateral  magnification  of  the  image  by  using  Eq.  (29-62) 
to  reduce  Eq.  (29-61)  to  the  expression 

(a  + s'c)y  = y'  (29-64) 

All  paraxial  rays  which  leave  the  object  plane  at  height  y above  the  axis  ar- 
rive at  the  image  plane  at  height  y'  above  the  axis.  The  ratio  of  these  two 
quantities  is  what  we  define  to  be  the  lateral  magnification  /.  That  is, 


y 


Employing  Eq.  (29-64),  we  have 

l = a + s'c  (29-65) 

As  usual,  a positive  value  of  I means  an  erect  image,  and  a negative  value  of 
/ means  an  inverted  image.  Equation  (29-65)  makes  it  very  easy  for  us  to 
calculate  the  lateral  magnification  /,  after  T has  been  found,  in  terms  of  the 
values  of  the  two  elements  a and  c of  the  system  matrix. 

Equations  (29-63)  and  (29-65)  completely  determine  the  characteristics  of  the 
image  that  a system  produces  of  a given  object  for  any  case  in  which  both  the 
image  distance  s ' and  the  object  distance  s are  finite.  In  Sec.  29-7  we  consider  a 
case  in  which  s is  infinite.  We  will  find  that  Eq.  (29-63)  reduces  to  an  even  simpler 
expression  for  s '.  Also  we  will  find  that  Eq.  (29-65)  for  1 is  not  useful  because  its 
value  is  zero  when  s is  infinite  (unless  s'  also  is  infinite,  which  makes  1 indeter- 
minate). But  it  will  be  simple  to  use  Eqs.  (29-63)  and  (29-64)  to  develop  an  equa- 
tion which  plays  the  role  of  Eq.  (29-65)  in  determining  the  characteristics  of  the 
image  produced  from  an  object  when  the  object  distance  s is  infinite. 

If  the  image  distance  s'  is  infinite,  then  the  lateral  magnification  1 is  infinite 
(unless  s is  also  infinite,  which  makes  1 indeterminate  as  was  just  said).  In  such  a 
case,  the  magnification  of  a system  must  be  specified  in  terms  of  angular  quan- 
tities. It  can  be  evaluated  by  using  Eq.  (29-60b),  cy  + cs$  + d<f>  = $',  to  find  the 
value  of  the  angle  <//  between  any  of  the  parallel  light  rays  and  the  axis  at  any  posi- 
tion beyond  the  last  vertex  plane  of  the  system.  But  as  we  saw  in  Sec.  27-5,  magni- 
fications involving  angular  quantities  are  defined  differently  for  different  systems. 
They  are  not  as  interesting  as  lateral  magnification  because  they  do  not  have  the 
generality  of  lateral  magnification,  which  is  always  defined  the  same  way.  So  we 
will  not  apply  the  matrix  method  to  evaluate  magnifications  involving  angular 
quantities.  (However,  one  of  the  exercises  at  the  end  of  this  chapter  leads  you 
through  an  evaluation  of  the  angular  magnification  of  a thick  lens  used  as  a magni- 
fying glass.) 

Before  it  is  possible  to  use  the  equations  which  determine  the  location, 
size,  and  orientation  of  the  image  produced  by  an  optical  system  in  terms  of 
the  location,  size,  and  orientation  of  the  object,  it  is  necessary  to  evaluate 
the  elements  a,  b,  c,  and  d of  the  system  matrix.  This  is  done  by  employing 
Eqs.  (29-48),  (29-51),  and  (29-58)  to  write,  with  specific  numerical  values  for 
each  of  the  elements,  the  string  of  2 by  2 matrices  that  form  the  system 
matrix.  Then  repeated  matrix  multiplications  are  carried  out  to  reduce 
the  string  of  matrices  to  a single  2 by  2 matrix.  The  numerical  values  of  the 
four  elements  of  this  matrix  are  a,  b,  c,  and  d. 

If  the  system  is  a complicated  one,  there  will  be  many  2 by  2 matrices 
that  must  be  multiplied  together  in  evaluating  the  elements  of  the  system 
matrix.  This  can  be  tedious,  and  error-prone,  when  clone  manually.  But 


1416  Ray  Optics 


you  will  find  a matrix  multiplication  program  in  the  Numerical  Calculation 
Supplement  that  will  convert  a programmable  calculator  or  computet  into 
a machine  on  which  you  can  quickly  and  easily  calculate  the  product  of  even 
a quite  long  string  of  2 by  2 matrices. 

Examples  29-9  through  29-11  making  use  of  the  matrix  method  are 
presented  in  Sec.  29-7. 


29-7  APPLICATIONS 
OF  THE  MATRIX 
METHOD 


Example  29-9  uses  the  matrix  method  to  find  the  image  that  a thick  lens 
produces  of  an  object. 


EXAMPLE  29-9 

a.  Figure  29-34  shows  a thick  lens  and  an  object.  Use  the  matrix  method  to  de- 
termine the  location,  size,  and  orientation  of  the  image. 

b.  For  the  sake  of  comparison,  determine  these  properties  of  the  image  by 
treating  the  lens  as  if  it  were  a thin  lens. 

■ a.  Fhe  system  matrix  you  must  first  evaluate  is  R^To^,  = S,  or 


\J_  _«2 
/r2  n2 


0 


0 7 


n'y 


= S 


From  the  figure,  you  have  n1  = 1,  n[  = 1.500,  rl  = +10.000  cm,  t2l  = 2.000  cm, 
n2  = 1.500,  n2  = 1 , and  r2  = — 5.000  cm.  Fir  st  you  make  calculations  producing  the 
values 


Fig.  29-34  A scale  drawing  of  a thick  lens  and  an  object.  Shown  are  the  actual  image  deter- 
mined by  the  matrix  method  and  the  approximate  image  determined  by  treating  the  lens  as  if 
it  were  thin.  Also  shown  is  the  actual  focal  point  of  the  lens  and  the  approximate  focal  point 
found,  respectively,  by  the  matrix  method  and  by  considering  the  lens  to  be  thin. 


29-7  Applications  of  the  Matrix  Method  1417 


1 


n1 

— = 0.6667 

n[ 


1 

I — = — 0.0333  cm  1 
r i 


n o 

— = 1.500 
n2 

(n2  \ 1 

— — 1 — = -0.1000  cm  1 
\ n2  1 r2 

Then  you  enter  the  values  of  all  the  matrix  elements,  expressing  distances  in  units 
of  centimeters.  You  have 


1 

0 

T 2.000" 

1 

0 

_- 0.1 000 

1.500. 

.0  1 . 

.-0.03333 

0.6667. 

Performing  the  successive  multiplications  by  using  the  matrix  multiplication  pro- 
gram in  an  appropriate  computing  device,  you  obtain 


0.9333 

1.3334 

— C _ 

"0  b " 

.-0.1433 

0.8667. 

— ij  — 

_c  d _ 

Now  you  can  find  the  image  distance  s'  when  the  object  distance  is  5 = 3.000  (in 
units  of  centimeters),  as  shown  in  the  figure.  Using  Eq.  (29-63),  you  have 

as  + b 0.9333  x 3.000  + 1.3334 

5'  = — 9 463 

cs  + d -0.1433  x 3.000  + 0.8667 

You  then  find  the  lateral  magnification  by  using  Eq.  (29-65): 

1=  a + s'c  = 0.9333  + (-9.463)  x (-0.1433)  = +2.289 

Rounding  off  these  final  results  to  three  significant  figures  and  inserting  the  units 
of  centimeters  in  the  distance  5',  you  obtain 

s'  = —9.46  cm  and  / = +2.29 


The  negative  value  of  s'  means  the  image  is  on  the  side  of  the  last  vertex  plane 
from  which  light  is  coming  through  the  system,  its  distance  from  that  plane  is  the 
magnitude  of  s' . I he  positive  value  of  / tells  you  that  the  image  is  erect  as  shown  in 
the  figure.  Since  the  height  of  the  object  in  the  figure  is  h = +0.500  cm.  the  height 
of  the  image  is 

h'  = Ih  = +2.29  x ( + 0.500  cm)  = +1.E5  cm 

1 he  image  shown  in  the  figure,  which  is  accurately  to  scale  in  all  regards,  was  con- 
structed by  using  the  values  of  s'  and  h'. 

b.  To  find  the  properties  of  the  image  by  approximating  the  actual  lens  with  a 
thin  lens,  you  must  say  where  the  plane  of  the  thin  lens  is  located.  The  most  reason- 
able location  to  choose  is  halfway  between  the  vertex  planes  of  the  actual  lens.  Then 
you  calculate  the  focal  length  of  the  thin  lens  by  using  the  lens  maker’s  formula 


= (1.500  - 1) 


1 

+ 10.00  cm 


= +0.150  cm-1 


— 5.00  cm 


So  the  focal  length  of  the  thin  lens  is 


/ = 


1 

+ 0.150  cm-1 


+ 6.67  cm 


1418 


Ray  Optics 


Since  here  the  object  distance  5 is  measured  from  the  object  to  the  plane 
halfway  between  the  vertex  planes  of  the  actual  lens,  you  have  5 = 3.00  cm  + 
1.00  cm  = +4.00  cm.  If  you  compare  this  value,  the  value  just  calculated  for/,  and 
the  height  h = +0.500  cm  of  the  object  with  the  values  of  these  quantities  in  the 
thin-lens  calculation  of  Example  29-5,  you  will  find  them  to  be  identical.  Hence  you 
can  use  the  results  of  Example  29-5  to  conclude  immediately  that  the  thin-lens 
approximation  predicts  that  the  image  distance  s',  measured  from  the  plane 
halfway  between  the  vertex  planes  of  the  actual  lens,  has  the  value 

s'  = - 10.0  cm 

It  also  predicts  that  the  height  of  the  image  is 

h'  = +1.25  cm 

The  image  predicted  by  the  thin-lens  approximation  is  shown  in  Fig.  29-34. 
Comparison  of  the  location  and  size  of  this  image  with  the  location  and  size  of  the 
image  obtained  in  part  a shows  how  much  error  is  made  in  assuming  the  lens  to  be 
thin. 


Example  29-9  made  two  comparisons  between  accurate  results  ob- 
tained by  using  the  matrix  method  to  treat  paraxial  rays  passing  through  a 
thick  lens  and  approximate  results  obtained  by  considering  the  lens  to  be 
thin,  so  that  thin-lens  formulas  could  be  applied.  Another  comparison  of 
the  same  type  can  be  made  by  investigating  the  case  in  which  the  object  is 
infinitely  far  from  the  lens.  When  the  object  distance  5 is  large,  Eq.  (29-63), 

f as  + b 
cs  + d 


reduces  to 

as 

cs 


This  is  so  because  when  5 is  large,  b is  negligible  compared  to  as  and  d is 
negligible  compared  to  cs.  Dividing  by  s and  then  letting  5 approach  infinity 
to  make  the  approximate  equality  become  an  exact  equality,  we  obtain 


s'  = for  s = 00 

c 


(29-66) 


This  expression  gives  the  value  of  the  image  distance  s',  measured  from  the 
last  vertex  plane,  when  the  object  producing  the  image  is  at  infinity.  For  the 
thick  lens  considered  in  Example  29-9fl,  the  value  of  s'  obtained  from  Eq. 
(29-66)  is 


a 

c 


0.9333 

-0.1433 


+ 6.513 


Writing  the  distance  units  explicitly  gives 

5'  = +6.513  cm 


The  point  on  the  axis  at  this  distance  behind  the  last  vertex  plane  is  the 
focal  point  of  the  lens.  It  is  shown  in  Fig.  29-34. 

Approximating  the  lens  by  a thin  lens,  as  in  Example  29-96,  we  apply 
the  thin-lens  formula 

1 - I _ 1 

/ / 5 


29-7  Applications  of  the  Matrix  Method  1419 


1420  Ray  Optics 


and  set  5 = °°.  We  obtain  immediately 

=/ 

Using  the  focal  length  f — +6.67  cm  found  in  Example  29-9,  we  have  for 
the  image  distance 

s'  = + 6.67  cm 

This  image  distance  is  measured  from  the  plane  of  the  thin  lens,  which  is 
halfway  between  the  vertex  planes  of  the  actual  lens.  The  point  on  the  axis 
at  this  distance  is  the  focal  point  of  the  lens,  according  to  the  thin-lens 
approximation.  It,  too,  is  shown  in  Fig.  29-34.  Why  does  die  thin-lens 
approximation  predict  a focal  point  closer  to  the  lens  than  the  prediction  of 
the  accurate  calculation?  Can  you  use  the  fact  that  it  does  to  give  an  expla- 
nation of  why  the  thin-lens  approximation  leads,  in  Example  29-9 b,  to  too 
large  a distance  from  lens  to  image  and  to  an  image  which  is  too  high? 


As  the  object  distance  5 approaches  infinity,  the  size  of  an  object  also 
must  approach  infinity  if  the  object  is  not  to  become  invisible  as  viewed  by 
the  optical  system.  Yet  the  system  will  produce  an  image  of  finite  size 
(unless  the  image  distance  also  is  approaching  infinity,  as  in  a telescope, 
which  we  assume  not  to  be  the  case).  An  example  of  this  situation  is  pro- 
vided by  a camera  taking  a picture  of  a very  distant  object.  In  such  a situa- 
tion, the  lateral  magnification  l is  not  a useful  quantity  because  its  value  ap- 
proaches zero.  (You  can  see  this  by  substituting  the  expression  s'  — —a/c 
into  the  expression  / = a + s' c , to  obtain  / = 0.) 

To  determine  the  size  and  orientation  of  the  image  that  a system  pro- 
duces of  a very  distant  object,  we  go  back  to  Eq.  (29-64),  writing  it  as 


Then  we  use  Eq.  (29-63), 


y'  — (a  + s'c)y 


as  + b 
cs  + d 


to  evaluate  s',  obtaining 


asc  + be 


y 


cs  + d 
acs  + ad  — asc  — be 
cs  + d 
ad  — be 


cs  + d 


y 


Eetting  the  object  distance  5 be  very  large  makes  the  quantity  d negligible 
compared  to  the  quantity  cs  and  gives  us 


y — 


y 


or 


Applying  this  to  a ray  going  from  the  tip  of  an  object  of  height  h to  the  tip 
of  its  image  of  height  h' , we  have 


0 

* 


Optical 

system 


Fig.  29-35  A ray  from  the  tip  of  a very  dis- 
tant, very  large  object  that  passes  through 
an  optical  system.  The  height  of  the  object 
is  h,  the  object  distance  is  s,  and  the  slope 
angle  of  the  ray  is  <f>.  The  slope  is  negative 
when  the  object  height  is  positive,  as  illus- 
trated. Its  value  is  given  by  the  relation 
tan  4>  = h/s. 


h'  - 


h 

s 


Consider  a ray  which  starts  at  the  tip  of  a very  distant  object  and  passes 
anywhere  through  the  hrst  refracting  surface  of  an  optical  system.  If  the 
slope  angle  0 of  the  ray  as  it  enters  the  system  is  not  essentially  zero,  then 
the  height  h of  the  object  above  the  axis  must  be  very  large.  Assuming  this 
to  be  the  case,  we  can  see  from  Fig.  29-35  that  (f>  is  given  by  the  relation 
tan  (j>  = —h/s,  where  s is  the  object  distance  and  the  minus  sign  occurs  be- 
cause the  slope  is  negative.  To  satisfy  the  paraxial  approximation,  we  assume 
also  that  the  ratio  of  h to  s is  small  enough  to  make  the  magnitude  of  <j>  small 
compared  to  1 rad.  Then  we  may  replace  tan  </>  by  </>  and  write 


<t>  = - 


h 

s 


(29-67) 


Using  this  relation,  we  obtain 


h' 


(-</>) 


Finally,  we  let  the  object  distance  approach  infinity,  so  that  the  approxi- 
mate equality  becomes  an  exact  equality.  Then  we  have 


h'  = 


for  s = 00 


(29-68) 


In  Example  29-10  Eqs.  (29-66)  and  (29-68)  are  used  to  analyze  the 
characteristics  of  an  optical  system  containing  two  adjacent  thick  lenses. 


EXAMPLE  29-10 

a.  Figure  29-36  shows  a converging-diverging  pair  of  thick  lenses  used  in  an 
inexpensive  camera  to  reduce  chromatic  aberration.  Light  enters  the  camera  lens 
through  the  curved  surface  on  the  left  and  leaves  through  the  flat  surface  on  the 
right.  What  must  be  the  distance  from  the  flat  surface  to  the  plane  of  the  camera 
him  when  a photograph  is  being  taken  of  a very  distant  object? 

■ The  system  matrix  you  must  evaluate  to  obtain  the  answer  is 
RaTsaR^Ri  = S,  or 


^32 

1 


i o j n t21 


«2 

n'2 


0 1 


0 1 


n i 


= S 


To  do  this,  you  first  evaluate  the  quantities  appearing  in  the  three  refraction 
matrices.  They  are 


n i _ 1 

n[  1.500 


0.6667 


1 


r i 


(0.6667  - 1) 


1 

+ 2.000  cm 


— 0.1667  cm  1 


29-7  Applications  of  the  Matrix  Method  1421 


1 


o'  o' 


Fig.  29-36  The  lens  of  an  inexpensive 
camera.  It  is  made  from  two  different 
types  of  glass  held  together  with  trans- 
parent optical  cement.  The  purpose  is 
to  reduce  chromatic  aberration. 


1.500 


ni  1.630 


= 0.9202 


\ 1 1 

1 - = (0.9202  - 1)— — - 

/ r2  v -2.00 


n3  1.630 


ns 


1 


000  cm 


= 1.630 


+ 0.03988  cm' 


~ - 1 )-  = (1.630  — l)  — = 0 
n3  I r3  cc 


I he  quantities  appearing  in  the  two  transmission  matrices  have  the  values 

t2 1 = 0.5000  cm 
t32  = 0.4000  cm 

If  all  distances  are  expressed  in  units  of  centimeters,  the  system  matrix  becomes 


1 0 

'1  0.4000  " 

" 1 

0 

"1  0.5000' 

1 

0 ' 

.0  1.630. 

.0  1 

.0.03988 

0.9202  _ 

.0  1 . 

. — 0.1667 

0.6667. 

Carrying  out  the  matrix  multiplications  on  your  computing  device,  you  obtain 


0.8699 

0.5841 

— C — 

'a 

b~ 

. -0.1905 

1.022  . 

— 3 — 

_c 

d_ 

Now  all  you  have  to  clo  is  to  evaluate  s'  from  Eq.  (29-66),  the  equation  that  per- 
tains to  the  case  where  5 = °°.  You  have 


a 

c 


0.8699 

-0.1905 


+ 4.566 


Rounding  off  to  three  significant  figures  and  including  the  distance  unit,  you  obtain 
the  result 


5'  = +4.57  cm  ■ 

b.  When  the  camera  is  focused  on  an  object  at  infinity,  the  lens  is  at  its  min- 
imum distance  from  the  plane  of  the  film.  The  lens  can  be  moved  away  from  that 
plane  by  sliding  it  in  its  mount  through  a distance  equal  to  0.300  cm.  What  is  the 
closest  object  that  can  be  clearly  photographed  by  the  camera? 

■ Working  again  in  units  of  centimeters,  you  add  0.300  to  the  value  5'  = 
+ 4.566  found  in  part  a and  find  that  here  5'  = +4.866.  To  find  the  corresponding 
value  of  5,  you  must  take  Eq.  (29-63), 

as  + b 
cs  + d 

and  solve  it  for  s.  Multiplying  through  by  cs  + d,  you  obtain 

css'  + ds'  = — (as  + b) 

Transposing  and  factoring  give  you 

s(cs'  + a)  = —(ds'  + b) 


or 


5 


ds'  + b 
cs'  + a 


1422 


Ray  Optics 


Inserting  the  values  of  the  four  elements  of  the  system  matrix  found  in  part  a and 
the  value  of  s',  you  have 

1.022  x 4.866  + 0.5841 
5 “ ~~  -0.1905  x 4.866  + 0.8699  “ +9/'3/ 

Putting  the  distance  units  in  explicitly  and  rounding  off  give  you  the  result 

5 — + 97.4  cm  ■ 

c.  The  height  from  the  axis  to  the  tip  of  the  image  produced  in  the  plane  of  the 
him  can  have  a magnitude  no  greater  than  1.20  cm.  (This  is  the  distance  from  the 
center  of  a strip  of  35-mm  hint  to  the  edge  of  the  usable  part  of  the  emulsion.)  Eval- 
uate the  angle  subtended  at  the  camera  from  the  axis  to  the  tip  of  an  object  at  in- 
finity whose  image  has  this  limiting  height. 

■ The  applicable  equation  is  Eq.  (29-68), 

*' = (f_s)  {-*) 

Evaluating  the  first  term  in  parentheses,  you  obtain 

'ad  0.8699  - 1.002 

b = hTToh^ 0.5841  = -5.251 

c -0.1905 

Assuming  the  object  producing  the  image  has  a positive  height  h and  expressing 
its  value  in  centimeters,  you  write 

h’  = - 1.20 

The  value  is  negative  because  the  camera  produces  an  inverted  image.  (See  Fig. 
29-19.)  Thus  you  have 

-1.20  = (-5.251  )(-<£) 
or 

1.20 

<t>  = ~ — — = -0.299  rad  = - 13.1° 
o.2o  1 

The  magnitude  of  cj)  is  the  angle  subtended  at  the  camera  from  the  axis  to  the  tip  of 
the  object.  (See  Fig.  29-35.) 

This  angle  is  large  enough  to  cause  concern  about  monochromatic  aberrations 
in  the  region  near  the  tip  of  an  image  of  limiting  height,  arising  from  the  failure  of 
rays  forming  that  part  of  the  image  to  satisfy  well  the  paraxial  approximation.  A 
lens  in  a good  camera  must  be  more  sophisticated  than  the  one  shown  in  Fig.  29-36, 
in  order  to  reduce  these  aberrations. 


It  takes  little  more  time  or  effort  to  handle  a set  of  several  thick  lenses, 
provided  a programmable  pocket  calculator  or  computer  is  available  to  eval- 
uate the  system  matrix  S.  In  fact,  the  matrix  method  is  so  convenient  that  it 
is  also  the  preferable  one  to  use  for  systems  containing  several  thin  lenses. 
This  is  particularly  so  because  for  thin  lenses  S is  extremely  simple.  The 
reason  is  that  the  complete  effect  of  a thin  lens  on  a light  ray  can  be  repre- 
sented by  a single  matrix. 

To  see  this,  consider  the  part  of  a system  matrix  associated  with  a 
single  lens,  say  the  lens  whose  surfaces  are  numbered  1 and  2.  We  let  the 
index  of  refraction  of  the  glass  forming  the  lens  be  n and  assume  that  it  is  in 
air.  Also,  we  put  its  thickness  equal  to  zero.  Then  we  have 


o 

T 0" 

o 

R2T2iRi  — 

1 

i 

s 

^ 1 

1 

O 

\n  ) n J 

29-7  Applications  of  the  Matrix  Method  1423 


Multiplying  the  second  matrix  to  the  right  of  the  equals  sign  into  the  third 

gives 


R2T2i  Rj 

Then  multiplying  the  two  remaining  matrices  produces 


1 O' 

1 0~ 

, 1 

/I  \ 1 1 

(n  — 1)—  n 

- - 1 — - 

r2 

V n J rx  nj 

R2T21R1  — 


(n 


1 


1 /I 

1 ) — + n ( — — 1 

r2 


O' 

1 


This  simplifies  to 

R2T21R1 


1 

0" 

1 O' 

/ 1 1 \ 

1 

~(n  - 1) 

1 

--7  1 

L Ml  h>/ 

L / J 

In  the  last  step  we  used  the  lens  maker’s  formula,  Eq.  (29-19),  to  express 
one  element  of  the  matrix  in  terms  of  the  focal  length/ of  the  thin  lens. 

The  matrix  we  have  obtained  is  known  as  the  thin-lens  matrix  L.  That 
is,  we  define 


L = 


1 

l 

f 


0 

1 


(29-69) 


In  a system  of  three  thin  lenses  lableled  1,  2,  and  3,  for  instance,  the  system 
matrix  S taking  a ray  from  just  in  front  of  the  plane  of  lens  1 to  just  behind 
the  plane  of  lens  3 would  be 

L3T32L2T21L1  = S (29-70) 


Example  29-1 1 deals  with  such  a system. 


EXAMPLE  29-11 

A zoom  lens  is  actually  a system  of  lenses  that  has  a variable  lateral  magnification. 
Yet  it  produces  an  image  at  a fixed  location  behind  the  last  component  of  the 
system,  if  the  object  location  is  fixed.  This  allows  the  image  to  remain  in  focus  on 
him  located  at  a fixed  distance  from  the  system.  A simple  system  of  variable  magni- 
fication, and  only  small  variation  in  image  location,  is  shown  in  Fig.  29-37.  The  mag- 
nification is  changed  by  changing  the  position  of  lens  2.  All  the  lenses  are  thin  and 
their  focal  lengths  are/  = f3  = +10.00  cm ,/2  = +20.00  cm. 

a.  For  an  object  distance  s = 100.0  cm,  determine  the  image  distance  5'  and  the 
lateral  magnification  I for  t2 1 = 10.00  cm  and  t32  = 50.00  cm. 

b.  Do  the  same  for  t21  = t32  = 30.00  cm. 

■ a.  Substituting  the  values  given  into  Eqs.  (29-69)  and  (29-70)  and  expressing  all 
distances  in  units  of  centimeters,  you  have 


1 0 

T 50.00' 

1 0 

T 10.00' 

1 0 

-0.1000  1. 

_0  1 

_- 0.05000'  1 

.0  1 

.-0.1000  1 _ 

After  using  the  matrix  multiplication  program,  you  obtain 


-5.000 

35.00 

— C — 

a 

b~ 

. 0.4000 

-3.000. 

— 0 — 

_c 

d_ 

Fhen  using  Eq.  (29-63),  you  find  the  image  distance  to  be 

as  + b ( — 5.000)  x 100.0  + 35.00 

5 “ ~ cs  + d ~ ~ 0.4000  x 100.0  + (-3.000)  ~ 


1424  Rav  Optics 


Object 

plane 


(All  in  air) 


1 


2 


V 

or  2 


3 


Fig.  29-37  A zoom  lens. 


Introducing  the  proper  units  and  rounding  off,  you  have  the  result 

s'  = + 12.6  cm 


The  image  is  located  this  distance  beyond  lens  3. 

According  to  Eq.  (29-65),  its  lateral  magnification  is 

/ - a + s'c  = (-5.000)  + 12.57  x 0.4000  = +0.028 

The  image  is  erect  since  the  sign  of  I is  positive. 

Can  you  explain  why  the  image  is  erect?  The  explanation  involves  the  fact  that 
lens  2 intercepts  the  rays  from  lens  1 before  they  have  come  to  a focus.  In  the  termi- 
nology introduced  below  Eq.  (29-29),  there  is  a virtual  object  for  lens  2. 
b.  Here  you  have 


1 

0 

1 30.00  ■ 

1 O' 

T 

30.00' 

1 0 

_- 0.1000 

1_ 

.0  1 

0.05000  1 

0 

1 

.-0.1000  1 

Carrying  out  the  matrix  multiplications,  you  obtain 


-2.000 
. 0.2000 


15.00 

- 2.000  _ 


b~ 

d_ 


Now  the  image  is  at 

as  + b (-2.000)  x 100.0  + 15.00 
^ ~~  ~ cs  + d ~ ~ 0.2000  x 100.0  + (-2.000) 


+ 10.28 


or 


s'  = + 10.3  cm 


Its  lateral  magnification  is 

/ = a + s'c  = (-2.000)  + 10.28  x 0.2000  = +0.056 
Why  is  the  image  erect  in  this  case? 

Comparing  the  results  obtained  in  parts  a and  b,  you  will  see  that  there  is  only  a 
small  shift  in  the  image  distance  — little  more  than  2 cm  out  of  12  cm  — for  a 
factor-of-two  change  in  the  lateral  magnification.  The  shift  can  be  corrected  by 
giving  lens  3 a small  simultaneous  motion.  How  could  you  determine  the  change  in 
position  of  lens  3 with  respect  to  lens  1 which  keeps  the  image  at  the  same  distance 
from  lens  1 as  lens  2 moves  between  its  two  positions? 


EXERCISES 

Group  A 

29-1.  Reversing  a thin  lens.  Imagine  turning  the  thin 
converging  lens  in  Fig.  29-13  so  that  the  incident  light  first 
strikes  its  concave  surface  instead  of  its  convex  surface. 
Use  the  lens  maker's  formula  to  calculate  its  focal  length  in 
this  position,  and  compare  your  results  with  those  ob- 
tained in  Example  29-1. 


29-2.  Find  the  radii  of  curvature.  The  focal  length  of  a 
thin  converging  lens  of  the  type  shown  if  Fig.  29-13  is 
24.0  cm.  One  radius  of  curvature  is  twice  the  other.  The 
index  of  refraction  of  the  glass  is  1.50.  What  are  the  radii 
of  curvature? 

29-3.  A very  symmetrical  thin  lens.  Both  surfaces  of  a 
thin  converging  lens  have  the  same  radius  of  curvature. 


Exercises  1425 


The  index  of  refraction  of  the  glass  is  precisely  f . Prove 
that  the  focal  length  equals  either  radius  of  curvature. 

29-4.  f-namber.  The  light-gathering  ability  of  a lens  is 
proportional  to  the  cross-sectional  area  of  the  lens  per- 
pendicular to  its  axis.  For  a camera  lens  this  property  is 
specified,  indirectly,  by  a quantity  called  the  /-number. 
For  example,  an  // 8 lens  has  a diameter  equal  to  one- 
eighth  its  focal  length.  If  such  a lens  is  stopped  down  by  an 
adjustable  diaphragm  to//16  (so  that  its  diameter  is  effec- 
tively reduced  to  one-sixteenth  its  focal  length),  by  what 
factor  must  the  exposure  time  (to  take  a particular  picture 
using  a particular  type  of  him)  at // 8 be  multiplied  to  give 
the  new  correct  exposure  time? 

29-5.  Find  the  focal  length.  A thin  converging  lens 
forms  an  image  50.0  cm  from  the  lens.  The  object  is  on 
the  other  side  of  the  lens  and  30.0  cm  from  it.  What  is  the 
focal  length  of  the  lens? 

29-6.  Slide  projector,  I.  The  focal  length  of  a slide  pro- 
jector lens  is  25.0  cm.  The  screen  is  5.00  m from  the  lens. 
What  is  the  lateral  magnification?  Assume  the  lens  is  thin. 

29-7.  Slide  projector,  II.  A slide  25  mm  by  35  mm  is  to 
be  projected  on  a screen  10.0  m from  a thin  lens  as  a 
50  cm  by  70  cm  image. 

a.  What  is  the  distance  of  the  slide  from  the  lens? 

b.  What  focal  length  lens  is  needed? 

29-8.  Enlarger.  The  thin  lens  in  an  enlarger  has  a 
10.0-cm  focal  length.  It  is  desired  to  make  a 10  cm  by 
14  cm  print  from  a 25  mm  by  35  mm  negative.  What  is 
the  distance  from  negative  to  lens  and  from  print  to  lens? 

29-9.  Ray  diagram  construction.  Construct  an  accurate 
ray  diagram  involving  the  diverging  thin  lens  in  Fig. 
29-21,  for  a situation  in  which  the  object  lies  closer  to  the 
lens  than  does  its  focal  point.  Then  do  the  same  for  a situ- 
ation in  which  the  object  lies  at  the  focal  point. 

29-10.  How  many  diopters ?,  I.  The  “strength,”  or 
“power,”  of  an  eyeglass  lens  is  proportional  to  the  recipro- 
cal of  its  focal  length  and  is  expressed  in  diopters.  By  defi- 
nition, the  strength  of  a lens  in  diopters  is  the  reciprocal 
of  its  focal  length  in  meters,  so  that  a lens  of  exactly  i m 
focal  length  has  a strength  of  exactly  2 diopters.  The 
more  converging  a lens,  the  shorter  its  focal  length  and 
the  greater  its  strength  in  diopters.  The  direct  proportion- 
ality between  strength  and  diopters  makes  the  unit  conve- 
nient. For  a diverging  lens,  the  relations  are  the  same  ex- 
cept that  the  quantities  involved  have  negative  values.  If  a 
certain  far-sighted  person  cannot  see  objects  clearly  when 
they  are  closer  than  50  cm,  what  should  an  optometrist 
prescribe  for  the  strength  in  diopters  of  a thin  eyeglass 
lens  that  will  enable  the  person  to  read  a book  held  at  a 
distance  of  25  cm? 

29-11.  How  many  diopters'?,  II.  If  a certain  near- 
sighted person  cannot  see  objects  clearly  when  they  are 
farther  than  50  cm,  what  should  an  optometrist  prescribe 


for  the  strength  in  diopters  (see  Exercise  29-10)  of  a thin 
eyeglass  lens  that  will  enable  the  person  to  see  very  distant 
objects  clearly? 

29-12.  Miiror,  mirror,  on  the  wall.  In  Fig.  29E-12, 0 is  a 
point  source  which  radiates  light,  some  of  which  falls  on  a 
plane  mirror M mounted  on  a wall.  An  observer  sees  a vir- 
tual image  of  0 at/,  behind  the  mirror.  The  virtual  image 
can  be  located  by  extending  backward  any  two  reflected 
rays,  such  as  PQ  and  RS.  Prove  that  the  mirror  is  the  per- 
pendicular bisector  of  IO. 

v\  Fig.  29E-12 


29-13.  What  about  mermaids?  Explain  why  things  ap- 
pear blurry  to  someone  swimming  underwater  unless  the 
person  wears  watertight  goggles. 

29-14.  Viewing  the  moon.  The  distance  from  the  objec- 
tive to  the  eyepiece  of  a telescope  used  to  view  the  moon  is 
100  cm.  If  the  angular  magnification  is  —9.00,  what  is  the 
focal  length  of  each  thin  lens? 

29-15.  Photographing  the  moon.  In  order  to  take  a pho- 
tograph of  the  moon  with  a telescope,  the  eyepiece  can  be 
removed  and  a photographic  plate  inserted  at  the  focal 
point  of  the  thin  objective  lens.  If  the  focal  length  of  the 
objective  is  300  cm,  what  is  the  diameter  of  the  moon’s 
image  on  the  plate?  The  moon  subtends  an  angle  of  0.50°. 

29-16.  Matrix  multiplication,  I.  Let 


A = 


1 1 
0 1 


and  B = 


0“ 

1_ 


a.  Calculate  AB. 

b.  Calculate  BA. 

c.  What  property  of  matrix  multiplication  is  illus- 
trated by  parts  a and  b ? 


29-17.  Matrix  multiplication,  II.  Let 


A = 


"0 

.1 


1 

0 


and  C = 


1 

1 


a.  Calculate  (AB)C. 

b.  Calculate  A(BC). 

c.  What  property  of  matrix  multiplication  is  illus- 
trated by  parts  a and  b ? 


1426  Ray  Optics 


Group  B 

29-18.  The  time  is  a minimum,  not  a maximum,  /.  Make 
the  calculation  suggested  in  Sec.  29-2  to  verify  that  the 
light  path  found  from  the  relation  dt/dx  = 0,  and  used  to 
derive  the  law  of  reflection,  is  a path  on  which  the  travel 
time  of  light  is  a minimum  instead  of  a maximum.  Then 
plot  t versus  x for  the  three  pairs  of  values  about  which 
you  have  quantitative  information,  and  sketch  in  a quali- 
tative way  the  dependence  of  t on  x in  between  these  val- 
ues. 

29-19.  The  time  is  a minimum,  not  a maximum,  II.  Do  the 
same  as  in  Exercise  29-18  for  the  light  path  used  to  derive 
the  law  of  refraction. 

29-20.  Generality  of  the  lens  maker’s  formula.  Starting 
from  Eqs.  (29-5),  repeat  the  derivation  leading  to  the  lens 
maker’s  formula,  Eq.  (29-19),  for  a thin  lens  with  both  sur- 
faces concave  in  the  direction  that  light  travels  through 
the  system.  Thereby  confirm  that  Eq.  (29-19)  can  be  ap- 
plied to  a lens  of  a type  different  from  that  of  the  one  used 
to  derive  it  in  the  text,  if  proper  attention  is  paid  to  the 
sign  convention  for  r. 

29-21.  Thin  lens  in  a liquid.  A converging  thin  lens  has 
a focal  length  of  20  cm. 

a.  What  is  its  focal  length  when  immersed  in  water? 
Take  the  index  of  refraction  for  glass  to  be  n = f and  for 
water  take  it  to  be  n = f. 

b.  What  is  the  focal  length  of  the  same  lens  when  it  is 
immersed  in  carbon  disulfide,  taking  its  index  of  refrac- 
tion to  be  n = t?  Comment  on  your  results. 

29-22.  Focal  length  of  a spherical  mirror.  The  angles  of 
incidence  and  refraction  in  Snell’s  law,  0 and  O' , are  both 
positive,  since  they  both  can  be  considered  to  be  a result  of 
rotation  from  the  normal  in  the  counterclockwise  sense. 
Applying  this  convention  to  reflection  instead  of  refrac- 
tion. you  have  a positive  angle  of  incidence  0 and  a nega- 
tive angle  of  reflection  O' , with  O'  = -0.  In  these  terms, 
Snell’s  law  can  be  extended  to  include  reflection,  pro- 
viding that  sin  0/sin  O'  = sin  0/sin(  — 0)  = -sin  0/sin  0 = 
— 1.  Thus  you  must  take  n = — 1 for  reflection.  L'se 
this  result  in  the  lens  maker’s  formula  of  Eq.  (29-19).  1//  = 
(n  — l)(l/rj  — l/r2),  to  calculate  the  focal  length / of  the 
spherical  mirror  M shown  in  Fig.  29E-22.  Its  center  of 


curvature  is  at  C and  its  focal  point  is  at  F.  Take  ry  = — p, 
where  p is  the  radius  of  curvature  of  the  mirror,  and  de- 
lete the  term  involving  r2  since  only  a “first"  surface  is  in- 
volved. The  positive  value  you  will  obtain  for  / indicates 
that  the  convention  for  the  sign  of  the  focal  length  for  a 


mirror  differs  from  the  convention  for  a lens.  Describe 
briefly  what  you  would  do  to  derive  the  result  you  obtain 
from  basic  principles. 

29-23.  Sharpening  your  image.  A thin  converging  lens 
forms  an  image  on  a screen  which  is  120  cm  from  the  ob- 
ject. Without  changing  the  position  of  screen  or  object,  it 
is  found  that  a sharp,  smaller  image  is  formed  when  the 
lens  is  moved  20.0  cm  toward  the  screen.  What  is  the  focal 
length  of  the  lens? 

29-24.  The  nearest  image.  Prove  that  the  shortest  dis- 
tance between  a real  object  and  its  real  image  is  equal  to 
four  times  the  focal  length  of  the  thin  converging  lens 
producing  the  image.  What  are  5 and  s'  for  this  distance? 

29-25.  Two  possible  object  distances.  An  object  and  a 
screen  are  1. 000  m apart.  There  are  two  different  loca- 
tions of  a thin  converging  lens  of  focal  length  21.0  cm 
which  allow  it  to  form  a sharp  image  on  the  screen.  For 
what  two  distances  between  the  lens  and  object  does  this 
occur?  Account  for  the  fact  that  two  object  distances  are 
possible. 

29-26.  Sizing  up  an  object.  An  object  and  screen  are 
separated  by  a constant  distance.  A converging  lens  forms 
a sharp  image  of  height  h[  on  the  screen.  The  lens  is 
moved  until  a sharp  image  of  height /t-j  is  formed.  (See  Ex- 
ercise  29-25.)  Prove  that  the  height  of  the  object  is  equal  to 
\fh[h2 . I his  provides  an  optical  method  for  measuring  the 
size  of  a nearby  object,  the  exact  distance  to  the  object 
being  unknown. 

29-27.  Eye  ring.  The  image  formed  by  a telescope’s 
eyepiece  ol  the  rim  ol  its  objective  is  called  the  eye  ring. 
Prove  that  the  telescope’s  angular  magnification  is  equal  to 
the  ratio  of  the  diameter  of  its  objective  to  the  diameter  of 
its  eye  ring. 

29-28.  Oil  immersion  microscope.  An  oil  immersion 
microscope  can  resolve  structures  smaller  than  the  more 
common  type.  The  specimen  is  covered  with  a drop  of  oil 
of  the  same  index  n as  the  objective,  and  the  objective  is 
immersed  in  the  oil. 

a.  Explain  why  the  size  8 of  the  smallest  resolvable 
structure  is  given  by  the  following  modification  of  Eq. 
(29-36):  8 = 1.22  X/[2n  sin(a/2)]. 

b.  To  make  8 as  small  as  possible,  the  half  angle  a/2 
subtended  by  the  lens  is  made  equal  to  90°.  Ultraviolet 
radiation  of  short  wavelength  X is  used.  This  requires  pho- 
tographing the  image.  If  X = 300  nm  and  n = 1.6,  what  is 
the  smallest  resolvable  structure? 

29-29.  Mind  the  signs.  Starting  from  Eqs.  (29-37),  re- 
peat the  derivation  leading  to  the  transformation  equa- 
tions of  Eqs.  (29-40)  and  (29-41)  for  a surface  which  is 
concave  in  the  direction  opposite  to  that  which  light  trav- 
els through  the  system.  Thereby  confirm  that  Eqs.  (29-40) 
and  (29-41)  can  be  applied  to  a situation  different  from 
that  used  to  derive  them  in  the  text,  providing  the  sign 
conventions  for  r,  y,  y' , f.  and  f are  followed. 


Exercises  1427 


29-30.  Matrix  multiplication,  HI.  Carry  out  a matrix 
multiplication  verifying  that  Eq.  (29-6),  the  single  matrix 
equation  describing  the  transmission  of  a ray  between  two 
vertex  planes,  is  equivalent  to  Eqs.  (29-4)  and  (29-5),  (he 
two  scalar  equations  describing  the  transmission. 

29-31.  Using  the  thin-lens  matrix.  Consider  an  optical 
system  containing  a single  thin  lens. 

a.  Use  the  thin-lens  matrix  of  Ecp  (29-69),  and  Eq. 
(29-63),  which  relates  s'  to  s and  the  elements  ot  the 
system  matrix,  to  obtain  the  equation  1/5  + 1 /s'  = 1 //. 
Note  that  if  the  relation  (n  — l)(l/rj  — l/r2)  = 1 //is  con- 
sidered a definition  of  the  quantity  1 //,  then  your  calcula- 
tion can  be  considered  to  be  an  independent  derivation  of 
Eq.  (29-17),  the  basic  equation  of  optics  for  thin  lenses. 

b.  Use  the  thin-lens  matrix  to  obtain  the  equation  / = 
—s'/s  for  the  lateral  magnification  produced  by  a thin  lens. 

29-32.  Thin  lenses  in  contact.  One  thin  lens  whose  focal 
length  is f\  is  in  contact  with  another  of  focal  length /2. 

a.  What  is  the  system  matrix  for  the  combination? 

b.  Use  this  system  matrix  to  prove  that  1 //=  1//  + 
I//2,  where  / is  the  focal  length  of  the  combination. 

29-33.  Measuring  the  focal  length  of  a diverging  lens. 
The  measurement  of  the  focal  length  of  a thin  converging 
lens  is  simply  a matter  of  obtaining  a real  image  of  a dis- 
tant object  and  measuring  the  image  distance.  A diverging 
lens  does  not  form  a real  image  of  a real  object,  so  it  is  nec- 
essary to  proceed  otherwise  to  obtain  its  focal  length.  1 his 
exercise  illustrates  one  method.  A thin  converging  lens  of 
focal  length  20.0  cm  is  placed  in  contact  with  a thin  di- 
verging lens  ol  unknown  focal  length.  An  image  of  the 
sun  is  formed  50.0  cm  from  the  combination.  Use  the  re- 
sults of  Exercise  29-32 b to  determine  the  focal  length  of 
the  diverging  lens. 

29-34.  Ray  diagram  for  a zoom  lens.  Using  as  much 
accuracy  as  you  can,  make  geometrical  constructions 
which  trace  rays  from  the  object  through  the  three  thin 
lenses  of  the  zoom  lens  in  the  configuration  considered  in 
part  a of  Example  29- 1 1 . Then  do  it  again  for  the  configu- 
ration considered  in  part  b.  Else  your  ray  diagrams  to  ex- 
plain why  the  image  is  erect  in  each  case.  Then  compare 
your  results,  and  their  accuracy,  with  the  results  obtained 
in  the  example  by  employing  the  matrix  method. 

Group  C 

29-35.  Fermat’s  principle  without  calculus.  Fermat’s 
principle  for  reflection  was  known  in  classical  times  (150 
B.C.)  and  was  called  Heron’s  theorem.  This  exercise 
shows  how  the  principle  was  proved  for  reflection  without 
the  aid  of  differential  calculus.  In  Fig.  29E-35,  M is  the 
minor,  A is  the  source  of  light,  and  B is  the  point  at  which 
it  is  to  be  received  after  reflection.  Let  C be  the  point  at 
which  AC  and  CB  make  ecptal  angles  with  the  mirror.  Let 
C he  any  other  point  on  the  mirror.  Prove  that  AC  + 
CB  < AC'  + CB.  Hint:  Draw  AD  perpendicular  to  the 
mirror  and  extend  it  until  it  meetsBC  extended  at  A' . Also 
draw  A'C . 


Fig.  29E-35 


29-36.  Wave  optics  derivation  of  lens  maker’s  formula. 
a.  The  curvature  of  a sphere  is  defined  as  1/p,  where 
p is  its  radius.  Show  that  1/p  = 2d/y2  when  the  quantities  y 
and  d defined  in  Fig.  29E-36a  are  both  very  small  com- 
pared to  p. 


L M 


Fig.  29E-36 


(b) 

b.  In  Fig.  29E-36 b,  parallel  paraxial  rays  arrive  at  a 
thin  lens.  The  wave  front,  when  it  first  makes  contact  with 
the  lens,  is  ADL.  Light  travels  more  slowly  in  glass  than  in 
air.  Thus,  while  the  wave  front  moves  a distance  DEG  in 
the  lens,  it  moves  a larger  distance  ABH  in  the  air.  Thus, 
the  wave  front  leaving  the  lens  is  HGM.  Assume  that  this 
is  spherical  so  that  it  converges  at  F.  the  focal  point,  and 
use  the  fact  that  the  rays  are  paraxial  to  justify  this  as- 
sumption. Let  r be  the  speed  of  light  in  air  and  n the  index 
of  refraction  of  the  glass.  Show  that 

AB  + BH  _ DE  + EG 

c c/n 


1428  Ray  Optics 


c.  Show  that 

DE  | EG  | GK  DE  | EG 
c c c c/n  c/n 

d.  Multiply  the  equation  in  part  c throughout  by  2/y2 
and  arrive  at  t he  lens  maker’s  formula  expressed  in  terms 
of  the  radii  of  curvatures  of  the  two  surfaces. 

29-37.  Measuring  the  curvature  of  a lens  surface.  The 
radius  of  curvature  of  one  surface  of  a thin  lens,  whose 

C Fig.  29E-37 


focal  length  is  known,  can  be  measured  by  floating  it  in 
mercury  as  in  Fig.  29E-37.  A position  is  found  where 
the  image  1 is  at  the  same  distance  from  the  lens  as  the 
object  O.  This  distance  is  measured.  Call  it  s.  Show  that  s' , 
given  by  1/s  + 1/s'  = 1//,  equals  the  radius  of  curvature 
of  the  surface  in  contact  with  the  mercury.  The  center  of 
curvature  of  this  surface  is  at  C in  the  figure,  and  the  dotted 
lines  intersecting  at  C are  extensions  of  the  rays  in  the 
lens,  which  are  normal  to  the  surface  in  contact  with  the 
mercury. 

29-38.  Achromatic  doublet.  The  index  of  refraction  of 
glass  depends  on  the  color  of  the  light  passing  through  it. 
Hence  the  focal  length  of  a lens  depends  on  the  color  of 
the  light.  Since  white  light  is  a mixture  of  many  colors, 
there  will  be  many  images  of  an  object  illuminated  by- 
white  light,  and  these  will  be  in  slightly  different  places, 
resulting  in  a blurred  image.  The  blurring  can  be  mini- 
mized by  placing  two  lenses  made  of  different  glass  in  con- 


tact with  one  another.  Let  nrC,  nyC,  and  nbF  be  the  indices 
of  refraction  of  crown  glass  for  red,  yellow,  and  blue  light 
respectively,  and  let  nrF , nyF,  nbF  be  the  corresponding 

indices  of  refraction  for  hint  glass.  Let  Kc  = (— — ) 

for  the  crown  glass  lens  and  K F = — ) for  the  flint 

\rlF  r2F  / 

glass  lens,  where  the  quantities  r give  the  radii  of  curva- 
ture and  directions  of  curvature  of  the  lens  surfaces  just  as 
in  Eq.  (29-17). 

a.  Use  the  results  of  Exercise  29-32 b.  1 //=  \/fc  + 
\/fF , for  the  combined  focal  length / of  a thin  crown  glass 
lens  of  focal  length  fc  in  contact  with  a thin  flint  glass  lens 
of  focal  length  fF  to  And  the  condition  that / have  the  same 
value  for  red  and  blue  light.  Such  a combination  is  some- 
times called  an  achromatic  doublet. 

b.  Write  the  expressions  for  the  focal  length  fc  of  the 
crown  glass  lens  for  yellow  light  and  for  fF,  the  corre- 
sponding focal  length  of  the  flint  glass  lens. 

c.  Use  the  results  of  part  b to  eliminate  Kc  and  KF 
from  the  condition  found  in  part  a. 

d.  The  quantity  (nr  — nb)/(ny  — 1)  = p is  called  the 
dispersive  power  of  the  variety  of  glass.  Express  fc/fF  'n 
terms  of  pc  and  pF . 

e.  For  crown  glass:  nrC  = 1.515,  nyC  = 1.517,  and 
nbc  = 1.523.  For  flint  glass:  nrF  = 1.644,  nvF  = 1.650,  and 
nhF  = 1.664.  Calculate  pc . pF , and  fc/fF- 

f.  If  the  focal  length  of  the  crown  glass  lens  is 
+ 20.0  cm,  what  is  the  focal  length  of  the  flint  glass  lens 
that  makes  the  combination  achromatic?  Why  is  such  a 
combination  called  a converging-diverging  pair  in  the 
text? 

g.  What  is  the  focal  length  of  the  combination? 

29-39.  A virtucd  object.  A converging  lens  forms  a real 
image  50  cm  to  the  right  of  the  lens.  Then  a diverging  lens 
is  introduced  in  the  system  20  cm  to  the  right  of  the  con- 
verging lens.  It  is  found  that  this  displaces  the  real  image 
formed  by  the  system  to  the  right  of  its  previous  location 
through  a distance  of  15  cm.  What  is  the  focal  length  of 
the  diverging  lens?  Hint:  The  object  viewed  by  the  di- 
verging lens  is  a virtual  object. 

29-40.  Galilean  telescope.  A Galilean  telescope  consists 
of  a converging  objective  lens  of  long  focal  length  f0>  0 
and  a diverging  eyepiece  lens  of  short  focal  length  fe  < 0. 
The  two  are  separated  by  distance  f0  — \fe |,  as  in  Fig. 


Exercises  1429 


29E-40.  Parallel  paraxial  rays  from  a distant  object  would 
be  brought  to  a focus  to  form  the  image  h at  distance /„ 
from  the  objective,  but  the  eyepiece  intervenes  and  h be- 
comes the  virtual  object  for  the  eyepiece.  This  arrange- 
ment is  used  in  opera  glasses  because  it  is  compact  and 
gives  an  erect  image. 

a.  Show  that  the  image  formed  by  the  eyepiece  is  at 
infinity. 

b.  Show  that  the  angular  magnification  of  the  tele- 
scope is  a = -fo/fe  ■ 

29-41.  Strong  focusing  analogue,  I.  A set  of  four  thin 
lenses,  1,  2,  3,  and  4,  are  arranged  along  a common  axis, 
with  a 5.00-cm  separation  from  the  plane  of  one  lens  to 
the  plane  of  the  next.  Their  focal  lengths  are  fx  = 
+ 20.00  cm,  /2  = —20.00  cm,  /3  = +20.00  cm,  and  /4  = 

— 20.00  cm.  Use  repeated  applications  of  the  thin- 
lens  formula,  1 /s  + 1 /s'  = 1 //,  to  show  that  the  net  ef- 
fect of  this  set  of  two  equal  converging  lenses  and  two 
equally  diverging  lenses  is  to  cause  a beam  of  parallel 
light  incident  upon  it  to  converge,  and  find  the  point  of 
convergence.  Can  you  explain  from  simple  considerations 
why  the  net  effect  is  converging?  This  system  is  an  optical 
analogue  illustrating  a principle  called  strong  focusing, 
widely  used  in  high-energy  particle  accelerators.  Hint:  In 
nearly  all  the  applications  of  the  formula,  the  object 
viewed  by  a lens  is  a virtual  object. 

29-42.  Angular  magnification  by  the  matrix  method.  The 
thick  lens  in  Example  29-9  is  used  as  a magnifying  glass  by 
adjusting  the  object  distance  5 so  that  the  image  distance  s' 
becomes  infinite.  Use  Eq.  (29-63)  to  find  an  expression  for 
the  required  value  of  s.  Then  substitute  this  value  into  Eq. 
(29-606),  and  find  an  expression  for  the  slope  </>'  of  rays 
emitted  from  the  tip  of  an  object  of  height  h above  the  axis 
as  these  rays  enter  an  eye  located  behind  the  lens.  Com- 
pare this  to  the  slope  of  a ray  emitted  from  the  tip  of  an 
object  of  height  h as  it  enters  an  eye  located  25  cm  from 
the  object,  with  no  lens  between  the  object  and  the  eye,  ob- 
tained from  Eq.  (29-67)  with  5 = 25  cm.  Thereby  show 
that  the  angular  magnification  /'//  of  the  thick  lens  used 
as  a magnifying  glass  is  given  by  the  expression  /'//  = 

— 25c,  where  all  distances  are  measured  in  units  of  cm. 
Use  the  value  of  the  element  c of  the  thick-lens  system  ma- 
trix found  in  Example  29-9  to  evaluate  the  angular  magni- 
fication. Then  make  the  approximation  that  the  thickness 
of  the  lens  can  be  neglected,  so  that  its  system  matrix  is 
given  by  the  thin-lens  matrix  of  Eq.  (29-69).  Use  this  in  the 
expression  you  have  derived  for  the  angular  magnifica- 
tion to  show  that  in  the  thin-lens  approximation  it  be- 
comes <f>'  /<f>  = 25 If,  where /is  the  focal  length  of  the  lens, 
and  all  distances  are  measured  in  cm.  Compare  this  with 
the  expression  for  f/fi  obtained  by  combining  Eqs. 
(29-30)  and  (29-31).  Use  the  value  of / found  in  Example 
29-9  to  find  the  thin-lens  approximation  value  of  the 
angular  magnification,  and  compare  it  with  the  accurate 
value  you  found  earlier.  Explain  the  qualitative  relation 
between  these  values  from  simple  considerations. 


Numerical 

29-43.  Single  thick  lens.  The  values  of  r for  the  sur- 
faces of  a lens  are  rx  = + 8.000  cm  and  r2  = — 8.000  cm, 
the  thickness  of  the  lens  at  its  axis  is  t21  = 1.800  cm,  and 
the  glass  used  to  make  the  lens  has  an  index  of  refraction 
n = 1.500.  The  lens  is  in  air.  An  object  of  height  h = 
+ 0.300  cm  is  located  at  an  object  distance  5 = 6.000  cm 
(measured  from  the  first  surface  of  the  lens).  Use  the  ma- 
trix method  to  determine  the  location,  size,  and  orienta- 
tion of  the  image. 

29-44.  Crystal  ball. 

a.  The  filament  of  a light  bulb  is  1.000  m from  the 
surface  of  a sphere  of  diameter  10.00  cm  made  from  glass 
of  index  of  refraction  1.500.  A stop  is  used  so  that  only 
paraxial  rays  pass  through  the  sphere.  Use  the  matrix 
method  to  determine  the  location  of  the  image  of  the  fila- 
ment, and  its  lateral  magnification. 

b.  The  glass  sphere  is  illuminated  with  sun  rays. 
Where  will  they  be  brought  to  a focus? 

29-45.  Reversing  a thick  lens.  The  radii  of  curvature  of 
the  surfaces  of  a converging  lens  are  5.000  cm  and  ».  The 
thickness  of  the  lens  from  vertex  to  vertex  is  1.000  cm. 
The  lens  is  made  from  glass  with  an  index  of  refraction 
equal  to  1.630  and  is  in  air.  A parallel  beam  of  paraxial 
light  strikes  its  curved  surface  first.  Use  the  matrix 
method  to  find  the  distance  from  the  vertex  of  the  flat  sur- 
face to  the  point  at  which  the  light  rays  come  to  a focus. 
Then  repeat  the  calculation,  but  with  the  incident  beam 
striking  the  flat  surface  first,  and  find  the  distance  from 
the  vertex  of  the  curved  surface  to  the  focal  point.  Com- 
pare the  results  you  obtain  with  those  obtained  in  Ex- 
ample 29-2,  where  similar  calculations  were  made  for  a 
similar  thin  lens.  Can  you  explain  from  first  principles  the 
difference  between  the  behaviors  of  the  thin  and  thick 
lenses  when  the  direction  of  travel  of  light  through  the 
lenses  is  reversed? 

29-46.  Separated  thick  lenses.  Find  the  focal  point  for 
the  converging-diverging  pair  in  Fig.  29-36  when  the  two 
lenses  are  separated  by  0.3000  cm.  Compare  your  results 
with  those  obtained  in  Example  29-10,  where  the  two 
lenses  were  in  contact,  and  explain  the  difference. 

29-47.  Telephoto  lens.  Two  thin  lenses  of  focal  lengths 
+ 20.00  cm  and  - 20.00  cm  are  separated  by  10.00  cm.  An 
object  is  100.0  m to  the  left  of  the  converging  lens. 

a.  Use  the  matrix  method  to  determine  the  distance 
to  the  right  of  the  diverging  lens  at  which  the  image  is  lo- 
cated and  to  determine  its  lateral  magnification. 

b.  Use  thin-lens  formulas  to  determine  the  distance 
to  the  right  of  a single  converging  thin  lens  at  which  an 
image  of  the  same  distant  object  would  be  located  if  the 
image  has  the  same  lateral  magnification  as  in  part  a. 

The  two-lens  arrangement  constitutes  a telephoto 
lens.  Since  the  distance  from  the  second  lens  to  the  film  is 
appreciably  less  than  it  would  be  if  a single  lens  is  used  to 
provide  the  same  lateral  magnification,  the  telephoto  lens 
has  the  advantage  of  compactness. 


1430  Ray  Optics 


29-48.  Zoom  lens.  Find  the  image  distance  and  lateral 
magnification  for  the  zoom  lens  in  Fig.  29-37  when  t21  = 
20.00  cm  and  t32  = 40  cm. 

29-49.  Perfecting  a zoom  lens.  Determine  the  change  in 
position  of  lens  3 with  respect  to  lens  1 in  the  zoom  lens  of 
Fig.  29-37,  which  keeps  the  image  at  the  same  distance 
from  lens  1 as  lens  2 moves  between  its  two  positions. 
Hint:  There  are  two  ways  you  can  do  this.  One  is  a purely 
numerical  trial-and-error  procedure.  The  other  is  to  write 
the  system  matrix  in  terms  of  a variable  which  specifies 
the  position  of  lens  3 and  then  develop  an  algebraic  equa- 


tion that  determines  the  value  of  this  variable  for  which 
the  distance  from  lens  1 to  the  image  does  not  change 
when  lens  2 moves. 

29-50.  Strong  focusing  analogue,  II.  Use  the  matrix 
method  to  carry  out  the  analysis  in  Exercise  29-41  of  the 
set  of  two  equal  converging  and  two  equally  diverging  thin 
lenses.  If  you  have  done  Exercise  29-41,  compare  the  re- 
sults obtained  there  by  making  repeated  applications  of 
the  thin-lens  formulas  with  those  obtained  here  by  using 
the  matrix  method.  Also  compare  the  effort  expended  in 
the  two  procedures. 


Exercises 


1431 


Particle  -Wave 

Duality 


30-1  THE  QUANTUM  We  have  seen  how  the  speed  of  light  c = 2.998  x 108  m/s  is  the  basis  for  a 
DOMAIN  criterion  which  separates  physics  into  domains.  If  the  characteristic  speed 
in  a system  is  comparable  to  c (a  value  that  is  very  large  by  everyday  stan- 
dards), then  the  system  is  in  the  relativistic  domain.  Otherwise,  it  is  in  the 
nonrelativistic  domain.  There  is  another  physical  constant,  just  as  funda- 
mental as  c,  that  underlies  the  separation  of  physics  into  domains  in  an- 
other way.  This  is  Planck ’s  constant,  h = 6.626  x 10~34Js  (a  value  that  is  very 
small  by  everyday  standards).  The  separation  to  which  it  leads  usually  is  de- 
termined by  the  characteristic  size  of  a system.  On  the  scale  of  a system  of 
molecular  or  atomic  size,  or  smaller,  crucial  quantities  proportional  to  h are 
of  appreciable  magnitude.  Such  microscopic  systems  are  in  what  is  called 
the  quantum  domain.  For  systems  of  the  size  we  deal  with  in  the  normal  rou- 
tine of  our  lives,  or  larger,  quantities  in  which  Planck's  constant  is  a factor 
are  completely  negligible  in  most  circumstances.  These  macroscopic 
systems  are  in  the  nonquantum  domain.  Briefly  put,  the  quantum  domain 
is  the  one  in  which  Planck’s  constant  is  significant.  In  this  chapter  and  the 
next  we  complete  our  study  of  physics  by  entering  the  quantum  domain  to 
investigate  some  of  its  most  important  and  fascinating  aspects.  The  pur- 
pose of  this  section  is  to  give  an  overview  of  them. 

For  macroscopic  systems  it  is  convenient  to  make  a distinction  between 
particle  motion  and  wave  motion.  Newtonian  mechanics  originated  in  the 
study  of  the  motion  of  particles  (or  of  rigid  bodies  comprising  sets  of  par- 
ticles). Later  newtonian  mechanics  was  found  to  be  equally  useful  in 
treating  the  motion  of  mechanical  waves,  even  though  waves  are  very  dif- 
ferent from  particles.  A particle  is  localized,  occupying  a particular  location 
at  each  instant.  A wave  is  not  localized,  since  at  each  instant  it  necessarily 


1432 


extends  over  a range  of  locations.  Moreover,  since  waves  tend  to  spread, 
the  range  usually  broadens  with  the  passage  of  time.  Because  of  this  signifi- 
cant difference,  the  first  step  in  analyzing  a macroscopic  mechanical  system 
is  to  decide  whether  its  key  feature  involves  particle  motion  or  wave  mo- 
tion. In  the  first  case,  the  system  should  be  treated  by  direct  application  of 
Newton’s  laws.  In  the  second  case,  Newton's  laws  should  be  applied  indi- 
rectly by  treating  the  system  through  the  mechanical  wave  equation.  It  is 
generally  easy  to  make  this  decision  for  a macroscopic  mechanical  system. 

After  the  laws  of  electromagnetism  were  developed,  the  separation 
between  particle  motion  and  wave  motion  was  carried  over  to  electromag- 
netic systems.  An  electron  in  J.  J.  Thomson’s  charge-to-mass  ratio  measure- 
ment was  considered  to  be  a particle  moving  in  a combination  of  electric 
and  magnetic  fields  under  the  influence  of  the  Lorentz  force,  and  its  mo- 
tion was  treated  by  applying  Newton's  laws  to  this  force.  Heinrich  Hertz 
showed,  by  means  of  diffraction  experiments,  that  the  radiation  produced 
by  his  spark-gap  apparatus  propagates  like  a wave.  This  radiation  was 
therefore  described  by  the  electromagnetic  wave  equation.  Around  1900  it 
was  agreed  that  an  electron  is  a particle  because  it  moves  like  a particle.  And 
it  was  agreed  that  electromagnetic  radiation  is  a wave  because  it  moves  like  a 
wave. 

But  then  experimental  and  theoretical  physicists  began  very  carefully 
to  investigate  properties  of  electromagnetic  radiation  that  are  just  as  im- 
portant as  the  way  it  moves  from  where  it  is  emitted  to  where  it  is  absorbed. 
In  1900  Max  Planck  analyzed  measurements  of  the  emission  of  radiation 
by  a hot  surface  in  the  infrared  and  visible  ranges  of  the  electromagnetic 
spectrum  (v  — 1014  Hz).  Planck’s  analysis  was  later  reinterpreted  by  Albert 
Einstein,  and  others,  who  found  that  the  measurements  could  be  ac- 
counted for  by  assuming  that  the  radiation  exhibits  particlelike  aspects  in 
the  emission  process.  According  to  our  present  view,  radiation  is  not 
emitted  in  a continuous  stream,  like  water  flowing  from  a large-diameter 
pipe.  Instead  it  is  emitted  in  discrete  packets,  like  individual  droplets  of 
water  spraying  from  the  nozzle  of  a garden  hose.  The  packets  of  radiation 
have  subsequently  come  to  be  called  photons.  For  radiation  of  frequency  n, 
the  amount  of  radiant  energy  contained  in  one  photon  is  hv,  where  h is 
Planck’s  constant. 

In  1905  Einstein  concluded  that  this  “granularity”  in  electromagnetic 
radiation  is  also  present  in  the  absorption  process.  He  did  this  by  inter- 
preting experiments  involving  the  absorption  on  metal  surfaces  of  radia- 
tion in  the  form  of  ultraviolet  light  (v  — 1015  Hz).  Thus  the  work  of  Planck, 
Einstein,  and  more  recent  investigators  leads  to  the  conclusion  that  both 
when  it  is  emitted  and  when  it  is  absorbed,  electromagnetic  radiation  is  lumped 
into  localized  and  therefore  particlelike  objects,  the  photons.  Individual 
photons  are  difficult  to  detect  unless  they  are  few  and  far  between,  just  as 
individual  droplets  of  water  in  a spray  are  difficult  to  detect  unless  their 
number  is  small  enough  and  their  separation  large  enough.  But  even  when 
photons  are  not  directly  detected,  they  can  make  their  presence  evident,  if 
their  energies  hv  are  comparable  to  the  other  energies  of  importance  in  a 
system.  This  is  the  case  for  systems  in  the  quantum  domain. 

In  1923  Arthur  H.  Compton  performed  and  analyzed  experiments  in- 
volving X rays  which  led  him  to  the  conclusion  that  electromagnetic  radia- 
tion is  also  particlelike  when  it  is  scattered  by  matter.  Thus  it  appears  that 
radiation  is  particlelike  when  it  interacts  with  matter  in  any  way. 

Yet  all  the  diffraction  experiments  show  conclusively  that  electromag- 


30-1  The  Quantum  Domain  1433 


netic  radiation  is  wavelike  as  it  moves  from  one  location  to  another — say, 
from  its  place  of  emission  at  a light  source,  through  a diffraction  apparatus, 
to  the  place  where  it  is  absorbed  on  a photographic  film  used  to  obtain  the 
diffraction  pattern.  Radiation  is  not  purely  particlelike  because  its  propaga- 
tion is  wavelike.  Nor  is  it  purely  wavelike  because  its  interaction  with  matter  is 
particlelike.  Electromagnetic  radiation  simply  refuses  to  fit  neatly  into  either 
one  of  these  artificial  categories.  Instead,  radiation  has  the  dual  character- 
istics of  a particle-wave. 


In  1924  Louis  de  Broglie  postulated  that  particle-wave  duality  applies  to 
matter  as  well  as  radiation,  so  that  it  embraces  both  of  the  two  fundamental 
constituents  of  the  universe.  His  argument  was  simple:  Radiation  had  been 
thought  for  many  years  to  be  purely  wavelike,  but  had  recently  been  found 
also  to  possess  particlelike  properties.  Perhaps  material  entities,  which  had 
been  thought  to  be  purely  particlelike,  could  also  have  attributes  that  are 
wavelike.  Several  years  later  Davisson  and  Germer  discovered  experi- 
mentally that  de  Broglie’s  idea  was  correct.  They  found  striking  diffraction 
effects  in  the  motion  of  electrons  through  a system  containing  a micro- 
scopic diffraction  grating.  Electrons  can  be  diffracted,  and  diffraction  is  a phe- 
nomenon of  waves! 

This  crucial  experimental  result  is  not  in  conflict  with  earlier  measure- 
ments indicating  that  electrons  move  through  macroscopic  systems  like 
particles.  Wavelike  aspects  in  the  motion  of  electrons  are  observable  only 
when  the  circumstances  are  favorable,  that  is,  when  the  electrons  move 
through  a system  whose  characteristic  dimension  is  microscopic.  Further- 
more, the  same  is  true  for  electromagnetic  radiation.  Wavelike  motion  of 
radiation  is  seen  only  in  favorable  circumstances.  Light  appears  to  travel 
along  rays  (in  other  words,  along  well-defined  paths  like  billiard  balls) 
obeying  the  laws  of  ray  optics,  unless  a diffraction  apparatus  with  suffi- 
ciently small  slits  is  used  to  investigate  its  motion.  Only  if  such  an  apparatus 
is  used  will  it  become  apparent  that  the  motion  oi  light  really  obeys  the  laws 
of  wave  optics  and  that  ray  optics  is  just  a special  case  which  applies  to  large 
systems. 

In  this  chapter  we  study  all  these  basic  facts  of  the  quantum  domain. 
The  chapter  concludes  by  considering  the  connection  between  them  and 
l he  famous  uncertainty  principles  set  forth  by  Werner  Heisenberg.  The  un- 
certainty principles  place  fundamental  limits  on  the  ability  of  physicists  to 
measure  the  very  quantities  with  which  physics  deals.  Some  of  the  far- 
reaching  consequenes  of  the  uncertainty  principles  are  also  discussed. 


The  quantum  domain  takes  its  name  from  the  phenomenon  of  quanti- 
zation. There  are  several  types  of  quantization,  but  the  most  important  is  en- 
ergy quantization,  the  topic  of  Chap.  31.  For  an  example,  consider  a box 
filled  with  electromagnetic  radiation,  say  a red-hot  oven.  The  energy  con- 
tained in  the  radiation  of  a particular  frequency  v can  be  0 if  no  photons 
are  present,  or  hv  if  one  is  present,  or  2 hv  if  two  are,  and  so  forth.  But  it 
cannot  have  values  other  than  some  integer  times  the  energy  hv  of  a photon 
of  frequency  v (that  is,  a photon  of  radiation  having  frequency  v).  The 
reason  is  that  there  is  no  such  thing  as  part  of  a photon  any  more  than 
there  can  be  a fraction  of  an  electron.  It  is  said  that  the  energy  in  the  radia- 
tion of  frequency  v is  quantized  to  have  only  the  discrete  values  just  quoted, 
instead  of  having  a continuous  range  of  values. 


1434  Particle-Wave  Duality 


A box  containing  radiation  is  not  the  simplest  example  of  energy  quan- 
tization, since  photons  of  certain  frequencies  are  constantly  absorbed  on  its 
walls  and  then  reemitted  as  photons  of  different  frequencies.  So  photons  of 
a wide  range  of  different  frequencies  are  in  the  box,  and  the  number  of 
each  frequency  present  fluctuates  from  instant  to  instant. 

Less  complicated  examples  of  energy  quantization  are  found  in 
systems  containing  matter  where  the  same  number  of  particles  is  always 
present.  An  example  is  an  isolated  hydrogen  atom  where  there  is  always 
one  electron  bound  to  one  proton.  Such  systems  are  treated  in  Chap.  31. 
We  will  find  that  their  energy  is  quantized  because  the  wave  associated  with 
a material  entity  bound  in  a system  must  be  a standing  wave  satisfying  cer- 
tain conditions  at  the  boundary  of  the  system.  These  boundary  conditions 
are  reminiscent  of  the  ones  imposed  by  the  rigid  supports  at  both  ends  of  a 
stretched  string.  They  cause  the  string  to  have  a certain  set  of  possible 
modes  of  motion,  with  a different  value  of  oscillation  frequency  v in  each 
mode. 

We  will  analyze  energy  quantization  in  three  microscopic  systems.  Two 
are  special  cases  in  which  an  analysis  can  be  based  directly  on  de  Broglie’s 
idea.  After  treating  these  systems,  we  use  the  de  Broglie  idea  to  obtain  the 
Schrodinger  equation.  This  equation  makes  it  possible  to  study  energy  quanti- 
zation, and  other  key  aspects  of  the  behavior  of  matter,  in  any  microscopic 
material  system.  It  is  the  basic  equation  in  quantum  mechanics,  the  mechanics 
of  the  quantum  domain.  We  will  obtain  numerical  solutions  to  the 
Schrodinger  equation  for  the  harmonic  oscillator,  a system  of  fundamental 
importance  in  the  quantum  domain. 


30-2  THE  EMISSION  Exploration  of  the  quantum  domain  began  in  earnest  with  Planck’s  theory 
AND  ABSORPTION  of  the  emission  by  hot  bodies  of  electromagnetic  radiation  in  the  thermal 
OF  PHOTONS  range  °f  the  spectrum  (that  is,  in  the  infrared  and  visible  ranges).  For  an 

example  of  this  thermal  radiation,  imagine  that  you  heat  the  end  of  an  ini- 
tially cold  iron  poker  to  successively  higher  temperatures  in  a fire,  periodi- 
cally withdrawing  it  long  enough  to  note  its  properties.  While  the  poker  is 
at  a relatively  low  temperature,  you  can  feel,  but  not  see,  the  thermal  radia- 
tion it  emits.  As  its  temperature  increases,  the  amount  of  thermal  radiation 
emitted  by  the  poker  increases  very  rapidly.  Furthermore,  it  becomes 
visible — the  poker  hrst  emitting  light  of  a dull  red  color,  then  of  a bright 
red  color,  and  finally  of  a blue-white  color.  When  the  heated  end  of  the 
poker  is  at  a particular  temperature,  it  emits  thermal  radiation  with  a par- 
ticular spectrum  of  frequencies.  Accurate  measurements  made  in  a con- 
trolled situation  show  that  the  total  amount  of  energy  radiated  over  all  fre- 
quencies of  the  thermal  spectrum  is  proportional  to  the  fourth  power  of 
the  absolute  temperature  of  the  body  emitting  the  radiation.  The  measure- 
ments show  also  that  the  frequency  at  which  the  thermal  radiation  is  most 
intense  is  proportional  to  the  hrst  power  of  the  absolute  temperature  of  the 
body. 

In  his  theory  of  the  emission  of  thermal  radiation,  Planck  pictured  the 
surface  of  a substance  as  containing  a collection  of  microscopic  charged 
harmonic  oscillators.  Specifically,  he  said  that  the  surface  contains  electrons 
bound  to  fixed  points  by  forces  obeying  Hooke’s  law.  (This  is  the  same 
model  as  the  one  pictured  in  Fig.  27-15  and  used  to  calculate  the  mo- 
mentum delivered  to  the  surface  of  a substance  when  it  absorbs  electro- 
magnetic radiation.)  In  their  thermal  motion,  the  electrons  execute 


30-2  The  Emission  and  Absorption  of  Photons  1435 


harmonic  oscillation  about  the  points  to  which  they  are  bound — the  ampli- 
tude of  the  oscillation  increasing  with  increasing  temperature.  These  accel- 
erating charged  particles  emit  electromagnetic  radiation  with  the  same  fre- 
quency as  their  oscillation  frequency,  just  as  if  each  were  in  a microscopic 
dipole  transmitting  antenna.  The  properties  of  this  thermal  radiation 
emitted  by  a hot  surface  depend  on  the  properties  of  the  harmonic  oscil- 
lators that  emit  it.  Planck  found  that  in  order  to  obtain  agreement  between 
the  theoretical  predictions  and  the  experimental  measurements,  he  had  to 
assume  that  the  total  energy  £ of  a harmonic  oscillator  whose  oscillation 
frequency  is  v can  have  only  one  of  the  values  E = 0,  hv,  2/m,  3 hv,  . . . , 
where  h is  the  constant  now  called  Planck’s  constant.  In  other  words,  Planck 
found  that  the  total  energy  of  a harmonic  oscillator  is  quantized. 

The  physical  and  mathematical  arguments  entering  Planck’s  theory  of 
the  emission  of  electromagnetic  radiation  in  the  thermal  range  of  the  spec- 
trum are  too  complicated  to  allow  us  to  do  more  than  give  the  very  brief 
description  contained  in  the  preceding  paragraph.  So  instead  of  using 
Planck's  theory  to  introduce  the  quantum  domain,  we  deviate  from  the  his- 
torical ordering  and  use  Einstein’s  theory  of  the  absorption  of  electromag- 
netic radiation  in  the  ultraviolet  range  of  the  spectrum.  It  will  lead  us 
where  we  want  to  go  by  a route  that  is  easier  to  follow. 


Ultraviolet 

B 

1 

'X 

wh  n 

light 

h 

J 

C 

Fig.  30-1  A photocell  and  associated 
electric  circuit.  The  battery,  reversing 
switch,  and  adjustable-resistance  "volt- 
age" divider"  arrangement  allow  an 
electric  potential  difference  of  variable 
sign  and  magnitude  to  be  applied  across 
the  insulated  terminals  C and  D carrying 
the  leads  from  electrodes  A and  B 
through  the  body  of  the  cell.  The  blades 
of  the  reversing  switch  are  shown  in  the 
position  which  makes  the  potential  of  C 
with  respect  to  D be  positive.  If  A and  B 
are  made  of  the  same  metal,  the  poten- 
tial V of  B with  respect  to  A is  just  the 
potential  of  C with  respect  to  D , and  it 
may  be  read  by  connecting  a voltmeter 
between  C and  D.  Otherwise,  V is  the 
voltmeter  reading  plus  a correction  for 
what  is  called  the  contact  potential 
acting  between  the  two  metals.  The  cor- 
rection is  explained  more  fully  later  in 
this  section. 


During  the  course  of  his  work  with  spark-gap  radio  transmitters,  Hertz 
discovered  that  shining  ultraviolet  light  on  the  electrodes  of  a spark  gap 
facilitates  the  production  of  a spark.  Other  investigators  tried  to  find  out 
why  this  was  so  and  soon  were  able  to  show  that  it  is  due  to  electrons  liber- 
ated from  the  electrode  surfaces  when  they  absorb  the  light.  Once  free, 
these  electrons  can  initiate  the  electric  discharge  that  comprises  a spark. 
The  electrons  emitted  from  a surface  when  it  absorbs  light  are  called  pho- 
toelectrons, and  the  phenomenon  is  called  the  photoelectric  effect. 

The  type  of  apparatus  used  to  investigate  the  photoelectric  effect  is 
shown  in  Fig.  30-1.  Ultraviolet  light  passes  through  a quartz  plate  covering 
one  end  of  an  evacuated  photocell  and  enters  the  cell.  (Quartz  is  used  be- 
cause ordinary  glass  absorbs  strongly  at  ultraviolet  frequencies.)  The  light 
goes  through  a hole  in  electrode  B to  strike  electrode  A,  which  is  made  of 
the  material  under  study.  The  photoelectrons  ejected  from  A can  be  at- 
tracted to  electrode  B by  putting  B at  a positive  electric  potential  V with 
respect  to  A.  The  passage  of  these  electrons  completes  the  circuit  ABCDA, 
and  a current  i is  read  by  the  ammeter.  By  using  light  of  known  frequency 
v,  measurements  are  made  of  i versus  V.  Typical  results  are  indicated  in 
Fig.  30-2  for  a certain  intensity  of  the  incident  light  (curve  1)  and  for  twice 
that  intensity  (curve  2). 

Even  when  V is  zero,  some  photoelectrons  emitted  from  electrode  A 
are  collected  by  electrode  B and  produce  a small  current  i.  As  the  potential 
V becomes  positive,  more  photoelectrons  are  collected  and  i increases.  At 
the  value  of  V where  the  curve  of  i versus  V becomes  flat,  the  potential  is 
large  enough  that  all  the  electrons  emitted  from  A are  collected  by  B.  The 
corresponding  value  of  i is  called  the  saturation  current.  Doubling  the  in- 
tensity of  the  incident  light  causes  the  saturation  current  to  double  from 
h to  i2. 

Making  the  potential  V of  the  electrode  B negative  relative  to  electrode 
A reduces  the  current,  because  then  B repels  the  negatively  charged  elec- 
trons. But  the  current  i does  not  drop  to  zero  until  V reaches  a negative 
value  whose  magnitude  is  called  the  stopping  potential  TstoP.  The  evidence 


1436  Particle-Wave  Duality 


2 


Fig.  30-2  Results  obtained  by  measuring  the  current  i reaching  the 
electron-collecting  electrode  B in  a photocell  for  different  values  of  V, 
the  potential  of  that  electrode  with  respect  to  the  electron-emitting 
electrode  A.  Curve  1 is  obtained  by  shining  light  of  a certain  intensity 
on  the  emitting  electrode;  curve  2 is  obtained  when  the  light  intensity 
is  doubled. 


V 


0 2 4 6 8 10  12  14 


v (in  1014  Hz) 

Fig.  30-3  The  stopping  potential  ver- 
sus the  frequency  of  the  incident 
light,  for  a photocell  with  an  emitting 
electrode  made  of  sodium.  The  data  are 
those  obtained  by  Millikan,  except  that 
the  additive  correction  to  Vstop  has  been 
recalculated  by  using  a more  recently 
measured  value  of  the  contact  potential. 


is  interpreted  to  mean  that  photoelectrons  are  emitted  from  the  illumi- 
nated electrode  with  kinetic  energies  covering  a certain  range  of  values. 

The  maximum  kinetic  energy  A'max  is 


(30-1) 


where  e is  the  magnitude  of  the  electron  charge.  When  the  stopping  poten- 
tial is  applied  between  electrodes  A and  B , an  electron  of  charge  — e moves 
through  a potential  difference  — Tstop  in  traveling  from  one  electrode  to  the 
other,  and  the  system  thereby  gains  potential  energy  eVstov.  This  potential 
energy  gain  must  be  at  the  expense  of  the  kinetic  energy  which  the  elec- 
tron has  when  leaving  A.  So  if  the  most  energetic  electron  emitted  from 
electrode  A has  the  kinetic  energy  specified  by  Eq.  (30-1),  that  electron  can 
just  make  it  to  electrode  B.  As  the  potential  difference  between  the  elec- 
trodes is  made  less  and  less  negative — that  is,  approaches  zero  — more  and 
more  electrons  have  enough  kinetic  energy  on  leaving  A to  allow  them  to 
arrive  at  B,  since  the  gain  in  potential  energy  associated  with  moving  from 
A to  B is  being  reduced. 

The  role  of  the  light  frequency  v on  the  photoelectric  effect  was  stud- 
ied by  shining  ultraviolet  light  of  various  frequencies  into  a photocell.  It 
was  found  that  when  the  value  of  v is  lower  than  a certain  cutoff  frequency 
^cutofo  no  photoelectrons  at  all  are  liberated  from  the  material.  That  is,  the 
photoelectric  effect  commences  at  v = i;cutoff,  and  there  the  value  of  the 
stopping  potential  is  zero.  With  increasing  v the  value  of  Vstop  increases  in 
proportion  to  the  difference  v — izcutoff.  Also,  the  measurements  indicated 
that  Cutoff  depends  on  the  nature  of  the  surface  emitting  the  photoelec- 
trons. Minute  quantities  of  such  surface  contaminants  as  oxides  cause  great 
difficulties  in  the  experiments.  Clearly  reproducible  measurements  of  Vstop 
versus  v for  a relatively  clean  surface  were  not  available  until  R.  A.  Mil- 
likan’s work  in  1914.  Figure  30-3  is  adapted  from  a figure  he  published  re- 
porting photoelectric  measurements  on  a sodium  surface. 

The  observed  properties  of  the  photoelectric  effect  are  in  serious  con- 
flict with  the  predictions  of  a theory  in  which  electromagnetic  radiation  acts 
like  a wave  in  the  process  of  ejecting  photoelectrons  from  a surface  that  it 
strikes.  In  such  a wave  theory  of  the  photoelectric  effect,  the  electric  field  in 
the  wave  must  increase  with  increasing  intensity  of  the  incident  radiation. 
I he  force  exerted  on  the  electrons  in  the  surface,  being  proportional  to 
this  electric  field,  should  also  increase.  This  would  accelerate  them  more, 
and  so  it  follows  that  the  kinetic  energy  acquired  by  the  electrons  from  the 
radiation  striking  them  should  increase  as  the  radiation  intensity  increases. 
But  you  can  see  from  Fig.  30-2  that  Kmax  = cTst0P  has  the  same  value  in  both 


0 2 4 6 8 10  12  14 


v (in  1014  Hz) 

Fig.  30-3  The  stopping  potential  ver- 
sus the  frequency  of  the  incident 
light,  for  a photocell  with  an  emitting 
electrode  made  of  sodium.  The  data  are 
those  obtained  by  Millikan,  except  that 
the  additive  correction  to  Vstop  has  been 
recalculated  by  using  a more  recently 
measured  value  of  the  contact  potential. 


30-2  The  Emission  and  Absorption  of  Photons  1437 


the  measurements  of  curve  1 and  those  of  curve  2 in  which  the  intensity 
was  doubled. 

A second  conflict  with  a wave  theory  is  found  in  the  very  existence  of 
an  observed  cutoff  frequency  vcuioif.  In  the  wave  theory,  incident  radiation 
of  any  frequency  v should  be  able  to  supply  enough  energy  to  electrons  at 
the  surface  of  the  material  absorbing  the  radiation  to  make  it  possible  for 
them  to  escape.  It  should  only  be  necessary  that  the  radiation  be  suffi- 
ciently intense.  But  the  measurements  show  that  for  v less  than  rcutoff  no 
electrons  at  all  are  ejected  from  the  surface,  no  matter  how  intense  the 
radiation. 

A third  conflict  is  found  in  the  following  considerations.  In  a wave 
theory,  the  energy  flux  in  the  incident  electromagnetic  radiation  is  uni- 
formly spread  over  the  wave  fronts.  An  atom  in  the  surface  exposed  to  the 
radiation  can  be  expected  to  absorb  no  more  of  this  energy  flux  than  the 
amount  the  atom  intercepts.  In  the  most  favorable  case,  all  the  energy  ab- 
sorbed by  an  atom  might  be  deposited  on  a single  electron.  After  the  elec- 
tron had  accumulated  the  required  amount  of  energy,  it  could  escape.  But 
the  energy  flowing  onto  an  atom  would  be  proportional  to  the  surface  area 
of  the  atom,  and  this  is  extremely  small.  So  the  rate  at  which  it  can  absorb 
energy  is  very  low  if  the  energy  is  distributed  uniformly  over  the  incident 
wave  fronts,  and  an  appreciable  amount  of  time  would  be  required  for  an 
electron  to  accumulate  enough  energy  to  be  able  to  escape  the  surface.  An 
estimate  shows  the  time  to  be  about  1 s for  the  case  of  a 10-W  light  source 
placed  1 m from  a sodium  surface.  Thus  the  wave  theory  predicts  that  for 
the  first  second  after  the  source  is  turned  on  there  would  be  no  photoelec- 
trons emitted  from  the  surface,  and  then  there  would  be  a flood  of  photo- 
electrons that  had  all  accumulated  the  required  energy  at  about  the  same 
time.  This  is  not  what  is  observed  in  the  experiments.  Instead,  it  appears 
that  some  photoelectrons  are  ejected  by  the  incident  radiation  as  soon  as  it 
begins  to  shine  on  the  surface.  Experiments  with  time  resolution  better 
than  10“9  s have  been  performed  to  show  that  there  is  no  measurable  time 
lag  between  the  instant  the  radiation  begins  to  illuminate  the  surface  and 
the  instant  the  first  photoelectron  is  ejected. 


In  1905  Einstein  proposed  a theory  of  the  photoelectric  effect  which 
agreed  with  the  results  of  the  few  crude  measurements  then  available  and 
subsequently  proved  to  be  in  agreement  with  the  accurate  measurements 
that  Millikan  and  others  made  later.  His  key  idea  was  suggested  by  Planck’s 
conclusion  concerning  the  quantization  of  the  total  energy  of  a charged 
harmonic  oscillator  oscillating  at  frequency  v in  the  surface  of  a body  and 
emitting  electromagnetic  radiation  at  the  same  frequency.  According  to 
Planck,  the  total  energy  of  the  oscillator  can  have  only  one  of  the  values 
E = 0,  hv,  2 hv,  3 hv,  . . . , where  h is  a constant.  This  implies  that  when  an 
oscillator  changes  its  total  energy  from  one  value  to  the  next  lowest  value, 
the  radiation  emitted  carries  energy  hv,  equal  to  the  energy  lost  by  the  oscil- 
lator. Einstein  assumed  that  the  process  occurs  abruptly,  so  that  a burst  of 
electromagnetic  radiation  is  emitted  by  the  oscillator.  Einstein  also  assumed 
that  the  burst  of  radiation,  having  frequency  v and  energy  hv,  is  initially 
localized  in  a small  region  of  space.  Furthermore,  he  assumed  that  this 
bundle  of  radiation  remains  localized  as  it  moves  away  from  its  source,  in- 
stead of  spreading  out  in  the  manner  characteristic  of  mechanical  waves 
moving  in  three  dimensions.  Finally,  Einstein  assumed  that  in  the  photo- 


1438  Particle-Wave  Duality 


electric  effect  each  bundle  of  radiation  striking  the  absorbing  surface 
deposits  its  entire  energy  content  hv  on  a single  electron  of  the  surface. 

To  summarize  the  key  points,  Einstein  said  that  when  radiation  in- 
teracts with  matter  as  it  is  being  absorbed,  its  energy  is  not  uniformly 
spread  over  wave  fronts.  Instead  the  radiant  energy  being  absorbed  is  lumped 
into  small  bundles,  now  called  photons.  Each  photon  has  toted  energy 

E = hv  (30-2) 

where  h is  a constant,  now  called  Planck's  constant,  and  v is  the  frequency  of  the 
radiation.  Einstein  also  said  that  in  the  photoelectric  effect  each  photon  deposits  its 
entire  energy  on  some  electron  in  the  absorbing  surface. 


Einstein’s  theory  eliminates  immediately  the  problem  of  the  predicted 
but  unobserved  time  lag.  At  the  instant  that  light  from  even  a very  weak 
source  first  arrives  at  an  absorbing  surface,  a photon  will  strike  the  surface 
somewhere  and  will  be  completely  absorbed  by  an  electron  at  that  location. 
Providing  the  energy  E of  the  photon  exceeds  the  energy  W which  the  elec- 
tron must  expend  in  work  against  the  force  binding  it  to  the  surface,  the 
electron  will  have  enough  energy  to  escape  and  contribute  immediately  to 
the  measured  photoelectric  current. 

After  an  electron  escapes  the  surface,  its  kinetic  energy  K will  be 

K = E - W 


where  E is  the  energy  received  from  the  photon  and  W is  the  energy  ex- 
pended in  escaping.  The  energy  W is  called  the  binding  energy  of  the  elec- 
tron. Its  value  is  not  the  same  for  all  electrons  since  some  are  more  tightly 
bound  to  the  material  comprising  the  surface  than  others.  But  there  is  a 
minimum  value  of  W,  called  the  work  function  VEmin,  which  represents  the 
smallest  energy  required  to  remove  an  electron  from  the  surface  of  a given 
material.  The  value  of  the  work  function  depends  on  the  nature  of  the 
material  (and  on  the  cleanliness  of  its  surface).  Thus  there  is  a maximum 
kinetic  energy  for  the  electrons  that  have  escaped  from  the  surface.  It  is 

^max  E ITmin 


If  we  use  Einstein’s  relation  E = hv,  the  expression  for  Km.dX  becomes 


^max  hv  lEinin 

(30-3) 

This  can  be  written  in  terms  of  the  stopping  potential  by 
A-max  = *Vst0P.  The  result  is 

using  the  relation 

hv  Wmin 
stop  e e 

(30-4) 

or 

t/  _ h ( ITmin  \ 

lstop  eV  h ) 

Finally,  this  can  be  used  to  express  Estop  as 

o 

II 

^ | 

1 

o 

(30-5a) 

where 

^min  ^ ^cu  toff 

(30-56) 

30-2  The  Emission  and  Absorption  of  Photons 


1439 


As  mentioned  in  the  caption  to  Fig.  30-1,  if  the  electron-emitting  electrode  A 
in  a photocell  is  made  of  a different  metal  than  that  of  the  electron-collecting  elec- 
trode B,  then  the  potential  V of  B with  respect  to  A is  not  equal  to  the  voltage 
Vmetei  that  is  read  on  a voltmeter  connected  between  terminals  C and  D leading  to 
the  two  electrodes.  Instead,  the  meter  reading  must  be  corrected  by  adding  to  it 
what  is  called  the  contact  potential  acting  between  the  two  metals.  That  is,  V = 

T meter  T ^contact- 

The  contact  potential  can  be  evaluated  by  imagining  that  a wire,  of  the  same 
metal  as  is  used  in  the  leads  from  A to  D and  from  B to  C,  is  connected  from  C to  D 
and  that  no  light  strikes  A.  Next  an  electron,  considered  as  a test  charge,  is 
imagined  to  be  carried  across  the  gap  from  A to  B and  then  through  the  conductors 
fromB  to  C toD,  back  to  A . We  require  that  there  be  no  net  loss  or  gain  of  energy  in 
the  process.  Then  we  find  that  there  must  be  a potential  difference  between  A and 
the  lead  connected  to  it  at  the  point  of  contact  between  the  two,  and  similarly  forB 
and  the  lead  connected  to  it.  The  two  potential  differences  are  such  that  the  poten- 
tial of  B with  respect  to  A has  the  value  Vcont;ict  = (Wmin,B  - Wmin,.4)/e,whereWmin,.., 
and  Wmin,  B are  the  work  functions  of  metals  A and  B. 

When  we  take  the  contact  potential  into  account,  the  relation  between  the 
actual  value  Vstop  of  the  stopping  potential  and  the  value  Vst0Pj  meter  read  by  a volt- 
meter connected  across  the  terminals  C and  D is 


T stoP  E stoP,  meter 


WmjnjJ4 


By  introducing  the  appropriate  subscript,  Eq.  (30-4)  is  written  as 


v _ hv  _ Wmin.A 

v stoP  — 

e e 


Expressed  in  terms  of  the  voltmeter  reading,  this  is 


E stop,  meter  T 


lTminj  b Wmin  ,a  hn  kFmiji,  .4 


or 

t j fa  r*  W min,  b 

* stop,  meter 

e e 

Hence  the  meter  reading  itself  gives  direct  information  about  the  work  func- 
tion Wmin.B  of  the  electron-collecting  electrode  of  the  photocell,  if  it  is  different 
from  that  of  the  electron-emitting  electrode.  But  this  is  due  to  a technicality  of  the 
measuring  technique.  What  is  really  of  fundamental  importance  in  the  photoelec- 
tric effect  is  clearly  the  work  function  Wmini/1  of  the  electron-emitting  electrode. 

The  results  of  Einstein’s  theory  of  the  photoelectric  effect  contained  in 
Eqs.  (30-5a)  and  (30-5 b)  are  in  complete  agreement  with  experiment.  The 
first  equation  says  that  Tstop  should  increase  with  the  frequency  v of  the 
light  used  in  proportion  to  r - r'eutofn  where  ncutoff  is  a constant.  I he  pro- 
portionality constant  is  predicted  to  be  the  ratio  of  the  constant  h,  ap- 
pearing in  Einstein’s  relation  E = hv,  to  the  magnitude  e of  the  electron 
charge.  In  other  words,  Eq.  (30-5a)  says  that  a plot  of  the  measured  values 
of  Hstop  versus  v should  define  a straight  line  intercepting  the  v axis  at  r^off 
and  having  a slope  with  the  value  h/e.  The  data  in  Fig.  30-3  exhibit  just  this 
behavior. 

The  second  equation  describing  the  results  of  Einstein’s  theory,  Eq. 
(30-56),  shows  that  the  value  of  vcut0({  depends  on  VEmin,  which  in  turn  de- 
pends on  the  nature  of  the  surface.  Increasing  the  intensity  of  the  incident 
light  simply  increases  the  rate  at  which  photons  hit  the  absorbing  surface. 


1440 


Particle- Wave  Duality 


Since  each  photon  is  absorbed  by  an  electron,  this  increases  the  number  of 
electrons  emitted  and  therefore  the  saturation  current.  But  increasing  the 
light  intensity  has  no  effect  on  the  distribution  of  energies  of  the  electrons 
or  on  Tstop.  Also,  the  reason  for  a cutoff  frec]uency  is  apparent.  Unless  v is 
greater  than  i/cl)toff,  die  photon  energy  hv  will  be  less  than  /mcutoft  and  no 
electron  can  receive  the  necessary  minimum  energy  for  escape,  Wmin  = 
hv cutoff-  (The  probability  that  two  photons  are  absorbed  simultaneously  by 
the  same  electron  is  completely  negligible.) 

Example  30-1  provides  a value  for  the  constant  h as  well  as  for  the 
work  function  VEmin  of  sodium. 


EXAMPLE  30-1 

Use  the  data  of  Fig.  30-3  for  the  photoelectric  effect  on  sodium  to  determine  the 
experimental  value  of  the  constant  h.  Then  use  the  value  you  obtain  and  the  data  to 
determine  the  work  function  for  sodium.  Express  Wmin  in  electron  volts. 

■ Equation  (30-4)  says  that  the  data  should  fall  on  a straight  line  whose  slope  has 
the  value  h/e.  So  you  can  find  that  value  by  evaluating  the  slope  of  the  line.  Inspect- 
ing Fig.  30-3  shows  that  the  line  passes  through  the  points  Vstop  — 0.1  V at  v = 
6.0  x 1014  Hz  and  Tstop  — 2.1  V at  v = 1 1.0  x 1014  Hz.  Thus  its  slope  is 

h AVst0P  2.1  V- 0.1  V 2.0  V 

_ _ uop  __ _ 4 q x in-15  v-s 

e \v  11.0  x 1014  s"1  - 6.0  x 1014  s"1  5.0  x 1014  s"1 

Consequently 

h = e x 4.0  x 10"15  V-s 


Inserting  the  known  value  e = 1.60  x 10  19  C gives  you 

h = 1.60  x 10-19  C x 4.0  x KT15  V-s 
= 6.4  x IQ"34  C-V-s 


or 

h = 6.4  x 10“34  J-s 

Now  you  determine  the  work  function  by  evaluating  the  cutoff  frequency  for 
sodium  from  the  intercept  on  the  axis  of  the  straight  line.  You  will  find  i/cutoff  — 
5.6  x 1014  Hz.  Using  this  and  the  value  of  h just  determined  in  Eq.  (30-56),  you  ob- 
tain 

IVmin  = hv cut0ff  — 6.4  x 10~34  J-s  x 5.6  x It)14  s_1 
- 3.6  x ltr19  J 

Since  1.60  x 10— 19  J = 1 eV,  you  can  write  the  result  as 

1 eV 

w . ~ 3 fi  x 1U-19  I x = 2 9 eV 

mn  1 1.60  x 10-19  J 


Work  functions  for  pure  metals  have  values  comparable  to  the  value 
2.2  eV  for  sodium,  obtained  in  Example  30- 1 . The  values  range  from  about 
2 eV  to  about  5 eV.  A few  materials  have  work  functions  as  small  as  about 
1.6  eV.  They  are  employed  in  photocells  designed  to  detect  visible  light. 
Elevators  use  a light  beam  passing  across  their  doorways  and  detected  by  a 
photocell  to  sense  the  presence  of  a person  standing  in  the  doorway  and  in- 
terrupting the  beam.  An  associated  control  system  then  prevents  the 
doors  from  closing. 

Example  30-2  will  give  you  a feeling  for  the  rate  at  which  photons 
strike  a photocell  in  a typical  situation. 


30-2  The  Emission  and  Absorption  of  Photons  1441 


EXAMPLE  30-2 


A source  that  emits  10  W of  light  uniformly  in  all  directions  is  located  1.0  m from 
the  absorbing  surface  of  a photocell.  Calculate  the  number  of  photons  hitting  1.0 
cm2  of  the  surface  in  1 .0  s,  assuming  that  the  frequency  of  the  light  is  1 .0  x 1015  Hz. 

■ First  you  calculate  the  power  striking  1.0  cm2  = 1.0  x 10~4  m2  of  the  surface. 
To  do  this,  you  multiply  the  total  power  10  W by  the  ratio  of  this  area  to  the  total 
area  of  a sphere  of  radius  1.0  m.  The  result  is 

1.0  X 10-4  m2 

10  W x 

4tt  x ( 1 .0  m)z 

Next  calculate  the  energy  of  a photon  of  frequency  1.0  x 1015  Hz.  You  get 

hv  = 6.4  x 10-34  Js  x 1.0  x 1015  s"1  = 6.4  x 10“19  J 

Dividing  the  energy  striking  the  surface  per  second  by  the  energy  per  photon,  you 
obtain 


.0  x 10"5  W = 8.0  X It)-5  J/s 


8.0  x 10-5  J/s 
6.4  x 10“19  J 


1.3  x 1014  s"1 


This  is  the  number  of  photons  striking  the  one-square-centimeter  surface  per 
second. 


The  rate  at  which  photons  hit  the  absorbing  surface  of  Example  30-2  is 
so  high  that  it  would  be  impossible  to  detect  the  arrival  of  individual  ones. 
They  are  much  more  numerous  and  closely  spaced  than  the  water  droplets 
spraying  from  a nozzle,  which  appear  to  form  a continuous  stream  despite 
the  fact  that  they  are  actually  discrete.  The  rate  of  arrival  of  photons  is 
high,  even  at  an  appreciable  distance  from  a not  very  intense  light  source, 
because  the  energy  of  each  photon  is  extremely  low  in  everyday  terms.  Fur- 
thermore, a single  light  photon  is  difficult  to  detect  reliably  just  because  its 
energy  is  so  low.  The  dark-adapted  human  eye  is  a more  sensitive  detector 
of  light  photons  than  anything  else  that  was  available  in  the  early  1900s. 
But  it  takes  a pulse  of  light  containing  about  five  photons  of  the  frequency 
to  which  it  is  most  sensitive  to  produce  reliably  a response  in  the  eye. 
Despite  the  inability  of  the  early  experimenters  to  detect  individual 
photons,  the  explanatory  power  of  Einstein’s  theory  for  those  properties  of 
the  photoelectric  effect  which  could  be  measured  certainly  argued  strongly 
for  the  correctness  of  the  photon  concept. 


But  many  physicists  were  initially  reluctant  to  accept  the  existence  of 
photons  and  the  complete  uprooting  of  established  physical  concepts  which  it  im- 
plied. Prior  to  Millikan’s  detailed  experimental  confirmation  of  Einstein’s  theory 
in  1914,  Planck  and  others  recommended  Einstein  for  membership  in  the  Prussian 
Academy  of  Sciences.  In  certifying  his  qualifications,  they  wrote:  “Summing  up, 
we  may  say  that  there  is  hardly  one  among  the  great  problems,  in  which  modern 
physics  is  so  rich,  to  which  Einstein  has  not  made  an  important  contribution.  That 
he  may  have  sometimes  missed  the  target  in  his  speculations,  as,  for  example,  in 
his  hypothesis  of  light  quanta  [photons],  cannot  really  be  held  too  much  against 
him,  for  it  is  not  possible  to  introduce  fundamentally  new  ideas,  even  in  the  most 
exact  sciences,  without  occasionally  taking  a risk.”  Subsequent  events  amply  vin- 
dicated Einstein.  He  received  the  Nobel  Prize  in  physics  in  1921  for  his  theory  of 
the  photoelectric  effect.  In  1923  Millikan  was  awarded  the  Nobel  Prize  for  his 
experiments  on  the  photoelectric  effect  as  well  as  for  those  in  which  he  measured 
the  charge  of  the  electron. 


1442  Particle- Wave  Duality 


Wavelength 
(in  m) 

io-12 


Name 


on 


10-11 
1 0 10 


7 ray 


io-9 

10~8 


X ray 


Frequency 

Photon  Energy  (in  Hz) 


(in  eV) 

1021 

io6  - 

1020  - 

10s  - 

1019  - 

104  - 

1018  - 

103  - 

1017  - 

102  - 

ICT7 

IO'6 

io-s 

10"4 

IO"3 

IO"2 

KT1 

1 

10 

102 

103 

104 

105 

106 

107 


Ultraviolet 
Visible  light 


1016  - 


IO15  -\ 

1 J 


Infrared 


EHF  microwaves 


Radar 


10“ 3 H 

10  4 -j 
nr5  J 


; -UHF  television 
_ Citizen  bands,  etc.  IO-6  -| 
l ±VHF  television 
“ \FM  radio  1 0^7  - 

-■yVHF  television 
^Citizen  bands,  etc.  jq-8  _ 
— Short-wave  radio 

AM  radio  10^9- 

Radio  direction  finding 

1 0— 10  — 


VLF  radio 

10“n- 

io-12_ 

AC  power 

icr13- 

1 014  - 
1013  - 
1012  -I 


10“  J 


IO10  J 
109  - 
1 o8  J 

io7  - 
106  - 
10s  - 
104  - 
io3  - 
102  - 


Fig.  30-4  1 he  electromagnetic 

spectrum. 


10  -1 


In  the  years  which  have  passed  since  the  photon  concept  was  intro- 
duced to  explain  the  absorption  of  ultraviolet  light,  a great  body  of  experi- 
mental work  in  many  different  regions  of  the  electromagnetic  spectrum 
has  confirmed  that  radiation  acts  like  a stream  of  photons  when  it  interacts 
with  matter  in  absorption,  as  well  as  when  it  interacts  with  matter  in  any 
other  process.  Figure  30-4  relates  the  photon  energy  E to  the  frequency  v 
and  wavelength  A.  in  various  regions  of  the  electromagnetic  spectrum.  Since 
E — hv  and  v = c/k,  the  photon  energy  increases  as  the  frequency  in- 
creases and  as  the  wavelength  decreases.  For  the  X-ray  and  y-ray  range  of 
the  spectrum,  photon  energies  are  large  enough  that  it  is  easy  to  detect  with 
complete  reliability  the  arrival  of  a single  photon  at  the  material  with  which 
it  interacts.  Nuclear  and  elementary-particle  physicists  do  this  every  day. 

Example  30-1  used  Millikan’s  data  concerning  the  absorption  of  elec- 
tromagnetic radiation  on  a surface  to  calculate  the  value  of  the  constant  h in 
Einstein’s  relation  E = hv.  The  value  obtained  was  h — 6.4  x 10-34  J-s.  In 
1900  Max  Planck  (1858-1947)  had  carried  out  his  theoretical  investigation 
of  the  emission  of  electromagnetic  radiation  by  a hot  surface  in  thermal 


30-2  The  Emission  and  Absorption  of  Photons  1443 


equilibrium  with  its  surroundings,  described  briefly  at  the  beginning  of 
this  section.  It  did  not  employ  the  photon  concept,  requiring  only  that  a 
harmonic  oscillator  of  frequency  v have  its  total  energy  limited  to  one  of 

the  values  E = 0,  hv,  2 hv,  3 hv Planck  determined  the  numerical 

value  of  h by  fitting  the  results  of  the  theory  to  the  measured  properties  of 
thermal  radiation.  The  value  that  emerged  was  h — 6.6  x 10~34  J-s.  Thus 
both  Planck  and  Einstein  found  essentially  the  same  value  for  the  constant 
h,  using  quite  different  theories  to  analyze  quite  different  experiments. 
The  constant  h is  called  Planck’s  constant,  in  honor  of  the  physicist  who  in- 
troduced it.  The  currently  accepted  value  is,  to  four  significant  figures, 

h = 6.626  x IQ-34  J-s  (30-6) 


30-3  THE  SCATTERING  Confirmation  of  the  existence  of  photons  was  provided  in  1923  by  the 
OF  PHOTONS  experimental  and  theoretical  work  of  Arthur  H.  Compton  (1892-  1962)  on 

the  scattering  of  X rays  by  matter.  The  apparatus  he  used  is  sketched  in 
Fig.  30-5.  X rays  of  wavelength  X = 7.11  x 10_n  m = 0.071 1 nm  are  pro- 
duced by  an  X-ray  tube  using  a molybdenum  target.  They  are  rendered  (by 
elimination)  into  a beam  by  a set  of  lead  diaphragms,  and  the  beam  is  al- 
lowed to  strike  a graphite  plate.  X rays  emerging  from  the  plate  at  an 
adjustable  angle  d>  pass  through  lead  diaphragms  to  impinge  on  a crystal 
used  to  measure  their  wavelength. 

The  wavelength-measuring  procedure  employs  Bragg’s  law, 

jk  = 2d  sin  6 (30-7) 

which  is  derived  in  Fig.  30-6  and  its  caption.  In  this  relation  each  value  of 
the  integer  j corresponds  to  a different  peak  in  the  intensity  pattern  of 
X rays  of  wavelength  k reflected  from  the  atomic  planes  of  the  crystal,  d is 
the  spacing  between  these  planes,  and  6 is  the  angle  between  them  and  the 
incident  or  reflected  beams.  The  value  of  A.  for  the  X rays  striking  the 
crystal  is  obtained  by  measuring  6 for  the  j = 1 peak,  with  a crystal  of 
known  d. 

Compton  found  that  the  X rays  emerging  from  the  graphite  plate  con- 
tain two  wavelengths.  One  is  the  same  wavelength  k as  that  of  the  incident 
radiation,  and  the  other  is  a wavelength  k'  which  is  longer  than  k.  He  mea- 
sured k'  for  various  values  of  </>,  obtaining  the  data  shown  in  Fig.  30-7.  He 
also  made  measurements  on  materials  other  than  graphite  and  found  that 
the  results  obtained  for  k'  versus  <j>  do  not  depend  on  the  material.  It  is  the 
component  of  longer  wavelength  k'  that  provides  evidence  for  the  exis- 
tence of  photons. 


Fig.  30-5  A schematic  diagram  of  Compton’s  X-ray  scattering  apparatus. 
In  the  X-ray  tube  a beam  of  electrons  which  have  kinetic  energy  of  several 
tens  of  thousands  of  electron  volts  is  produced  by  accelerating  electrons 
emitted  from  a heated  wire  filament  through  a system  of  electrodes  across 
which  the  corresponding  voltage  is  applied.  The  electron  beam  then  im- 
pinges on  a target  made  of  the  metal  molybdenum.  Several  processes  occur 
as  the  electrons  come  to  rest  in  the  target.  The  one  leading  to  the  produc- 
tion of  the  X rays  used  by  Compton  involves  electrons  of  the  beam  colliding 
with  electrons  in  the  interior  of  atoms  of  the  target  and  ejecting  these 
atomic  electrons.  Because  an  atomic  electron  with  a large  negative  binding 
energy  has  been  removed  from  each  atom  in  which  the  process  has  taken 
place,  these  atoms  have  a large  positive  energy.  They  ultimately  get  rid  of 
this  energy  by  emitting  the  high-energy  photons  that  constitute  X rays. 


1444  Particle- Wave  Duality 


Intensity 


Crystal 


(a) 


Atomic 


Fig.  30-6  Bragg  reflection,  (a)  A beam  of  X rays  incident  on  the  atomic 
planes  of  a crystal  and  the  beam  reflected  from  the  planes  at  an  angle  of 
reflection  equal  to  the  angle  of  incidence.  Reflection  takes  place  princi- 
pally at  the  atomic  planes  because  the  density  of  electric  charge  is 
highest  there.  But  the  intensity  of  the  reflected  beam  will  be  large  only 
for  certain  values  of  the  angle  8,  which  is  measured  from  the  atomic 
planes  to  the  incident  and  reflected  beams.  ( b ) Determination  of  the  val- 
ues of  8 that  lead  to  a large  reflected  intensity.  Indicated  are  two  atomic 
planes  reflecting  two  rays  and  also  two  wave  fronts  drawn  perpendicular 
to  the  rays.  If  an  integral  number  of  wavelengths  jK  just  fits  into  the  dis- 
tance 21  from  the  incident  to  the  reflected  wave  fronts  along  the  upper 
ray,  then  waves  following  the  two  rays  of  the  reflected  beam  will  be  in 
phase  when  they  combine  at  some  distant  point.  In  fact,  a moment’s  con- 
sideration of  part  a shows  that  in  such  a case  the  waves  following  all  the 
rays  will  be  in  phase  upon  combination  and  so  will  superpose  construc- 
tively to  produce  a maximum  intensity.  Now  l/d  = cos(90°  — 8)  = sin  8. 
So  21  = 2d  sin  0,  and  the  condition  jK  = 21  for  maximum  intensity  is 
jK  = 2d  sin  8,  as  stated  in  Eq.  (30-7).  In  most  cases  the  maximum  ob- 
tained when  the  integer  j has  the  value  7 = 1 is  the  most  intense.  (Note 
that  the  angle  8 used  here,  and  conventionally  in  X-ray  work,  is  the  com- 
plement of  the  angle  measured  from  the  normal  used  in  optics.) 


Since  it  made  no  difference  with  what  material  the  incident  X rays  in- 
teracted, Compton  realized  that  the  interaction  producing  the  X rays  of 
wavelength  A'  did  not  involve  atoms  of  the  material.  Instead,  he  argued 
that  the  incident  X rays  must  be  interacting  with  individual  electrons,  the 
common  constituent  of  all  atoms.  Specifically,  Compton  assumed  that  what 
goes  on  involves  a single  photon  interacting  with  a single  electron.  The  in- 
teraction can  he  described  as  the  absorption  by  an  electron  of  an  incident 
photon  of  a certain  frequency  and  wavelength,  followed  very  quickly  by  the 
emission  from  ihe  electron  of  another  photon  of  different  frequency  and 
wavelength  in  a different  direction.  A simpler  description,  which  Compton 
used,  is  to  say  that  a photon  is  scattered  by  an  electron.  In  this  picture  the 
photon  changes  its  direction  of  motion,  its  frequency,  and  its  wavelength, 
but  is  not  destroyed  in  the  process.  (Since  in  the  experiment  under  discus- 
sion there  is  no  way  to  distinguish  between  the  two  descriptions  of  the 
photon-electron  interaction,  a distinction  between  them  is  without  meaning 
from  an  operational  point  of  view,  and  whichever  is  most  convenient  can  be 
adopted.) 

In  Compton’s  picture,  a photon  is  treated  quite  literally  as  a particle.  In 
scattering  from  an  electron,  a photon  transfers  energy  to  the  electron  just 
as  a moving  billiard  ball  transfers  energy  in  scattering  from  an  initially  sta- 
tionary billiard  ball.  Hence  after  scattering  a photon’s  energy  is  reduced 
from  its  initial  value  E = hv  to  a lower  value  E'  — hv' . Thus  the  frequency 
of  the  radiation  with  which  the  photon  is  associated  will  be  reduced  from  v 


Fig.  30-7  1 lie  wavelength  distribution  of  X rays  emerging  from  graph- 

ite on  which  X rays  of  wavelength  0.071  1 nm  are  incident.  Distributions 
are  shown  for  four  values  of  the  angle  <f>  between  the  directions  of  propa- 
gation of  the  incident  and  emergent  X rays. 


30-3  The  Scattering  of  Photons  1445 


to  v' . Using  Eq.  (27-21),  A . — c/v,  to  determine  the  corresponding  wave- 
lengths, we  see  that  this  means  the  wavelength  of  the  scattered  radiation 
will  be  increased  from  A = c/v  to  k'  — c/v' . Just  as  in  a billiard-ball  colli- 
sion, the  larger  the  angle  between  the  initial  and  final  directions  of  motion 
of  the  incident  particle,  the  more  energy  is  transferred  in  the  collision  from 
it  to  the  struck  particle.  So  there  should  be  a greater  decrease  in  frequency 
and  a greater  increase  in  wavelength  for  larger  angles.  Hence  this  descrip- 
tion of  the  process  yields  qualitative  agreement  with  Compton’s  experi- 
mental results.  Quantitative  agreement  can  be  obtained  by  analyzing  in  de- 
tail the  collision  between  a photon  and  an  electron,  treating  the  momentum 
transferred  between  the  two  particles  as  well  as  the  energy.  But  first  we 
must  obtain  an  expression  for  the  momentum  carried  by  a photon. 


Since  a collection  of  photons  comprises  a beam  of  electromagnetic 
radiation  and  since  such  a beam  travels  at  the  speed  of  light  c,  the  photons 
themselves  must  move  with  speed  v exactly  equal  to  c.  A photon  is  a com- 
pletely relativistic  particle!  In  determining  its  momentum,  it  is  therefore 
necessary  to  use  relativistic  relations.  Consider  the  relations  among  the 
total  energy  content  E of  a particle  of  speed  v,  its  relativistic  mass  m,  and  its 
rest  mass  m0,  given  by  Eqs.  (15-7)  and  (15-15): 


E = me 2 


m0c2 

Vl  - v2/c 2 


Since  the  total  energy  content  of  a photon  has  the  finite  value  E = hv,  the 
fraction  m0c2/ V 1 — v2/c2  must  have  a finite  value.  But  v = c for  a photon, 
so  its  denominator  is  zero.  Thus  its  numerator  must  also  be  zero  in  order 
that  the  fraction  be  the  indeterminate  quantity  0/0  which  can  have  any 
finite  value,  including  its  actual  value  me2  = E = hv.  The  numerator  can  be 
zero  only  if  m0  = 0 for  a photon.  So  we  can  conclude  that  a photon  must  have 
zero  rest  mass. 

Knowing  that  the  rest  mass  of  a photon  is  zero,  we  can  evaluate  its  mo- 
mentum p in  terms  of  its  relativistic  mass  m by  setting  m0  = 0 in  the  general 
relation  among  m,  p,  and  m0  of  Eq.  ( 1 5-2 1 <a) : 

(me2)2  = (cp)2  + ( m0c 2)2 

We  obtain 


or 


Since  E = me2,  this  is 


or 


Using  E = hv,  we  have 


me 2 = cp 

p = me 

E 


(30-8) 


(30-9) 


1446 


Particle-Wave  Duality 


Then  employing  v = c/X,  we  find  the  expression  we  have  been  seeking: 

p = (30-10) 

A. 

The  magnitude  of  the  momentum  of  a photon  of  electromagnetic  radiation  is  given  by 
Planck's  constant  divided  by  the  wavelength  of  the  radiation. 

You  have  seen  Eq.  (30-8)  before.  It  is  the  same  as  Eq.  (15-24),  which  pertains  to 
the  zero-rest-mass  particle  called  a neutrino.  But  do  not  confuse  a neutrino  with  a 
photon  just  because  both  have  zero  rest  mass.  Neutrinos  are  particles  that  are  in- 
volved in  the  weak  nuclear  force,  whereas  photons  are  very  different  particles  that 
have  to  do  with  the  very  different  electromagnetic  force. 

You  have  also  encountered  before  an  equation  very  closely  related  to  Eq. 
(30-9).  Compare  it  to  Eq.  (27-32): 

_ Penergy 

Pmomentum 

C 

This  says  that  the  total  momentum  content  in  a unit  volume  of  electromagnetic 
radiation  equals  its  total  energy  content  divided  by  c.  It  is  clearly  in  agreement 
with  Eq.  (30-9),  which  says  that  the  momentum  of  each  photon  in  the  unit  volume 
equals  its  energy  divided  by  c.  The  only  difference  between  the  two  is  that  Eq. 
(30-9)  takes  into  account  the  idea  that  the  energy  in  the  radiation  and  also  the  mo- 
mentum are  lumped  into  photons. 

Although  you  have  not  seen  Eq.  (30-10)  before,  you  certainly  will  see  it  again. 
It  is  just  as  important  as  E = hv  in  treating  the  particlelike  behavior  of  electromag- 
netic radiation.  De  Broglie  carried  both  of  these  equations  over  to  his  treatment  of 
the  wavelike  behavior  of  matter. 

Now  we  are  ready  to  continue  with  Compton  s analysis  of  a collision 
between  a photon  in  the  incident  beam  of  X rays  and  an  electron  in  the 
graphite  plate  irradiated  by  the  beam.  In  the  analysis  he  assumed  that  a 
photon  is  scattered  by  a free  electron.  The  electrons  are  more  or  less  tightly 
bound  to  carbon  atoms.  But  the  X rays  Compton  used  had  a wavelength  of 
about  10-10  m = 10-1  nm,  which  corresponds  to  a photon  energy  of  about 
104  eV  (see  Fig.  30-4).  In  contrast,  even  the  most  tightly  bound  electron  in  a 
carbon  atom  is  known  to  be  bound  by  an  energy  of  only  about  5 x 102  eV. 
So  in  the  context  of  X-ray  scattering  such  an  electron  can  be  treated 
approximately  as  if  it  is  free  (and  the  approximation  is  even  better  for  the 
less  tightly  bound  electrons).  For  similar  reasons,  Compton  could  ignore 
the  initial  motion  of  the  struck  electron  and  assume  it  to  be  stationary  be- 
fore the  collision. 

The  collision  between  a photon  of  initial  momentum  ptj  and  a free 
electron  of  initial  momentum  p2i  = 0 is  illustrated  schematically  in  Fig. 
30-8.  After  the  collision  the  photon  moves  away  with  final  momentum  p[/, 


Pi; 

o 

II 

cl 

0 — 

o 

t 

Photon 

Electron 

Before 


Fig.  30-8  The  scattering  of  a photon  by  a free  and  initially  stationary 
electron.  The  symbols  are  defined  in  the  text. 


30-3  The  Scattering  of  Photons  1447 


and  the  electron  recoils  with  final  momentum  p 2f.  The  collision  is  analyzed 
by  applying  the  laws  of  momentum  conservation  and  energy  conservation 
to  the  isolated  system  as  viewed  from  an  inertial  reference  frame.  Energy 
conservation  is  applicable  because  an  electron  has  no  internal  structure  that 
can  absorb  energy,  so  the  collision  is  elastic.  The  analysis  is  quite  similar  to 
those  carried  out  in  Sec.  8-3,  except  that  here  relativistic  expressions  for  mo- 
mentum and  energy  must  be  used. 

Since  the  electron  has  no  initial  momentum,  the  momentum  conserva- 
tion equation  reads 

Pit  = Pi / + Pz/ 

The  scattering  angle  <p  is  measured  in  the  experiment,  so  it  should  be  intro- 
duced into  the  equations.  This  can  be  done  by  first  transposing,  to  yield 


P2/  = Pn  - Pi/ 

Then  taking  the  dot  product  of  each  side  of  this  equality  with  itself,  we  ob- 
tain 


p2/'  P2/  = (Pn  - Pi/)  • (Pit  - Pi/) 
or 

Plf  = P\i  + Plf  - ZpuPif  cos  (p  (30-11) 

Conservation  of  total  relativistic  energy,  Eq.  (15-206),  tells  us  that 

W71?-r2  + m2ic2  — ?»i/c2  + m2fc2  (30-12) 

where  the  symbol  m stands  tor  relativistic  mass.  According  to  Eq.  (30-8),  the 
initial  and  final  total  relativistic  energy  of  the  photon  are 

muc2  = cpu 

and 


mlfc2  = cpif 

l he  initial  total  relativistic  energy  of  the  electron  is  just  its  rest-mass  en- 
ergy, since  it  is  initially  at  rest.  So 

m2ic2  — m0c2 

where  m0  represents  the  rest  mass  of  the  electron.  Its  final  total  relativistic 
energy  is  given  by  Eq.  (15-13), 

m2fc2  = K2f  + m0c2 


where  A’2/  is  the  final  kinetic  energy  of  the  electron.  Using  the  last  four 
equations  in  Eq.  (30-12),  we  obtain 

cpu  + m0c2  = cpif  T K2f  + >n0c2 


or 


c(pn  pi/)  — K2f  (30-13) 

Since  neither  of  the  related  quantities  K2f  and  p2f  is  measured  in  the 
experiment,  they  should  be  eliminated  from  the  equations.  For  this  pur- 
pose, we  take  Eq.  ( 1 5-2  lc ) : 

( m2fc 2)2  = (cp2f  )2  + (m0c2)2 


1448 


Particle-Wave  Duality 


Then  we  write  m2/c2  as  the  sum  of  fC/and  m0c2  from  Eq.  (15-13)  obtaining 

(K2f  + m0c2)2  = ( cp2f )2  + {m.0c2)2 
Expanding  the  square  of  the  sum  and  then  canceling  give 

K\f  + 2K2fm0c2  = ( cp2f )2 


or 

K(f  + 2 K2fm0  = pi, 

Evaluating  K2f  from  Eq.  (30-13)  and  p2f  from  Eq.  (30-1  1),  we  obtain 

( Pv  - Pi/)2  + 2 c(pu  - p^nio  = p\i  + pjf-  % P uP if  cos  0 
Next  we  expand  the  square  of  the  difference  and  cancel,  to  find 
-2 puplf  + 2 c(pu  - pif)m0  = — ZpuPif  cos  0 


or 


c{pu  ~ Pir)m0  = pvpif (\  - cos  </>) 


Finally  we  divide  by  m^cpnp^/h  and  get 


h h h 

— = (1 

Pif  Pa  m0c 


cos  <p) 


N(jw  h/pu  is  the  wavelength  A.  erf  the  incident  radiation  of  which  the 
photon  of  momentum  /r1(  is  a part.  And  h/pu  is  A',  the  wavelength  of  the 
scattered  radiation  of  which  the  photon  of  momentum  pu  is  a part.  So  we 
can  write  our  result  as 


A'  — A = (1  — cos  (p)  (30-14) 

m0c 

This  is  the  Compton  equation.  1 he  factor  h/m0c  is  a combination  of  uni- 
versal constants  having  the  dimensions  of  length.  It  is  called  the  Compton 

wavelength, 


Its  numerical  value  is 


h-c 


h 

m0c 


(30-15a) 


6.626  x 10-34  J-s 

Xc  ~ 9.109  x 10“31  kg  x 2.998  x 108  m/s 


or 

Ac  = 2.426  x 10“12  m = 0.002426  nm  (30-156) 

In  terms  of  the  Compton  wavelength,  the  Compton  equation  is 

A'  — A = Ac(l  ~ cos  0)  (30-16) 

These  results  of  Compton’s  analysis  are  in  complete  agreement  with  the  re- 
sults of  his  experiment  shown  in  Fig.  30-7  and  also  with  other  data  that  he 
obtained.  He  was  awarded  a Nobel  Prize  for  his  work  in  1927. 


Note  that  the  wavelength  increase  A'  — A,  called  the  Compton  shift, 
depends  only  on  the  scattering  angle  0.  It  has  a very  small  value  if  the 


30-3  The  Scattering  of  Photons  1449 


photon  makes  what  can  be  thought  of  as  a grazing  collision  with  the  elec- 
tron and  is  only  slightly  deflected,  so  that  </>  — 0.  If  there  is  something  like  a 
head-on  collision  in  which  the  photon  is  scattered  back  on  itself  at  <t>  = 180°, 
the  Compton  shift  has  its  maximum  value  2kc ■ Note  also  that  at  any  scat- 
tering angle  </>  the  shift  in  the  wavelength  of  the  scattered  radiation  does  not 
depend  on  the  value  of  the  wavelength  of  the  incident  radiation.  (But  the 
change  in  energy  of  the  scattered  radiation  does  depend  on  the  energy  of 
the  incident  radiation.) 

Example  30-3  evaluates  quantities  characterizing  the  scattering 
process  for  two  types  of  incident  radiation. 


EXAMPLE  30-3 

a.  A beam  of  X rays  of  wavelength  0.10000  nm  is  incident  on  a target  in  a 
Compton  scattering  experiment.  The  X rays  scattered  by  the  target  through  a 
right  angle  are  detected.  Calculate  the  Compton  shift,  the  wavelength  of  the  scat- 
tered radiation,  and  the  fractional  change  in  wavelength  in  the  scattering. 

■ You  have  A.'  — X = \c(l  — cos  </>)  with  4>  = 90°.  So  the  Compton  shift  is 

X'  — X = Xc  = 0.00243  nm 

The  wavelength  of  the  scattered  radiation  is 

X'  = X + Xf  = 0.10000  nm  + 0.00243  nm  = 0.10243  nm 

The  fractional  change  in  wavelength  is 

X'  — X 0.00243  nm 

= = 0.0243 

X 0.10000  nm 


Such  a fractional  change  is  large  enough  to  be  easily  measurable.  ■ 

b.  Repeat  the  calculations  for  the  right-angle  scattering  from  free  electrons  of 
y rays  with  photon  energies  ot  0.492  MeV  = 0.492  x 106  eV.  Such  y rays  are 
emitted  in  the  decay  of  the  nuclei  of  the  radioactive  substance  bismuth-212. 

■ Since  the  Compton  shift  does  not  depend  on  the  wavelength  of  the  incident 
radiation,  you  again  have 

X'  - X = 0.00243  nm 


From  this  difference  you  can  find  the  value  of  X'  by  evaluating  X.  First  you  calculate 


E 0.492  x 106  eV  1.602  x 1019  J 
Ti  ~ 6.63  x 10“34  J-s  X 1 eV 


1.19  x 1020  Hz 


1 hen  you  calculate 


c 3.00  x 108  m/s 

X = — = — — — ^ — - = 2.52  x 10  12  m = 0.00252  nm 

v 1.19  x 1020  s”1 


So  you  obtain 

X'  = X + Xc  = 0.00252  nm  + 0.00243  nm  = 0.00495  nm 

The  fractional  change  in  wavelength  is,  in  this  case, 

X'  - X 0.00495  nm 

= = 1.96 

X 0.00252  nm 

The  Compton  shift  almost  doubles  the  wavelength  of  the  scattered  electromagnetic 
radiation  comprising  these  y rays. 


1450 


Particle- Wave  Duality 


If  a calculation  like  those  in  Example  30-3  is  made  to  predict  the  frac- 
tional change  in  wavelength  for  the  scattering  of  radiation  with  a wave- 
length typical  of  visible  light,  A — 500  nm,  the  result  obtained  is 
(A'  — A)/A  — 5 x 10~6.  This  small  shift  is  within  the  detection  capabilities 
of  the  very  high-resolution  spectrometers  available  in  the  visible  range  of 
the  electromagnetic  spectrum.  However,  no  measurable  wavelength  shift  is 
actually  observed  when  light  is  scattered  from  graphite  or  any  comparable 
material.  The  explanation  is  that  it  is  unreasonable  to  apply  Compton’s 
equation,  Eq.  (30-14),  to  the  scattering  of  visible  light.  Even  the  most 
loosely  bound  electron  in  any  atom  is  bound  by  an  energy  of  about  1 eV, 
whereas  the  energy  of  a light  photon  is  about  2 eV  (see  Fig.  30-4).  Thus  the 
binding  of  an  electron  to  an  atom  cannot  be  ignored  in  light  scattering.  In 
fact,  an  electron  remains  bound  to  its  atom  when  it  scatters  a photon  of  visi- 
ble light,  and  so  the  entire  atom  recoils  (or  even  the  entire  crystal  in  which 
the  atom  may  be  bound).  In  such  a situation,  the  mass  M of  the  atom  (or  of 
the  entile  crystal)  must  replace  the  electron  rest  mass  m0  in  the  equation 
A'  — A = (h/m0c)(  1 — cos  <fi).  Since  the  mass  of  an  atom  is  many  thousands 
of  times  larger  than  the  rest  mass  of  an  electron  (and  the  mass  of  a crystal  is 
very  much  larger  still),  the  resulting  Compton  shift  is  so  small  as  to  be  com- 
pletely unmeasurable.  Thus  there  is  no  significant  wavelength  shift  when 
light  is  scattered  in  this  way. 

But  even  with  radiation  of  shorter  wavelength,  and  therefore  of 
higher  photon  energy,  there  can  be  scattering  with  no  wavelength  shift, 
file  presence  of  the  unshifted  peaks  in  Fig.  30-7  shows  this  to  be  so.  Its  ori- 
gin is  explained  by  essentially  the  same  argument  as  the  one  in  the  preced- 
ing paragraph.  Even  if  the  energy  of  a photon  is  higher  than  the  binding 
energy  of  the  most  tightly  bound  atomic  electron,  there  is  a chance  that  the 
electron  will  not  be  liberated  from  the  atom  when  a photon  interacts  with  it. 
If  this  happens,  the  atom  as  a whole  will  recoil,  and  the  wavelength  shift 
will  be  negligibly  small.  The  process  is  called  Rayleigh  scattering.  The 
dependence  on  photon  energy  of  the  probability  for  scattering  with  signifi- 
cant wavelength  shift  is  not  the  same  as  the  dependence  on  photon  energy 
of  the  probability  of  Rayleigh  scattering.  At  sufficiently  high  photon  en- 
ergies the  wavelength-shifting  process  is  much  more  likely  to  occur  than 
Rayleigh  scattering,  and  so  the  Compton-shifted  peak  dominates  the  spec- 
trum of  the  scattered  radiation.  Thus  the  scattering  process  becomes  what 
is  called  pure  Compton  scattering  at  sufficiently  high  photon  energies,  that 
is,  at  sufficiently  short  wavelengths. 

The  question  of  whether  an  electron  is  bound  to  an  atom  when  a photon  in- 
teracts with  it  is  also  central  to  understanding  the  difference  between  the  photo- 
electric effect  and  Compton  scattering.  It  is  easy  to  show  that  total  relativistic 
energy  and  momentum  cannot  both  be  conserved  if  a free  electron  completely  ab- 
sorbs a photon  in  a photoelectric  process.  You  should  do  this.  For  a photon  to  be 
absorbed,  the  electron  must  be  bound  to  an  atom.  The  binding  forces  make  it  pos- 
sible to  transfer  to  the  atom  (and  then  to  the  crystal  if  the  atom  is  bound  in  a 
crystal)  whatever  momentum  is  required  to  satisfy  momentum  conservation  in  the 
photoelectric  process.  The  momentum  transfer  was  ignored  in  the  energy-transfer 
considerations  that  led  to  the  equation  obtained  in  Einstein’s  analysis  of  the  pho- 
toelectric effect,  Eq.  (30-4).  But  this  is  justifiable  because  the  recoiling  atom  (or 
crystal)  has  such  a large  mass  compared  to  the  mass  of  an  electron.  The  kinetic  en- 
ergy of  the  nonrelativistic  recoiling  motion  is 

Mv2  (Mv)2  p2 
K ~ 2 2M  ~ 2 M 


30-3  The  Scattering  of  Photons  1451 


30-4  RECENT 
EVIDENCE  FOR  THE 
EXISTENCE  OF 
PHOTONS 


where  p = Mv  is  momentum.  So  the  amount  of  energy  K transferred  to  the  atom  is 
small,  even  when  an  appreciable  amount  of  momentum  p is  transferred,  because 
the  mass  M of  the  atom  receiving  the  momentum  is  large.  The  probability  that  an 
atomic  electron  will  remain  bound  decreases  when  the  energy  of  a photon  in- 
teracting with  it  is  increased.  Consequently,  the  probability  of  the  photoelectric  ef- 
fect decreases  in  comparison  to  that  of  Compton  scattering  when  the  frequency  of 
the  incident  radiation  becomes  higher  and  its  wavelength  becomes  shorter. 


The  basic  concept  in  Einstein’s  and  Compton’s  theories  of  the  interaction 
of  electromagnetic  radiation  with  matter  is  that  a beam  of  radiation  com- 
prises localized  bundles  of  radiation,  the  photons.  Although  initially 
many  physicists  were  reluctant  to  accept  Einstein’s  use  of  the  photon  con- 
cept to  explain  the  photoelectric  effect,  the  reluctance  evaporated  after 
Compton  showed  that  it  also  explained  Compton  scattering.  As  we  will 
soon  describe,  de  Broglie’s  acceptance  of  the  existence  of  photons  was  cen- 
tral in  his  work  leading  to  the  development  of  quantum  mechanics. 

But  recently  John  Clauser,  Edward  Jaynes,  and  others  have  called  at- 
tention to  the  fact  that  neither  the  photoelectric  effect  nor  Compton  scat- 
tering can  be  used  to  prove  unambiguously  that  energy  and  momentum  in 
electromagnetic  radiation  are  lumped  into  localized  photons,  in  contrast  to 
being  spread  over  broad  wave  fronts.  This  is  so  because  in  the  years  that 
have  passed  since  the  work  of  Einstein  and  Compton,  adequate  explana- 
tions of  the  observed  properties  of  the  photoelectric  effect  and  Compton 
scattering  have  been  developed  which  do  not  involve  assuming  that  photons 
exist.  The  photoelectric  effect  can  be  explained  on  the  basis  of  Maxwell’s 
nonphoton  description  of  electromagnetic  radiation,  provided  account  is 
taken  of  energy  quantization  in  the  material  whose  surface  absorbs  the 
radiation.  Compton  scattering  can  be  explained  without  assuming  that 
electromagnetic  radiation  is  composed  of  photons,  if  use  is  made  of  the 
uncertainty  principles.  We  cannot  go  into  these  alternative  explanations 
here  because  we  have  not  yet  studied  either  energy  quantization  in  matter 
or  the  uncertainty  principles.  But  there  is  no  difficulty  in  sketching  the  es- 
sential idea  of  an  experiment  which  proves  that  Einstein  and  Compton 
were  right,  after  all.  Their  explanations  of  the  photoelectric  effect  and 
Compton  scattering  are  the  correct  explanations  because  photons  really  do 
exist! 

There  are  several  modern  experiments  which  show  unambiguously 
that  radiation  is  composed  of  localized  photons.  They  all  have  the  feature 
that,  in  one  way  or  another,  they  are  concerned  with  th e simultaneous  detec- 
tion of  two  photons.  The  most  straightforward  of  these  so-called  photon 
correlation  experiments  was  reported  by  Clauser  in  1974.  Figure  30-9  is  a 
considerably  simplified  schematic  of  Clauser’s  apparatus.  Light  from  a 
low-intensity  source  passes  through  a lens,  which  forms  it  into  a beam.  The 
beam  is  incident  on  a “beam  splitter,”  consisting  of  a lightly  silvered  glass 
plate  inclined  to  the  beam  at  a 45°  angle  (just  as  in  a Michelson  interfer- 
ometer). Half  of  the  intensity  of  the  beam  is  reflected  into  the  light  detector 
shown  below  the  beam  splitter,  and  half  is  transmitted  through  it  to  the  de- 
tector shown  to  its  right.  Each  detector  is  a “photomultiplier,”  comprising  a 
photoelectric  surface  followed  by  a system  of  electrodes  which  amplifies  by 
a very  large  factor  any  electron  current  emitted  from  the  surface.  The  de- 
tectors are  so  sensitive  that  they  have  some  chance  of  responding  to  the 
arrival  of  a single  photon.  Measurements  are  made  of  the  “coincidence 


1452  Particle-Wave  Duality 


Photomultiplier 


30-5  THE  WAVELIKE 
MOTION  OF  PHOTONS 


Photomultiplier 


Electronic 

coincidence 

recorder 


Fig.  30-9  Schematic  drawing  of  Clauser's  apparatus.  The  open  dots  repre- 
sent photons.  One  has  been  reflected  by  the  beam  splitter,  and  is  about  to 
strike  the  lower  photomultiplier.  The  other  photon  has  a 50  percent  chance 
of  being  reflected  by  the  beam  splitter  into  the  lower  photomultiplier  and  a 
50  percent  chance  of  being  transmitted  through  it  into  the  photomultiplier 
on  the  right.  Not  indicated  are  features  of  the  apparatus  used  to  verify 
experimentally  that  if  the  two  photomultipliers  would  respond  simulta- 
neously, the  so-called  coincidence  would  be  properly  recorded  by  the  coin- 
cidence recorder.  Actually,  a low  coincidence  rate  is  observed.  But  it  is  ac- 
counted for  completely  by  accidental  events.  The  experiment  demonstrates 
the  existence  of  photons  localized  in  regions  of  space  which  are  at  least 
small  enough  that  any  single  photon  cannot  actuate  both  of  the  separated 
photon  detectors. 


rate.”  This  is  the  number  of  times  per  second  that  the  two  detectors 
respond  simultaneously,  within  the  time  resolution  of  (he  electronic  cir- 
cuitry used  to  judge  simultaneity. 

If  light  has  the  photon  structure  that  Einstein  and  Compton  believed  it 
to  have,  then  the  coincidence  rate  should  be  very  low.  The  reason  is  that 
each  photon  incident  on  the  beam  splitter  will  go  into  one  detector  or  into 
the  other.  The  photon  cannot  go  into  both  detectors  because  it  occupies  a 
small  region  of  space.  So  (he  two  detectors  should  not  respond  in 
coincidence — except  in  an  accidental  event  in  which  two  photons  are  inci- 
dent on  the  beam  splitter  within  a time  interval  less  than  the  experimental 
time  resolution,  one  being  transmitted  and  the  other  reflected,  and  both  de- 
tectors responding.  II  light  has  the  continuous  structure  described  by  Max- 
well, then  light  is  always  arriving  simultaneously  at  both  detectors  and  the 
coincidence  rate  should  be  much  higher.  Detailed  analysis  of  the  very  low 
coincidence  rate  found  in  the  measurements  showed  conclusively  that  light 
is  composed  of  photons. 


In  discussing  the  interaction  of  radiation  with  matter,  we  have  fallen  into 
the  habit  of  speaking  as  if  radiation  were  composed  of  photons,  not  just 
when  it  is  emitted,  scattered,  or  absorbed,  but  all  the  time.  This  is  the  pre- 
valent point  of  view.  After  all,  to  determine  the  constituents  of  radiation, 
we  must  let  it  transfer  energy  and  momentum  to  some  sort  of  radiation  de- 
tector. When  this  happens,  we  always  find  that  the  radiation  acts  like  a 
stream  of  photons.  Thus,  whenever  we  investigate  the  composition  of  radi- 
ation, we  study  how  it  interacts  with  a detector.  In  a suitably  designed 
experiment  we  find  that  radiation  has  particlelike  properties. 

On  the  other  hand,  we  can  also  investigate  where  the  radiation  is  when 
we  detected  it.  When  we  use  suitable  apparatus,  we  find  that  radiation  has 
wavelike  properties.  That  is,  radiation  traveling  through  a system  exhibits  the 
superposition  phenomena  that  are  the  hallmarks  of  waves,  if  the  character- 
istic dimension  of  the  system  is  comparable  to  the  wavelength  of  the  radia- 
tion. This  is  evidenced  by  the  fact  that  the  radiation  arrives  at  the  region 
where  it  is  detected  in  a diffraction  pattern.  Thus  the  particles  of  electro- 
magnetic radiation  do  not  travel  like  billiard  balls,  as  if  they  were  guided  by 
the  laws  of  motion  of  the  macroscopic  particles  of  newton ian  mechanics. 
Instead,  they  travel  as  if  they  were  guided  by  the  laws  of  propagation  of 
waves. 


30-5  The  Wavelike  Motion  of  Photons  1453 


A fruitful  way  to  describe  this  particle-wave  duality  is  to  say  that  a 
photon  is  a particle  whose  motion  is  “guided”  by  an  associated  electromagnetic  wave. 
The  associated  electromagnetic  wave  for  a photon  of  energy  E and  mo- 
mentum p has  frequency  v and  wavelength  A.  which  satisfy  the  relations 
E = hv  and  p — h/k  of  Eqs.  (30-2)  and  (30-10).  The  wave  guides  the 
photon  in  a sense  that  can  be  best  explained  operationally  by  describing  a 
procedure  for  predicting  the  passage  of  photons  through  a system.  The 
description  follows. 

After  the  wavelength  k of  the  electromagnetic  wave  associated  with  a 
photon  is  evaluated,  a diffraction  calculation  is  performed  to  determine 
how  the  wave  propagates  through  the  system.  In  some  cases  this  need  be 
only  a partial  calculation  leading  to  a determination  of  where  the  maxima 
and  minima  in  the  diffraction  pattern  will  occur,  as  we  did  for  single-slit 
diffraction  at  the  beginning  of  Sec.  28-8.  But  to  obtain  complete  informa- 
tion about  the  motion  of  the  photon  requires  a complete  calculation  of  the 
full  form  of  the  diffraction  pattern  in  the  region  where  the  photon  is  to  be 
detected,  as  we  did  for  single-slit  diffraction  at  the  end  of  Sec.  28-8.  You 
will  recall  that  a complete  diffraction  calculation  amounts  to  finding  the 
amplitude  of  the  total  wave  at  various  locations  in  the  detection  region  by 
summing  the  amplitudes  of  its  component  waves  and  then  squaring  this 
amplitude  to  determine  the  intensity  at  each  of  these  locations. 

I he  intensity  pattern  obtained  from  the  diffraction  calculation  is  inter- 
preted as  giving  the  probability  that  at  any  instant  a photon  would  be  found 
in  the  immediate  neighborhood  of  each  location.  That  is,  the  probability  of 
finding  a photon  near  a point  is  proportional  to  the  squared  amplitude,  or  intensity, 
of  the  associated  electromagnetic  wave  near  that  point.  Justification  can  be  found 
in  our  study  of  the  energy  content  of  electromagnetic  waves  in  Sec.  27-5. 
We  saw  there  that  the  energy  in  a small  volume  surrounding  some  point  in 
a region  containing  an  electromagnetic  wave  is  proportional  to  the  square 
of  its  electric  held  or  its  magnetic  held.  In  other  words,  the  energy  in  the 
volume  is  proportional  to  the  intensity  of  the  wave  in  that  volume.  But  if 
there  are  photons  in  the  volume,  the  energy  it  contains  is  also  proportional 
to  the  number  of  photons  present.  So  the  probability  of  hnding  a photon 
in  the  volume  must  be  proportional  to  the  intensity  of  the  associated 
wave  in  the  volume. 

This  relation  leads  to  the  prediction  that  where  the  intensity  of  the  dif- 
fraction pattern  is  high,  there  will  be  a high  probability  of  hnding  a photon, 
and  where  the  intensity  is  low,  there  will  be  a low  probability  of  hnding  a 
photon.  That  is,  photons  will  tend  to  arrive  at  the  him  (or  other  detector 
used  to  observe  the  diffraction  pattern)  at  locations  where  the  intensity  of 
the  diffraction  pattern  is  high.  Note  that  the  exact  location  at  which  the 
photon  will  arrive  is  not  predicted.  What  is  predicted  is  that  it  will  be  detected 
at  some  location — acting  as  a localized  particle  when  it  interacts  with  the 
detector — and  that  this  location  is  likely  near  where  the  associated  wave  has  a 
high  intensity. 

If  only  one  photon  travels  through  the  system,  its  particular  point  of 
arrival  cannot  have  the  appearance  of  a diffraction  pattern.  But  after  a 
number  of  photons  have  traversed  the  system,  their  arrival  locations  form  a 
distribution  that  is  the  same  as  the  intensity  distribution  of  the  wave  asso- 
ciated with  each  photon.  This  is  illustrated  in  Fig.  30-10.  Thus  this  proce- 
dure for  treating  photon  motion  predicts  that  the  energy  arriving  at  the  de- 
tector is  distributed  in  space  in  a way  that  is  in  complete  agreement  with  the 
results  of  the  diffraction  experiments,  all  of  which  involve  the  arrival  of 


1454  Particle-Wave  Duality 


Single 

slit 


Single-slit  First  photon 

diffraction  to  arrive 

pattern 


Fig.  30-10  The  development  of  a single-slit  diffraction  pat- 
tern. (a)  The  intensity  pattern  at  the  detection  plane  is  pre- 
dicted by  a diffraction  calculation  for  the  single-slit  system 
shown  in  the  figure.  Simulated  bar  graphs  showing  the 
arrival  locations  of  (b)  the  first  photon,  ( c ) the  first  10 
photons,  and  ( d ) the  first  100  photons  arriving  at  the  detec- 
tion plane. 


( 


First  10 
photons 
to  arrive 


£ 

t 

First  100 
photons 
to  arrive 


(c) 


W) 


many  photons  at  the  detector.  Also,  the  procedure  predicts  that  the  energy 
arrives  in  localized  bundles,  in  agreement  with  the  experiments  demon- 
strating the  existence  of  photons. 

An  important  feature  of  the  procedure  is  that  each  photon  is  treated  indepen- 
dently. This  is  in  agreement  with  an  experiment  performed  early  in  this  century 
by  G.  I.  Taylor.  He  set  up  a standard  light  diffraction  apparatus  and  obtained  a dif- 
fraction pattern  on  the  photographic  film  used  to  detect  the  light.  Then  he  reduced 
the  light  intensity  to  such  a low  value  that  photons  were  passing  through  the 
apparatus  one  at  a time.  After  a very  long  exposure,  he  obtained  exactly  the  same 
diffraction  pattern  on  the  film.  In  this  way,  he  proved  that  the  diffraction  pattern 
governing  the  motion  of  some  photon  arises  from  a superposition  of  different  parts 
of  the  wave  associated  with  that  photon.  The  diffraction  pattern  is  not  due  to  a su- 
perposition of  the  wave  associated  with  one  photon  and  the  wave  associated  with 
another  photon. 

An  even  more  important  feature  of  the  procedure  is  that  it  deals  with  proba- 
bilities. It  attempts  to  predict  not  the  exact  number  of  photons  that  will  arrive  in  a 
certain  region  where  they  are  detected,  only  the  probable  number.  If  a diffraction 
experiment  is  repeated  many  times  under  conditions  that  are  as  identical  as  pos- 
sible, the  actual  number  of  photons  arriving  in  a particular  region  each  time  will 
iluctuate  about  an  average  value.  It  is  the  average  value  only  which  is  predicted  by 
the  procedure  for  treating  the  motion  of  photons.  The  situation  is  similar  to  that 
found  in  the  kinetic  theory  of  gases,  where  the  number  of  molecules  per  unit  time 
striking  a certain  region  of  the  wall  of  a vessel  containing  the  gas  fluctuates  about 


30-5  The  Wavelike  Motion  of  Photons  1455 


some  average  value.  Kinetic  theory  is  concerned  with  that  average  value  only,  but 
this  is  enough  to  determine  the  pressure  on  the  wall  and  in  the  gas.  Just  as  is  the 
case  for  the  gas  molecules  arriving  in  a certain  region,  the  fluctuations  about  its 
average  value  in  the  number  of  photons  arriving  become  more  significant  the 
smaller  that  value  is,  and  less  significant  the  larger  it  is.  So  if  the  total  number  of 
photons  involved  in  a diffraction  experiment  is  small,  the  diffraction  pattern  will 
give  only  an  approximation  to  the  actual  observed  distribution  of  arrival  locations. 
Also,  the  granularity  in  the  pattern  of  energy  deposition  on  the  detector  will  be 
very  apparent.  But  if  the  total  number  of  photons  is  very  large,  any  region  of  the 
detector  big  enough  to  be  resolvable  will  have  been  struck  by  such  a large  number 
of  photons  that  the  fluctuations  in  that  number  will  be  negligible.  Then  energy 
will  be  deposited  on  the  detector  in  exactly  the  same  pattern  that  Maxwell  would 
have  predicted. 

Finally,  it  should  be  emphasized  that  the  procedure  for  predicting  the  motion 
of  a photon  is  based  on  the  view  that  it  is  the  photon  which  has  a physical  exis- 
tence, in  the  sense  of  being  detectable.  The  electromagnetic  wave  is  looked  on  as 
having  a mathematical  existence,  in  that  it  is  used  as  a tool  to  calculate  the  pre- 
dicted motion  of  the  photon. 


30-6  MATTER  WAVES  In  his  1924  doctoral  thesis  Louis  de  Broglie  first  expressed  the  idea  that  na- 
ture treats  all  particles  in  the  same  way  with  regard  to  particle-wave  duality,  whether 
their  rest  mass  is  zero  or  not.  He  said  that  the  essential  aspects  of  the  procedure 
just  described  for  predicting  the  motion  of  the  particle  of  zero  rest  mass, 
called  the  photon,  can  be  used  to  predict  the  motion  of  a particle  of  non- 
zero rest  mass,  such  as  an  electron.  Specifically,  de  Broglie  postulated  that  a 
particle  of  nonzero  rest  mass  moves  as  if  it  were  guided  by  an  associated  matter  wave. 
Furthermore,  he  postulated  that  the  energy  E and  momentum  p of  the  particle  are 
related  to  the  frequency  v and  wavelength  k of  the  wave  by  the  equations 

E = hv  (30-17) 

and 

P=g  (30-18) 

These  equations  are  identical  to  Eqs.  (30-2)  and  (30-10)  relating  the  charac- 
teristics of  a photon  and  its  associated  electromagnetic  wave.  Thus  they  are 
used  for  both  the  zero-rest-mass  particles  of  radiation  and  the  nonzero- 
rest-mass  particles  of  matter.  They  are  now  called  the  Einstein-de  Broglie 
relations.  In  many  cases  only  Eq.  (30-18)  is  needed  to  predict  the  wavelike 
motion  of  particles  since  diffraction  patterns  and  similar  phenomena  are 
governed  by  the  wavelength  of  a wave,  not  its  frequency.  If  only  that  equa- 
tion is  being  employed,  it  is  usually  called  the  de  Broglie  relation,  particu- 
larly if  it  is  being  applied  to  a material  particle. 

Now,  de  Broglie  said  that  the  role  of  matter  waves  in  guiding  the  mo- 
tion of  material  particles  is  analogous  to  that  of  electromagnetic  waves  in 
guiding  the  motion  of  photons.  In  1926  Max  Born  took  the  next  step, 
saying  that  the  relation  between  a matter  wave  and  its  associated  particle  is 
exactly  the  same  as  that  between  an  electromagnetic  wave  and  its  associated 
photon.  According  to  Born’s  postulate,  the  probability  of  finding  a material 
particle  near  ci  point  is  proportional  to  the  squared  amplitude,  or  intensity,  of  the  as- 
sociated matter  wave  near  that  point.  Thus  a matter  wave  is  a mathematical 
construct  that  can  be  used  in  calculations  of  diffraction  patterns  and  other 


1456 


Particle-Wave  Duality 


wave  propagation  phenomena,  for  the  purpose  of  predicting  the  motion  of 
a particle  of  matter  in  the  same  way  that  an  electromagnetic  wave  can  be 
used  in  such  calculations  to  predict  the  motion  of  a particle  of  radiation. 

As  for  the  other  aspect  of  particle-wave  duality,  it  goes  without  saying 
that  material  particles  act  like  particles  when  they  interact  with  anything.  In 
other  words,  when  interacting,  they  behave  like  localized  bundles  of  energy 
and  momentum,  just  as  is  the  case  for  interacting  particles  of  radiation.  But 
in  both  cases  if  there  are  very  many  particles  interacting,  the  presence  of 
individual  ones  can  be  difficult  to  detect. 

Stress  has  been  put  on  the  close  relation  between  matter  waves  and  electro- 
magnetic waves.  But  this  should  not  lead  you  to  conclude  that  all  the  properties  of 
the  two  types  of  guiding  waves  are  the  same.  It  is  not  so,  any  more  than  it  is  so  that 
all  the  properties  of  the  nonzero-  and  zero-rest-mass  particles  whose  motion  they 
govern  are  the  same.  In  more  advanced  treatments  of  the  properties  of  matter  waves 
than  that  in  this  book,  a number  of  features  are  developed  which  are  distinctly  dif- 
ferent from  those  of  electromagnetic  waves.  For  instance,  the  latter  always  travel 
at  speed  c while  the  former  never  to. 

Convincing  experiments  confirming  de  Broglie's  revolutionary  idea 
were  performed  in  1925  and  1927.  Before  they  are  described,  it  should  be 
explained  why  the  wavelike  motion  of  material  particles  had  not  been  ob- 
served in  all  the  experimental  work  involving  electrons,  and  other  material 
particles,  done  in  the  preceding  years.  The  main  point  is  the  same  one  that 
was  discussed  at  length  in  Sec.  29-1,  in  connection  with  what  we  would  now 
call  the  wavelike  motion  of  photons  of  visible  light.  1 he  wavelike  properties 
of  the  motion  of  anything  traveling  through  a system  can  be  observed 
experimentally  only  if  the  associated  wave  is  diffracted  by  an  observable 
amount.  This  means  that  the  angle  6 characterizing  the  diffraction  pattern 
produced  when  the  wave  travels  through  the  system  cannot  be  too  small. 
According  to  Eqs.  (28-21)  and  (28-25),  the  value  of  6 is  related  to  the  wave- 
length A of  the  wave  and  the  characteristic  dimension  d of  the  system  as 
follows: 


A specific  example  of  a characteristic  dimension  d would  be  the  width  of  a 
slit  in  a single-slit  diffraction  apparatus.  For  diffraction  to  be  detectable,  6 
cannot  be  too  small.  Therefore  A cannot  be  too  small  compared  to  d if  the 
wave  is,  so  to  speak,  to  move  like  a wave. 

In  fact,  if  the  wavelength  A of  waves  passing  through  a system  is  very 
small  compared  to  the  characteristic  dimension  d of  the  system,  the  wave 
appears  to  travel  along  well-defined  paths.  These  are  what  we  have  called 
rays  in  the  case  of  light.  For  such  a case  the  associated  particles  also  appear 
to  follow  well-defined  paths.  That  is,  the  light  photons  traveling  through 
such  a system  seem  to  move  along  trajectories  in  essentially  the  same  way 
that  macroscopic  particles  like  billiard  balls  follow  trajectories.  The  wave- 
like motion  of  photons  will  be  observed  only  if  their  wavelength  is  compa- 
rable to  the  characteristic  dimension  of  the  apparatus  used  to  demonstrate 
their  wavelike  motion.  The  same  is  true  of  material  particles.  So  the  key  to 
understanding  under  what  circumstances  de  Broglie's  wavelike  motion  of 
material  particles  can  be  observed  lies  in  evaluating  the  wavelength  A from 
his  Eq.  (30-18),  written  in  the  form 

A = — (30-19) 

P 


30-6  Matter  Waves  1457 


I his  wavelength  is  known  as  the  de  Broglie  wavelength.  A numerical  value 
of  a de  Broglie  wavelength  is  worked  out  in  Example  30-4. 


EXAMPLE  30-4 

Evaluate  the  de  Broglie  wavelength  of  a 1.0-kg  billiard  ball  moving  at  5.0  m/s. 
■ Using  Eq.  (30-19),  you  have 

_ h _ h_  _ 6.6  x I0~34  J-s 
p mv  1.0  kg  x 5.0  m/s 


or 


A = 1.3  x 10-34  m 

Can  you  estimate  how  much  smaller  than  this  value  is  the  de  Broglie  wave- 
length of  the  earth  moving  around  the  sun? 


Because  Planck's  constant  is  extremely  small  on  the  macroscopic  scale, 
the  de  Broglie  wavelength  A — 1 0-34  m found  in  Example  30-4  for  a typical 
moving  macroscopic  particle  is  extremely  small  compared  to  the  dimen- 
sions of  the  smallest  systems  known  in  nature.  For  instance,  the  diameter  of 
a typical  atomic  nucleus  is  d — 10-14  m.  Even  if  a diffraction  apparatus 
could  be  obtained  with  a slit  width  d on  the  nuclear  scale  (and  if  some  way 
could  be  found  to  get  the  billiard  ball  through  the  slit),  the  angle  6 charac- 
terizing the  diffraction  pattern  that  would  be  produced  in  an  attempt  to 
use  it  to  demonstrate  the  wavelike  properties  of  a billiard  ball  would  be  9 — 
k/d  — 10~34  m/10-14  m = 10-20  rad.  This  is  so  small  that  there  is  absolutely 
no  way  of  telling  that  any  diffraction  at  all  has  taken  place.  Thus  de 
Broglie's  postulate  predicts  no  observable  diffraction  effects  in  the  motion 
of  macroscopic  particles,  and  there  is  no  conflict  between  the  postulate  and 
any  of  the  standard  macroscopic  experiments  of  the  nonquantum  domain. 

Example  30-5  works  out  the  value  of  another  de  Broglie  wavelength. 


EXAMPLE  30-5  ■■■■■■ — ■■■■ * ' ' 

Evaluate  the  de  Broglie  wavelength  of  an  electron  whose  kinetic  energy  is  54.0  eV. 

■ Since  the  electron  is  nonrelativistic,  you  can  use  the  relation  K = p2/2m  and 
obtain 

_ h _ h 6.63  x 1Q~34  J-s 

p V2mK  (2  x 9.11  x HU31  kg  x 54.0  eV  x 1.60  x 1 (T19  J/eV)1'2 

= 1.67  x 10_1°  m = 0.167  nm 


Because  the  mass  of  an  electron  is  very  small,  the  de  Broglie  wave- 
length  A — 1 0-10  m of  the  moderate-energy  electron  considered  in  Ex- 
ample 30-5  is  much  larger  than  that  of  the  billiard  ball.  Although  it  is  still 
small  on  the  macroscopic  scale  of  sizes,  it  is  not  small  on  the  microscopic 
scale.  For  example,  the  diameter  of  a typical  atom  is  also  about  10-10  nr. 
This  suggests  that  it  should  be  possible  to  use  the  regular  arrangement  of 
atoms  in  a crystal  as  a diffraction  grating  of  the  appropriate  slit  spacing, 
since  the  atoms  are  separated  by  distances  comparable  to  the  atomic  di- 
ameter. 


1458  Particle-Wave  Duality 


Fig.  30-11  (a)  A schematic  drawing 

of  the  Davisson-Germer  electron  dif- 
fraction experiment.  The  entire 
apparatus  is  in  vacuum.  A magnified 
schematic  drawing  of  the  nickel 
crystal.  ( From  F.  Richtmyer,  E.  Ken- 
nard,  and  J.  Cooper,  Introduction  to 
Modern  Physics,  6th  ed.,  McGraw-Hill, 
New  York,  1969.) 


In  1925  Clinton  Davisson  and  Lester  Germer  were  investigating  an  un- 
expected effect  they  had  found  in  the  scattering  of  electrons  from  nickel. 
Upon  trying  a nickel  crystal,  they  saw  a pattern  in  the  scattered  electrons 
with  features  reminiscent  of  a diffraction  pattern.  They  had  not  heard  of 
de  Broglie’s  postulate  at  the  time.  In  due  course  they  realized  that  it  was  a 
diffraction  pattern  and  that  it  agreed  quantitatively  with  the  predictions  of 
de  Broglie.  Their  apparatus  is  shown  in  Fig.  30-1 1.  Electrons  emitted  from 
a heated  cathode  C into  an  evacuated  region  are  accelerated  toward  an 
anode  A by  the  positive  potential  difference  V applied  between  A and  C. 
Those  passing  through  a hole  in  the  anode  form  a beam  of  electrons  that  is 
incident  normally  on  the  surface  of  the  nickel  crystal.  Some  of  these  elec- 
trons are  scattered  into  an  electron  collector.  The  collector  consists  of  a 
plate  P located  behind  a diaphragm  D.  A negative  retarding  voltage, 
slightly  lower  in  magnitude  than  the  accelerating  voltage  V,  is  applied 
between  P and  D.  Thus  only  electrons  scattered  into  the  collector  with  en- 
ergy almost  equal  to  the  energy  of  the  electrons  incident  on  the  crystal  are 
collected  and  contribute  to  the  current  i read  by  the  meter.  Since  low- 
energy  electrons  lose  energy  very  rapidly  when  traveling  through  a solid, 
the  collected  electrons  must  have  been  scattered  from  the  crystal  surface  or 
from  very  near  the  surface. 

At  each  of  several  different  accelerating  voltages  V,  the  current  i was 
measured  as  a function  of  the  angle  cj)  between  the  direction  of  the  scat- 
tered electrons  and  the  normal  to  the  surface.  Data  were  taken  with  the 
crystal  oriented  so  that  the  rows  of  atoms  in  the  surface  were  at  right  angles 
to  the  plane  containing  the  incident  and  scattered  electron  directions.  The 
collector-current  versus  angle  data  are  presented  as  polar  plots  in  Fig. 
30-12. 

The  data  indicate  that  for  all  values  of  V most  of  the  electrons  are  scat- 
tered straight  back,  with  <fi  — 0°,  as  if  they  had  undergone  mirror  reflection 
from  the  surface  of  the  crystal.  This  information  does  not  help  tell  whether 
an  electron  moves  like  a particle  or  like  a wave,  since  mirror  reflection  is 
found  in  both  cases.  But  when  the  accelerating  voltage  is  around  50  V,  a 
strong  peak  develops  in  the  scattering  pattern  at  a direction  inclined  to  the 
normal.  The  peak  maximizes  at  54  V,  and  at  that  value  of  V the  di- 
rection ol  the  peak  was  <J)  = 50°. 


30-6  Matter  Waves  1459 


Fig.  30-12  Polar  plots  of  the  detected 
current  versus  the  detector  angle  <b. 
measured  by  Davisson  and  Germer 
for  the  labeled  values  of  electron 
accelerating  voltage.  For  each  plot,  the 
distance  along  any  direction  from  the 
origin  to  the  curve  is  proportional  to 
the  current  detected  in  that  direction. 
(From.  F.  Richtmyer,  E.  Kennard,  and  J. 
Cooper,  Introduction  to  Modem  Physics, 
6th  ed.,  McGraw-Hill,  New  York,  1969.) 


The  peak  provides  a convincing  qualitative  verification  of  de  Broglie’s 
postulate  that  an  electron  moves  like  a wave.  It  can  be  explained  only  in  terms 
of  a constructive  superposition  of  parts  of  a wave  that  are  diffracted  from  the  uni- 
formly spaced  atoms  in  the  crystal  surface.  The  peak  is  explained  by  saying  that 
the  matter  wave  associated  with  an  incident  electron  is  diffracted  most 
strongly  along  lines  passing  through  each  row  of  surface  atoms,  since  that  is 
where  the  atomic  density  is  highest.  Thus  each  row  acts  like  a source  of 
waves  emitted  in  all  directions.  The  geometry  is  identical  to  that  of  a dif- 
fraction grating  used  in  the  reflection  mode;  see  Sec.  28-7.  In  a certain 
direction  the  many  scattered  waves  superpose  constructively  to  form  a total 
scattered  wave  of  maximum  intensity.  This  forms  a peak  in  the  scattering 
pattern  of  the  matter  wave  for  each  incident  electron  and  therefore  a peak 
in  the  overall  electron  scattering  pattern. 

Davisson  and  Germer  were  able  to  give  a quantitative  verification  of  de 
Broglie’s  postulate  by  using  their  matter-wave  reflection  grating  to  measure 
the  de  Broglie  wavelength  of  the  electrons  incident  on  it.  The  behavior  of 
the  grating  is  described  by  Eq.  (28-15),  which  here  we  write  as 

jk  = D sin  0 

In  this  expression  the  integer  j identifies  a peak  in  the  diffraction  pattern 
0 = 1 for  the  most  intense  peak),  A.  is  the  wavelength  of  the  diffracted 
waves,  D is  the  separation  between  the  rows  of  atoms  that  act  like  lines  on  a 
grating,  and  0 is  the  angle  of  the  diffraction  peak.  The  value  of  D is  known 
to  be  0.215  nm.  The  value  comes  from  applying  Bragg’s  law,  Eq.  (30-7),  to 
the  diffraction  of  X rays  of  known  wavelength  and  thus  measuring  the 
spacing  d between  the  inclined  planes  of  atoms  terminating  on  the  surface 
rows  of  the  nickel  crystal.  I he  value  of  D is  used  in  Eq.  (28-15),  assuming 
j=l.  The  resulting  relation  specifies  tfie  wavelength  of  the  waves  pro- 
ducing the  peak  at  0 = 50°  to  be 

h — D sin  0 = 0.215  nm  x sin  50°  = 0.165  nm 

Since  the  accelerating  voltage  used  was  54  V,  this  result  represents  the  mea- 
sured wavelength  of  the  waves  associated  with  electrons  of  kinetic  energy 
54  eV. 

The  analysis  above  does  not  account  completely  for  the  observations  of  the 
Davisson-Germer  experiment.  If  all  the  diffraction  took  place  at  the  surface,  a dif- 
fraction peak  could  be  observed  at  any  desired  angle  simply  by  adjusting  the  en- 

1460  Particle-Wave  Duality 


ergy  of  the  incident  electrons.  In  fact,  however,  peaks  are  observed  only  for  certain 
combinations  of  incident  electron  energies  and  collector  angles.  These  critical 
combinations  depend  in  turn,  on  the  crystal  orientation. 

The  reason  is  that  not  all  the  electrons  are  reflected  from  the  surface  plane  of 
atoms.  At  the  energies  involved  (about  50  eV)  electrons  can  penetrate  the  crystal 
to  a depth  of  several  atomic  layers,  be  reflected,  and  then  emerge  with  sufficient 
energy  to  be  detected.  In  general,  however,  the  distance  between  the  crystal 
planes,  each  of  which  acts  like  a diffraction  grating,  is  such  that  the  electron 
waves  diffracted  from  each  will  superpose  destructively  with  the  waves  diffracted 
from  the  others,  with  the  result  that  that  individual  diffraction  maxima  will  be 
more  or  less  wiped  out. 

But  the  superposition  from  the  several  layers  will  be  constructive,  and  the 
maximum  therefore  strong,  if  the  combination  of  electron  energy  (and  thus  wave- 
length) and  collector  angle  used  satisfies  the  condition  for  Bragg  reflection.  This 
requires  two  things:  (1)  There  must  be  a set  of  atomic  planes  in  the  crystal  with 
reasonably  high  electron  density  oriented  at  approximately  one-half  the  collector 
angle  so  that  the  angles  between  the  atomic  planes  and  the  incident  and  reflected 
beams  have  the  same  value  6.  (2)  The  electron  wavelength  inside  the  crystal  must 
satisfy  Bragg's  law,  jk  = 2d  sin  9,  where  d is  the  distance  between  that  set  of 
planes. 

Thus  the  Davisson-Germer  experiment  is  a sort  of  hybrid  between  diffraction 
by  a stack  of  gratings  and  Bragg  reflection.  But  this  complication  has  no  effect  on 
the  evaluation  A.  = 0.165  nm  of  the  electron  wavelength  outside  the  crystal,  ob- 
tained by  considering  only  diffraction  from  the  grating  at  the  crystal  surface. 

In  Example  30-5  the  de  Broglie  wavelength  of  the  matter  wave  for  an 
electron  of  kinetic  energy  54  eV  was  predicted  from  the  de  Broglie  relation 
to  be  0.167  nm.  Davisson  and  Germer  measured  the  wavelength  to  be  0.165 
nm.  The  agreement  was  well  within  the  accuracy  of  the  measurement.  It 
constitutes  an  impressive  quantitative  verification  of  the  de  Broglie  postulate.  There 
really  are  wavelike  aspects  in  the  motion  of  electrons! 

The  diffraction  of  electrons  transmitted  through  very  thin  crystals  was 
observed  in  1927  by  G.  P.  Thomson  (the  son  of  J.  J.  Thomson).  He  sent  a 
beam  of  electrons  through  a polycrystalline  metal  foil.  Its  thickness  was 
only  about  100  nm,  but  even  so  the  initial  kinetic  energy  of  the  electrons 
had  to  be  about  104  eV  in  order  to  keep  small  the  fractional  energy  loss  of 
the  electrons  passing  through  the  foil.  Some  of  the  electrons  were  reflected 
from  the  atomic  planes  of  properly  oriented  crystals  of  the  foil  at  diffrac- 
tion angles  satisfying  Bragg’s  law,  Eq.  (30-7): 

jk  = 2d  sin  d 

Several  different  values  of  atomic  plane  spacing  d are  found  in  the  geome- 
try of  even  the  simplest  crystal  (depending  on  the  orientation  of  the  crystal 
with  respect  to  the  beam),  and  the  integer  j can  assume  different  values. 
Thus  there  will  be  a set  of  particular  values  of  the  angle  6 at  which  diffrac- 
tion will  occur.  For  each  of  these  there  is  a corresponding  value  of  the  angle 
29  between  the  directions  of  the  incident  and  diffracted  electrons.  (See  Fig. 
30-6.) 

For  a 1 x 104  eV  electron  the  de  Broglie  wavelength  k is  about  5 x 
10~3  nm,  whereas  the  typical  spacing  d between  the  planes  of  atoms  in  a 
crystal  is  about  1 x 10_1  nm.  Thus  the  values  of  k/d  in  the  experiment  are 
small,  and  so  the  deflection  angles  26  — jk/d  are  also  small.  But  by  placing 
the  photographic  film  used  to  detect  the  electrons  a considerable  distance 
behind  the  foil,  Thomson  was  able  to  observe  a series  of  concentric  rings. 


30-6  Matter  Waves  1461 


Fig.  30-13  (a)  Diffraction  rings  obtained  by  passing  electrons  of  a certain  wavelength 

through  an  aluminum  foil.  ( b ) Diffraction  rings  obtained  by  passing  X rays  of  the  same  wave- 
length through  an  aluminum  foil,  (c)  Diffraction  rings  obtained  by  passing  neutrons  through  a 
copper  foil.  ( From  PSSC  Physics,  2d  ed.,  D.  C.  Heath , Boston,  1965.  Courtesy  Education  Development 
Corporation. ) 


The  rings  were  centered  on  the  axis  of  the  incident  electron  beam,  and 
each  one  corresponded  to  a particular  value  of  26.  A typical  set  of  diffrac- 
tion rings  is  illustrated  in  the  photograph  of  Fig.  30- 13c/.  It  was  obtained  by 
passing  electrons  of  a certain  wavelength  through  an  aluminum  foil.  An  es- 
sentially similar  set  of  rings,  shown  in  Fig.  30-136,  was  obtained  by  passing 
X rays  of  the  same  wavelength  through  a foil  of  the  same  material.  Both 
electrons  and  photons  move  like  waves. 

G.  P.  Thomson  shared  the  1937  Nobel  Prize  in  physics  with  Davisson  for  their 
experimental  verification  of  de  Broglie’s  postulate.  The  historian  of  science  Max 
Jammer  has  written,  “One  may  feel  inclined  to  say  that  Thomson,  the  father,  was 
awarded  the  Nobel  Prize  [in  1906]  for  having  shown  that  the  electron  is  a particle, 
and  Thomson,  the  son,  for  having  shown  that  the  electron  is  a wave.” 

There  are  also  wavelike  aspects  in  the  motion  of  neutrons.  The  effect 
is  very  apparent  if  they  traverse  a system  with  a characteristic  dimension 
that  is  comparable  to  their  de  Broglie  wavelength.  A diffraction  ring  pat- 
tern obtained  by  sending  a beam  of  neutrons  through  a polycrystalline 
copper  foil  is  shown  in  Fig.  30- 13c. 

Even  entire  atoms  have  been  observed  to  move  in  a wavelike  manner. 
Several  years  after  the  Davisson-Germer  experiment,  Estermann,  Frisch, 
and  Stern  diffracted  a beam  of  helium  atoms  from  the  surface  of  a single 
crystal  of  lithium  fluoride.  In  this  experiment,  a helium  beam  is  obtained 
from  a small  hole  leading  out  of  an  oven  filled  with  helium  gas  at  400  K. 
The  beam  contains  atoms  with  a Maxwell-Boltzmann  distribution  of 
speeds.  But  an  arrangement  of  two  rotating  shutters  spaced  along  the 
beam  is  used  to  select  from  the  distribution  atoms  having  a particular  speed 


1462  Particle- Wave  Duality 


1.64  x 103  m/s.  The  first  shutter  opens  briefly,  allowing  a burst  of  atoms 
with  the  distribution  of  speeds  to  start  toward  the  second  shutter.  Those 
with  the  particular  speed  selected  reach  it  at  just  the  time  it  opens  briefly 
to  let  them  pass  and  strike  the  crystal  surface.  They  scatter  from  the  surface 
in  the  same  way  as  the  electrons  in  the  Davisson-Germer  experiment,  and 
they  are  detected  by  a collecting  chamber  connected  to  an  extremely  sensi- 
tive pressure  gauge.  The  diffraction  pattern  observed  in  the  experiment  is 
analyzed  in  the  same  way  as  in  the  Davisson-Germer  experiment,  by  using 
the  spacing  between  rows  of  atoms  on  the  lithium  fluoride  surface  pre- 
viously determined  from  X-ray  diffraction.  The  helium  atom  cle  Broglie 
wavelength  measured  by  Estermann,  Frisch,  and  Stern  was  k = 0.0600  nm. 


EXAMPLE  30-6 

Evaluate  the  de  Broglie  wavelength  of  a helium  atom,  whose  speed  is  1.64  x 103 
m/s,  from  de  Broglie’s  relation. 

■ First,  you  must  find  the  mass  M of  the  common  species  of  helium  atom, 
helium-4.  A quick,  yet  sufficiently  accurate,  procedure  is  to  write  M = 4 u,  where 
the  atomic  mass  unit  (u)  has  the  value  given  in  Eq.  (15-27),  u = 1.66  X 10-27  kg.  The 
result  is 

M = 4 x 1.66  x 10-27  kg  = 6.64  x 1(T27  kg 

Then  you  evaluate 

h h 6.63  x 10~34  J-s 

k~~p~  AG  _ 6.64  x 1CT27  kg  x 1.64  x 103  m/s 
= 6.09  x 10"11  m = 0.0609  nm 

This  result  agrees  with  the  value  k = 0.0600  nm  measured  by  the  diffraction 
experiment  of  Estermann,  Frisch  and  Stern,  to  within  the  estimated  accuracy  of  the 
measurement. 


Countless  experiments  have  confirmed  the  validity  of  de  Broglie’s  pos- 
tulate and  also  of  Born’s  closely  related  postulate.  All  material  particles 
move  as  if  they  were  guided  by  associated  matter  waves.  But  for  wavelike 
motion  to  be  evident,  it  is  essential  that  the  important  dimension  of  the 
system  used  to  study  the  motion  of  a particle  be  comparable  to  the  de 
Broglie  wavelength  k of  the  particle.  Since  Planck’s  constant  h is  extremely 
small,  the  large  values  of  momentum  p characterizing  the  motion  of  macro- 
scopic particles  lead  to  values  of  k — h/p  much  too  small  to  meet  this  condi- 
tion in  any  system.  So  there  is  never  any  conflict  between  de  Broglie’s  pos- 
tulate and  the  fact  that  wavelike  aspects  are  not  observed  in  the  motion  of 
the  macroscopic  particles  of  the  newtonian  domain.  For  the  microscopic 
particles  of  the  quantum  domain,  p is  small  because  their  masses  are  small. 
Thus  for  such  particles  k — h/p  is  large  enough  to  be  comparable  to  the 
minute  dimensions  of  the  microscopic  systems  encountered  in  the 
quantum  domain.  As  a result,  wavelike  motion  of  material  particles  is  very 
apparent  and  very  important  in  the  quantum  domain.  To  summarize, 
Planck's  constant  is  extremely  small  on  the  scale  of  everyday  affairs,  but  it  is 
not  zero.  Consequently,  wave  effects  in  the  motion  of  material  particles 
usually  cannot  be  seen  in  macroscopic  systems  and  usually  cannot  be  over- 
looked in  microscopic  systems. 


30-6  Matter  Waves  1463 


There  are  some  systems  in  which  both  wavelike  and  particlelike  motion  must 
be  taken  into  account.  An  electron  microscope  provides  an  example.  It  is  analyzed 
by  treating  the  motion  of  electrons  traveling  through  it  in  terms  of  both  particle 
motion  and  wave  motion.  A diagram  of  an  electron  microscope  is  shown  in  Fig. 
30-14.  Electrons  of  kinetic  energy  around  105  eV  are  formed  into  a beam,  of  the 
width  required  to  “illuminate”  the  specimen,  by  a “lens”  consisting  of  a magnetic 
field  produced  by  a solenoid.  Those  electrons  which  pass  through  the  specimen 
then  pass  through  additional  magnetic  lenses,  which  produce  a greatly  magnified 
image  on  a fluorescent  viewing  screen.  To  determine  the  magnification,  the  elec- 
trons moving  through  the  system  are  treated  as  newtonian  particles  that  follow  the 
trajectories  indicated  in  the  figure.  This  amounts  to  treating  the  system  by  “ray 
optics.”  The  resolution  of  the  microscope  is  analyzed  by  treating  the  diffraction  of 
electrons  passing  through  the  specimen.  In  this  “wave  optics”  treatment,  the  same 
conclusion  is  obtained  as  for  the  optical  microscope  considered  in  Sec.  29-5.  That 
is,  the  wavelength  of  the  electrons  used  limits  the  size  of  the  smallest  object  that 
can  be  resolved  in  the  specimen. 

Since  X — 5 x 10-3  nm  for  a 1 x 105  eV  electron,  while  X — 5 x 102  nm  for 
an  optical  photon,  the  diffraction-limited  resolution  for  an  electron  microscope  is 
very  much  better  than  it  is  for  an  optical  microscope.  However,  a reasonably 
well-made  optical  microscope  actually  has  a resolution  almost  as  good  as  its  dif- 
fraction limit,  while  this  is  not  the  case  even  for  the  best  currently  available  elec- 
tron microscope.  Because  of  imperfections  in  the  magnetic  lenses,  the  best  resolu- 
tion yet  achieved  by  an  electron  microscope  is  something  like  0.5  nm.  This  does 
make  it  possible  to  see  individual  atoms  in  a specimen  under  very  favorable  cir- 
cumstances. And  improved  design  of  the  instrument  should  make  the  resolution 
approach  more  closely  the  ultimate  limit  dictated  by  the  de  Broglie  wavelength  of 
the  electron. 


Fig.  30-14  An  electron  microscope.  ( Courtesy  RCA  Corp.) 


14(14 


Particle-Wave  Duality 


30-7  THE 
UNCERTAINTY 
PRINCIPLES 


In  studying  the  motion  of  a macroscopic  particle,  we  usually  deal  with  cer- 
tainties. If  we  know  exactly  the  initial  values  of  the  position  and  velocity  of  a 
billiard  ball  and  the  net  force  acting  on  it,  we  can  use  newtonian  mechanics 
to  predict  exactly  what  its  final  position  and  velocity  will  be.  But  when  we 
are  concerned  with  the  motion  of  a microscopic  particle,  we  always  deal 
with  probabilities,  never  with  certainties.  The  procedure  described  in  Sec. 
30-5  for  predicting  the  motion  of  a photon  through  a single-slit  system 
does  not  yield  the  exact  position  of  the  photon  after  traversing  the  system. 
Instead  it  predicts  the  probability  of  finding  the  photon  at  various  posi- 
tions. The  same  situation  occurs  when  we  treat  the  motion  of  a micro- 
scopic particle  of  matter.  We  predict  not  the  exact  position  of  an  electron 
after  it  has  traversed  the  Davisson-Germer  apparatus,  only  the  probability 
of  finding  it  at  various  positions.  There  is  also  a procedure  for  predicting 
the  probabilities  that  a microscopic  particle  will  have  various  values  of  final 
velocity,  but  not  for  predicting  its  exact  final  velocity.  It  is  similar  to  the  one 
used  to  predict  the  final-position  probabilities. 

Analysis  of  the  motion  of  a particle  through  a microscopic  system  pro- 
vides less  information  about  its  final  state  than  such  an  analysis  for  a macro- 
scopic system.  There  is  a very  good  reason  for  this — less  information  is 
available  about  the  initial  state  of  the  particle  in  the  microscopic  system. 
After  an  astronomer  has  carefully  measured  the  position  and  velocity  of  a 
comet  moving  in  the  solar  system,  she  can  predict  with  great  accuracy  what 
its  position  and  velocity  will  be  many  years  later.  But  an  atomic  physicist 
cannot  do  this  for  an  electron  moving  in  an  atom  because  he  cannot  mea- 
sure both  its  position  and  velocity  at  the  same  time  with  sufficient  accuracy. 
His  inability  to  make  accurate  measurements  is  not  due  to  a technical  limi- 
tation of  currently  available  experimental  equipment.  Instead,  there  is  a fun- 
damental limitation  to  the  accuracy  with  which  the  position  and  velocity  of  a 
microscopic  particle  can  be  known  at  the  same  instant.  The  limitation  was  first  ex- 
pressed in  1927  by  Werner  Heisenberg  in  terms  of  the  accuracy  with  which 
the  position  and  momentum  of  a particle  can  be  known  simultaneously. 
(The  momentum  accuracy  limitation  leads  immediately  to  a velocity  accu- 
racy limitation  since  momentum  equals  mass  times  velocity.) 

To  be  specific,  we  consider  a particle  moving  along  the  x axis.  Let  Ax 
represent  the  uncertainty  in  our  knowledge  of  its  position  at  some  instant. 
In  other  words,  the  position  x is  known  only  to  (he  accuracy  Ax,  so  its  value 
is  quoted  as  x ± Ax.  Let  A p represent  the  similarly  defined  uncertainty  in 
the  particle’s  momentum  at  the  same  instant.  That  is,  the  momentum  is 
quoted  as  p ± A p at  that  instant.  According  to  Heisenberg,  the  values  of 
these  uncertainties  must  satisfy  the  relation 

h 

Ax  A/j  > — (30-20) 

477 

where  h is  Planck’s  constant.  This  is  the  position-momentum  uncertainty 
principle.  It  states  that  the  product  of  the  simultaneous  uncertainties  in  the 
position  and  momentum  of  a particle  cannot  be  smaller  than  the  value  of 
Planck’s  constant  divided  by  477.  Unless  enough  care  is  taken  in  the  mea- 
surements that  determine  a particle’s  position  and  momentum,  the  product 
Ax  A p of  the  uncertainties  in  these  quantities  at  some  instant  can  be  very 
large.  Thus  Ax  A p may  well  be  much  larger  than  h/ 477.  Improving  the 
accuracy  of  the  measurement  technique  will  reduce  the  value  of  Ax  A p.  But 
the  uncertainty  principle  says  that  no  matter  how  ideal  the  experimental 


30-7  The  Uncertainty  Principles  1465 


equipment  and  no  matter  how  expert  the  experimenter,  it  is  impossible  in 
principle  to  reduce  the  position-momentum  uncertainty  product  to  less  than 
Ii/Att. 

Note  that  the  uncertainty  principle  does  not  limit  either  Ax  or  A p indi- 
vidually, but  only  their  product.  The  uncertainty  Ax  in  the  position  of  a par- 
ticle may  be  reduced  by  suitably  modifying  the  measurement,  but  only  at 
the  expense  of  increasing  the  uncertainty  A p in  the  value  of  its  momentum. 
Or  A p can  be  made  smaller,  but  this  will  result  in  a larger  Ax.  Also  note  that 
the  uncertainty  principle  sets  no  limit  on  the  size  of  Ax  at  one  instant  and 
the  size  of  A p at  some  other  instant.  Instead,  it  limits  the  product  of  their 
sizes  at  the  same  instant.  If  a particle  moves  in  more  than  one  dimension, 
there  is  more  than  one  position-momentum  uncertainty  principle.  For  each 
coordinate  used  to  specify  the  particle’s  position,  the  product  of  its  uncer- 
tainty and  the  simultaneous  uncertainty  in  the  particle’s  momentum  com- 
ponent along  that  coordinate  axis  must  be  at  least  as  large  as  h/Air. 

This  limitation  is  always  present.  But  it  is  of  no  practical  significance  at 
all  in  systems  of  macroscopic  size  since  the  value  of  Planck’s  constant  h is 
completely  negligible  on  the  macroscopic  scale.  For  microscopic  systems, 
on  the  other  hand,  the  value  of  h is  not  at  all  negligible.  So  the  limit  set  by 
the  position-momentum  uncertainty  principle  is  important  in  systems  of 
microscopic  size.  Indeed,  it  is  crucial. 


We  will  see  soon  that  the  position-momentum  uncertainty  principle 
can  be  interpreted  as  arising  from  two  basic  facts.  One  has  to  do  with  the 
kinematics  of  waves:  The  smaller  the  width  of  a localized  wave  formed  by 
superposing  sinusoids  whose  wavelengths  span  a certain  range,  the  larger 
that  range  must  be.  The  other  fact  is  the  particle-wave  duality  connecting 
the  position  of  a particle  to  the  intensity  of  its  associated  wave  and  relating 
the  momentum  of  the  particle  to  the  wavelength  of  the  wave. 

Then  we  will  see  a different  interpretation  of  the  origin  of  the  posi- 
tion-momentum uncertainty  principle.  This  alternative  interpretation  con- 
siders the  measurement  process  itself,  assuming  an  ideal  microscope  is 
used  to  measure  a particle’s  position,  and  shows  that  the  light  used  to  illu- 
minate the  particle  transfers  an  unknown  amount  of  momentum  to  it  in  the 
very  act  of  making  the  position  measurement  possible.  As  a consequence  of 
particle-wave  duality  and  of  the  resolution  properties  of  a microscope,  the 
smaller  the  uncertainty  Ax  in  the  position  of  the  particle  after  the  measure- 
ment, the  larger  the  uncertainty  A p in  the  momentum  transferred  to  it. 
The  momentum-transfer  uncertainty  is  also  the  uncertainty  in  the  par- 
ticle's momentum  after  the  experiment.  Calculation  leads  to  a value  for 
Ax  A p in  good  agreement  with  the  limit  given  by  the  position-momentum 
uncertainty  principle. 

There  is  a second  uncertainty  principle  that  we  will  consider.  It  in- 
volves the  measured  value  of  the  energy  of  a particle  and  the  time  interval 
consumed  in  making  the  energy  measurement.  If  the  results  of  the  mea- 
surement determine  the  energy  to  be  in  the  range  E ± ±E  and  the  mea- 
surement is  carried  out  during  the  interval  in  which  the  time  is  in  the  range 
t ± At,  the  time-energy  uncertainty  principle  requires  that 


1466  Particle-Wave  Duality 


h 

At  AT  ^ — 


(30-21) 


The  product  of  At  and  A E cannot  be  smaller  than  Planck’s  constant  h di- 
vided by  477.  To  put  the  matter  in  a slightly  different  way,  we  can  say  that  if 
At  specifies  the  time  available  to  measure  the  energy  of  a particle,  then  its 
energy  cannot  be  measured  with  an  uncertainty  less  than  that  specified  by 
A E,  where  the  product  At  A E is  given  by  /i/477.  We  will  find  that  there  is  a 
very  close  relation  between  the  origin  of  this  uncertainty  principle  and  the 
origin  of  the  position-momentum  uncertainty  principle. 

Now  we  will  derive  the  uncertainty  principles  quoted  above  by  inves- 
tigating the  properties  of  a wave  that  extends  over  only  a finite  distance  and 
then  using  the  de  Broglie  and  Born  postulates  to  relate  these  properties  to 
the  behavior  of  a particle  associated  with  the  wave.  Consider  first  the 
position-momentum  uncertainty  principle  for  a particle  moving  along  the  x 
axis.  The  position  of  the  particle  at  some  instant  is  known  to  be  likely  some- 
where within  a range  x ± Ax.  Since  Born’s  postulate  says  the  intensity  of 
the  associated  wave  measures  the  probability  of  finding  the  particle  in 
various  positions,  this  wave  must  have  a significant  intensity  only  in  the 
same  range.  Let  ip  (the  Greek  letter  psi)  represent  the  amplitude  of  the  wave. 
At  a particular  instant  of  time,  the  amplitude  depends  on  only  the  coordi- 
nate x in  the  one-dimensional  situations  we  consider.  This  dependence  can 
be  expressed  by  the  function  p(x).  The  x dependence  of  the  intensity  of  the 
wave,  at  the  fixed  value  of  time  t,  is  given  by  the  square  of  the  function  spe- 
cifying the  x dependence  of  its  amplitude.  So  the  intensity  is  the  square  of 
p(x),  which  we  write  as  i/j2(x).  Since  we  are  dealing  with  a material  particle, 
the  wave  whose  amplitude  is  given  by  the  function  ifj(x)  is  what  we  have 
called  a matter  wave.  In  the  terminology  of  Chaps.  12  and  13,  p(x)  would 
be  called  a time-independent  wave  function.  But  here  we  generally  abbre- 
viate and  call  the  function  ip(x)  simply  a wave  function. 

Since  the  wave  function  p(x)  has  appreciable  values  only  in  a certain 
range  ofx,  its  mathematical  form  cannot  be  expressed  by  a single  sinusoid 
of  a certain  wavelength  X or  wave  number  k — 2tt/X.  That  is,  it  cannot  be 
written  in  a form  like 


(30-22) 


Such  a wave  extends  uniformly  over  the  unlimited  length  of  the  x axis  and 
cannot  represent  a matter  wave  associated  with  a freely  moving  particle 
known  to  be  in  the  range  x ± Ax.  But  a wave  localized  in  a certain  range, 
called  a wave  group,  can  be  obtained  by  superposing  a set  of  sinusoidal 
waves  like  Eq.  (30-22),  provided  each  member  of  the  set  has  a different 
wavelength  or  wave  number.  That  is,  a group  can  be  represented  by  the 
sum 


lp(x)  = 2)  COS(kjX) 

kj 


(30-23) 


The  wave  numbers  kj  of  the  component  sinusoids  cos(kjx)  cover  the  range  of 
values  k ± Ak.  We  want  to  see  just  how  such  a group  can  be  formed  and 
then  to  determine  the  relation  between  Ax,  the  width  of  the  group,  and  Ak, 
the  range  of  wave  numbers  required  to  produce  it.  (Note  that  what  we  are 
doing  amounts  to  making  a Fourier  synthesis  of  a time-independent  wave 
function,  as  discussed  in  Sec.  13-7.  The  only  difference  is  that  here  it  is 
more  convenient  to  deal  with  cosines  than  with  sines,  and  here  all  their  am- 
plitudes are  given  the  convenient  value  1 . We  must  use  sinusoids  in  the  syn- 


30-7  The  Uncertainty  Principles  1467 


thesis  instead  of  other  types  of  oscillatory  functions  because,  as  is  shown  in 
Sec.  31-3,  matter  waves  are  solutions  to  the  Schrodinger  equation  and  for  a 
freely  moving  particle  these  solutions  are  sinusoids.) 

Our  task  is  simply  a matter  of  adding  cosines  and  plotting  the  results, 
l he  work  is  facilitated  by  using  a programmable  calculator  or  small  com- 
puter and  the  cosine  superposition  program  found  in  the  Numerical  Cal- 
culation Supplement.  The  program  makes  the  device  evaluate  the  sum  in 
Eq.  (30-23)  for  n uniformly  spaced  values  of  kj  in  the  range  centered  on  the 
value  k and  extending  from  k — Ak  to  k + Ak.  For  convenience  in  compari- 
son, it  divides  each  term  in  the  sum  by  n,  so  that  in  all  cases  the  value  of  the 
sum  is  1 at  x = 0,  where  cos(AjX)  = 1 for  any  kj.  The  sum  is  calculated  at 
values  of  x starting  at  i = 0 and  spaced  at  distances  8x  apart.  Since 
cos(  — kjx)  = cos(^jx),  all  the  components  and  their  sum  are  symmetric 
about  x = 0 and  so  are  plotted  only  for  positive  x. 


EXAMPLE  30-7  "■  

Run  the  cosine  superposition  program  with  the  sets  of  parameters  given. 

a.  k = 12  (in  m-1):  Ah  = 2 (in  m_1)t  n = 2;  8x  = 0.04  (in  m). 

■ Here  there  are  only  two  component  sinusoids  that  form  the  sum 

tp(x)  = 5 cos (1  Ox)  + 4cos(14x) 

The  components  are  plotted  in  Fig.  30- 15a.  Their  sum,  produced  by  the  calcula- 
tion. is  plotted  in  Fig.  30- 1 56.  The  sum  consists  of  a group  centered  on  x = 0 plus  a 
set  of  repeated  groups  spaced  along  the  x axis.  (The  situation  is  completely  similar 
to  the  phenomenon  of  musical  beats,  discussed  in  Sec.  13-8;  see  Fig.  13-29.)  Figure 
30-15  should  make  what  goes  on  apparent  to  you.  At  x = 0 the  two  component 
waves  are  in  phase,  so  their  sum  has  maximum  amplitude.  Asx  increases,  they  begin 
to  get  out  of  phase  because  their  wave  numbers  k (or  wavelengths  A.  = 2n/k)  differ. 
In  going  from  x = 0 m to  x — 0.78  m,  the  component  with  k = 10  m-1  has  traced 
1 .25  cycles  of  oscillation,  and  the  component  with  k = 14  m_1  has  traced  1 .75  cycles. 
So  at  that  point  the  components  are  out  of  phase,  and  their  sum  has  minimum  am- 
plitude. But  as  x continues  to  increase,  the  components  begin  to  get  back  in  phase. 
At  x = 1.56  m the  component  with  the  smaller  wave  number  has  gone  through  2.5 
cycles  while  the  one  with  the  larger  wave  number  has  gone  through  3.5  cycles.  Thus 
they  are  in  phase  at  that  location  and  produce  a repeat  group  in  that  general  vi- 
cinity. As  x continues  to  increase,  the  continued  cyclic  change  in  the  phase  relation 
results  in  a series  of  uniformly  spaced  identical  groups  (just  like  beats).  ■ 

b.  k = 12  (in  m_1):  Ak  = 2 (in  m-1);  n = 4;  Sx  = 0.04  (in  m). 

■ The  four  component  sinusoids  used  here  are  plotted  in  Fig.  30- 16a,  and 

their  sum  is  plotted  in  Fig.  30- 1 6b.  You  can  see  that  doubling  the  number  of  compo- 
nent sinusoids,  while  keeping  constant  the  range  Ak  that  their  wave  numbers  cover, 
does  not  have  a strong  effect  on  the  central  group  in  the  sum.  But  it  pushes  the  first 
repeat  group  out  a very  considerable  distance  along  the  x axis.  The  reason  is  that 
now  there  are  more  component  sinusoids  that  must  all  get  back  in  phase  before  the 
first  repeat  group  can  be  formed.  This  happens  at  x = 4.72  m,  where  the  compo- 
nent with  k = 10  m_1  has  gone  through  7.5  oscillation  cycles,  the  component  with 
k = 11.33  m-1  through  8.5,  the  component  with  k = 12.67  m-1  through  9.5,  and  the 
component  with  k = 14  m-1  through  10.5.  ■ 

c.  k = 12  (in  m-1)-  Ak  = 2 (in  nr-1);  n = 8;  8x  = 0.04  (in  m). 

■ Only  the  sum  is  plotted  in  Fig.  30-17.  It  is  enough,  however,  to  allow  you  to 
conclude  that  doubling  again  the  number  of  component  sinusoids  that  are  used  has 
only  a small  effect  on  the  central  group,  if  the  range  Ak  of  wave  numbers  used  is  not 
changed.  But  doubling  the  number  of  component  sinusoids  does  push  the  hrst  re- 
peat group  out  even  farther  along  the  x axis.  Why? 


1468 


Particle-Wave  Duality 


t(x)  -cos  (fcjX) 


x ( in  m ) 


0.6 

0.4 

0.2 

0 


-0.2 


-0.4 


-0.6 


(a) 


x ( in  m ) 


(6) 


Fig.  30-15  (a)  Two  cosine  waves.  One  has  wave  number  k = 10  m \ and  the  other  has  wave 

number  A = 12  m-1.  Both  have  amplitude  }.  (The  plots  show  the  cosine  waves  for  only  positive 
values  of  the  position  x.  Actually,  they  extend  to  negative  values,  being  symmetrical  about  x = 
0.)  ( b ) The  sum  of  the  two  cosine  waves  forms  a group  centered  on  x = 0.  (Only  the  half  for 
positive  x is  shown.)  Also  formed  are  a set  of  repeat  groups  at  successively  more  positive  values 
of  x.  (There  are  repeat  groups  at  successively  more  negative  values  of  x,  too,  but  they  are  not 
shown.) 


As  more  and  more  sinusoidal  waves  are  superposed  to  produce  a wave 
function  t Jj(x),  very  little  happens  to  the  central  group  if  the  range  Ak 
spanned  by  the  wave  numbers  is  kept  constant.  But  the  repeat  groups  are 
displaced  farther  and  farther  away  from  the  central  group.  In  the  limit 
where  an  infinite  number  of  component  sinusoids  are  distributed  over  a 


30-7  The  Uncertainty  Principles  1469 


Ip(x)  - cos (k/x) 


(6) 

Fig.  30-16  (a)  Four  cosine  waves  with  wave  numbers  ranging  uniformly  from  k = 10  rrT1  to 

k = 14  m-1  and  with  amplitudes  1.  ( b ) The  sum  of  the  four  cosine  waves  forms  a group  cen- 
tered on  x = 0 which  is  very  similar  to  the  one  formed  by  two  cosine  waves  with  k = 10  m-1 
and  k = 14  m-1.  Repeat  groups  are  also  produced.  But  they  are  displaced  along  the  x axis 
away  from  the  central  group. 


certain  range  Ak,  each  with  infinitesimally  different  wave  numbers,  the  re- 
peat groups  are  displaced  infinitely  far  away,  and  only  the  central  group  re- 
mains. This  is  so  because  then  there  is  no  length  of  the  x axis  in  which  an 
exactly  half-integral  number  of  cycles  fits  for  every  one  of  the  infinite 
number  of  component  sinusoids.  Born's  postulated  relation  between  the 
probability  of  finding  a particle  and  the  squared  amplitudes  of  the  asso- 
ciated wave  implies  that  such  a wave  function,  describing  a single  group, 
can  be  associated  with  a particle  whose  position  is  known  to  be  somewhere 
in  a region  centered  on  x = 0.  However,  the  width  of  the  wave  group  must 
be  the  same  as  the  uncertainty  in  the  position  of  the  particle. 

Precisely  what  is  meant  by  the  “width”  of  a wave  group  like  the  one  in 
Fig.  30-17  can  be  specified  by  definition.  There  are  several  ways  to  do  this. 
The  one  that  we  use  defines  the  width  as  the  separation  between  the  point 
where  the  dashed  envelope  encompassing  the  oscillations  in  the  amplitude 
ijj(x)  has  its  maximum  value  1 and  the  point  where  the  envelope  curve  has 


1470 


Particle-Wave  Duality 


x (in  m 


Fig.  30-17  T he  sum  of  eight  cosine  waves  with  wave  numbers  ranging  uniformly  from  k = 
10  m-1  to  k = 14  m_I,  each  having  amplitude  i,  forms  a central  group.  It  is  very  similar  to  the 
one  formed  by  summing  two,  or  four,  cosine  waves  with  wave  numbers  in  the  same  range.  Re- 
peat groups  are  also  produced.  But  they  are  displaced  along  the  x axis  even  farther  away  from 
the  central  group.  The  dashed  "envelope"  curve,  which  envelops  the  central  group,  is  used  to 
describe  the  shape  of  the  group.  Its  width  is,  by  definition,  the  distance  Ax  measured  from  the 
point  where  the  envelope  curve  has  its  central  value  1 to  the  point  where  it  has  fallen  to  the 
value  1/V2  = 0.71. 


the  value  1/V2.  At  the  latter  point  the  corresponding  envelope  for  the  in- 
tensity i \)2(x)  has  the  value  (l/\/2)2  = b So  the  separation  is  between  the 
full-intensity  and  half-intensity  points  of  the  group.  Since  the  group  has  a 
symmetrical  part  for  negative  values  of  x (not  shown  in  the  figure),  the  sep- 
aration between  the  two  points  is  actually  the  group’s  half-width  at  half- 
maximum intensity.  But  we  will  usually  call  it  just  the  width  Ax  of  the  group. 
The  symbol  is  the  same  as  that  used  for  the  position  uncertainty  of  the  asso- 
ciated particle,  since  we  will  soon  equate  the  two  quantities.  For  the  group  in 
Fig.  30-17,  the  half-maximum  intensity  point  (where  the  amplitude  enve- 
lope shown  with  dashed  lines  has  fallen  to  the  value  l/\/2  = 0.71)  is  at  x — 
0.6  m.  Flence  the  width  of  that  group  is  Ax  — 0.6  m. 

The  width  Ax  of  a group  can  be  adjusted  to  equal  the  position  uncer- 
tainty of  the  particle  with  which  it  is  to  be  associated  by  changing  the  range 
Ak  of  the  wave  numbers  of  the  group's  component  sinusoids.  The  relation 
between  Ax  and  Ak  is  demonstrated  in  Examples  30-8  and  30-9. 


EXAMPLE  30-8 

Run  the  cosine  superposition  program  with  the  following  set  of  parameters: 
k = 24  (in  m-1):  Ak  = 2 (in  m-1);  n = 8;  8x  = 0.04  (in  m). 

■ This  wave  group,  obtained  by  doubling  the  value  of  k,  is  plotted  in  Fig.  30-18. 
The  figure  shows  that  the  group  still  has  the  same  width  Ax  = 0.6  m as  the  group  in 
Fig.  30-17,  for  which  k = 12  m-1  and  all  the  other  quantities  are  the  same.  Hence, 
you  can  conclude  that  Ax  does  not  depend  significantly  on  k. 


30-7  The  Uncertainty  Principles  1471 


(x)'l'  (x)^ 


Fig.  30-18  A wave  group  formed  by  summing  eight  cosine  waves  whose  wave  numbers  range 
uniformly  from  k = 22  m-1  to  A = 26  rrT1.  Its  width  is  the  same  as  that  of  the  group  shown  in 
Fig.  30-17,  which  was  formed  by  summing  cosine  waves  with  wave  numbers  ranging  from  k = 
10  m_l  to  k = 14  m_1. 


EXAMPLE  30-9  — 

Run  the  cosine  superposition  program  with  the  following  set  of  parameters: 
k = 12  (in  m-1);  \k  = 4 (in  m-1);  n = 8;  8x  = 0.04  (in  m). 

■ This  wave  group  is  plotted  in  Fig.  30-19.  Its  width  is  Ax  — 0.3  nr.  1 he  reduc- 


Fig.  30-19  A wave  group  formed  by  summing  eight  cosine  waves  whose  wave  numbers  range 
uniformly  from  k = 8 m-1  to  k = 16  m-1.  It  is  half  as  wide  as  the  group  shown  in  Fig.  30-17, 
which  was  formed  by  summing  cosine  waves  with  wave  numbers  ranging  from  k — 10  m-1  to 
k = 14  m-1. 


x (in  m) 


x (in  m) 


1472  Particle-Wave  Duality 


tion  in  the  width  of  the  group  from  Ax  — 0.6  m to  Ax  — 0.3  m is  a result  of  the  in- 
crease in  the  range  of  wave  numbers  in  the  sinusoids  that  form  the  group  from 
Ak  = 2 m-1  to  Ak  = 4 m_1.  Since  these  wave  numbers  (and  the  corresponding  wave- 
lengths) cover  twice  as  large  a range  of  values,  the  sinusoids  get  out  of  phase  twice  as 
rapidly  as  x increases  from  x = 0.  This  makes  the  group  half  as  wide.  It  is  apparent 
that  Ax  is  proportional  to  1 /Ak. 


Examples  30-7c  and  30-9  show  that  the  width  Ax  of  a group  is  inversely 
proportional  to  the  range  Ak  of  wave  numbers  found  in  the  group.  Thus  a 
narrow  wave  group  necessarily  contains  a broad,  range  of  wave  numbers.  The  in- 
verse proportionality  means  the  product  Ax  Ak  is  a constant.  We  can  deter- 
mine the  value  of  this  constant  by  using  the  values  of  Ax  and  Ak  from  any  of 
the  examples.  Let  us  do  it  with  the  pair  Ax  — 0.6  m,  Ak  = 2 m-1  of  Ex- 
ample 30-7c.  The  result  is  the  dimensionless  number  Ax  Ak  — 1.2.  (Note 
that  this  result  does  not  depend  on  the  units  used  to  measure  x and  k;  they 
could  just  as  well  be  nanometers  and  reciprocal  nanometers  as  meters  and 
reciprocal  meters.)  The  same  result  is  obtained  for  any  group  composed  of 
a mixture  of  cosines  of  equal  amplitude,  or  of  sines  of  equal  amplitude, 
with  uniformly  distributed  wave  numbers. 

A somewhat  smaller  product  Ax  Ak  results  if  the  amplitudes  of  the 
component  sinusoids  are  adjusted  so  that  the  one  with  the  central  value  of 
kj  has  the  largest  amplitude  and  the  amplitudes  of  the  others  decrease 
smoothly  as  kj  departs  from  this  value.  In  this  case,  Ak  is  defined  as  the 
range  in  kj  from  the  central  value  to  the  value  for  the  component  sinusoids 
whose  intensity  is  half  that  of  the  central  component.  A rather  complicated 
mathematical  analysis  shows  that  the  optimal  adjustment  reduces  the  prod- 
uct Ax  Ak  to  a minimum  value.  This  minimum  value  is  quite  close  to  the 
value  Ax  Ak  = b On  the  other  hand,  an  arbitrarily  large  value  of  Ax  Ak  can 
be  obtained  by  adjusting  the  relative  phases  of  the  sinusoids  so  that  they  are 
not  all  in  phase  at  x = 0.  In  conclusion,  we  can  say  that  sinusoids  whose 
wave  numbers  cover  a range  Ak  superpose  to  form  a wave  group  of  width 
Ax,  where  in  all  circumstances  the  condition 

Ax  Ak  - i (30-24) 

is  satisfied. 

A wave  group  moving  through  space  takes  a certain  time  to  pass  a 
given  position.  We  will  designate  by  the  symbol  At  the  interval  from  the 
time  of  arrival  of  the  half-maximum  intensity  point  on  its  leading  edge  to 
the  time  of  arrival  of  the  maximum  intensity  point.  This  is  actually  the 
half-duration  of  the  group,  just  like  Ax  is  actually  its  half-width.  But  we 
usually  call  it  simply  the  duration  At  of  the  group. 

Since  the  sinusoids  forming  the  group  have  wave  numbers  in  a range 
k ± Ak,  they  also  have  angular  frequencies  in  a range  w ± Aw.  And  the 
duration  At  of  the  group  is  related  to  its  angular  frequency  range  Aoi  by  a 
limit  that  looks  just  like  the  one  relating  Ax  and  Ak.  That  is, 

At  Aw  (30-25) 

The  similarity  between  Eqs.  (30-24)  and  (30-25)  is  no  accident.  They 
can  both  pertain  to  the  same  wave  group  if  it  is  being  viewed  by  two  dif- 
ferent observers  (see  Sec.  12-2).  For  one  observer  the  group  is  hxed  in  loca- 


30-7  The  Uncertainty  Principles  1473 


non,  and  so  she  can  describe  it  mathematically  by  means  of  the  function  in 
Eq.  (30-23), 

4>(x)  = V COS(kjX) 

giving  its  x dependence  at  a particular  time.  The  second  observer  is  moving 
toward  the  hist,  so  that  from  his  point  of  view  the  wave  group  passes  by.  He 
describes  it  by  the  function 

MO  = X cosK-0  (30-26) 

specifying  its  t dependence  at  his  particular  location  in  his  reference  frame. 
1 he  similarity  between  the  two  descriptions  of  the  wave  group  will  allow 
you  to  rephrase  all  the  arguments  following  Eq.  (30-23),  including  the  plots 
in  the  figures,  so  that  they  apply  to  Eq.  (30-26).  Simply  replace  x by  t and  kj 
by  o)j  everywhere.  I his  will  lead  to  Eq.  (30-25)  in  the  same  way  that  the 
arguments  led  to  Eq.  (30-24). 

The  relations  we  have  obtained  between  the  width  in  space  of  a group  composed 
of  sinu'soids  and  its  wave  number  content,  and  between  the  duration  in  time  of  the 
group  and  its  angular  frequency  content,  are  universal  properties  of  waves.  They 
apply  whether  the  group  represents  a wave  in  a stretched  string,  or  an  elec- 
tromagnetic wave,  or  a matter  wave.  The  reason  is  that  the  relations  are  es- 
sentially kinematical.  Nothing  in  the  calculations  leading  to  them  was  really 
specific  to  a particular  type  of  wave. 

An  interesting  situation  involving  Eq.  (30-25)  and  electromagnetic  waves 
arises  in  connection  with  the  transmission  of  television  signals.  A television  pic- 
ture is  formed  from  a pattern  containing  about  2 x 105  illuminated  dots.  The 
brightness  of  each  is  governed  by  the  equivalent  of  a single  wave  group.  New  pic- 
tures are  formed  at  a rate  of  about  30  per  second.  So  the  television  signal  being  re- 
ceived contains  the  equivalent  of  approximately  6 x 106  electromagnetic  wave 
groups  per  second.  For  adequate  separation,  the  half-duration  At  of  each  can  be  no 
more  than  about  one-quarter  of  their  separation  in  time,  or  about  1/(24  x 106  s_I). 
According  to  Eq.  (30-25),  the  range  of  angular  frequencies  contained  in  such  a 
group  is  at  least  Aw  = l/2At.  Thus  the  angular  frequency  range  is  at  least  Aw  = 
12  x 106  s-1.  Expressed  in  terms  of  frequency  v = w/l-rr,  this  is  a frequency  range 
of  Ac  = Aw/277  — 2 x 106 s_1  = 2 x 10,!  Hz.  Hence  the  signal  from  a single  televi- 
sion channel  contains  frequencies  in  the  range  v ± Av,  where  Av  — 2 x 106  Hz. 
The  full  width  of  this  range  is  about  4 x 106  Hz.  This  is  called  the  bandwidth.  The 
AM  broadcast  band  extends  from  v — 0.5  x 106  Hz  to  v — 1.5  x 106  Hz.  It  is  not 
wide  enough  to  accommodate  even  one  television  channel.  This  is  why  fre- 
quencies around  108  Hz  must  be  used  for  television  (see  Example  27-4).  At  such 
frequencies  each  channel  takes  up  a reasonably  small  part  of  the  spectrum.  Can 
you  explain  why  many  different  AM  radio  signals  fit  into  the  AM  broadcast  band? 

As  has  been  said,  Eqs.  (30-24)  and  (30-25)  are  satisfied  by  wave  groups 
of  any  type.  Let  us  apply  them  to  a matter  wave  group  and  its  associated 
material  particle  or,  equally  well,  to  an  electromagnetic  wave  group  and  its 
associated  particle  of  radiation.  We  do  this  by  combining  the  equations  with 
the  Einstein-de  Broglie  relations  and  then  using  Born’s  postulate  to  help 
us  interpret  the  results.  First,  we  express  Eq.  (30-24), 

Ax  Ak  i 

in  terms  of  wavelength.  This  is  done  by  employing  the  relation  k = 2 77/A.  to 
write  Ak  = A(2 7t/ A.)  = 2 77-  A(  1 / A).  We  obtain 

1474  Particle-Wave  Duality 


Ax  2tt  A 


2 


or 


Ax  A 


477 


Now  we  multiply  by  Planck’s  constant  h,  to  produce 

h 


Mi 


477 


Applying  Eq.  (30-18),  the  Einstein-de  Broglie  relation  p = h/K,  we  get 

h 


Ax  Ap 


477 


Using  Born’s  postulate,  we  next  equate  the  width  of  the  group  to  the  uncer- 
tainty in  the  position  of  the  associated  particle.  So  the  width  Ax  in  this  ex- 
pression now  stands  for  a particle’s  position  uncertainty.  The  particle  also 
has  a momentum  uncertainty  A p because  its  associated  wave  does  not  have 
a unique  de  Broglie  wave  number,  but  rather  a distribution  of  wave 
numbers  covering  the  range  k ± AT  The  corresponding  momentum  range 
is  p ± A p.  Having  identified  the  quantities  on  the  left  side  of  the  expres- 
sion, we  recognize  it  to  be  the  position-momentum  uncertainty  principle. 

Deriving  the  time-energy  uncertainty  principle  involves  combining  the 
property  of  wave  groups  given  by  Eq.  (30-25), 

At  Aco  3=  i 

and  the  Einstein-de  Broglie  relation  of  Eq.  (30-17): 

E = hv 

We  use  the  definition  a>  = 2ttv  to  express  the  property  of  wave  groups  as 

1 


At  Av 


477 


Next  we  employ  the  Einstein-de  Broglie  relation  to  write 

A E 


nv 


The  two  together  give  us 


At  A E 


A 

477 


Then  we  invoke  Born’s  postulate  to  equate  the  duration  of  the  wave  group 
to  the  time  interval  during  which  the  associated  particle  can  be  found  in  the 
vicinity  of  an  energy-measuring  apparatus  past  which  it  is  moving.  That 
time  interval  is  the  time  available  to  make  the  energy  measurement,  and  it 
is  therefore  identified  as  the  A t in  this  expression.  The  particle’s  energy  is 
uncertain  since  its  associated  wave  has  a distribution  of  angular  frequencies 
in  the  range  a>  ± A a>.  The  corresponding  energy  range  is  E ± A E.  So  A E is 
its  energy  uncertainty,  and  the  expression  is  the  time-energy  uncertainty 
principle. 


30-7  The  Uncertainty  Principles  1475 


Fig.  30-20  The  Bohr  microscope 
thought  experiment. 


Niels  Bohr  devised  a derivation  of  the  position-momentum  uncer- 
tainty principle  which  is  much  more  physical  than  the  one  presented  above. 
He  argued  that  an  observer  actually  measuring  the  x position  of  a particle 
would  use  a microscope,  as  in  Fig.  30-20.  The  uncertainty  Ax  in  the  mea- 
surement is  determined  by  the  size  of  the  diffraction  pattern  image  that  the 
microscope  forms  of  a point  object.  The  quantity  5 in  Eq.  (29-36)  is  the 
half-width  of  the  image  measured  to  the  zero-intensity  value.  This  is  about 
twice  the  half-width  measured  to  the  half-maximum  intensity  value.  If  we 
continue  to  define  Ax  in  terms  of  half-width  at  half-maximum  intensity  by 
setting  Ax  = 6/2,  the  equation  gives 


0.3A. 

sin(«/2) 


(30-27) 


Here  A is  the  wavelength  of  the  light  from  the  particle  that  enters  the  objec- 
tive lens  of  the  microscope,  and  a is  the  angle  subtended  by  that  lens  at  the 
particle. 

In  the  process  of  measuring  the  particle’s  position  along  the  x axis,  an 
uncertainty  is  introduced  in  its  momentum  along  that  direction.  This 
comes  about  because  the  photons  of  illuminating  light  carry  momentum. 
Part  of  the  momentum  is  transferred  to  the  particle  when  they  are  scat- 
tered by  it,  as  they  must  be  if  they  are  to  produce  an  image  of  it.  The  exact 
amount  of  momentum  transfer  depends  on  the  scattering  angle.  But  that 
angle  is  not  known  since  the  photons  can  leave  the  particle  anywhere  within 
the  angular  range  ±a/2  and  still  end  up  in  the  observer’s  eye.  In  an  idea- 
lized measurement,  the  observer  minimizes  this  uncertainty  by  reducing 
the  level  of  illumination  and  by  using  a light  detector  more  sensitive  than 
the  eye,  so  that  the  position  can  be  observed  when  only  one  photon  is  scat- 
tered from  the  particle.  But  there  is  still  some  uncertainty  in  the  particle’s 
momentum  along  the  x axis  after  the  measurement. 

If  the  wavelength  of  the  light  entering  the  objective  lens  is  A,  the 
Einstein-de  Broglie  relation,  Eq.  (30-18),  says  that  the  momentum  of  the 
photon  after  scattering  from  the  particle  has  magnitude  h/k.  But  although 
the  magnitude  of  the  scattered  photon’s  momentum  is  known,  its  compo- 
nent along  the  x axis  is  uncertain.  In  fact,  this  component  can  be  anywhere 
in  the  range  - (h/k)  sin(a/2)  to  +(h/k)  sin(a/2).  Since  the  photon  had  no 
momentum  along  the  x axis  before  being  scattered  by  the  particle,  mo- 
mentum conservation  requires  that  there  be  momentum  along  that  axis 
transferred  to  the  particle.  And  the  amount  transferred  is  just  as  uncertain. 
Hence  the  uncertainty  in  the  particle’s  momentum  along  the  x axis  is 


A p = ~ sin(a/2) 

A 


(30-28) 


after  its  position  has  been  measured  by  the  photon  being  scattered  from  it. 
The  particle's  position-momentum  uncertainty  product  immediately  after 
the  measurement  is 


. . 0.3A  h . , 

Ax  = ■ ( /9\T  sin(a/2) 
sin(a/2)  A 

or 

Ax  A p = 0.3 h (30-29) 

Phis  value  of  the  position-momentum  uncertainty  product  is  in  reasonable 
agreement  with  the  limiting  value  Ax  A p = Ji/Att  obtained  in  the  wave 
group  derivation. 


1476 


Particle-Wave  Duality 


Bohr’s  derivation  attributes  the  origin  of  the  position-momentum  un- 
certainty principle  to  particle-wave  duality  of  the  radiation  that  must  in- 
teract with  an  object  if  its  position  is  to  be  observed.  Specifically,  it  is  a di- 
rect consequence  of  the  fact  that  the  radiation  is  composed  of  photons.  If 
photons  did  not  exist,  so  that  there  was  no  granularity  in  electromagnetic 
radiation,  the  observer  could  reduce  the  level  of  illumination  to  the  point  at 
which  the  momentum  transferred  to  an  object  would  be  completely  neg- 
ligible. This  could  be  done  while  maintaining  the  resolution  of  the  micro- 
scope, if  a sufficiently  sensitive  light  detector  were  available.  Hence  there 
would  be  no  lower  limit  to  the  product  of  the  uncertainties  in  the  position 
and  momentum  of  the  object  after  the  observation.  But,  in  fact,  photons  do 
exist.  So  either  a photon  is  scattered  from  the  object,  leading  to  the 
position-momentum  uncertainty  product  obtained  in  the  derivation,  or 
else  absolutely  no  electromagnetic  radiation  at  all  is  scattered  from  it,  and 
no  measurement  of  its  position  is  made. 

You  may  wish  to  argue  that  the  object  did  have  a precise  position  and  mo- 
mentum before  the  measurement  and  that  the  measurement  merely  disturbed 
them.  But  how  can  you  know  these  precise  values,  since  you  have  not  measured 
them?  Thus  the  physicist  will  argue  that  quantities  which  cannot  be  measured  can 
have  no  physical  meaning  and  must  not  take  any  part  in  physical  theory. 

What  is  the  significance  of  the  fact  that  both  the  wavelength  k of  the 
radiation  used  by  the  observer  and  the  angle  a subtended  by  the  objective 
lens  of  the  microscope  cancel  in  the  derivation?  The  assumption  made  in 
the  derivation  that  the  photon  has  no  momentum  along  the  x axis  before 
being  scattered  from  the  particle  is  equivalent  to  the  assumption  that  its 
momentum  in  the  x direction  is  then  known,  with  essentially  no  uncer- 
tainty, to  be  zero.  Why  does  this  not  contradict  the  position-momentum  un- 
certainty principle  applied  to  the  photon? 

You  now  have  seen  two  different  derivations  of  the  position- 
momentum  uncertainty  principle.  In  Example  30-10  you  will  see  it  applied 
in  two  different  cases. 


EXAMPLE  30-10 

a.  The  horizontal  position  of  a grain  of  sand  of  mass  1 x 10-5  kg,  falling  verti- 
cally through  a vacuum,  is  measured  by  a microscope  that  uses  a single  light  photon 
of  wavelength  5 x 10-7  m and  an  objective  lens  subtending  an  angle  of  30°.  Evalu- 
ate the  uncertainty  in  the  horizontal  position  that  is  measured  for  the  particle  and 
the  uncertainty  in  its  horizontal  momentum  after  the  measurement.  Comment  on 
the  results. 

■ The  position  uncertainty  is  evaluated  from  Eq.  (30-27),  which  gives  you 


0.3X  _ 0.3  x 5 X 10-7  m 

sin(a/2)  sin(30°/2) 


6 X 10  7 m 


Then  you  can  evaluate  the  momentum  uncertainty  from  Eq.  (30-29),  obtaining 


0.3  h 

~a7 


0.3  x 6.6  x 10~34  J-s 
6 x 10-7  m 


3 x 10  28  kg-m/s 


Although  the  position  measurement  is  quite  accurate  on  the  macroscopic  scale, 
the  momentum  uncertainty  introduced  by  the  measurement  is  completely  neg- 
ligible on  that  scale.  Even  as  small  an  object  as  a grain  of  sand  is  so  far  from  the 


30-7  The  Uncertainty  Principles  1477 


quantum  domain  that  the  position-momentum  uncertainty  principle  is  of  no  conse- 
quence at  all  to  it.  ■ 

b.  An  electron  of  kinetic  energy  1 eV  is  one  of  many  comprising  a vertical 
beam.  Its  horizontal  position  is  measured  by  the  same  microscope.  Determine  the 
uncertainties  in  its  horizontal  position  and  momentum  after  the  measurement. 
Then  compare  the  magnitudes  of  the  horizontal  momentum  uncertainty  and  the 
initial  vertical  momentum,  and  comment. 

■ Since  Eq.  (30-27)  involves  only  the  characteristics  of  the  microscope,  and  Eq. 
(30-29)  only  the  value  of  Planck's  constant,  you  obtain  the  same  results  for  the  posi- 
tion uncertainty  Ax  and  the  momentum  uncertainty  A p: 

Ax  = 6 x 10“7  m 

Ap  = 3 x 10-28  kg- m/s 

The  initial  vertical  momentum  of  the  electron  is 

p = V2 mK  = V2  x 9.1  x IQ-31  kg  x 1 eV  x 1.6  x 10“19  J/eV 

or 

p = 6 x 10-25  kg- m/s 

Compared  to  this,  the  horizontal  momentum  uncertainty  introduced  by  the  hori- 
zontal position  measurement  is  not  so  negligible.  The  horizontal  momentum  uncer- 
tainty will  cause  the  direction  of  motion  of  the  electron  to  change  by  an  angle  whose 
magnitude  can  be  as  large  as  about  Ap/p  = 5 x 10-4  rad.  An  electron  deflected 
through  such  an  angle  and  continuing  to  travel  1 m would  move  through  a horizon- 
tal distance  of  almost  1 mm.  This  could  be  enough  to  take  it  out  of  the  vertical 
beam.  Furthermore,  on  the  microscopic  scale  the  uncertainty  Ax  = 6 x 10-7  m in 
the  position  measurement  is  quite  large,  since  it  amounts  to  something  like  1000 
atomic  diameters.  If  a more  accurate  measurement  were  made  of  the  electron’s  hor- 
izontal position  by  using  radiation  of  shorter  wavelength,  it  would  necessarily  lead 
to  an  increase  in  the  uncertainty  of  its  horizontal  momentum  because  a shorter- 
wavelength  photon  carries  more  momentum. 


A striking  conclusion  can  be  drawn  from  the  discussion  closing  Ex- 
ample 30-10.  An  increase  in  the  electron’s  horizontal  momentum  means  an 
increase  in  the  uncertainty  in  the  horizontal  distance  it  will  move  as  time 
passes.  So  the  more  accurately  the  horizontal  position  of  the  electron  is 
known  immediately  after  the  measurement,  the  less  accurately  it  will  be 
known  later.  Thus  the  position-momentum  uncertainty  principle  makes  it  impos- 
sible to  predict  the  exact  motion  of  an  electron  or  of  any  other  particle  in  the  quantum 
domain.  This  is  why  the  mechanics  of  the  quantum  domain  must  deal  with 
probabilities  and  why  the  behavior  of  systems  in  that  domain  seems  to  have 
an  essentially  random  character.  (The  implications  of  this  conclusion  even 
enter  philosophy.  It  is — or  should  be — central  to  the  arguments  con- 
cerning determinism.  The  randomness  of  microscopic  systems  would  seem 
to  make  it  more  difficult  to  accept  the  idea  that  the  behavior  of  the  universe 
is  foreordained.  Do  you  agree?) 

The  position-momentum  uncertainty  principle  leads  to  many  other 
important  conclusions  concerning  the  quantum  domain,  some  of  which  are 
considered  in  Chap.  31.  Many  physicists  believe  that  an  example  from  the 
domain  of  macroscopic  systems  is  found  in  the  way  this  uncertainty  princi- 
ple provides  an  answer  to  a basic  question  discussed  in  Sec.  18-7.  The  ques- 
tion is:  How  does  nature  contrive  to  define  the  direction  of  the  “arrow  of 
time”  by  the  behavior  of  macroscopic  systems,  even  though  the  equations 


1478 


Particle-Wave  Duality 


governing  the  individual  interactions  of  their  microscopic  constituents 
have  the  same  form  if  the  direction  of  flow  of  time  is  reversed  by  changing  t 
into  —t?  These  equations  predict  that  if  you  set  up  an  isolated  system  in 
which  the  initial  momentum  of  every  constituent  were  exactly  the  reverse 
of  the  momentum  of  a constituent  in  a natural  system,  but  with  the  initial 
positions  being  exactly  the  same,  then  your  system  would  subsequently 
evolve  to  a state  of  increased  order  and  lower  entropy.  But  you  cannot  do 
this,  even  in  principle,  because  you  cannot  measure  the  exact  position  and 
momentum  of  even  one  of  the  constituents  of  the  natural  system.  So  you 
cannot  produce  an  isolated  system  which  violates  the  natural  tendency  to 
increase  its  disorder  as  ii  grows  older.  No  one  can. 

Example  30-1 1 applies  the  time-energy  uncertainty  principle  to  a typi- 
cal case. 


EXAMPLE  30-11 


As  discussed  in  Chap.  3 1,  the  energy  of  a nucleus  is  quantized  so  that  it  is  restricted 
to  a set  of  certain  well-separated  values.  Iridium- 191  nuclei  in  a sample  are  taken 
from  their  normal  state  to  their  state  of  next  higher  energy  by  bombarding  them 
with  y rays.  A nucleus  entering  this  "excited  state"  remains  there  for  a time  and 
then  returns  to  its  normal  state  by  emitting  a y ray.  Because  the  nucleus  spends  only 
a finite  time  in  the  excited  state,  only  a finite  time  is  available  for  measuring  its  en- 
ergy in  that  state.  Thus  the  uncertainty  principle  At  A E 5=  h/ 4n  puts  limitations  on 
the  accuracy  with  which  the  measurement  can  be  made.  The  actual  time  spent  in 
the  excited  state  varies  from  nucleus  to  nucleus.  But,  averaged  over  the  sample,  the 
time-energy  uncertainty  principle  can  be  employed  by  using  the  value  A t = 1.4  x 
10-10  s.  How  accurately  can  the  energy  of  the  excited  state  be  determined? 

■ The  time-energy  uncertainty  principle  says  that  the  uncertainty  in  the  energy 
measurement  cannot  be  less  than 


Evaluating  h and  At,  you  find 


A E 


h 

47 T A I 


6.6  x hr34  J-s 
~~  4tt  x 1.4  x 10”10  s 
= 3.8  x 10-28  J = 2.3  x 1CT6  eV 


Another  way  to  express  the  statement  made  by  the  time-energy  uncertainty 
principle  is  to  say  it  predicts  that  in  a collection  of  nuclei  excited  to  the  energy  state 
in  question,  each  will  have  a slightly  dif  ferent  excitation  energy  in  the  range  E ± 
A E,  where  A E = 2.3  x 1()~6  eV.  Consequently,  when  they  return  to  their  normal 
state,  the  energies  carried  off  by  the  y ray  that  each  emits  will  form  a distribution 
extending  over  that  range.  The  central  energy  E of  the  distribution  is  measured  to 
be£  = 0.129  MeV  = 0.129  x 106  eV.  So  the  fractional  full  width  of  the  distribution, 
2A E/E  = 3.6  x 10“u,  is  extremely  small.  Nevertheless,  it  has  been  measured  by 
using  the  so-called  Mossbauer  effect.  That  experiment  and  similar  ones  provide 
direct  confirmation  of  the  time-energy  uncertainty  principle. 


30-7  The  Uncertainty  Principles  1479 


EXERCISES 

Group  A 

30-1.  The  longest  wavelength.  Evaluate  the  longest 
wavelength  at  which  light  is  able  to  produce  the  photoelec- 
tric effect  on  sodium,  taking  its  work  function  to  be 
2.2  eV. 

30-2.  Photoelectric  effect  on  aluminum.  Light  of  fre- 
quency 1.5  x 1015  Hz  is  incident  on  an  aluminum  surface, 
which  has  a work  function  of  4.2  eV. 

a.  What  is  the  maximum  kinetic  energy  of  the  photo- 
electrons? 

b.  What  is  the  stopping  potential? 

c.  What  is  the  cutoff  frequency? 

30-3.  Find  the  frequency.  Photoelectrons  are  emitted 
from  a certain  material  illuminated  by  light  of  frequency 
7.5  X 1014  Hz,  and  the  stopping  potential  is  measured  to 
be  0.70  V.  Then  the  frequency  of  the  light  is  changed,  and 
the  stopping  potential  is  measured  to  be  1.45  V.  Find  the 
frequency  used  in  the  second  measurement. 

30-4.  Comparing  photon  and  molecular  energies. 

a.  Compare  the  energy  of  a photon  of  visible  light  ot 
wavelength  500  nm  with  the  average  kinetic  energy  of  a 
helium  molecule  at  room  temperature,  300  K.  Hint:  See 
Eq.  (18-18). 

b.  Compare  their  momenta. 

30-5.  Hope  it  doesn't  rust. 

a.  Photons  of  wavelength  0.030000  nm  are  incident 
on  an  iron  foil.  Calculate  the  wavelength  of  the 
Compton-scattered  photons  emerging  from  the  foil  at  a 
scattering  angle  of  45°. 

b.  Repeat  part  a for  a scattering  angle  of  135°. 

30-6.  A property  of  the  Compton  wavelength.  Show  that 
the  energy  of  a photon  whose  wavelength  equals  hc , the 
Compton  wavelength,  is  equal  to  the  rest  mass  energy  of 
an  electron. 

30-7.  Compton  scattering  from  protons.  A beam  of 
photons  passes  through  a region  containing  hydrogen 
gas.  Use  a modification  of  Eq.  (30- 15a)  to  evaluate  nu- 
merically the  maximum  wavelength  shift  when  a photon  is 
scattered  from  a free  proton  in  the  gas. 

30-8.  The  de  Broglie  wavelength  of  what  you  breathe.  Cal- 
culate the  de  Broglie  wavelength  of  a nitrogen  molecule 
moving  with  the  average  energy  in  air  at  room  tempera- 
ture, 300  K.  Hint:  See  Eq.  (18-18).  Compare  it  with  the 
diameter  of  the  molecule,  which  is  about  2 X 10“10  m. 

30-9.  The  de  Broglie  wavelength  of  an  alpha  particle.  An 
alpha  particle  is  ejected  from  a uranium  nucleus  with 
4.18  MeV  of  kinetic  energy.  Assume  that  the  alpha  particle 
exists  as  such  within  the  uranium  nucleus  and  has  the 
same  kinetic  energy.  What  is  its  de  Broglie  wavelength? 
Compare  that  wavelength  with  the  diameter  of  the  ura- 
nium nucleus,  which  is  about  2 X 10-14  m. 


30-10.  Comparing  electron  and  proton  de  Broglie  wave- 
lengths. The  rest  mass  energy  of  an  electron  is  0.51 1 MeV 
and  the  rest  mass  energy  of  a proton  is  938  MeV.  Find  the 
de  Broglie  wavelengths  of  an  electron  and  a proton,  each 
with  kinetic  energy  10.0  MeV. 

30-11.  The  shy  electron.  Assume  that  an  extremely 
high  resolution  microscope  (or  the  equivalent)  is  used  to 
determine  the  x coordinate  of  an  electron  in  an  atom  to 
within  the  accuracy  Ax  = 2 x 10~n  m,  about  10  percent  of 
the  atomic  diameter.  What  is  the  smallest  possible  uncer- 
tainty in  the  x component  of  velocity  of  the  electron  just 
after  its  x coordinate  is  measured? 

30-12.  A property  of  the  de  Broglie  wavelength.  The  posi- 
tion of  a particle  moving  along  the  x axis  is  determined  to 
within  an  uncertainty  about  equal  to  its  de  Broglie  wave- 
length. Prove  that  the  minimum  uncertainty  in  its  velocity 
must  then  be  about  equal  to  its  velocity. 

30-13.  Neutron  decay.  A neutron  not  in  a nucleus  is  an 
unstable  particle.  In  free  space  a neutron  has  a lifetime, 
on  the  average,  of  930  s before  decaying  into  other  par- 
ticles. 

a.  Make  a calculation  similar  to  the  one  in  Example 
30-1 1 to  determine  the  uncertainty  in  your  ability  to  know 
the  total  energy  of  a neutron  in  tree  space. 

b.  Calculate  the  corresponding  uncertainty  in  the 
rest  mass  energy  of  a neutron  having  negligible  kinetic  en- 
ergy. 

c.  Calculate  the  uncertainty  in  the  rest  mass  of  the 
neutron,  expressing  your  results  as  a fraction  of  the  rest 
mass. 


Group  B 

30-14.  Time  lag  for  the  “ classical ” photoelectric  effect.  A 
10-W  light  source  is  placed  1 .0  m from  a clean  sodium  sur- 
face. The  radius  of  a sodium  atom  is  about  1.0  x 10~10  m. 

a.  Assuming  that  an  atom  absorbs  all  the  light  energy 
incident  on  it  and  that  the  energy  is  not  lumped  into 
photons,  find  the  energy  absorbed  per  second  by  a single 
atom. 

b.  It  takes  2.2  eV  to  remove  an  electron  from  sodium 
metal.  How  long  does  it  take  an  atom  to  absorb  this  much 
energy? 

30-15.  Analyzing  a photoelectric  experiment.  A student 
obtains  the  following  data  for  a photoelectric  effect  exper- 
iment: 

Wavelength  (in  nm)  Stopping  potential  (in  V) 

100  10.0 

150  6.0 

200  3.8 

300  1.9 

400  0.5 


1480  Particle- Wave  Duality 


From  the  data,  find  values  for 

a.  Planck’s  constant 

b.  the  work  function  of  the  target 

c.  the  cutoff  frequency  and  wavelength 

d.  the  stopping  potential  of  the  photoelectrons 
ejected  from  the  target  by  electromagnetic  radiation  of 
wavelength  50  nm. 

30-16.  The  tininess  of  photon  energies,  I.  The  photons 
used  to  produce  medical  and  dental  X-ray  images  have 
energies  of  several  thousand  electron  volts  and  may 
rightly  be  regarded  as  highly  energetic. 

a.  What  is  the  diameter  of  a solid  particle  which 
would  require  the  investment  of  5000  eV  of  energy  to  ele- 
vate it  by  its  own  diameter  against  the  gravitational  attrac- 
tion of  the  earth?  Assume  a spherical  particle  of  density 

3.0  g/cm3. 

b.  How  does  the  result  obtained  in  part  a compare 
with  the  0.5  mm  diameter  of  typical  sand  grains  on  a 
beach? 


Fig.  30E-20 


b.  If  the  photons  from  5 have  an  energy  of  1.0  MeV, 
what  is  the  energy  of  the  Compton-scattered  photons  de- 
tected at  D? 

30-21.  Energetics  of  Compton  scattering,  II.  Show  that 
the  kinetic  energy  of  the  recoil  electron  in  Compton  scat- 
tering is 


(E2/m0c2)(  1 - cos  <f>) 

1 + (E2/m0c2)(  1 - cos  f) 


30-17.  The  tininess  of  photon  energies,  II.  A typical 
candle  flame  emits  visible  light  energy  at  a rate  of  about 

1.0  x IQ-4  W. 

a.  How  many  visible-light  photons  does  the  candle 
flame  emit  per  second?  Assume  that  the  average  energy 
per  photon  is  2.0  eV. 

b.  If  a person  views  the  candle  from  a distance  of 
10  m,  how  many  visible-light  photons  per  second  entei 
each  eye?  Assume  a pupil  diameter  of  0.60  cm. 

c.  In  dark  surroundings,  the  candle  described  is 
barely  visible  when  viewed  from  a distance  of  1.5  km.  How 
many  visible-light  photons  per  second  enter  each  eye  from 
an  object  at  the  limit  of  visibility? 

30-18.  Comparing  photon  and  electron  energies.  What  is 
the  energy  of  a photon  that  has  the  same  momentum  as  a 
100-keV  electron? 

30-19.  Energetics  of  Compton  scattering,  I. 

a.  Show  that  Eq.  (30-14)  is  equivalent  to 


where  E and  £'  are  the  initial  and  final  energies  of  the 
photon. 

b.  If  a photon  scatters  at  180°  from  a free  electron, 
what  was  its  original  energy  if  it  gives  up  half  of  its  energy 
to  the  electron? 

30-20.  An  advantageous  Compton  scattering  geometry. 

a.  In  Fig.  30E-20,  the  circle  represents  a spherical 
shell  of  scattering  material.  S is  a source  of  y-ray  photons 
and  D is  a detector  of  these  photons,  the  two  being  located 
at  diametrically  opposite  points  on  the  shell.  Prove  that  all 
photons  detected  at  D have  scattered  through  the  same 
angle,  90°.  What  experimental  advantage  does  this  geom- 
etry have  over  the  one  illustrated  in  Fig.  30-5? 


where  E is  the  energy  of  the  incoming  photon. 

30-22.  Diffraction  at  low  light  levels.  An  experimenter 
wishes  to  perform  a double-slit  diffraction  experiment 
using  individual  photons.  Specifically,  she  wants  to  estab- 
lish experimental  conditions  under  which  more  than 
ninety-nine  percent  of  the  photons  pass  through  the 
system  at  limes  when  no  other  photon  is  ‘'in  flight.”  I he 
typical  path  length  through  the  apparatus  from  entrance 
slits  to  the  photographic  film  being  used  as  a detector  is 

1.0  m.  In  order  to  obtain  a good  image  of  the  pattern,  the 
experimenter  plans  to  send  1.0  x 108  photons  through  the 
apparatus. 

a.  What  is  the  flight  time  t through  the  apparatus  for 
an  individual  photon? 

b.  What  is  the  maximum  number  per  unit  time  R at 
which  photons  can  be  sent  through  the  apparatus?  Hint: 
Assume  that  the  photons  enter  the  apparatus  at  a definite 
rate  on  a long-term  average,  but  that  the  entrance  times 
are  otherwise  entirely  random.  Then,  of  a large  number 
of  photons  used,  the  fraction / whose  flights  overlap,  even 
slightly,  with  the  flights  of  other  photons  is  given  by  f = 
2Rt,  provided  that / « 1 . 

c.  Based  on  your  result  for  part  b,  what  minimum  ex- 
posure time  is  necessary  to  photograph  the  “individual- 
photon"  diffraction  pattern? 

30-23.  De  Broglie  and  a falling  object. 

a.  Find  the  diameter  d of  a spherical  object  of  density 
p such  that  its  de  Broglie  wavelength  \ after  it  falls 
through  vacuum  a distance  d in  the  gravitational  field  of 
the  earth  has  the  value  \ = d. 

b.  Suppose  that  the  object  is  a water  droplet,  so  that 
p = 1.00  g/cm3.  Compare  the  computed  diameter  to  the 
wavelengths  of  visible  light.  Could  you  expect  to  see  such  a 
droplet? 


Exercises  1481 


30-24.  Probing  atomic  nuclei.  One  important  means  of 
determining  the  size  of  an  atomic  nucleus  is  the  observa- 
tion of  the  way  a nucleus  “scatters”  the  electrons  in  an  inci- 
dent beam.  These  and  other  experiments  indicate  that  nu- 
clei are  a few  ferntis  in  diameter  ( 1 ferrni  = 10-lD  m).  Sup- 
pose that  a researcher  wishes  to  perform  an  electron- 
scattering experiment  capable  of  resolving  details  of  nu- 
clear structure  down  to  1 ferrni.  To  do  so,  he  must  employ 
electrons  whose  de  Broglie  wavelengths  are  no  greater 
than  1 ferrni.  What  minimum  kinetic  energy  is  needed  for 
each  of  the  incident  electrons?  Express  your  result  in 
joules  and  in  electron  volts.  Hint:  Such  electrons  are 
moving  with  speeds  comparable  to  c,  so  you  will  need  to 
use  the  relativistic  expression  that  relates  energy,  mo- 
mentum, and  rest  mass.  However,  the  de  Broglie  relation 
retains  the  simple  form  X = h/p. 


That  is,  a violation  of  the  law  of  energy  conservation  of 
magnitude  A E can  take  place  during  a time  A t which  does 
not  exceed  Ii/4ttAE.  Such  temporary  events  which  seem  to 
violate  a conservation  law  are  called  virtual  events. 

Now  assume  that  a nucleon  emits  a virtual  pion  of 
rest  mass  m.  The  magnitude  of  the  violation  of  the  law  of 
conservation  of  energy  is  then  about  equal  to  the  pion’s 
rest  mass  energy. 

a.  How  long  can  the  pion  remain  free  before  it  must 
be  reabsorbed? 

b.  What  is  the  maximum  distance  the  pion  can  travel 
in  this  time?  This  distance  prot  ides  an  estimate  of  the 
range  of  the  strong  nuclear  force. 

c.  The  observed  range  is  about  2 x 10-15  m.  Use  re- 
sults of  part  b to  estimate  the  rest  mass  energy  of  the  pion. 
Compare  it  with  140  MeV. 


30-25.  Another  feature  of  the  Davisson-Germer  experi- 
ment. In  analyzing  the  Davisson-Germer  experiment,  re- 
fraction of  electrons  entering  or  leaving  the  crystal  was  ig- 
nored to  keep  the  analysis  from  being  too  complicated.  If 
the  speed  of  an  electron  is  va  in  air  and  vc  in  the  crystal, 
with  vc  > va  because  of  the  attraction  of  the  crystal  for  the 
electron,  show  that 

va  sin  6a  = vc  sin  6C 

Here  9„  and  9C  are  the  angles  measured  from  the  normal 
of  the  air-crvstal  surface  to  the  electron  paths  in  air  and  in 
the  crystal.  Compare  this  to  Snell's  law,  Eq.  (28-6),  and  ex- 
plain the  difference  between  the  two. 

30-26.  Music  and  the  properties  of  wave  groups.  A rea- 
sonable criterion  for  preserving  the  musicality  of  an  au- 
dible tone  is  that  the  frequency  spread  Av  must  be  less 
than  or  equal  to  one  percent  of  the  nominal  frequency  v0. 
Under  the  most  favorable  circumstances,  the  frequency 
spread  in  a tone  of  duration  At  is  given  by  At  Av  = 1/477. 
the  minimum  spread  implied  by  Ecp  30-25. 

a.  Find  the  duration  At  of  the  briefest  possible  mu- 
sical tone  at  each  of  the  following  frequencies: 

(i)  v0  = 4100  Hz  (approximately  the  highest  C on  a 
piano  keyboard) 

(ii)  v0  = 440  Hz  (concert  A) 

(iii)  v0  = 33  Hz  (approximately  the  lowest  C on  a 
piano  keyboard) 

b.  How  might  the  results  of  part  a help  to  account  for 
the  paucity  of  examples  in  the  musical  literature  of  pas- 
sages featuring  a rapid  succession  of  very  low  pitches? 

30-27.  Yukawa’s  theory  of  the  strong  nuclear  force.  Yu- 
kawa’s meson  theory  states  that  the  strong  force  which 
binds  nucleons  (protons  and  neutrons)  together  to  form 
nuclei  arises  from  the  exchange  of  particles,  now  called 
pions,  between  the  nucleons.  Since  the  rest  mass  energy  of 
a pion  is  about  140  MeV,  it  is  impossible  for  a nucleon  to 
emit  or  absorb  a pion  without  violating  the  law  of  conser- 
vation of  energv.  However,  the  time-energy  uncertainty 
principle  permits  violations  of  the  law  of  conservation  of 
energv  if  they  do  not  last  long  enough  to  be  observable. 


30-28.  If  you  can’t  beat  it,  join  it.  A very  broad  electron 
beam  contains  electrons  of  precisely  defined  momentum, 
Px  = 0.  Py  = Po-  A naive  experimenter  attempts  to  use  this 
beam  to  beat  the  uncertainty  principle  by  passing  it 
through  a slit  of  width  a in  the  x direction.  He  hopes  that 
this  will  produce  a collimated  beam  of  width  Ax  — a/2 
consisting  of  particles  with  the  same  precisely  defined  mo- 
mentum as  the  original  beam.  See  Fig.  30E-28.  Show  that 
this  attempt  must  fail,  as  follows: 


Fig.  30E-28 


y 


X 


a.  The  electron  beam  will  spread  out  in  a single-slit 
diffraction  pattern.  Find  the  angular  width  of  the  pat- 
tern's central  maximum. 

b.  Find  the  range  of  values  of  px  that  an  electron  with 
py  = p0  can  have  and  still  be  in  the  central  maximum.  Use 
this  to  estimate  the  uncertainty  A px. 

c.  Evaluate  Ax  A px.  and  show  that  the  uncertainty 
principle  is  satisfied. 


Group  C 

30-29.  Energy  density  and  photon  density  in  thermal  radia- 
tion. Within  an  enclosure  whose  walls  are  held  at  tempera- 
ture T,  there  is  a unique  amount  and  spectral  distribution 
of  electromagnetic  radiation  which  will  be  maintained  in 
thermal  equilibrium  with  the  walls.  A correct  combined 
application  of  quantum  mechanics  and  statistical  me- 
chanics shows  that  the  enclosure  contains  an  energy  den- 


1482  Particle-Wave  Duality 


Fig.  30E-34 


sity  e(v)  dv  of  radiation  in  the  frequency  range  v to  v + dv, 
given  by 


8t rhv3  1 

«(*')  dv  = ghvlkT  _ j dv 


where  h is  Planck's  constant,  c is  the  speed  of  light,  and  k is 
Boltzmann's  constant  of  Eq.  (17-1 1). 

a.  Show  that  the  total  energy  density  eT,  that  is,  e inte- 
grated over  all  v , can  be  written  as 


877 

£t  = 


x3  dx 

ex  - 1 


b.  Show  that  the  energy  density  is  given  by  eT  = 
7.57  x 10-16  T4,  when  eT  is  measured  in  J/m3  and  T is  mea- 
sured in  K.  Hint:  The  integral  in  part  a can  be  found  in  a 
table  of  definite  integrals,  or  it  can  be  evaluated  numeri- 
cally. 

c.  Use  the  equation  for  e(v)  dv  and  the  equation  E = 
hv  to  obtain  an  expression  for  n(v)  dv , the  number  of 
photons  per  unit  volume  with  energies  in  the  range  hv  to 
hv  + hdv. 

d.  Integrate  the  expression  obtained  in  part  c to  show 
that  the  total  number  of  photons  per  unit  volume,  nT,  in 
“thermal  radiation"  at  temperature  T is  given  by 

87 T(kT)3  r°°  x2  dx 
'T  h3c3  Jo  ex  — 1 


Incident  beam 


3.00  X 1CT10  m 


3.00  X 10"10  m 


30-34.  Bragg  reflection,  I.  A beam  containing  electrons 
of  kinetic  energies  between  50  and  200  eV  is  incident  on  a 
cubic  crystal,  as  shown  in  Fig.  30E-34.  The  spacing 
between  adjacent  atoms  in  the  crystal  is  3.00  x 10“10  m. 

a.  What  are  the  kinetic  energies  of  all  the  beam  elec- 
trons which  will  be  reflected  through  a right  angle  by  the 
crystal? 

b.  What  are  the  kinetic  energies  of  the  beam  elec- 
trons which  will  be  reflected  straight  back? 


e.  Show  that  the  photon  density  nT  is  given  by  nT  = 
2.03  x 1()7  T3,  when  nT  is  measured  in  photons/m3  and  T 
is  measured  in  K.  Hint:  As  before,  the  integral  can  be 
found  in  a table,  or  it  can  be  evaluated  numerically. 


30-35.  Bragg  reflection , II.  The  cubic  crystal  shown  in 
Fig.  30E-35  has,  among  many  others,  the  set  of  Bragg 
planes  shown. 


30-30.  Average  photon  energy  in  thermal  radiation.  Use 
the  expressions  given  in  parts  b and  e of  Exercise  30-29  to 
show  that  the  average  energy  (E)  per  photon  in  thermal 
radiation  at  temperature  T is  given  by  (E)  = 3.73  x 
10"23  T.  when  (E)  is  measured  in  J/photon  and  T is  mea- 
sured in  K.  Rewrite  (£)  as  a multiple  of  k T.  How  does 
your  result  compare  with  3AT/2,  which  is  the  average 
kinetic  energy  of  a molecule  in  a monatomic  gas  at  tem- 
perature T? 

30-31.  Not  on  free  electrons!  Using  relativistic  me- 
chanics, show  that  the  photoelectric  effect  cannot  occur 
with  free  electrons.  That  is,  show  that  a free  electron 
cannot  absorb  a photon,  with  both  momentum  and  energy 
being  conserved. 

30-32.  More  energy  is  useless.  Show  that  regardless  of 
how  large  the  initial  energy  of  the  photon  in  Compton 
scattering,  the  energy  of  the  scattered  photon  is  less  than 
2 m0c2  if  the  scattering  angle  is  greater  than  60°. 

30-33.  Energetics  of  Compton  scattering , III.  In  a Comp- 
ton scattering  experiment,  the  photon  recoils  backward. 
The  electron  enters  a cloud  chamber  where  there  is  a 
magnetic  field  of  0.150  T.  The  trajectory  of  the  electron  is 
a circle  of  radius  3.00  cm. 

a.  What  is  the  energy  of  the  electron? 

b.  What  was  the  initial  energy  of  the  photon? 


a.  If  the  distance  between  adjacent  atoms  is  4.00  x 
10-10  m,  find  the  spacing  d between  the  planes  and  the 
angle  f3. 

b.  What  is  the  speed  of  the  slowest  electron  which  will 
be  reflected  from  these  planes? 

c.  How  should  the  crystal  be  oriented  so  that  these 
planes  will  reflect  electrons  traveling  at  2.00  X 108  m/s? 

30-36.  The  Fourier  integral.  Equations  (30-23)  and 
(30-26)  can  be  generalized  to  include  cosine  waves  of  dif- 
ferent amplitudes  in  the  sums.  In  addition,  a still  more 
general  case  replaces  the  summations  by  the  so-called 

Fourier  integral,  with  the  results: 


1 jj(x)  = A(k)  cos(kx)  dk 


Exercises  1483 


and 


In  this  exercise  you  will  investigate  the  relationship 
between  the  spread  of  wave  numbers  of  frequencies  of 
such  wave  groups  and  their  spatial  width  or  temporal 
duration. 

a.  In  the  first  equation,  let 

f 0 for  k < k0  — Ak 
A(k)  = I 1 for  k0  — Ak  < k < k0  + Ak 
If)  for  k > k0  + Ak 

where  Ak  « k0.  Do  the  integration  to  find  t fix).  Plot  i/j(x) 
and  find  its  width  Ax,  defined  as  in  two  paragraphs  above 
Example  30-8.  What  is  the  relation  between  Ax  and  Ak? 
Compare  it  with  Eq.  (30-24). 

b.  In  the  second  equation,  let  B(co)  = c_<UT,  where  r is 
a constant.  Do  the  integration  to  find  ip(t).  Plot  t fit)  and 
B(w).  Find  the  duration  At  and  the  frequency  spread  Aw 
of  the  wave  pulse,  defined  analogously  to  the  definition  of 
Ax.  What  is  the  relation  between  At  and  Aw?  Compare  it 
with  Eq.  (30-25). 

30-37.  Inexpensive  electron  gun.  An  engineer  is  de- 
signing an  inexpensive  “electron  gun"  for  a TV  tube.  It 
uses  only  a heated  electron-emitting  cathode  followed  by 
two  well-spaced  metal  plates,  each  pierced  by  a small  hole 
on  the  axis  of  the  tube,  to  produce  the  electron  beam.  A 
potential  difference  applied  between  the  cathode  and  the 
first  plate  accelerates  the  electrons.  Those  which  pass 
through  the  hole  in  that  plate  continue  moving  at  a con- 
stant high  speed  up  to  the  second  plate.  The  electrons 
passing  through  the  hole  in  the  second  plate  then  con- 
tinue on  to  strike  the  fluorescent  screen  of  the  TV  tube. 
To  maximize  the  sharpness  of  the  TV  picture,  the  engi- 
neer wants  to  minimize  the  width  of  the  spot  made  by  the 
beam  on  the  face  of  the  tube. 

a.  Find  an  expression  for  the  diameter  of  the  hole  in 
the  second  metal  plate  that  will  minimize  the  width  of  the 
spot. 

b.  Estimate  the  minimum  spot  width  by  using  reason- 
able values  for  the  quantities  on  which  it  depends. 

Hint:  If  the  second  hole  is  very  narrow  the  uncer- 


tainty principle  will  be  important,  and  it  will  cause  the  spot 
to  be  wide.  If  that  hole  is  very  wide  the  uncertainty  princi- 
ple will  be  unimportant,  but  a wide  spot  will  be  produced 
just  because  the  hole  is  wide.  Justify,  and  then  use,  the  as- 
sumption that  the  spot  will  have  a minimum  width  when 
these  two  sources  make  equal  contributions  to  the  width. 

30-38.  Hit  the  crack ? A student  on  the  third  floor  of 
the  physics  building  is  at  elevation  l above  the  sidewalk. 
She  is  dropping  billiard  balls  of  mass  m out  of  the  window, 
attempting  to  hit  a narrow  crack  in  the  sidewalk. 

a.  Prove  that  no  matter  how  refined  the  equipment 
she  uses  to  aim  and  release  the  balls,  on  the  average  the 
centers  of  the  balls  will  miss  the  crack  by  a distance  having 
the  order  of  magnitude  (/t/m)1/2(//g)1/4,  where  h is  Planck's 
constant  and  g is  gravitational  acceleration. 

b.  Estimate  this  distance  by  using  reasonable  values 
for  / and  m. 

Hint:  See  the  hint  for  Exercise  30-37. 

Numerical 

30-39.  Cosine  superposition.  Run  the  cosine  superposi- 
tion program  with  the  following  set  of  parameters:  k = 12 
(in  m_1),  Ak  = 1 (in  m_1),  n = 8,  8x  = 0.04  (in  m).  Plot 
your  results,  and  compare  them  with  Figs.  30-17  and 
30-19.  Then  use  your  results  to  estimate  the  values  of  the 
product  Ax  Ak,  and  compare  it  with  the  value  Ax  Ak  — 1.2 
obtained  from  Fig.  30-17. 

30-40.  Modified  cosine  superposition.  Modify  the  cosine 
superposition  program  so  that  the  amplitude  of  one  com- 
ponent sinusoid  has  the  value  1,  while  the  amplitudes  of 
the  others  decrease  uniformly  to  the  value  0 as  their 
wave  numbers  become  either  larger  or  smaller  than  that  of 
the  one  with  amplitude  1.  Run  the  program,  letting  the 
component  sinusoid  with  amplitude  1 have  wave  number 
k = 12  (in  m-1),  letting  the  amplitudes  reach  0 at  k = 10 
(in  rrT1)  and  at  k = 14  (in  m-1),  with  n = 8 and  Sx  = 0.04 
(in  m).  Defining  Ak  in  the  manner  indicated  in  the  para- 
graph ending  with  Eq.  (30-24)  and  then  measuring  Ax 
from  the  plot  of  your  results,  determine  the  value  of  the 
product  Ax  Ak.  How  does  it  compare  with  the  value 
Ax  Ak  — 1.2  obtained  with  all  the  component  sinusoids 
having  the  same  amplitude? 


1484 


Particle-Wave  Duality 


■ 

Energy  Quantization 

in  Matter 


31-1  THE  PARTICLE  This  chapter  is  devoted  to  energy  quantization  in  matter,  one  of  the  most 
IN  A BOX  important  consequences  of  particle-wave  duality.  The  total  energy  of  a 
material  system  is  quantized  — that  is,  restricted  to  a set  of  discrete,  sepa- 
rated values — if  the  particles  of  matter  comprising  the  system  are  confined 
to  a certain  region  of  space.  The  smaller  the  region,  the  larger  the  separa- 
tions between  the  allowed  energies.  For  instance,  the  energy  of  a nucleus 
like  the  one  considered  in  Example  30-1 1 does  not  have  a continuous  range 
of  possible  values,  but  only  those  values  corresponding  to  its  quantized  en- 
ergy states.  This  is  so  because  the  neutrons  and  protons  in  the  nucleus  are 
bound  within  the  nucleus  and  thus  are  confined  to  a certain  region  of 
space.  The  region  of  confinement  of  the  nucleons  in  a nucleus  is  very  small, 
so  the  energy  states  are  separated  by  relatively  large  amounts  of  energy. 
Countless  other  examples  of  this  phenomenon  of  energy  quantization 
occur  in  material  systems  of  the  microscopic  size  found  in  the  quantum  do- 
main. 

Energy  quantization  is  also  present  in  material  systems  of  macroscopic 
size.  But,  except  for  very  special  circumstances  like  those  occurring  in  liq- 
uid helium,  it  is  not  observable  in  macroscopic  systems.  The  reason  is  that 
the  quantized  energy  states  of  large  systems  are  so  closely  spaced  in  energy 
as  to  be  indistinguishable  from  a continuum. 

In  this  section  we  consider  the  simplest  system  in  which  energy  quanti- 
zation arises.  It  consists  of  a single  material  particle  which  can  move  com- 
pletely freely  in  a certain  region,  but  cannot  escape  from  that  region.  The 
system  models  very  well  a gas  molecule  confined  in  a box  with  rigid  and  im- 
penetrable walls.  The  particle  in  a box  moves  until  it  strikes  a wall,  where  it 
bounces  elastically,  reversing  the  direction  of  its  momentum  component 
perpendicular  to  the  wall  but  not  changing  the  magnitude  of  its  mo- 


1485 


mentum.  A conduction  electron  confined  to  a block  of  metal  moves  in  a 
similar  way,  as  does  a neutron  or  a proton  confined  to  a nucleus.  So  the 
particle  in  a box  can  be  used  to  describe,  at  least  approximately,  the  behav- 
ior of  a variety  of  physical  systems.  In  fact,  the  essential  features  of  energy 
quantization  emerge  from  a study  of  a version  of  this  model  in  which  the 
motion  is  one-dimensional.  We  consider  the  one-dimensional  version  be- 
cause its  analysis  is  particularly  easy. 


0 L 


Fig.  31-1  A particle  confined  to  a 
one-dimensional  box. 


0 L 


Fig.  31-2  T he  simplest  wave  function 
for  a particle  in  a one-dimensional  box. 


Figure  31-1  illustrates  schematically  a particle  confined  to  a one- 
dimensional box  of  length  L.  The  particle  moves  freely  between  perfectly 
rigid  walls  at  x = 0 and  x = L.  But  when  it  collides  with  either  wall,  the  sign 
of  its  momentum  is  reversed,  and  it  bounces  back.  We  assume  the  collisions 
are  perfectly  elastic.  Hence  its  speed  remains  constant,  and  therefore  the 
magnitude  of  its  momentum  remains  constant.  The  particle  bounces  back 
and  forth  between  the  limits  of  the  region  0 x =£  L,  with  momentum  of  a 
fixed  and  precisely  defined  magnitude  p.  This  does  not  contradict  the 
position-momentum  uncertainty  principle  since  the  actual  value  of  the  mo- 
mentum is  not  known;  it  can  be  either  +p  or  —p. 

Associated  with  the  particle  is  a matter  wave.  According  to  Born’s  pos- 
tulate, the  probability  of  finding  the  particle  near  a point  x is  proportional 
to  the  square  of  the  matter-wave  amplitude  near  that  point.  This  implies 
that  since  the  particle  cannot  penetrate  the  walls  at  x = 0 and  x — L, 
neither  can  its  matter  wave.  Thus  the  matter  wave  is  reflected  back  and 
forth  between  the  walls,  just  as  the  particle  is.  The  wave  reflected  from  a 
wall  has  the  same  de  Broglie  wavelength  K = h/p  as  the  wave  incident  on 
the  wall,  since  the  magnitude  p of  the  particle’s  momentum  is  not  changed 
on  reflection.  Also,  reflection  cannot  decrease  the  amplitude  of  the  matter 
wave.  Otherwise  there  would  be  a steady  decrease  in  its  square.  This  is 
not  possible  because  it  would  lead  to  a continued  reduction  in  the  total 
probability  of  finding  the  particle  somewhere  between  the  walls,  in  contra- 
diction to  the  fact  that  the  total  probability  always  equals  1.  In  summary, 
each  time  the  particle  bounces  from  a wall,  its  matter  wave  is  reflected  with 
no  change  in  wavelength  or  amplitude.  So  the  box  contains  a matter  wave 
with  one  component  moving  in  the  positive  x direction  and  the  other  com- 
ponent moving  in  the  negative  x direction,  both  components  having  the 
same  wavelength  and  amplitude.  The  two  components  form  a standing 
wave,  just  as  traveling  waves  reflected  back  and  forth  from  the  constrained 
ends  of  a stretched  string  at  x = 0 and  x = L form  a standing  wave  (see  Sec. 
13-3). 

The  simplest  possible  matter-wave  amplitude  pattern  meeting  the  im- 
posed conditions  is  shown  in  Fig.  31-2.  Inside  the  range  0 ^ x ^ L,  this 
standing  wave  is  represented  by  the  function  i//(x),  which  has  the  form  of  a 
sinusoid  with  the  particular  wavelength  \ — h/p  that  corresponds  to  the 
particular  magnitude  p of  the  associated  particle’s  momentum.  In  the 
ranges  x < 0 and  x > L,  the  amplitude  of  the  standing  wave  has  the  value 
if/(x)  = 0.  We  assume  that  the  function  ip(x)  is  everywhere  continuous,  just 
like  the  functions  describing  waves  in  a stretched  string.  Then  the  condi- 
tions at  the  boundaries  of  the  range  0 x L require  that  the  standing 
wave  have  nodes  at  both  its  boundaries.  Note  that  exactly  the  same  figure 
represents  the  amplitude  pattern  for  the  simplest  standing  wave  in  a 
stretched  string.  The  nodes  of  the  wave  in  the  string  are  imposed  by  the 
rigid  supports  at  its  ends,  while  the  nodes  of  the  matter  wave  are  imposed 
by  the  rigid  walls  at  the  boundaries  of  the  region.  The  way  in  which  these 


1486  Energy  Quantization  in  Matter 


Fig.  31-3  The  first  three  wave  func- 
tions for  a particle  in  a one-dimensional 
box. 


constraints  lead  to  the  production  of  boundary  nodes  is  completely  dif- 
ferent in  the  two  cases.  But  the  results  produced  are  the  same  as  far  as  the 
amplitude  patterns  are  concerned. 

There  are  other  possible  forms  for  the  mathematical  function  describ- 
ing the  amplitude  pattern  for  a sinusoidal  standing  wave  with  nodes  at  x = 
0 and  x = L.  The  hrst  few  of  these  wave  functions  are  shown  in  Fig.  31-3, 
along  with  the  one  discussed  in  the  preceding  paragraph.  Each  corre- 
sponds to  a distinctly  different  state  of  motion  for  the  associated  particle. 
To  distinguish  between  them,  an  integer  n = 1,  2,  3,  . . .is  used  as  a la- 
beling subscript.  All  the  i//„(x)  have  nodes  at  x = 0 and  x = L.  They  must. 
The  function  already  discussed,  i//x(x),  has  no  others.  But  i//2(x)  has  an  addi- 
tional node  which  splits  the  range  0 to  L into  two  equal  parts,  i//3(x)  has 
another  so  that  the  range  is  split  into  three  equal  parts,  and  so  forth.  Com- 
parison with  Fig.  13-13  will  show  that  the  i/j„(x)  have  the  same  form  as  the 
set  of  functions  describing  the  possible  standing  waves  in  a stretched  string 
of  length  L. 


Let  us  evaluate  the  total  energy  £ of  the  particle  in  the  box  when  it  is  in 
the  state  of  motion  associated  with  the  wave  function  i You  can  see 
from  the  figure  that  for  each  value  of  n there  are  exactly  n de  Broglie 
half-wavelengths  fitting  into  the  length  of  the  box.  That  is, 

n — = L for  n — 1,  2,  3,  . . . 

or 

2 L 
A.  = — 
n 

Fhe  de  Broglie  relation  says  the  momentum  of  the  particle  has  the  magni- 
tude 


h nh 
P = K = 2L 


(31-1) 


If  we  assume  that  it  is  nonrelativistic,  the  particle’s  kinetic  energy  is 

, _ p2  n2h 2 

K ~ 2, m~  8 mV- 


Since  the  particle  moves  freely  between  the  walls  of  the  box,  it  feels  no 
force  there.  So  the  potential  energy  U must  be  constant  in  that  region.  We 
define  the  reference  value  of  potential  energy  so  that  U — 0 inside  the  box. 
Then  the  total  energy  of  the  particle  will  just  equal  its  kinetic  energy,  and 
we  have  for  the  total  energy 

n2h2 

En  = for  n = 1,  2,  3,  . . . (31-2) 


The  subscript  n has  been  added  to  aid  in  distinguishing  among  the  various 
possible  values  of  total  energy.  This  subscript  is  called  the  quantum 
number,  and  the  total  energy  is  said  to  be  quantized.  In  other  words,  the 
energy  is  restricted  to  any  of  the  values  given  by  the  equation  we  have  just 
obtained.  The  energy  can  have  no  other  values.  Each  of  the  allowed  energy 
values  En  corresponds  to  one  of  the  physically  possible  wave  functions 
I Pn(x). 


31-1  The  Particle  in  a Box  1487 


The  particle-in-a-box  energy  quantization  formula,  Eq.  (31-2),  is  rep- 

resented  graphically  in  Fig.  31-4.  The  horizontal  line  at  E — 0 represents 

the  reference  value  chosen  for  the  potential  energy.  Possible  total  energies 
En  are  each  represented  by  a horizontal  line  whose  distance  above  the  ref- 
~/^ E = Q erence  line  is  proportional  to  the  value  of  En.  These  lines,  and  also  the  en- 

ergies En  they  represent,  are  called  energy  levels.  When  the  particle  has 
Fig.  31-4  Energy-level  diagram  for  a energy  E„  and  associated  wave  function  i p„(x),  it  is  said  to  be  in  the  nth 
|M1Ul  111  a "lu  "<llllK "M"nal  l,nv  quantum  state.  The  first  quantum  state  is  the  one  in  which  a particle  will  be 

found  if  it  has  its  lowest  possible  energy.  In  most  circumstances  the  particle 
will  be  in  its  state  of  lowest  possible  energy.  Hence  this  state  has  a special 
significance  and  is  given  a special  name,  the  ground  state.  The  states  at 
higher  energies  are  called  excited  states. 

The  energy  levels  of  the  particle  in  a box  are  discrete  and  separated. 
This  conclusion  is  in  sharp  contrast  to  the  predictions  of  newtonian  me- 
chanics! According  to  newtonian  mechanics,  the  particle  in  a box  could 
have  any  value  of  total  energy,  so  that  its  energy-level  diagram  would  be  a 
continuum  of  densely  packed,  horizontal  lines.  But  this  prediction  is  in 
error  because  it  ignores  the  wavelike  motion  of  the  particle.  The  conditions 
imposed  on  the  wave  at  the  boundaries  of  the  box  limit  the  possible  values 
of  its  wavelength.  Acting  through  the  de  Broglie  relation,  the  wavelength 
limits  lead  to  limits  on  the  possible  values  of  the  particle’s  momentum  and 
therefore  on  its  total  energy.  We  will  soon  make  a more  detailed  compari- 
son between  our  predictions  and  newtonian  mechanics  for  the  higher  en- 
ergy levels  of  the  particle  in  a box.  But  first  we  consider  only  the  lowest  en- 
ergy level. 

Equation  (31-2)  says  that  there  is  a smallest  possible  total  energy  for 
the  particle  in  a box.  It  is 

El  = 8 (Sl-3a) 

This  is  not  expected  on  the  basis  of  newtonian  mechanics.  According  to 
that  theory,  there  is  no  reason  at  all  why  a particle  cannot  be  motionless  in 
the  box  and  therefore  have  zero  energy.  However,  newtonian  mechanics 
ignores  the  position-momentum  uncertainty  principle.  Let  us  take  it  into 
account. 

If  the  particle  is  known  to  be  in  a box  of  length  L,  its  position  is  known 
to  be  in  the  range  L/2  ± L/2.  Thus  its  position  uncertainty  is  Ax  = 
L/2.  For  the  lowest  energy  level,  the  magnitude  of  its  momentum  is 
p — V2 mK  — \j2mEx  = \j2mh2 / 8?nL2 , or  p = h/2L.  Since  the  particle 
can  be  moving  in  either  direction,  the  momentum  itself  can  have  either 
sign  and  is  written  ±h/2L.  Thus  the  momentum  uncertainty  has  the 
value  A p = h/2L.  The  position-momentum  uncertainty  product  for  the 
particle  is  Ax  Ap  = (L/2){h/2L)  = h/ 4,  in  reasonable  agreement  with  the 
lower  limit  h/Arr  set  by  the  uncertainty  principle.  The  particle  cannot  be  at 
rest  if  it  is  confined  to  the  box,  for  then  its  momentum  uncertainty  would  be 
A/?  = 0 while  its  position  uncertainty  would  still  be  Ax  = L/2.  So  the 
position-momentum  uncertainty  product  would  be  Ax  A p = 0,  in  violation 
of  the  position-momentum  uncertainty  principle.  Even  for  the  ground 
state,  a particle  in  a box  (or  in  any  other  system  of  limited  size)  must  possess 
what  is  called  zero-point  motion.  1 hus  the  ground-state  energy  E1  is  also 
known  as  the  zero-point  energy.  Example  31-1  concerns  zero-point 
energy. 


1488  Energy  Quantization  in  Matter 


EXAMPLE  31-1 


Prior  to  the  discovery  of  the  neutron  in  1932,  a nucleus  generally  was  pictured  as 
containing  A protons  and  A — Z electrons.  This  could  explain  the  observed  atomic 
number  Z and  approximate  atomic  weight  A of  an  atom  with  that  nucleus.  But 
among  other  problems  with  the  picture,  it  was  difficult  to  understand  how  a particle 
with  as  small  a mass  as  an  electron  could  be  bound  in  a system  with  as  small  a size  as 
a nucleus.  This  is  because  the  zero-point  energy  would  be  very  high.  Estimate  E1, 
assuming  that  an  electron  in  a nucleus  moves  like  a particle  in  a one-dimensional 
box  of  length  equal  to  the  nuclear  diameter. 

■ Since  the  prediction  of  Eq.  (31 -3a)  for  the  zero-point  energy, 

h2 


was  obtained  by  applying  the  nonrelativistic  expression  K = p2/2m  to  Eq.  (31-1); 
there  may  be  difficulty  in  using  it  if  E1  is  too  high.  You  can  find  out  by  seeing  what 
value  of  Ei  it  predicts  and  then  comparing  this  value  with  the  electron  rest- 
mass  energy: 


rn0c2  = 9 x 10  31  kg  x (3  x 108  m/s)2 


= 8 x 10~14  J x 


1 eV 


1.6  x 10~19  J 


= 5 x 105  eV 


Setting  L — lx  10  14  m,  a typical  nuclear  diameter,  and  m = m0 , you  obtain 


(6.6  x 10“34  J-s)2 

8 x 9 x KT31  kg  x (1  x 10~14)2 


= 6 x 10-10  J = 4 x 109  eV 


The  conclusion  is  that  it  would  be  inconsistent  to  apply  Eq.  (31-3a)  in  this  calcula- 
tion because  it  predicts  a zero-point  energy  for  the  electron  much  larger  than  its 
rest-mass  energy,  and  so  the  electron  is  highly  relativistic. 

Therefore  you  must  go  back  to  Eq.  (31-1).  For  n = 1 it  gives  an  electron  mo- 
mentum of 


Pi 


h 

2L 


It  is  easy  to  obtain  the  corresponding  value  of  total  energy  £,  from  the  general  rela- 
tivistic relation  E\  = (cpi)2  + (ot0c2)2  since  the  quantity  (m0c2)2  will  be  negligible  com- 
pared to  the  quantity  ( cpi )2  and  therefore  Ex  — cpx.  Thus 


Ei  - 


ch 

2L 


(31-36) 


Inserting  the  numbers,  you  find 

3 x 108  m/s  x 6.6  x 10“34  J-s 
El  = 2 x 1 x KT14  m 


1 x KT11  J = 6 x 107  eV  = 60  MeV 


as  an  estimate  of  the  zero-point  energy  of  an  electron  in  a nucleus.  Calculation 
shows  that  this  is  nearly  an  order  of  magnitude  larger  than  the  maximum  energy  an 
electron  could  have  and  still  be  bound  to  a nucleus  by  the  electric  attraction  exerted 
on  it  by  the  positive  nuclear  charge. 

When  the  neutron  was  discovered  by  James  Chadwick,  it  became  clear  that  a 
nucleus  contains  Z protons  and  A — Z neutrons.  A neutron  has  about  2000  times 
more  mass  than  an  electron.  So  its  zero-point  energy  is  smaller  by  a factor  of  about 
2W0  than  the  value  4 x 109  eV  obtained  by  using  the  electron  mass  in  Eq.  (31-3a). 
That  is,  Ei  — 2 x 106  eV  = 2 MeV  for  a neutron  in  a nucleus.  While  the  neutron  is 
not  affected  by  the  electric  force,  the  attractive  nuclear  force  exerted  on  it  is  more 
than  enough  to  bind  it  with  this  much  zero-point  energy.  Why  is  it  proper  to  use  Eq. 
(31-3a)  to  estimate  the  neutron’s  zero-point  energy? 


31-1  The  Particle  in  a Box  1489 


>> 

'go 

C 

<L> 

T3 


0 L 


Fig.  31-5  The  first  three  probability 
densities  ip2(x)  for  a particle  in  a one- 
dimensional box.  Also  shown  in  the 
probability  density  PnewtM  predicted  by 
newtonian  mechanics  for  the  system. 


Born’s  postulate  states  that  the  probability  of  finding  a particle  near  a 
certain  location  is  proportional  to  the  intensity  of  its  associated  wave  func- 
tion near  that  location.  The  statement  can  be  made  more  specific  by  de- 
fining the  probability  density  P(x)  as  the  probability  per  unit  length  of  the 
x axis  of  finding  the  particle  at  coordinate  x,  and  then  saying  this  quantity  is 
equal  to  the  intensity  t p2(x).  In  other  words,  Born’s  postulate  can  be  ex- 
pressed by  saying  this:  If  a measurement  is  made  to  locate  a particle  associated 
with  the  wave  function  ifj(x),  the  probability  P(x)  dx  that  it  will  be  found  at  a location 
between  x and  x + dx  equals  i/r(x)  dx.  The  particle-in-a-box  probability  den- 
sities t jjn(x)  are  plotted  as  solid  curves  in  Fig.  31-5  for  the  first  few  values  of 
the  quantum  number  n. 

Also  plotted  as  a dashed  curve  in  Fig.  31-5  is  the  probability  density 
Pnewt(x)  predicted  by  newtonian  mechanics  for  a particle  which  is  known  to 
be  in  a box,  but  concerning  which  nothing  else  is  known.  The  form  of 
PnewtM  is  obtained  from  the  following  considerations.  The  particle  is  either 
sitting  at  some  unknown  location  between  x = Oandx  = L,  or  else  bouncing 
back  and  forth  between  those  points.  In  the  first  case  any  location  is  equally 
likely.  The  same  is  true  in  the  second  case  since  the  particle  spends  equal 
time  intervals  in  each  equal  interval  of  the  x axis,  so  there  would  be  the 
same  likelihood  of  finding  it  in  every  such  interval  if  a measurement  of  its 
location  were  made.  Consequently,  the  value  of  Pnev/t(x)  is  constant  inside 
the  range  0 to  L.  It  drops  abruptly  to  zero  outside  this  range  because  the 
particle  can  never  be  found  there. 

The  interior  value  of  Pnewt(x)  's  determined  by  applying  the  normaliza- 
tion condition 


P(x)  dx  = 1 


(31-4) 


The  integral  evaluates  the  probability  that  the  particle  is  found  somewhere 
and  that  probability  clearly  must  equal  1.  Since  Pnewt(x)  is  nonzero  only 
from  0 to  L and  is  constant  within  that  range,  for  it  the  normalization  con- 
dition reads 

P new  if-)  dx  — I P new  t(x)  dx  — Pnewt  [ dx  — Pnewf  1 
Jo  Jo 

To  satisfy  the  last  equality,  Pnewt(x)  must  have  the  value  1/L  in  the  range 
from  0 to  L,  as  is  indicated  in  Fig.  3 1 -5.  The  probability  densities  pl(x)  must 
also  satisfy  the  normalization  condition,  and  the  ones  plotted  in  Fig.  31-5 
do  so.  This  is  achieved  by  incorporating  in  each  sinusoid  \jjn(x)  a multiplica- 
tive constant  whose  value  is  adjusted  so  that  the  integral  over  x from  0 to  L 
of  pl{x)  yields  1 . 


For  small  values  of  the  quantum  number  n,  the  fluctuating  probability 
densities  predicted  by  the  quantum  method  are  decidedly  different  from 
the  flat  probability  density  predicted  by  the  newtonian  method.  However, 
as  n increases,  the  fluctuations  become  more  numerous  as  the  wavelengths 
of  the  standing  waves  become  shorter,  and  the  individual  fluctuations  be- 
come more  compressed.  At  sufficiently  large  n,  no  experimental  position 
measurement  can  have  the  resolution  to  distinguish  individual  fluctuations 
in  the  probability  density;  all  that  could  be  observed  is  its  average  over  a 
number  of  fluctuations.  Such  an  average  is  exactly  the  flat  newtonian 
probability  density.  Thus  the  quantum  predictions  for  the  probability  den- 


1490  Energy  Quantization  in  Matter 


sity  come  into  essential  agreement  with  the  newtonian  ones  in  the  limit  that 
the  quantum  number  becomes  large. 

The  same  is  true  of  the  energy  levels  predicted  by  the  quantum 
method.  These  discretely  separated  levels,  specified  by  Eq.  (31-2), 

n2h2 
En  ~ 8 mL2 


and  plotted  in  Fig.  31-4,  are  certainly  different  from  the  continuum  of 
closely  packed  levels  predicted  by  newtonian  mechanics.  But  let  us  calculate 
the  fractional  separation  between  adjacent  levels,  A En/En,  for  large  values 
of  the  quantum  number  n.  This  quantity  specifies  the  resolution  required 
of  an  energy-measuring  apparatus  if  it  is  to  be  able  to  resolve  the  discrete- 
ness of  the  energy  levels.  We  have 


A E„ 


dEn 

dn 


A n 


Setting  An  = 1 , to  specify  that  the  levels  are  adjacent,  we  obtain 


A E„ 


dEn 

dn 


Evaluating  the  derivative  gives 


A En 


2 nh2  _ 2 n2h2  _ 2 
8mL2  n 8mL 2 n E 


Thus  the  fractional  separation  is 

A En  _ 2 

En  n 

As  n becomes  larger,  A En/En  becomes  smaller,  and  better  resolution  is 
needed  to  observe  that  the  levels  actually  are  discrete.  When  n is  large 
enough,  no  experimental  apparatus  is  able  to  tell  that  the  levels  do  not 
actually  form  a continuum.  Again,  we  find  quantum  predictions  corre- 
sponding to  newtonian  predictions  for  large  quantum  numbers. 

When  investigating  a system  like  a neutron  in  a nucleus,  we  are  dealing 
with  values  of  the  quantum  number  n that  are  quite  small.  For  such  micro- 
scopic systems,  quantum  effects  are  very  pronounced  and  lead  to  behavior 
which  can  be  completely  different  from  what  would  be  expected  on  the 
basis  of  newtonian  mechanics.  But  as  we  move  into  the  domain  of  macro- 
scopic systems,  the  energies  of  the  systems  become  larger,  and  the  quantum 
numbers  we  deal  with  do  the  same. 

Consider  a system  in  the  newtonian  domain — say,  the  proverbial  bil- 
liard ball  bouncing  back  and  forth  along  a path  perpendicular  to  opposing 
cushions  of  a billiard  table,  like  a macroscopic  particle  in  a macroscopic 
one-dimensional  box.  In  a typical  situation,  the  energy  of  the  ball  is  1 J.  If 
you  use  Eq.  (31-2)  to  determine  the  value  of  the  quantum  number  n,  tak- 
ing L = 1 m and  m = 1 kg,  you  will  find  n = 5 x 1033.  Hence  adjacent  en- 
ergy levels  are  separated  by  4 x 10-34  J,  and  adjacent  fluctuations  in  the 
probability  density  are  separated  by  2 x 10~34  m.  Thus  there  is  no  measur- 
able difference  at  all  between  quantum  and  newtonian  predictions. 


You  can  see  that  the  methods  we  are  beginning  to  develop  for  han- 
dling the  mechanics  of  systems  in  the  quantum  domain  bear  a relationship 


31-1  The  Particle  in  a Box  1491 


to  newtonian  mechanics  reminiscent  of  that  which  relativistic  mechanics 
bears  to  newtonian  mechanics.  Relativistic  mechanics  yields  correct  predic- 
tions for  systems  whose  characteristic  speed  v has  any  value  from  zero  to 
the  limiting  speed  c.  And  as  v/c  becomes  small,  the  relativistic  predictions 
merge  smoothly  into  those  of  the  simpler,  but  less  general,  theory  of  new- 
tonian mechanics.  Similarly,  predictions  obtained  by  using  the  quantum 
methods  are  valid  over  the  entire  range  of  the  quantum  number  n.  When 
n is  large,  they  come  into  correspondence  with  those  of  the  less  general 
theory  of  Newton.  When  n is  small,  there  are  striking  differences  between 
the  quantum  and  newtonian  predictions.  In  such  situations,  experiment 
shows  the  quantum  predictions  are  right  and  the  newtonian  ones  wrong. 


31-2  THE  HYDROGEN  We  have  obtained  a correct  expression  for  the  quantized  energies  of  a par- 
ATOM  tide  in  a box  by  making  direct  application  of  de  Broglie’s  relation  \ = h/p 
to  the  conditions  that  must  be  satisfied  at  the  boundaries  of  the  box.  As  is 
discussed  in  detail  in  Sec.  31-3,  our  success  is  due  to  the  fact  that  the  magni- 
tude p of  the  momentum  of  the  particle  in  a box  is  constant,  so  that  its  de 
Broglie  wavelength  A.  is  constant.  There  is  one — and  only  one — other  case 
in  which  a particle  bound  in  a system  of  restricted  size  can  be  expected  to 
have  a momentum  of  constant  magnitude.  This  is  the  case  of  a particle 
acted  on  by  a force  always  directed  to  a central  point  and  moving  in  a cir- 
cular orbit  about  that  point  like  a planet  moving  about  the  sun.  In  1913, 
two  years  after  Rutherford’s  discovery  of  the  atomic  nucleus,  Niels  Bohr 
(1885-1962)  proposed  that  the  hydrogen  atom  was  modeled  after  a plane- 
tary system.  According  to  the  Bohr  model,  the  single  electron  in  this  atom 
moves  with  momentum  of  constant  magnitude  in  a circular  orbit  around 
the  proton,  which  is  the  nucleus.  Because  a proton  is  some  2000  times  more 
massive  than  an  electron,  its  location  is  essentially  fixed  and  its  motion  is 
slight.  To  begin  the  treatment,  the  proton  is  considered  to  have  only  the 
function  of  acting  as  the  source  of  the  electric  central  force  binding  the 
electron  in  a circular  orbit. 

We  will  derive  a formula  for  the  quantized  values  of  the  total  energy  of 
an  electron  in  a Bohr  atom.  This  will  be  done  by  using  an  argument  due  to 
de  Broglie,  rather  than  by  following  Bohr’s  original  derivation.  Bohr's 
work  was  done  about  a decade  before  it  was  learned  that  particle-wave  du- 
ality applies  to  an  electron,  so  that  there  is  a wavelength  associated  with  its 
momentum.  He  obtained  energy  quantization  by  a subtle  argument  based 
on  the  idea  that  results  of  calculations  carried  out  on  a system  in  the 
quantum  domain  should  correspond  in  the  limit  of  large  quantum 
numbers  to  those  obtained  from  calculations  that  would  be  appropriate  if 
the  system  were  in  the  nonquantum  domain.  De  Broglie’s  simpler  deriva- 
tion leads  to  the  same  energy  quantization  formula  as  Bohr's. 

The  energy  quantization  formula  obtained  from  the  Bohr  model  is  in 
impressively  good  agreement  with  the  measured  values  of  the  energy  of  an 
electron  in  a hydrogen  atom.  Nevertheless,  there  are  fundamental  diffi- 
culties with  the  Bohr  model  of  a hydrogen  atom.  They  relate  to  the 
position-momentum  uncertainty  principle  and  so  were  found  only  after 
both  Bohr  and  de  Broglie  had  done  their  work  with  the  model.  These  diffi- 
culties cause  the  Bohr  model  to  yield  certain  predictions  which  are  in  com- 
plete disagreement  with  measurement.  We  will  consider  both  the  agree- 
ments and  disagreements.  But  hrst  let  us  go  through  the  de  Broglie  argu- 
ment. 


1492  Energy  Quantization  in  Matter 


Fig.  31-6  The  first  Bohr  orbit.  A dot 
represents  the  hydrogen  atom  nucleus, 
and  the  solid  circle  centered  on  it  repre- 
sents the  electron  orbit.  Oscillating 
about  the  solid  circle  is  a dashed  curve 
which  represents  the  amplitude  pattern 
of  the  electron  matter  wave,  with  one 
complete  oscillation  exactly  fitting  into 
the  periphery  of  the  orbit. 


Fig.  31-7  The  first  three  Bohr  orbits, 
with  the  solid  circles  showing  the  orbits 
drawn  to  scale.  The  dashed  curves  rep- 
resent the  electron  matter  wave  in  each 
orbit. 


Visualize  an  electron  moving  uniformly  around  a circular  orbit  of 
radius  r and  its  associated  matter  wave  moving  with  the  electron.  The  elec- 
tron has  momentum  of  constant  magnitude  p,  so  its  matter  wave  has  wave- 
length of  constant  magnitude  A = h/p.  Since  the  electron  travels  repeatedly 
around  the  orbit,  the  matter  wave  does  too.  In  order  that  the  wave  asso- 
ciated with  a particular  traversal  superpose  constructively  with  the  wave  as- 
sociated with  the  next  traversal,  it  is  necessary  that  an  integral  number  of 
wavelengths  A just  fit  into  the  distance  2vr  around  the  orbit.  That  is,  con- 
structive superposition  is  obtained  if  the  condition 

nk  — 2-nr  for  n = 1,  2,  3,  . . . (31-5) 

is  satisfied. 

If  an  orbit  does  not  satisfy  this  condition,  the  matter  waves  associated 
with  subsequent  traversals  of  the  electron  will  not  be  in  phase,  and  so  there 
will  be  cancellations  yielding  a total  wave  of  zero  intensity.  Since  the  inten- 
sity of  the  total  wave  tells  where  the  electron  will  be  found  in  a position 
measurement,  de  Broglie  interpreted  this  to  mean  that  an  electron  cannot 
be  found  in  such  an  orbit. 

The  hrst  orbit  allowed  by  Eq.  (31-5)  is  shown  in  Fig.  31-6  by  the  circle 
centered  on  a dot  representing  the  nucleus,  and  the  matter  wave  wrapped 
around  it  is  indicated  by  the  dashed  curve  representing  its  amplitude  pat- 
tern at  some  instant.  For  this  orbit  n = 1,  and  exactly  one  complete  ampli- 
tude oscillation  fits  into  its  periphery.  The  dashed  curve  is  just  like  one  that 
would  picture,  at  a particular  instant,  the  simplest  possible  transverse  wave 
traveling  around  a circular  string.  If  there  are  such  waves  traveling  around 
the  string  in  both  directions,  the  two  will  combine  to  form  a standing  wave. 
Its  amplitude  pattern  can  also  be  represented  by  Fig.  31-6,  if  the  fixed 
nodes  happen  to  lie  at  the  locations  illustated  in  the  figure.  Similarly,  if  no 
measurement  has  been  made  of  the  direction  in  which  the  electron  moves 
through  its  orbit,  it  could  just  as  well  be  moving  in  either  direction,  and 
there  must  be  associated  matter  waves  traveling  in  both  directions.  These 
combine  to  form  a standing  wave.  But  whether  the  matter  wave  associated 
with  the  particle  is  considered  to  be  in  the  form  of  a traveling  wave  or  a 
standing  wave  is  not  important  for  the  Bohr  atom.  For  both  types  the  wave- 
length is  governed  by  Eq.  (31-5). 

That  equation  can  also  be  satisfied  for  n = 2,  3,  ...  by  larger  orbits 
into  which  more  oscillations  of  the  matter-wave  amplitude  pattern  just  fit. 
The  first  few  are  shown  in  Fig.  31-7. 


By  applying  the  de  Broglie  relation  p = h/k,  we  can  convert  the  condi- 
tion on  the  wavelength  A given  by  Eq.  (31-5)  to  a condition  on  the  magni- 
tude p of  the  electron  momentum.  We  find 


h h nh 

A 2nr/n  2nr 


for  n — 1,  2,  3,  . . . 


(31-6) 


Now  we  evaluate  the  total  energy  of  a hydrogen  atom  when  the  elec- 
tron is  in  the  nth  allowed  orbit.  We  will  obtain  the  total  energy  of  the  atom, 
and  not  just  that  of  the  electron,  since  the  calculations  also  take  into  ac- 
count the  kinetic  energy  of  the  proton  nucleus.  The  proton  will  actually 
move  in  a very  small  circle  about  the  fixed  center  of  mass  of  the  atom, 
keeping  always  on  the  side  opposite  to  that  of  the  electron.  This  is  the  case 
since  its  mass  M is  not  infinitely  large  compared  to  the  electron  mass  m.  It  is 
very  easy  to  include  nuclear  motion  in  the  calculations.  This  is  done  by  pre- 


31-2  The  Hydrogen  Atom  1493 


tending  that  the  proton  mass  is  infinite,  so  that  its  location  is  fixed,  and  re- 
placing the  actual  electron  mass  by  the  reduced  mass  /jl,  given  by  Eq. 
(1  l-20o): 


mM 
m + M 


No  change  is  made  in  the  electron-proton  separation  r.  The  result  obtained 
for  the  total  energy  of  the  reduced-mass  electron  is  the  same  as  that  for  the 
total  energy  in  the  actual  atom. 

We  satisfy  Newton's  second  law  by  equating  the  electric  force,  which 
Coulomb’s  law  says  is  acting  on  the  reduced-mass  electron,  to  the  product 
of  its  mass  /x  and  its  centripetal  acceleration  v2/r.  This  gives 


47re0r2 


(31-7) 


Here  e is  the  magnitude  of  both  the  proton  and  electron  charges,  and  r is 
the  radius  of  the  orbit  of  the  electron.  Multiplying  through  by  r and  ex- 
pressing the  speed  v in  terms  of  the  magnitude  of  the  momentum  p bv  writ- 
ing v = p/\x , we  have 


477 €0r 


Inserting  Eq.  (31-6),  we  get 

o 

e n 


4 7760  r 4772r2p. 


for  n — 1,  2,  3, 


(31-8) 


Solving  for  r and  then  using  the  quantum  number  n to  label  these  allowed 
values  of  the  Bohr-atom  orbital  radii,  we  have 


r 


n 


n2h2e0 

77  /J£2 


for  n = 1,  2,  3,  . . . 


(31-9) 


Each  orbit  contains  n de  Broglie  wavelengths  in  its  periphery.  Those  shown 
in  Fig.  31-7  are  to  scale  since  their  radii  are  proportional  to  n2. 


These  radii  determine  the  allowed  values  of  the  atom's  total  energy. 
The  total  energy  is  found  by  calculating  first  the  kinetic  energy 


,9  2 

_r  _ r 

2p.  8776(9' 


where  Eq.  (31-8)  has  been  used  to  evaluate  p2 / /x.  Then  we  calculate  the 
electric  potential  energy 


U = - 


9 

e 

4 776(9 


Summing  the  two  gives  the  total  energy 


E 


K + U = 


2 

e~ 


8776(9' 


2 

e 

4776,9' 


or 


E = — 


87760r 


1494  Energy  Quantization  in  Matter 


E = 0 


-0.85  eV 


Fig.  31-8  Energy-level  diagram  for  a 
hydrogen  atom.  Each  arrow  repre- 
sents a possible  transition  of  the  atom 
from  a state  of  higher  energy  to  a state 
of  lower  energy.  These  transitions  are 
discussed  later. 


El  = -13.6  eV 


Setting  r = rn  and  using  Eq.  (31-9),  we  have 

_ e2TT/JLe2 

^ $Tre0n2h2eo 

Canceling  and  labeling  E with  the  subscript  n,  we  dually  obtain 

En=~  for  n = 1,  2,  3,  . . . (31-10) 

This  is  the  energy  quantization  formula  for  the  hydrogen  atom,  according 
to  the  Bohr  model. 

An  energy-level  diagram  for  the  hydrogen  atom  is  constructed  in  Fig. 
31-8  from  the  predictions  of  Eq.  (31-10).  The  diagram  is  interpreted  in  the 
same  way  as  the  particle-in-a-box  energy-level  diagram  of  Fig.  31-4,  except 
that  here  all  the  total  energies  E„  are  negative,  as  measured  from  the  en- 
ergy E = 0 that  is  the  reference  value  chosen  for  the  potential  energy. 
Thus  all  the  energy  levels  lie  below  the  line  drawn  at  E = 0.  The  most  nega- 
tive energy,  Eu  is  obtained  for  the  smallest  value  of  the  quantum  number, 
n = 1.  This  lowest  quantum  state,  or  ground  state,  is  the  most  stable  one. 
So  it  is  the  state  in  which  a hydrogen  atom  will  normally  be  found.  4 he 
atom  can  have  higher  energies,  but  only  one  of  the  values  E2 , £3,  and  so 
forth.  These  values  approach  the  limit  zero  as  n approaches  infinity.  Conse- 
quently, the  corresponding  energy  levels  bunch  together  as  n becomes 
larger.  If  the  total  energy  of  the  system  is  positive,  the  electron  is  not  bound 
to  the  proton,  and  the  system  will  not  be  a hydrogen  atom. 

Example  31-2  concerns  the  ground  state  of  a hydrogen  atom. 


EXAMPLE  31-2 

Evaluate  the  orbital  radius  and  total  energy  for  the  ground  state  of  the  Bohr  hy- 
drogen atom,  keeping  three  significant  figures. 

■ Since  the  reduced  electron  mass  yi  = mM/(m  + M ) differs  from  the  actual 
mass  m by  a factor  M/{m  + M)  that  is  smaller  than  1 by  only  about  1 part  in  2000, 
the  required  accuracy  can  be  obtained  if  you  use  m instead  of  yi  in  these  evaluations. 
Employing  Eq.  (31-9)  with  n = 1,  you  have 

h2e0  _ (6.63  x 10~34  J • s)2  x 8.85  x 10"12  C2/(N-m2) 

“ 7 rme2  ~ 7T  x 9.11  x 10“31  kg  x (1.60  x 10~19  C)2 
= 5.29  x 10_n  m - 0.0529  nm 


31-2  The  Hydrogen  Atom  1495 


To  evaluate  the  energy,  you  set  n = 1 in  Eq.  (31-10)  and  obtain 

_ meA  _ 9.1  1 x 10-31  kg  x (1.60  x 10-19  C)4 

8eo/n  8 x [8.85  x 10-12  C2/(N-m2)]2  x (6.63  x 10“34  J-s)2 
= -2.17  x 10“18  J = - 13.6  eV 


The  value  of  Ex  obtained  in  Example  31-2  is  shown  in  Fig.  31-8,  along 
with  the  values  E2,  E 3,  and  E4  obtained  in  a similar  way.  The  magnitude  of 
Elf  13.6  eV,  is  the  binding  energy  of  the  hydrogen  atom.  It  is  the  minimum 
energy  that  must  be  supplied  to  a hydrogen  atom  in  its  ground  state  in 
order  to  ionize  it,  that  is,  to  raise  its  energy  to  the  level  at  which  the  electron 
will  no  longer  be  bound  to  the  proton.  Direct  measurements  of  the  binding 
energy  are  in  excellent  agreement  with  the  value  predicted  by  the  Bohr 
model.  This  value,  13.6  eV,  characterizes  the  scale  of  energies  encountered 
in  atoms  and  molecules.  The  radius  0.0529  nm  is  also  in  good  agreement 
with  measurements  of  the  radius  of  the  hydrogen  atom  in  its  ground  state. 
Its  value  characterizes  the  size  scale  of  atoms  and  molecules. 


Bv  far  the  most  precise  experimental  test  of  the  predictions  of  the 
Bohr  model  is  obtained  by  evaluating  the  wavelengths  of  photons  emitted 
when  hydrogen  atoms  in  higher  energy  levels  drop  to  lower  energy  levels. 
Before  doing  this,  let  ns  describe  qualitatively  what  goes  on  in  the  process. 

Normally,  a hydrogen  atom  is  in  its  ground  state  with  energy  E1.  If  an 
electron  or  some  other  atom  collides  with  it.  say  in  an  electric  discharge,  the 
energy  of  the  atom  can  be  raised  to  one  of  its  excited  states  of  energy  E2,  or 
E3 , or  ....  Following  the  tendency  of  physical  systems  to  “seek"  their 
state  of  lowest  energy,  the  atom  will  return  to  its  ground  state  by  emitting 
energy  in  the  form  of  electromagnetic  radiation.  It  may  do  this  in  a single 
transition.  Typically,  though,  it  does  so  by  a sequence  of  transitions  taking 
the  atom  through  excited  states  of  successively  lower  energy  until  it  finally 
reaches  the  ground  state.  The  energy  decrease  in  each  transition  from  an 
initial  state  of  higher  energy  En.  to  a final  one  of  lower  energy  En,  is  accom- 
plished by  the  emission  of  a photon  carrying  away  the  required  amount  of 
energy.  In  other  words,  the  energy  content  of  the  photon  is  hv  = En.  — En,, 
where  v is  the  frequency  of  the  photon.  Its  wavelength  is  \ = c/v  — 
he/ {Eni  ~ En,). 

As  an  example,  say  a hydrogen  atom  is  excited  by  an  electron  colliding 
with  it  to  the  state  of  energy  E6.  It  might  drop  hrst  to  the  state  of  energy  E4, 
then  to  the  one  of  energy  E3,  and  finally  to  the  ground  state  of  energy  Ex. 
In  the  process  three  photons  are  emitted,  the  hrst  of  wavelength  A = 
hc/(E6  — E4),  the  next  of  wavelength  A = hc/(E4  — E3),  and  the  last  of 
wavelength  A = hc/(E3  — E/).  These  transitions  are  illustrated  schemati- 
cally in  Fig.  31-8. 

If  a large  number  of  hydrogen  atoms  are  excited  by  running  an  elec- 
tric discharge  through  hydrogen  gas,  or  by  a flame,  transitions  will  occur 
between  all  pairs  of  higher  and  lower  energy  levels.  For  each  transition 
there  is  a corresponding  photon  wavelength.  This  set  of  discrete  wave- 
lengths constitutes  the  hydrogen  spectrum  — that  is,  the  spectrum  of  elec- 
tromagnetic radiation  emitted  by  hydrogen  atoms  releasing  their  excitation 
energy. 


1496 


Energy  Quantization  in  Matter 


Fig.  31-9  Schematic  diagram  of  an  apparatus  used  to  measure  the 
spectrum  of  radiation  emitted  by  excited  hydrogen  atoms. 


H 


Discharge  tube 


Figure  31-9  indicates  the  apparatus  used  to  measure  the  spectrum  of 
radiation  emitted  by  hydrogen  (or  other)  atoms.  Excited  atoms  are  pro- 
duced by  passing  an  electric  discharge  through  a region  filled  with  hy- 
drogen gas.  That  part  of  the  radiation  emitted  when  atoms  deexcite  which 
forms  a parallel  beam  is  selected  by  a slit,  and  then  falls  on  a prism  or  dif- 
fraction grating.  Because  the  angle  of  refraction  or  diffraction  depends  on 
the  wavelength  of  the  radiation,  the  incident  beam  is  broken  into  a set  of 
beams  emerging  from  the  prism  or  diffraction  grating  at  various 
angles — one  for  each  wavelength  component  of  the  spectrum  of  radiation 
emitted  by  the  atoms.  These  beams  are  recorded  on  a photographic  him. 

Figure  31-10  is  a photograph  of  the  part  of  the  hydrogen  spectrum 
that  is  in  the  visible  range  of  wavelengths.  It  is  called  a line  spectrum  be- 
cause of  its  appearance.  Every  line  is  formed  by  a beam  of  a particular 
wavelength.  In  contrast  to  the  continuous  spectra  emitted  by  interacting 
atoms  in  the  surface  of  a heated  solid,  free  atoms  (or  molecules)  of  any 
species  emit  line  spectra.  Each  such  spectrum  is  different  from  those  of 
other  species  and  is  characteristic  of  the  atoms  (or  molecules)  producing  it. 
This  fact  is  used  frequently  by  chemists  in  analyzing  the  composition  of  an 
unknown  sample. 

Fhe  hydrogen  line  spectrum  is  the  simplest  of  them  all,  in  that  there  is 
the  greatest  degree  of  regularity  in  the  wavelengths  of  its  lines.  In  1885  the 
Swiss  high  school  teacher  J.  J.  Balmer  (1825-1890)  found,  by  trial  and 
error,  an  empirical  formula  which  gave  an  extremely  accurate  fit  to  the 
measured  wavelengths  of  lines  in  the  visible  part  of  the  spectrum.  Ex- 
pressed in  modern  terms,  the  Balmer  formula  is 


for  n = 3,  4,  5,  . . . 


(31-11) 


The  quantity  RH  is  the  Rydberg  constant  for  hydrogen  (named  after  a 
spectroscopist  who  modified  the  Balmer  formula  to  fit  the  spectra  of  alkali 
metal  atoms).  For  n — 3 the  formula  yields  the  wavelength  of  the  line  desig- 


Discharge  tube 


Fig.  31-10  T lie  hydrogen  spectrum  in 
the  visible  range  of  wavelengths.  The 
symbol  H„  is  used  to  indicate  the  series 
limit.  That  is,  H „ is  the  shortest- 
wavelength  line  in  this  series  of  lines. 
(Spectrum  from  IT.  Finkelnburg,  Structure 
of  Matter,  Sprmger-Verlag,  Heidelberg , 


00 

<N 


O 


so 


Wavelength 
(in  nm) 


U-) 

vo 


vo 

00 


o 


1964.) 


Designation:  Ha 

Color:  Red 


Blue 


H, 


7 


Violet 


Near  Ultraviolet 


31-2  The  Hydrogen  Atom  1497 


nated  Ha,  for  n = 4 that  of  H0,  and  so  forth.  Spectroscopic  wavelength 
measurements  are  very  precise,  so  the  experimental  value  of  RH  is  known 
to  a high  degree  of  accuracy.  A recently  quoted  value  is 

R„  = 10,967,757.6  ± 1.2  mT1  (31-12) 


l'lie  experimental  results  summarized  by  Eqs.  (31-11)  and  (31-12) 
were  used  by  Bohr  to  test  the  predictions  of  his  model  of  the  hydrogen 
atom.  This  was  done  by  invoking  energy  conservation  to  equate  the  energy 
hv  of  the  photon  emitted  when  the  atom  makes  a transition  between  states 
of  higher  energy  En.  and  lower  energy  E„f  to  the  difference  between  these 
energies: 

hv  Enj  Enf 

Then  the  relation  v - c/k  was  used  to  determine  1/A,  the  reciprocal  of  the 
photon’s  wavelength,  from  its  frequency  v.  This  gave 

J_  = v = En.  - E„f 
A c he 


Employing  his  energy  quantization  formula,  Eq.  (31-10), 


tie4  1 
8 e 'ih  2 rr 


to  evaluate  the  energies,  Bohr  then  obtained 

l _ /xe4  i 1 \_\ 

A 8eo h3c  \ nj  nf  / 


(31-13) 


Finally,  he  hxecl  nf  to  the  value  nf  = 2 so  that  the  possible  values  of  nt  satis- 
fying the  condition  that  En.  be  a higher  energy  than  En  become  n,  = 3,  4, 
5,  ...  . T his  allowed  the  results  to  be  written  as 


I - ^ /J_  _ J_  \ 

A 8e'oh3c  \ 22  rr  / 


for  n = 3,  4,  5,  . . . 


Thus  Boht  derived  a formula  that  is  identical  to  Balmer’s  empirical  for- 
mula, provided  that 


fxe4 

H = 8 4hh 


(31-14) 


When  we  calculate  the  reduced  electron  mass  p.  = mM/(m  + M)  from  the 
best  available  values  of  the  electron  and  proton  masses  m and  M,  and  we 
use  the  most  accurate  values  available  for  e,  e0,  h,  and  c,  then  Eq.  (31-14) 
yields  RH  = 10,968,100  m-1.  Comparing  this  predicted  value  with  the  mea- 
sured value  of  Rh  quoted  in  Eq.  (31-12),  you  can  see  that  the  Bohr- model 
predictions  agree  with  experiment  to  within  3 parts  in  100,000! 

Example  31-3  makes  use  of  Bohr’s  results  in  the  form  of  Eq.  (31-13). 


EXAMPLE  31-3 

For  nf  = 1,  ni  = 2,  3,  4,  ...  , Eq.  (31-13)  of  the  Bohr  model  yields  the  wavelengths 
of  a series  of  hydrogen  lines  different  from  the  series  shown  in  Fig.  31-10.  Deter- 
mine the  longest  wavelength  of  this  series,  keeping  three  significant  figures. 

■ To  the  required  accuracy, 

R„  = 1.10  x 107  nr1 


1498  Energy  Quantization  in  Matter 


Using  this  value  and  the  values  given  for  nf  and  n(,  you  can  write  Ec|.  (31-13)  as 


- = 1.10  X 107  nT1  x - 4)  for  n = 2,3,4,  . . . 
A \ 1 - nr  J 

The  longest  wavelength  will  be  obtained  for  n = 2.  Its  reciprocal  is 

1 3 

- = 1.10  x 107  m-1  x - 
k 4 

So 


X = 


4 

3 x 1.10  x 107  m_1 


1.21  x 1(T7  m = 121  nm 


This  wavelength  is  in  the  ultraviolet,  as  are  the  wavelengths  of  all  the  other 
lines  of  the  nf  = 1 series.  What  is  the  shortest  wavelength  of  the  series? 


Figure  31-11  shows  the  hydrogen  energy-level  diagram  and  sets  of 
arrows  representing  the  transitions  that  produce  the  lines  of  the  series  for 
nf  — 1,  nf  = 2,  and  7if  = 3.  The  series  in  the  visible  part  of  the  spectrum  is 
known  as  the  Balmer  series.  The  ultraviolet  and  infrared  series  are  called 
the  Lyman  series  and  the  Paschen  series,  respectively,  after  the  experi- 
menters who  discovered  them.  Two  other  hydrogen  series  are  known  in 
the  infrared.  They  are  for  nf  — 4 and  nf  = 5.  In  all  cases  the  measured 
wavelengths  are  in  accurate  agreement  with  the  predictions  of  the  Bohr 
model. 


Fig.  31-11  T he  hydrogen  atom  energy-level  diagram,  showing  the  transitions  which  produce 
the  three  most  prominent  series  of  lines  in  the  hydrogen  spectrum.  ( From  R.  Eisberg  and  R. 
Resnick,  Quantum  Physics  of  Atoms,  Molecules,  Solids,  Nuclei  and  Particles,  John  Wiley,  New  York, 
1974.) 


n 


E (in  eV) 


100  130  200  300  500  1000  2000  A (in  nm) 

| 1 1 1 1 1 ] 1 

3000  2400  1700  1000  500  200  t>(inl012Hz) 


31-2  The  Hydrogen  Atom  1499 


In  fact,  the  Bohr  model  predicts  with  great  accuracy  the  wavelengths 
of  the  spectral  lines  emitted  in  (he  deexcitation  of  any  system  containing 
two  particles  of  mass  m and  M which  are  bound  together  by  the  electric  at- 
traction of  their  opposite  charges.  If  they  are  singly  charged,  then  Eq. 
(31-13)  can  be  employed  immediately  by  using  the  proper  value  of  reduced 
mass  /x  = mM/(m  + M).  If  one  has  a charge  of  magnitude  Z times  a single 
electron  charge,  then  a factor  of  Z2  must  also  be  inserted  in  the  numerator 
on  the  right  side  of  the  equation.  Can  you  justify  this  statement?  Examples 
of  the  systems  referred  to  include  the  singly  ionized  helium  atom  that  con- 
tains only  one  electron,  the  muonic  atom  comprised  of  a proton  and  a 
muon,  and  the  system  consisting  of  a positron  and  an  electron  called  posi- 
tronium.  The  Bohr  model  also  predicts  approximate  values  for  the 
longer-wavelength  part  of  the  spectra  emitted  by  alkali  metal  atoms.  The 
reason  is  that  these  atoms  contain  a single  electron  outside  a spherically 
symmetrical  core  of  net  charge  +e  that  is  not  disturbed  in  the  low-energy 
processes  that  excite  the  production  of  relatively  long-wavelength  radia- 
tion. 


How  much  time  does  a hydrogen  atom  spend  in  a particular  excited 
state  before  it  drops  to  a state  of  lower  energy?  Quite  enough  for  the  time 
to  be  measured.  For  instance,  experiment  shows  that  the  average  time 
during  which  a hydrogen  atom  lives  in  the  state  of  energy  T2  before  leaving 
it  to  return  to  the  ground  state  is  about  10~8  s.  This  average  lifetime  of  an 
excited  state  varies  from  state  to  state,  but  generally  is  of  the  same  order  of 
magnitude  as  that  of  the  state  of  energy  T2.  If  you  refer  to  the  time-energy 
uncertainty  principle  calculation  of  Example  30-1  1,  you  will  see  that  this 
means  the  excited-state  energy  levels  are  not  completely  “sharp,”  but  in- 
stead extend  over  an  energy  range  AT  which  is  of  the  order  of  magnitude 
A E — /t/4 77  x It)-8  s — 10-8  eV.  The  only  energy  level  of  the  atom  that 
has  no  “width”  is  the  ground-state  level.  Can  you  explain  why  A E = 0 for 
the  ground  state? 

It  is  not  possible  to  calculate  the  lifetimes  of  hydrogen  atom  excited 
states  on  the  basis  of  the  Bohr  model.  The  reason  is  that  the  model  pro- 
vides no  picture  at  all  of  what  is  happening  when  a photon  is  emitted  in  a 
transition  from  an  excited  state.  The  inability  to  calculate  lifetimes  repre- 
sents a serious  deficiency  of  the  model — as  Bohr  was  well  aware  from  the 
very  beginning. 

There  is  another  difficulty  with  the  Bohr  model.  Although  it  says  there 
is  no  state  of  lower  energy  than  the  ground  state  to  which  the  ground-state 
atom  can  make  a transition,  the  model  gives  no  explanation  of  why  the  con- 
tinuing centripetal  acceleration  of  the  electron  in  that  state  does  not  cause  it 
to  radiate  away  its  energy,  spiraling  toward  the  nucleus  while  it  does  so.  A 
calculation,  based  on  Eq.  (27-41)  for  the  rate  of  radiation  by  an  accelerated 
charge,  predicts  that  it  would  take  only  about  10_n  s for  an  electron  initially 
in  the  ground  state  of  a hydrogen  atom  to  radiate  away  so  much  of  its  en- 
ergy that  it  spiraled  right  into  the  nucleus.  Indeed,  it  was  for  this  reason 
that  many  predecessors  of  Bohr,  J.  J.  Thomson  among  them,  had  consid- 
ered and  rejected  a planetary  atom.  However,  experiment  shows  it  does 
not  happen.  A hydrogen  atom  in  its  ground  state  is  completely  stable  and 
has  an  indefinite  lifetime,  instead  of  a lifetime  of  only  10_u  s. 

Yet  another  problem  with  the  Bohr  model  is  that  it  predicts  that  an  elec- 
tron in  the  ground  state  of  hydrogen  has  an  angular  momentum  of  magni- 


1500  Energy  Quantization  in  Matter 


tude  L — rp,  because  it  is  moving  in  an  orbit  of  radius  r with  momentum  of 
magnitude  p.  If  you  set  n = 1 in  Eq.  (31-6)  and  write  it  as  rp  — h/27T,  you 
will  see  that  the  model  predicts  L = Ii/2tt  in  the  ground  state.  But  experi- 
ments indicate  that  the  magnitude  of  the  angular  momentum  arising  from 
the  orbital  motion  of  an  electron  in  the  ground  state  of  hydrogen  is  L — 0. 

These  difficulties  with  the  Bohr  model  stem  from  the  fact  that  the  pre- 
cise orbits  envisaged  in  the  model  conflict  with  the  position-momentum  un- 
certainty principle.  If  the  radial  coordinate  of  an  electron  had  the  precise 
value  given  by  Eq.  (31-9),  then  its  momentum  in  the  radial  direction  would 
be  completely  uncertain.  This  momentum  component  could  be  measured  to 
be  anything  from  zero  to  infinity,  and  consequently  there  would  be  an  infi- 
nite uncertainty  in  the  kinetic  energy  of  the  electron  because  it  contains  a 
term  proportional  to  the  square  of  the  radial  momentum  component.  It 
follows  that  there  would  be  an  infinite  uncertainty  in  t fie  electron’s  total  en- 
ergy. Thus  the  position-momentum  uncertainty  principle  does  not  allow  a 
hydrogen  atom  both  to  have  a precise  total  energy  Ex  in  its  ground  state,  as 
experiment  shows  it  does,  and  to  move  in  an  orbit  of  precise  radius  rx,  as 
Bohr  assumed  it  does. 

In  fact,  since  the  measurements  find  the  electron  has  no  orbital  angu- 
lar momentum  in  the  hydrogen  atom  ground  state,  its  motion  in  that  state 
can  have  little  resemblance  to  any  kind  of  planetary  orbit.  Instead,  the  mo- 
tion has  more  resemblance  to  a microscopic  billiard  ball  performing  some 
sort  of  oscillation  about  the  nucleus  along  a diameter  of  the  atom.  A sche- 
matic illustration  of  the  motion  is  given  in  Fig.  31-12  by  taking  the  diameter 
to  be  along  the  x axis  of  some  coordinate  system.  But  it  must  be  appreciated 
that  any  orientation  of  the  oscillation  axis  would  be  equally  probable.  (That  such  an 
oscillation  would  appear  to  take  the  electron  through  the  nucleus  at  the  ori- 
gin need  cause  no  concern,  since  we  learned  in  Example  31-1  that  the  elec- 
tron could  not  be  captured  by  the  nucleus.  But  more  basic  is  the  fact  that 
electrons  are  not  simply  microscopic  billiard  balls.) 

This  is  a much  more  correct  picture  of  what  an  electron  does  in  the 
ground  state  of  a hydrogen  atom  than  the  one  provided  by  the  Bohr 
model.  And,  in  combination  with  the  position-momentum  uncertainty 
principle,  it  yields  a very  acceptable  estimate  of  the  energy  and  size  of  the 
atom  in  its  ground  state.  We  start  the  estimate  by  saying  that  the  x coordi- 
nate of  the  electron  whose  motion  is  depicted  in  Fig.  31-12  is  known  only  to 
be  in  the  range  ±r.  The  quantity  r is  the  radius  of  the  ground-state  hy- 
drogen atom.  (That  is,  r is  the  radius  of  a sphere  obtained  by  allowing  the 
oscillation  axis  to  assume  all  possible  orientations.)  Thus  the  uncertainty  in 
the  x coordinate  of  the  electron  is  Ax  = r. 

As  we  learned  in  applying  the  uncertainty  principle  to  the  ground  state 
of  the  particle  in  a box,  we  cannot  expect  the  momentum  uncertainty  A p to 
be  at  the  extreme  lower  limit  set  by  the  position-momentum  uncertainty 


Fig.  31-12  Schematic  representation  of  an  electron  in  the  ground  state  of  a 
hydrogen  atom.  The  electron  moves  back  and  forth  along  a diameter  of  the 
atom,  whose  radius  is  r.  The  diameter  can  have  any  orientation  in  space.  But 
whatever  the  orientation,  the  x axis  can  be  chosen  to  lie  along  it. 


-r  0 +r 


31-2  The  Hydrogen  Atom  1501 


principle  in  the  ground  state  of  a system.  So  we  make  an  educated  guess 
and  take  the  momentum  uncertainty  to  be  twice  that  value,  setting 

h h 

A/;  “ 2 7 r Ax  “ 2 Vr 

If  the  maximum  magnitude  of  the  electron  momentum  is  p,  then  through 
an  oscillation  cycle  the  actual  value  is  in  the  range  ±p,  and  the  uncertainty 
in  the  momentum  is  A p = p.  Thus 

h 


The  maximum  kinetic  energy  of  the  electron  of  reduced  mass  p,  will  be 

2/x  8p.7T2r2 

The  maximum  potential  energy  will  be 


U = - 


47 re0r 


fhe  average  over  an  oscillation  cycle  of  each  of  these  quantities  is  one-half 
their  maximum  value,  and  the  total  energy  of  the  electron  is  the  sum  of  the 
two  average  energies.  But  since  this  is  only  an  estimate,  we  will  ignore  the 
factor  4 and  simply  take  the  total  energy  E = K + U to  be 


E = 


h1 

8/jltt2)'2 


47 re0r 


(31-15) 


Now  we  will  find  the  value  of  the  ground-state  radius  r which  mini- 
mizes the  total  energy  E.  Physically,  we  assume  the  atom  adjusts  itself  so  as 
to  find  its  state  of  lowest  possible  total  energy.  All  physical  systems  tend  to 
do  just  this.  Mathematically,  we  treat  r as  a variable,  compute  dE/dr,  and 
then  invoke  the  condition  for  a minimum  by  setting  the  derivative  to  zero. 
We  have 


dE  _ —2 h2  —e2 

dr  8/x772r3  47re0r2 


Transposing,  canceling,  and  then  solving  for  r to  determine  the  value 
giving  the  minimum,  we  obtain 


/re  o 
r - ^ 

TTfjie~ 


(31-16) 


Substituting  this  value  into  Eq.  (31-15)  to  find  the  corresponding  value  of 
the  total  energy,  we  get 

_ h 2 772/u,2c4  e2  Tr/JLe2 

E SfJLTT2  h4e 0 47760 


or 


E = 


/xe4 

84h2 


(31-17) 


On  comparison  with  Eqs.  (31-9)  and  (31-10),  it  will  be  seen  that  this  es- 
timate gives  the  same  ground-state  radius  and  energy  as  does  the  Bohr 
model.  Of  course,  we  have  manipulated  the  uncertainty  principle  estimate 


1502  Energy  Quantization  in  Matter 


by  twice  inserting  factors  of  2.  The  Bohr  model  led  directly  to  these  results 
that  agree  with  experiment.  But  the  Bohr-model  agreement  is  fortuitous 
because  the  motion  specified  in  the  model  is  far  from  the  actual  motion, 
since  it  is  inconsistent  with  the  uncertainty  principles.  The  radial-motion 
model  of  the  hydrogen  atom  ground  state  is  much  closer  to  reality,  and  it  is 
no  accident  that  the  uncertainty  principle  estimate  based  on  this  model 
comes  out  very  well.  In  this  model  the  wave  function  associated  with  the 
electron  in  the  ground  state  is  a standing  wave  along  an  axis  of  any  orienta- 
tion passing  through  the  center  of  the  atom.  Its  amplitude  is  maximum  at 
the  center  and  approaches  zero  at  both  ends  of  the  axis.  In  other  words,  the 
ground-state  wave  function  is  a spherically  symmetrical,  three-dimensional 
standing  wave  that  has  a node  everywhere  on  a sphere  (called  a nodal 
sphere)  at  the  outer  limits  of  the  atom.  The  intensity  pattern  obtained  by 
squaring  the  wave  function  is  indicated  in  the  upper  left  corner  of  Fig. 
31-13.  Shading  is  used  to  represent  the  three-dimensional  appearance  of 
the  intensity.  Since  this  quantity  is  proportional  to  the  probability  density, 
or  probability  of  finding  the  electron  in  various  locations,  the  shading  also 
indicates  the  “appearance”  of  the  hydrogen  atom  “electron  cloud”  when 
n = 1,  so  that  the  atom  is  in  its  lowest  energy  level. 

Calculations  involving  the  soon-to-be  discussed  Schrodinger  equation 
show  that  when  the  hydrogen  atom  is  in  the  n — 2 energy  level,  there  are 
several  different  possible  forms  for  its  wave  function.  A spherically  sym- 
metrical one  has  an  additional  nodal  sphere  inside  the  considerably  ex- 
panded nodal  sphere  at  the  outer  surface  of  the  atom.  Its  probability  den- 
sity, shown  in  Fig.  31-13  immediately  below  the  ground-state  probability 
density,  consists  of  a central  blob  surrounded  by  a spherical  shell.  The  two 
other  wave  functions  for  the  n = 2 energy  level  have  nodal  surfaces  in  the 
form  of  cones  whose  apexes  are  at  the  center  of  the  atom,  as  well  as  the 
spherical  nodal  surface  at  its  outer  surface.  This  gives  an  angular  depen- 
dence to  them  and  to  their  probability  densities  displayed  to  the  right  of 
the  spherically  symmetrical  one.  Measurements  show  that  when  a hydrogen 
atom  is  in  the  n = 2 quantum  state  with  the  spherically  symmetrical  proba- 
bility density,  it  has  no  orbital  angular  momentum,  just  as  when  it  is  in  its 
ground  state.  The  measurements  also  show  that  the  atom  does  have  orbital 
angular  momentum  when  it  is  in  any  of  the  other  n = 2 quantum  states. 
For  the  two  states  whose  probability  density  looks  like  a barrel,  the  compo- 
nent of  orbital  angular  momentum  along  the  vertical  symmetry  axis  is  mea- 
sured to  be  either  + h/2n  or  — }i/2tt.  In  the  state  whose  probability  den- 
sity has  the  appearance  of  an  upper  and  lower  cap,  the  orbital  angular 
momentum  vector  has  the  same  magnitude  as  in  the  barrel  states,  but  it  lies 
somewhere  in  the  plane  perpendicular  to  the  symmetry  axis  so  that  its  com- 
ponent along  the  axis  is  0.  The  point  of  significance  is  that  there  is  orbital 
motion  of  the  electron  in  these  excited  states,  in  addition  to  its  radial  mo- 
tion. 


The  wave  functions  whose  probability  densities  are  illustrated  in  Fig. 
31-13  can  be  used  to  predict  a variety  of  properties  of  the  hydrogen  atom, 
including  orbital  angular  momentum  and  lifetime.  The  predictions  are  in 
complete  agreement  with  experiment.  Orbital  angular  momentum  predic- 
tions are  made  by  integrating  over  space  the  product  of  the  wave  function 
times  a quantity  involving  certain  angular  derivatives  of  the  wave  function. 
For  a spherically  symmetrical  wave  function  these  derivatives  are  zero,  and 


31-2  The  Hydrogen  Atom  1503 


i 

I 


I 

« = 3,  / = 1,  m = 0 


Fig.  31-13  Artist’s  impression  of  the  electron 
probability  density  patterns  for  various 
quantum  states  of  the  hydrogen  atom.  Each  of 
these  patterns  is  supposed  to  be  symmetric 
about  the  vertical  axis  passing  through  its 
center.  They  are  labeled  by  a set  of  three 
quantum  numbers:  n,  l,  m.  Three  quantum 
numbers  arise  because  the  system  is  three- 
dimensional.  But  for  the  hydrogen  atom,  the 
total  energy  depends  on  only  the  single 
quantum  number  n.  The  quantum  number  n 
also  plays  the  principal  role  in  determining  the 
maximum  radius  at  which  the  probability  den- 
sity has  an  appreciable  value,  although  there  is 
a small  dependence  of  this  “size”  on  the  other 
two  quantum  numbers.  The  “shape"  of  the 
probability  density  pattern  does  depend 
strongly  on  the  quantum  numbers  l and  m.  De- 
tailed calculations  show  that  the  shape  of  the 
probability  density  pattern  is  related  to  the  or- 
bital angular  momentum  of  the  atom.  ( From  R. 
Eisberg  and  R.  Resnick,  Quantum  Physics  of  Atoms, 
Molecules,  Solids,  Nuclei,  and  Particles,  John 
Wiley,  New  York,  1974.) 


n = 3,1=  2,  m = 0 


therefore  the  predicted  orbital  angular  momentum  is  zero.  Because  of 
their  complexity,  we  cannot  give  more  than  this  very  brief  sketch  of  the  or- 
bital angular  momentum  calculations.  The  lifetime  calculations  are 
described  in  a little  more  detail  at  the  end  of  Sec.  31-3. 

Calculating  the  probability  densities,  or  orbital  angular  momenta,  or 
lifetimes,  requires  an  exact  knowledge  of  the  hydrogen  atom  wave  func- 
tions. But  even  the  simplest  of  them  is  too  complicated  to  be  found  without 
the  aid  of  an  equation  that  specifies  the  form  completely.  Such  an  equation 
was  developed  in  1925  by  Erwin  Schrodinger  (1887-1961).  We  obtain  the 
Schrodinger  equation  for  one-dimensional  wave  functions  f(x)  in  Sec.  31-3, 


1504  Energy  Quantization  in  Matter 


using  an  argument  that  is  much  less  sophisticated  than  the  one  he  used.  In 
Sec.  31-4  it  is  applied  to  find  some  of  the  wave  functions,  probability  den- 
sities, and  energy  levels  of  a harmonic  oscillator. 


31-3  SCHRODINGER’S 
EQUATION 


The  systematic  study  of  the  mechanics  of  particles  in  the  quantum  domain 
is  called  quantum  mechanics.  Schrodinger’s  equation  is  to  quantum  me- 
chanics what  Newton’s  laws  of  motion  are  to  newtonian  mechanics.  Like 
Newton’s  laws,  Schrodinger’s  equation  was  constructed  on  the  basis  of  pos- 
tulates designed  to  ensure  its  agreement  with  a few  key  phenomena,  and  it 
was  subsequently  found  capable  of  explaining  a very  large  number  of  other 
phenomena.  These  explanations  are  obtained  by  solving  the  equation  in 
the  particular  form  that  it  assumes  in  a given  system,  thereby  finding  the 
wave  functions  associated  with  the  particle  of  the  system  in  its  various 
quantum  states.  Then  information  about  the  behavior  of  the  particle  is  ex- 
tracted from  the  wave  functions.  For  instance,  by  squaring  the  wave  func- 
tion the  probability  density  is  obtained,  and  this  quantity  determines  where 
the  particle  is  likely  to  be  found  in  a measurement  of  its  position.  The  en- 
ergy levels  of  the  system  are  also  found  in  the  process  of  solving  its 
Schrodinger  equation. 


Before  solving  Schrodinger’s  equation,  you  must  know  what  the  equa- 
tion is.  This  section  takes  you  through  an  argument  that  starts  with  three 
postulated  properties  of  the  equation,  combines  them  by  means  of  a simple 
calculation,  and  then  makes  a final  postulate.  The  result  is  a time- 
independent  form  of  Schrddinger’s  equation  that  can  be  used  to  determine 
the  possible  time-independent  wave  functions  ip(x)  for  a particle  in  any 
one-dimensional  system,  as  well  as  the  allowed  values  of  its  total  energy  E. 
fhe  argument  is  not  a derivation,  hut  a justification;  Schrodinger’s  equa- 
tion cannot  be  derived  from  something  more  basic,  any  more  than 
Newton’s  laws  can  be  so  derived.  But  the  argument  can  help  make  the 
equation  seem  more  plausible  to  you  than  it  would  be  if  it  were  simply 
quoted  without  any  prior  consideration. 

Schrodinger’s  equation  is  a generalization  of  the  de  Broglie  relation 

p = h7  (31-18) 

A 

among  the  magnitude  p of  a particle’s  momentum,  the  wavelength  A of  its 
associated  wave,  and  Planck’s  constant  h.  The  first  postulate  is  that 
Schrodinger’s  equation  is  consistent  with  Eq.  (31-18). 

The  second  postulate  is  related  to  the  first.  It  is  that  the  wave  function 
describing  the  mathematical  form  of  the  standing  wave  associated  with  a 
particle  whose  de  Broglie  wavelength  is  A must  be  a sinusoid,  say  a sine, 
with  that  wavelength.  The  justification  is  that  a sinusoid  is  the  simplest  os- 
cillatory function  for  which  a unique  wavelength  can  be  defined.  Thus  for  a 
particle  moving  freely  along  the  x axis  with  constant  de  Broglie  wavelength 
A,  the  wave  function  is  taken  to  be 

ijj(x)  = sin  j (31-19) 

This  is  precisely  the  form  of  the  particle-in-a-box  wave  functions  in  the 
region  between  walls  of  the  box.  See  Fig.  31-3.  Note  that  Eq.  (31-19)  is 
meaningful  only  for  the  case  of  a particle  that  has  momentum  of  constant 


31-3  Schrodinger's  Equation  1505 


Fig.  31-14  A function  which  is  oscilla- 
tory, but  not  sinusoidal.  Because  the  os- 
cillations spread  more  and  more  as  x in- 
creases, the  separation  between  any 
adjacent  pair  of  maxima  differs  from  the 
separation  between  the  closest  pair  of 
adjacent  minima.  Consequently,  it  is  not 
possible  to  define  the  wavelength  of 
even  a single  oscillation  of  this  function. 


magnitude  p at  all  positions  x,  like  the  particle  in  a box,  and  therefore  has 
the  same  de  Broglie  wavelength  A = h/p  at  each  position.  As  is  illustrated  in 
Fig.  31-14,  it  is  not  consistent  to  speak  of  a wavelength  that  varies  signifi- 
cantly with  position,  because  such  a concept  is  not  well  defined.  The  second 
postulate  is  that  Schrodinger’s  equation  must  be  consistent  with  Eq. 
(31-19).  That  is,  it  must  have  Eq.  (31-19)  as  a solution  for  the  case  of  a par- 
ticle whose  momentum  has  a constant  magnitude. 

The  third  postulate  is  that  Schrodinger’s  equation  is  consistent  with 
the  law  of  energy  conservation  in  the  form  it  assumes  for  a nonrelativistic 
particle.  The  kinetic  energy  of  such  a particle  is  K — mv2/2  = ( mv)2/2m  — 
p2 /2m,  where  m is  its  mass.  So  the  Schrodinger  equation  is  postulated  to  be 
consistent  with  the  relation 

+ U = E (31-20) 

2m 


where  U and  E are  the  system’s  potential  and  total  energies.  Note  that  by 
basing  Schrodinger’s  equation  in  part  on  Eq.  (31-20),  its  applicability  is 
restricted  to  the  nonrelativistic  domain. 

The  first  step  in  the  calculation  leading  to  Schrodinger’s  equation  is  to 
combine  Eqs.  (31-18)  and  (31-20).  This  produces 


/r 

2mA2 


U = E 


or 


1 2m 

—2=-pc(E-U)  (31-21) 

A-  h- 

Since  the  solutions  to  Schrodinger’s  equation  are  functions,  such  as  the 
one  given  in  Eq.  (31-19)  for  a particle  in  a box,  it  is  not  an  algebraic  equa- 
tion, but  a differential  equation.  So  it  must  contain  derivatives.  We  can  gen- 
erate some  derivatives  by  differentiating  Eq.  (31-19).  Doing  this  twice  yields 
first 


and  next 


By  using  Eq.  (31-19)  on  the  right  side  of  the  second  derivative,  it  simplifies 
to 


d iji(x) 
dx 


/9 


77  X 


t cosbr 


d2  iJj(x) 
dx2 


ZTT  \~ 


sin 


STTX 


d24>(x) 

dx2 


ip(x) 


Now  we  substitute  the  value  of  1/A2  from  Eq.  (31-21)  into  this  equality. 
We  obtain 


d2\ji(x) 

dx1 


8ir2m 

h2 


( E - U) ip(x) 


(31-22) 


This  equation  is  consistent  with  Eqs.  (31-18)  and  (31-20)  and  certainly  has 
Eq.  (31-19)  for  a solution  since  it  was  generated  from  that  function.  In  fact, 
it  is  Schrodinger’s  equation  in  a special  case — special  because  it  was  ob- 


1506 


Energy  Quantization  in  Matter 


tained  from  the  wave  function  of  Eq.  (31-19),  which  pertains  to  a particle 
whose  momentum  has  the  constant  magnitude  p associated  with  the  well- 
defined  wavelength  k.  This  means  that  the  kinetic  energy  p2/2m  of  the 
particle  is  constant,  and  so,  according  to  Eq.  (31-20),  the  potential  energy  U 
must  also  be  constant.  Thus  in  Eq.  (31-22)  the  quantity  U is  a constant. 

Now  comes  the  final  postulate.  It  is  simply  to  take  Eq.  (31-22)  as  valid 
even  for  the  case  where  the  potential  energy  U of  the  system  is  not  con- 
stant but,  instead,  is  a function  of  position  U(x).  Doing  this,  we  have  the 
Schrodinger  equation 


d2\jj(x) 

dx2 


-jj-  \E  - U(x)]iIj(x) 


(31-23) 


A summary  is  in  order.  The  de  Broglie  relation  p — h/k  was  known 
experimentally  to  be  a quantitatively  correct  statement  concerning  the 
wavelike  behavior  of  a particle  with  momentum  of  constant  magnitude  p. 
Schrodinger  wanted  to  be  able  to  treat  also  the  case  of  a particle  whose  mo- 
mentum varies  in  magnitude,  since  this  is  what  generally  happens.  He 
could  not  do  so  by  continuing  to  use  the  algebraic  equation  p = h/k  be- 
cause, as  explained  earlier,  it  becomes  meaningless  if  p and  therefore  the 
wavelength  k are  not  constant.  But  there  is  no  p or  k in  the  equivalent  dif- 
ferential equation,  Eq.  (31-22);  there  is  only  the  related  quantity  U , the 
potential  energy.  Therefore,  it  was  consistent  for  him  to  generalize  Eq. 
(31-22)  by  postulate  to  obtain  Eq.  (31-23),  which  is  supposed  to  pertain  to 
the  case  of  a particle  moving  with  momentum  of  varying  magnitude  be- 
cause the  potential  energy  U(x)  varies  with  position.  Whether  or  not  it  was 
correct  was  a matter  that  experiment  alone  could  decide.  In  the  intervening 
years  countless  experiments  have  shown  that  it  was. 

For  the  sake  of  giving  a better  view  of  quantum  mechanics  and  connecting  it 
more  closely  to  material  presented  in  earlier  chapters,  it  must  be  said  that  Eq. 
(31-23)  is  not  a wave  equation  even  though  its  solutions  p(x)  specify  the  ampli- 
tude patterns  of  standing  waves.  To  be  a wave  equation  in  the  strict  sense,  a dif- 
ferential equation  must  involve  partial  derivatives  with  respect  to  both  the  posi- 
tion variable  x and  the  time  variable  t.  An  example  is  the  wave  equation  for 
transverse  displacements  on  a string,  Eq.  (12-26): 

d2f  (x,  t)  = ^ d2f(x,  t) 
dx2  F dt2 

There  is  an  analogous  Schrodinger  equation,  which  can  be  obtained  by  arguments 
similar  to  the  one  leading  to  Eq.  (31-23).  It  is  the  time-dependent  Schrodinger 
equation: 


d2T(x,  t) 
dx2 


8/72m 

h2 


"ih  c)T(x,  t) 
_2  77  dt 


- U(x)T(x,  t) 


(31-24) 


Here  T(x,  t)  is  a function  giving  both  the  space  and  time  dependences  of  the 
matter  wave  associated  with  a particle  of  mass  m moving  in  a region  where  the  po- 
tential energy  is  U(x).  The  quantity  i is  the  imaginary  number  i = y/—l.  The 
functions  T(x,  f)  are  the  time-dependent  wave  functions. 

When  Eqs.  (31-23)  and  (31-24)  are  both  being  dealt  with,  the  former  is  desig- 
nated as  the  time-independent  Schrodinger  equation.  Its  solutions,  the  time- 
independent  wave  functions  p(x),  are  often  called  eigenfunctions,  eigen  being  a 
German  word  meaning  “characteristic”  or  “proper.”  But  we  continue  to  call  the 
p(x)  simply  wave  functions. 


31-3  Schrodinger's  Equation  1507 


Solutions  to  the  time-dependent  Schrodinger  equation  are  found  by  em- 
ploying the  same  technique  of  separation  of  variables  used  in  Sec.  13-4  to  solve 
the  wave  equation  for  a string.  A solution  of  the  form 


T'lx,  t)  = 0(x)</>(t) 


is  assumed  and  substituted  into  the  partial  differential  equation.  It  then  splits  into 
two  ordinary  differential  equations.  One  is  exactly  Eq.  (31-23),  except  thatE  is  re- 
placed by  the  separation  constant  C arising  in  the  separation-of-variables  proce- 
dure. The  other  equation  is 


d<t>(t) 

dt 


2 rnC 
h 


m 


It  has  the  solution 


( />(t)  = e~^icm 


as  can  be  verified  immediately  by  differentiation.  This  solution  is  compactly 
written  as  a complex  exponential.  But  it  can  also  be  written  in  terms  of  sinusoids 
by  using  “Euler’s  formula,” 

elz  = cos  z + i sin  z (31-25) 

to  convert  it  to  the  form 


4>(t)  = cos 


2irCt  \ 
h ) 


- i sin 


m 


This  makes  it  apparent  that  (/>(t)  is  an  oscillatory  function  of  t with  frequency  v = 
C/h.  Comparison  with  the  Einstein-de  Broglie  relation  v = E/h,  connecting  the 
total  energy  E of  a particle  with  the  frequency  v of  its  matter  wave,  identifies  the 
separation  constant  C as  the  particle’s  total  energy  E.  Then  the  differential  equa- 
tion for  t|/(x)  becomes  identical  to  Eq.  (31-23),  and  also  it  is  seen  that 

(f>(t)  = e~27T>Etlh 

The  net  result  is  that  solutions  to  the  time-dependent  Schrodinger  equation 
are  shown  to  be  of  the  general  form 

¥(x,  t)  = ilt{x)e-2niE"h  (31-26) 

where  the  <//(x)  are  solutions  to  Eq.  (31-23): 


d2i/j(x ) 
dx2 


87r2m 


[E 


U(x)Mx) 


Much  can  be  done  with  the  time-dependent  Schrodinger  equation.  But  all 
that  is  done  in  this  book  is  to  use  Eq.  (31-26)  to  discuss  the  lifetimes  of  hydrogen 
atom  energy  levels,  treating  the  atom  as  if  it  were  one-dimensional  so  that  the 
equation  applies.  Let  the  subscripts  1 and  2 represent  the  quantum  numbers  for  its 
ground  state  and  the  single  excited  state  that  will  be  considered.  When  the  atom  is 
in  its  ground  state,  then 

'Tfx,  t)  = i//1(x)e_27r!£'l(,'i 

The  corresponding  probability  density  of  the  electron  has  been  given  in  Born’s 
postulate  by  the  expression  i//j(x).  A more  general  expression  for  the  probability 
density  is  T*(x,  Q'Tfx,  t).  The  asterisk  stands  for  the  operation  of  taking  the  com- 
plex conjugate,  that  is,  of  changing  i to  — i.  For  the  ground  state  the  two  expres- 
sions are  equivalent.  This  is  seen  by  calculating 

'E*(x,  t)^(x,  t)  = ip1(x)e~277(~DElt/,liJj1(x)e ~2viE'tlh 
= t y2[x)e"27r(~t+,>Elt,h 


1508 


Energy  Quantization  in  Matter 


and  thus  obtaining 

¥*(x,t)¥(x,t)  = <//?(x)  (31-27) 

Experiment  shows  that  when  the  electron  is  excited  to  state  2,  it  does  not  stay 
there,  but  returns  to  state  1 in  due  course.  Thus  at  any  time  after  the  excitation,  its 
matter  wave  T(x,  t)  is  partly 

ifj2(x)e~2niE2llh 


and  partly 


iJi1{x)e~27TiEim 

because  the  electron  is  making  a transition  from  state  2 to  state  1.  So  T(x,  t)  is  a 
combination: 


T(x,  t)  = Oiipilxje  2"‘E,llh  + a2t/>2(x)e  27TiE2tlh 

The  quantities  aj  and  a2  specify  how  much  of  each  state  is  mixed  into  'Efx,  t).  At 
the  instant  of  excitation,  a,  = 0andci2  = 1 . But  as  time  passes,  the  magnitude  of  a t 
increases  and  the  magnitude  of  a2  decreases.  Assume  for  simplicity  that  cii  and  ci2 
are  real  quantities,  just  as  tf/fix)  and  i//2(x)  are.  Then  by  evaluating  the  T*(x,  f)  cor- 
responding to  this  T(x,  t),  multiplying  the  two,  and  using  Eq.  (31-25),  it  is  easy  to 
show  that  the  probability  density  is 

T:it(x,  tl'Efx,  t)  = ajip'j(x)  + a2\fj2(x)  + 2a1a2i//1(x)ij/2(x)  cos[27r(E2  - EJt/h]  (31-28) 

Now  T*(x,  t)T(x,  t)  gives  the  probability  of  finding  the  electron,  and  the 
charge  it  carries,  at  various  positions  in  the  hydrogen  atom.  Since  this  probability 
density  is  spread  over  the  region  occupied  by  the  atom,  in  a certain  sense  the  nega- 
tive charge  of  the  atom  is  contained  in  a charge  distribution  given  by 
— eT^tx,  f)T(x,  t),  where  -e  is  the  electron  charge.  Equation  (31-28)  shows  that 
when  the  atom  is  making  the  transition  from  state  2 to  state  1,  this  charge  distribu- 
tion has  a component  that  oscillates  in  time  at  frequency  v = (E2  — Ej)/h.  From 
the  treatment  of  Sec.  27-6,  it  would  be  expected  that  such  an  oscillating  charge 
distribution  emits  electromagnetic  radiation  at  the  same  frequency.  This  is  just 
what  is  observed  experimentally.  A photon  is  emitted  in  the  transition  whose  fre- 
quency v is  given  by  hv  = E2  — E1.  In  fact,  the  lifetime  of  the  state  2 energy  level 
can  be  calculated  from  an  equation  which  is  very  much  like  a combination  of  Eq. 
(31-28)  and  the  equation  giving  the  rate  at  which  oscillating  charges  emit  radia- 
tion, Eq.  (27-42).  When  the  actual  three-dimensional  functions  for  a hydrogen 
atom  are  used  instead  of  their  symbolic  one-dimensional  representations  i/n(x)  and 
i//2(x),  the  calculated  lifetimes  agree  very  well  with  the  lifetimes  measured  for  the 
various  excited  states  of  the  atom. 

The  success  of  these  calculations  makes  it  apparent  that  what  counts  in  deter- 
mining how  the  atom  radiates  is  the  motion  of  the  charge  distribution  specified 
by  the  probability  density  for  the  electron,  and  not  the  “motion  of  the  electron.” 
This  is  a very  satisfactory  situation  because  the  probability  density  can  be  deter- 
mined theoretically  and  experimentally.  But  the  motion  of  an  electron  in  an  atom 
cannot  be  determined  by  either  method  because  the  position-momentum  uncer- 
tainty principle  makes  it  impossible  to  follow  its  changing  position  by  calculation 
or  by  measurement.  Indeed,  the  phrase  in  quotation  marks  has  no  operational 
meaning  when  the  quantum  number  n is  small.  Analysis  shows  that  as  n becomes 
large  it  does  become  possible  to  determine  the  motion  of  the  electron,  and  when 
this  happens  the  motion  of  the  electron  becomes  indistinguishable  from  the  mo- 
tion of  the  charge  distribution.  So  calculations  made  by  the  quantum  procedure  of 
connecting  radiation  to  oscillations  in  the  charge  distribution  lead  to  results  corre- 
sponding to  those  obtained  by  the  nonquantum  procedure  of  connecting  it  to  os- 
cillations in  the  electron  position,  for  large  n.  But  for  small  n only  the  quantum 
procedure  correctly  describes  how  nature  works. 


31-3  Schrodinger's  Equation  1509 


31-4  THE  HARMONIC 
OSCILLATOR 


U(x) 


Fig.  31-15  1 he  potential  energy  of  a 

harmonic  oscillator,  plotted  qualitatively 
as  a function  of  the  coordinate  speci- 
fying the  displacement  of  the  oscillating 
particle  from  its  equilibrium  position. 


What  about  the  fact  that  a hydrogen  atom  emits  no  radiation  when  it  is  in  its 
ground  state?  Equation  (31-27)  shows  the  reason  is  that  when  the  atom  is  com- 
pletely in  the  ground  state,  the  charge  distribution  specified  by  its  electron  proba- 
bility density  does  not  change  in  time.  Since  it  does  not  oscillate,  it  does  not  radi- 
ate. This  is  a unique  property  of  the  ground  state  of  the  atom,  because  it  is  the  only 
state  where  there  is  no  lower  state  to  which  a transition  can  be  made.  The  ground 
state  is  completely  stable  against  losing  its  energy  by  radiation  because  there  is 
just  no  way  for  its  matter  wave  T(x,  t)  to  be  the  mixture  required  to  produce  an  os- 
cillating probability  density  and  charge  distribution  that  would  cause  it  to  radiate. 
The  conclusion  is  beautifully  consistent  with  the  earlier  one  that  the  position- 
momentum  uncertainty  principle  requires  the  electron  bound  in  the  atom  to  main- 
tain a certain  minimum  energy,  its  zero-point  energy. 


An  analysis  of  the  harmonic  oscillator  by  means  of  newtonian  mechanics 
provides  the  foundation  for  studying  oscillatory  motion  in  all  macroscopic 
mechanical  and  electrical  systems,  as  you  have  seen.  I he  harmonic  oscil- 
lator is  an  equally  important  prototype  for  microscopic  oscillatory  systems. 
To  give  just  one  example,  the  vibrational  motion  of  two  atoms  in  a diatomic 
molecule  is  well  represented  by  a harmonic  oscillator.  Analyzing  the  har- 
monic oscillator  in  quantum  mechanics  involves  finding  solutions  to  the 
Schrodinger  equation  for  a particle  of  mass  m and  coordinate  x moving  in  a 
region  where  the  potential  energy  U(x)  has  the  harmonic  oscillator  form  of 
Eq.  (8-26): 

kx~ 

U(x)  = — 


This  harmonic  oscillator  potential  is  plotted  in  Fig.  31-15.  The  constant  k 
specifies  the  stillness  ot  the  spring  in  a macroscopic  oscillator.  In  a micro- 
scopic one  the  “spring”  may  involve  electric  or  nuclear  forces,  whose 
“stiffness”  can  be  expressed  by  the  value  of  k.  But  since  the  Schrodinger 
equation  involves  the  potential  energy  of  a system,  not  the  force  acting  on 
its  particle,  it  is  best  to  think  of  k as  a constant  that  describes  how  abruptly 
the  system’s  potential  energy  increases  from  its  reference  value  U = 0 at 
the  equilibrium  position  x = 0,  as  the  particle  moves  away  from  that  posi- 
tion. 

Schrodinger’s  equation  for  the  harmonic  oscillator  is  obtained  by  sub- 
stituting Eq.  (8-26)  in  Eq.  (31-23),  to  produce 


d2  ifj(x) 
dx2 


(31-29) 


This  is  the  differential  equation  which  quantum  mechanics  says  governs 
the  behavior  of  the  same  system  that  newtonian  mechanics  says  is  governed 
by  d2x/dt 2 = —kx/m.  What  a difference!  Yet  you  will  see  that  the  two  equa- 
tions lead  to  corresponding  predictions  when  they  are  applied  to  macro- 
scopic oscillators.  For  microscopic  oscillators  the  predictions  of  one  equa- 
tion are  very  different  from  those  of  the  other,  and  experiment  shows  that 
only  those  made  by  the  Schrodinger  equation  are  correct. 


The  first  step  in  finding  solutions  to  Eq.  (31-29)  is  to  manipulate  it  into 
a simpler  form.  Carrying  the  factor  87 T2m/K2  inside  the  parentheses  makes 
the  equation  look  like  this: 

d2Mx)  (8i r2mE  4: T2mk  \ 

~d^  ~ ~ (_ h2  li2  V"  ) ^ V 


1510  Energy  Quantization  in  Matter 


Defining  the  constants 


a = 


87 T~mE 


h2 


and 


P = 


2ir\/mk 


h 


simplifies  the  form  of  the  equation  to 


d2  iJj(x) 
dx2 


— (a  — f32x2)xp(x) 


(31-30) 


(31-31) 


(31-32) 


Further  simplification  can  be  achieved  by  changing  from  the  variable  x to 
the  variable  u.  The  new  variable  is  proportional  to  the  old  one,  and  the  pro- 
portionality constant  is  \//3.  That  is, 

u = V/3  x (31-33) 

The  relation  between  the  derivatives  in  x and  u is  found  by  calculating 

difj  difi  du  dxjj 
dx  du  dx  du 


and 


d2\l)  d /dijj\  du  d (dij) 


dx ~ du  \ dx  / dx  du  \du 


So,  converted  to  the  new  variable,  Eq.  (31-32)  becomes 
d2\\){u)  I ~ zt'2  \ 


or 


d2ip(u) 

du2 


u 


2 


<K“) 


Finally,  writing 


e 


(31-34) 


reduces  the  harmonic  oscillator  Schrodinger  equation  to  the  form 


d2ifj(  u) 
du 2 


— (e  - u2)\\f(u) 


(31-35) 


To  express  e in  terms  of  the  quantities  appearing  in  the  original  form 
of  the  differential  equation,  Eqs.  (31-30)  and  (31-31)  are  used  in  Eq. 
(31-34),  yielding 

87 T2mE  h 

h~  2 TT\fmk 


or 


e = 


4tt 


hy/k/in 


(31-36) 


31-4  The  Harmonic  Oscillator  1511 


Now  the  oscillation  frequency  of  a macroscopic  oscillator  is  (l /2v)\/k/m. 
Consequently,  the  units  for  \/k/m  must  be  reciprocal  seconds.  Since  the 
units  for  h are  joule-seconds  and  those  for  E are  joules,  it  can  be  seen  from 
Eq.  (31-36)  that  e is  a dimensionless  quantity.  Since  both  terms  in  the  factor 
e — u2  of  Eq.  (31-35)  must  have  the  same  dimensions,  it  is  apparent  that  u 
must  also  be  dimensionless.  This  is  also  true  of  iJj(u)  since  its  square  i/j2(u)  is  a 
probability,  which  is  always  dimensionless,  per  unit  length  of  the  dimen- 
sionless u axis.  Thus  the  form  of  the  harmonic  oscillator  Schrodinger  equa- 
tion that  was  obtained  in  Eq.  (31-35)  by  seeking  to  simplify  it  turns  out  to 
also  be  a completely  dimensionless  form. 

Solutions  to  Eq.  (31-35)  can  be  found  by  an  analytical  method.  But  the 
method  is  extremely  complicated.  Here  you  will  see  how  solutions  can  be 
found  by  applying  the  numerical  method  that  has  been  used  to  solve  dif- 
ferential equations  on  many  previous  occasions.  It  is  particularly  appropri- 
ate to  use  it  on  the  Schrodinger  equation  because  the  procedure  will  give 
you  a clear  physical  insight  into  the  way  the  equation  predicts  that  the  en- 
ergy of  a bound  particle  is  quantized.  What  is  clone  in  finding  numerical 
solutions  for  Eq.  (31-35)  is  completely  analogous  to  what  was  done  in  find- 
ing them  for  Eq.  (13-41),  the  differential  equation  whose  solutions  describe 
symmetrical  standing  waves  on  a circular  drumhead.  It  is  sensible  to  use  the 
numerical  method  to  find  solutions  to  the  drumhead  equation  because  it 
need  be  done  only  once.  Since  the  equation  is  dimensionless,  its  solutions 
can  be  applied  to  any  particular  drumhead  by  adjusting  the  constants  that 
are  defined  in  the  process  of  making  it  dimensionless.  The  same  situation 
obtains  for  Eq.  (31-35). 

Numerical  solutions  are  found  for  Eq.  (13-41)  by  starting  from  values 
chosen  for  the  dependent  variable  and  its  first  derivative  at  some  initial 
value  of  the  independent  variable.  The  same  is  true  for  any  other 
second-orcler  ordinary  differential  equation,  including  Eq.  (31-35).  How 
can  you  make  an  intelligent  choice  of  an  initial  value  of  u and  the  corre- 
sponding values  of  t Jj(u)  and  d\Jj(u)/du ? The  harmonic  oscillator  potential 
U{x)  = kx2  /2  is  an  even  function  ot  x and  therefore  also  of  u.  That  is,  its 
value  for  a particular  ut  is  exactly  equal  to  its  value  at  — Uj.  The  behavior  of 
the  particle  moving  under  the  influence  of  the  potential  can  be  expected  to 
show  the  same  symmetry  as  the  potential  itself.  (This  is  certainly  true  for  a 
macroscopic  harmonic  oscillator.)  Since  the  behavior  is  governed  by  the 
value  of  ip2(u),  according  to  Born's  postulate,  you  can  see  that  i Jj2(u)  should 
be  an  even  function  of  u.  This  means  that  ifj(u)  itself  must  be  either  even  in  u 
or  odd  in  u.  See  Eig.  31-16.  The  point  is  that  if  either 

i fj(—Ui)  = +4>(Ui)  (31-37fl) 

or 

x{j{  — Ui)  — —xJj(Ui)  (31-37  b) 

then  i fj2(u)  will  have  the  required  property 

iJj2(-Uj)  = +*//2(m,-) 

Because  of  the  symmetry  of  i|/2(w),  it  is  necessary  to  carry  out  the  nu- 
merical solution  of  the  equation  for  only  the  positive  half  of  the  u axis.  And 
at  u = 0 the  wave  function  xjj(u)  must  satisfy  either  dxjj{u)/du  = 0 or  xjj(u)  = 
0.  The  first  condition  gives  the  even -xJj{u)  case,  while  the  second  gives  the 
odd-i Jj(u)  case,  as  is  demonstrated  in  Fig.  31-16. 


1512  Energy  Quantization  in  Matter 


>p(u) 


(a) 

1 Hu) 


Fig.  31-16  (a)  An  even  Its  value  at  any  point  — im  equals  its  value 

at  the  point  +ut.  Because  of  the  symmetry  about  u = 0,  its  slope 
dip(u)/du  must  be  zero  at  u = 0 if  d\(i(u)/du  is  to  be  continuous  there. 
Continuity  of  dijj(u)/du  at  u = 0 is  required  since  otherwise  dz4>(u) / du2 
would  be  infinite  at  the  sharp  corner,  in  disagreement  with  the  finite 
value  given  by  Schrodinger’s  equation.  ( b ) An  odd  Its  value  at  any 
point  —M;  equals  the  negative  of  its  value  at  the  point  +Uj.  For  this  case 
the  symmetry  requires  its  value  iji(u)  to  be  zero  at  u = 0,  if  ifj(u)  is  to  be 
continuous  there.  A discontinuity  in  ijj(u)  at  u = 0 can  be  ruled  out  on 
physical  grounds  since  it  would  lead  to  a sharp  corner  in  the  probability 
density  i/r(w ) describing  the  behavior  of  the  particle  at  a point  where  the 
potential  energy  function  determining  its  behavior  is  perfectly  smooth. 


The  values  used  for  t fj(u)  at  u = 0 in  the  even  case,  or  for  dijj(u)/du  at 
u = 0 in  the  odd  case,  do  not  really  matter.  Because  the  differential  equa- 
tion is  linear  in  t jj{u),  increasing  or  decreasing  these  values  by  some  factor 
will  only  scale  up  or  down  all  the  values  of  any  solution  t fj(u),  without  af- 
fecting its  overall  shape  or  any  other  vital  properties.  Example  31-4  con- 
siders an  even  solution  by  taking  >fj(u)  = 1 and  dijj(u)/du  = 0 at  u = 0. 

Before  numerical  calculations  can  begin,  it  is  also  essential  to  fix  the 
value  of  the  quantity  e appearing  in  the  differential  equation.  You  can  see 
from  Eq.  (31-36)  that  e is  a dimensionless  measure  of  the  total  energy  E of 
the  harmonic  oscillator.  In  newtonian  mechanics  any  value  of  E , or  e, 
would  be  a possible  one  for  the  oscillator.  But  you  know  already  that  this  is 
not  true  in  quantum  mechanics.  In  fact,  the  procedure  will  be  to  find  which 
values  of  e are  allowed,  and  which  are  not,  by  trial  and  error.  This  is  just  the 
procedure  used  with  Eq.  (13-41)  for  the  drumhead.  But  you  are  still  left 
with  the  question  of  how  to  choose  a value  of  e to  start  studying  the  solu- 
tions of  Eq.  (31-35).  Since  that  equation  contains  no  numerical  constants, 
except  the  1 that  is  the  unwritten  coefficient  of  each  term,  e = 1 seems  as 
good  a choice  as  any.  It  is  the  one  made  in  Example  31-4. 

Writing  Eq.  (31-35)  as 


d2\fj(u ) _ 
du2  U 


(3 1-38«) 


where 


Q = — (e  — u2)\\>{u)  (31-38  b) 

makes  manifest  that  it  is  just  another  case  of  Eq.  (6-14),  the  standard  form 
used  in  all  the  differential  equation  solving  programs  of  the  Numerical  Cal- 
culation Supplement.  However,  the  harmonic  oscillator  Schrodinger  equa- 
tion program  listed  there  does  have  one  new  feature.  There  is  a provision 
in  it  for  making  the  calculator  or  computer  display  ifj(u)  only  after  every  /th 
calculational  cycle.  This  is  to  allow  you  to  attain  a high  degree  of  numerical 
accuracy  in  the  solution  by  reducing  A u,  without  the  necessity  of  inspect- 
ing or  plotting  too  many  closely  spaced  values  of  ip(u). 


31-4  The  Harmonic  Oscillator  1513 


EXAMPLE  31-4 


Run  the  harmonic  oscillator  Schrodinger  equation  program  with  the  sets  of  param- 
eters given. 

a.  ipo  = 1;  ( di))/du)o  = 0;  u0  = 0;  Aw  = 0.05;  e = 1;  l — 2. 

■ rhe  results  are  shown  by  the  crosses  labeled  e = 1.000  in  Fig.  31-17.  Note 
that  the  curve  traced  by  the  crosses  starts  off  from  u = 0 with  a concave-downward 
curvature.  1 his  reflects  the  fact  that  d2\\>(u)  / du2 . which  is  a measure  of  the  curvature 
of  is  negative  for  small  u where  the  term  e — u2  in  the  equation  d2ip(u)/du2  = 
— (e  — uz)\\)(u)  is  positive,  since  i/j(u)  itself  is  positive.  At  u — 1 . you  have  e — u2  = 0 
since  e = 1.  So  d2ip(u)/du2  is  zero  at  that  point.  This  is  why  the  curve  is  locally 
straight  at  u = 1.  When  u exceeds  this  value,  d2\\j(u)/du 2 changes  sign  since  e — u2 
does,  and  so  the  curvature  becomes  concave  upward,  at  least  while  ijj(u)  remains 
positive.  But  since  the  slope  d\\)(u)/du  is  negative  at  u = 1,  the  positive  curvature 
d2xfj(u)/du2  does  not  prevent  i Jj(u)  from  crossing  the  u axis  at  about  u = 3.2.  When 
this  happens,  d2 1//(  u)/dir  changes  sign  again  because  t//(  w)  becomes  negative,  and  t p(  u ) 
becomes  concave  downward.  So  its  negative  slope  starts  to  increase,  and  i Jj(u)  bends 
rapidly  away  from  the  u axis.  When  this  behavior  gets  started,  nothing  can  prevent 
it  from  continuing  with  ever-increasing  vigor,  and  i jj(u)  diverges  to  negative  infinity. 
The  reason  is  that  the  more  negative  the  value  of  i li(u),  the  more  negative  is  the 
value  of  d2[jj(u)/du2  because  of  the  relation  between  them  that  is  imposed  by  the  dif- 
ferential equation  d2\\>{u)/du2  = — (e  — m2)i//(m). 

These  results  constitute  a solution  to  the  differential  equation  which  is,  within 
the  accuracy  of  the  numerical  method,  a completely  correct  one  for  the  initial  con- 
ditions used.  It  is  an  acceptable  solution  to  the  differential  equation  from  a mathe- 
matical point  of  view,  but  it  is  not  an  acceptable  wave  function  from  a physical  point 
of  view.  Born’s  postulate  says  that  tp2(u)  gives  the  probability  of  finding  the  asso- 
ciated particle  near  various  locations  on  the  u axis.  The  behavior  of  the  i Jj(u)  that  has 
been  found  will  cause  the  corresponding  t/r(w)  to  increase  without  limit  when  u be- 
comes more  positive  than  a not  very  large  number  (and  when  it  becomes  more  neg- 
ative than  minus  that  number).  Thus  you  must  reject  this  solution  because  it  de- 
scribes a particle  that  has  an  infinite  probability  of  being  everywhere  except  where  it 
is  supposed  to  be — that  is.  near  where  the  potential  U(u)  has  its  minimum  at  u = 0. 
Since  the  force  F = — dU(u)/du  acting  on  the  particle  is  always  directed  toward  that 
point,  the  particle  must  be  somewhere  in  its  vicinity.  More  formally,  the  solution 
i jj(u)  must  be  rejected  since  it  cannot  be  "normalized"  to  satisfy  Eq.  (31-4.)  ■ 

b.  i//0  = 1;  ( di])/du)o  - 0;  u0  = 0;  An  = 0.05;  e = 1.01;  / = 2. 

■ Here  an  attempt  is  made  to  find  an  acceptable  wave  function  by  increasing 

the  value  of  e.  Because  of  the  obvious  sensitivity  of  the  differential  equation,  e is  in- 
creased by  only  1 percent.  The  results  are  indicated  by  the  crosses  labeled  e = 1.010 
in  Fig.  31-17.  They  show  that  increasing  e is  the  wrong  thing  to  do.  ■ 

c.  t//0  = 1;  (dilj/du)0  = 0;  = 0;  Au  = 0.05;  e = 0.99;  I = 2. 

■ In  this  calculation  e is  decreased,  from  the  value  guessed  at  first,  by  1 per- 

cent. The  resulting  behavior  of  \jj(u)  is  shown  in  Fig.  31-17  by  the  dots  labeled  e = 
0.990,  which  merge  into  the  crosses  for  u smaller  than  about  u = 1.8.  For  this  e the 
concave-upward  curvature  of  i j»(u)  to  the  right  of  u = 1 is  a little  more  pronounced 
than  for  e = 1.000,  since  in  this  region  where  u2  > e the  magnitude  ol  the  term 
e — u2  is  larger  when  e is  smaller.  But  although  the  curvature  of  iJj(u)  is  only  slightly 
larger,  its  cumulative  effect  is  to  succeed  in  making  the  slope  of  iJj(u)  go  through 
zero  and  become  positive  before  < jj(u)  itself  cuts  the  axis  and  becomes  negative. 
When  this  happens,  i p(n)  again  starts  to  diverge  to  infinity  — this  time  to  positive  in- 
finity. The  farther  it  gets  from  the  u axis,  the  larger  its  rate  of  change  of  slope  is,  so 
the  more  rapidly  it  increases.  ■ 

d.  tpo  = 1;  (dty/du) o = 0;  u0  = 0;  Au  = 0.05;  e = 0.999;  1 = 2. 

■ Finally,  e is  made  0.1  percent  smaller  than  in  the  first  calculation.  The  re- 
sults are  plotted  as  dots  labeled  e = 0.999  in  Fig.  31-17.  They  are  similar  in  charac- 
ter to  the  ones  of  the  preceding  calculation,  but  the  divergence  to  positive  infinity 
does  not  occur  until  u assumes  larger  values. 


1514 


Energy  Quantization  in  Matter 


i PUD 


By  now  you  can  see  how  it  goes.  If  calculations  like  those  in  Example 
3 1-4  are  made  with  e = 1.0000  and  e = 0.9999,  while  using  an  appreciably 
smaller  value  of  A u to  reduce  the  inaccuracy  of  the  numerical  method  to 
the  lower  value  dictated  by  these  narrower  limits,  the  ijj(u)  found  in  the 
former  will  diverge  to  negative  infinity  at  a somewhat  larger  u than  for  the 
case  treated  in  the  first  calculation,  and  the  t/t(w)  found  in  the  latter  will  di- 
verge to  positive  infinity  about  the  same  larger  u.  If  this  is  not  apparent,  put 
the  program  on  a calculator  or  computer  and  try  it. 

Will  you  ever  succeed  in  obtaining,  from  a numerical  solution  of  the 
Schrodinger  equation,  a if/(u)  that  never  diverges?  Not  until  you  have  access 
to  a computer  of  infinite  speed  which  allows  you  to  use  an  infinitesimally 
small  Am,  and  which  also  carries  an  infinite  number  of  digits  so  as  to  have 
zero  roundoff  error.  But  is  it  necessary?  What  would  happen  if  the  ulti- 
mate machine  were  put  to  work  on  this  problem  is  apparent  anyway.  The 
results  would  look  like  the  dots  or  crosses  of  the  e = 0.999  or  e = 1.000 
curves  to  about  u — 3 and  then  continue  slowly  approaching,  but  never 
reaching,  the  u axis.  The  closer  i jj(u)  got  to  zero,  the  smaller  would  be  the 
value  of  its  second  derivative  dz\f/(u)/$u2  = — (e  — u2)i/j(u),  and  so  the  smaller 
would  be  its  curvature.  Thus  the  differential  equation  is  consistent  with  an 
ever-closer  approach  to  a straight  line  lying  along  the  u axis,  if  the  value  of 
e is  such  as  to  allow  ifj(u)  to  approach  the  axis  in  just  the  right  way.  [Such  an 
asymptotic  approach  to  the  u axis  is  seen  in  the  t//(«)  found  for  e = 1 by 
solving  the  Schrodinger  equation  analytically.  An  analytical  solution  auto- 
matically goes  to  the  limit  of  infinitesimal  Am  and  has  no  roundoff  error.] 
If  you  understand  this,  you  will  understand  that  there  is  no  practical 
need  to  go  further  than  the  first  and  fourth  calculations  of  Example  31-4. 
They  define  the  shape  of  the  allowed  wave  function  iJj(u)  quite  accurately 
and  determine  the  allowed  value  of  the  energy  parameter  to  be  e = 1, 
within  one  digit  in  the  third  decimal  place. 

1 his  is  not  the  only  value  of  the  dimensionless  energy  parameter  e al- 
lowed by  quantum  mechanics  for  a harmonic  oscillator.  Use  a program- 

Fig.  31-17  Harmonic  oscillator 
Schrodinger  equation  solutions  ob- 
tained in  a search  for  the  first  allowed 
solution.  The  parameter  e being  varied 
is  a dimensionless  measure  of  the  oscil- 
lator’s total  energy. 


31-4  The  Harmonic  Oscillator 


1515 


mable  calculator  or  small  computer  to  see  if  you  can  find  another  allowed  e 
in  the  range  from  0 to  below  1.  Do  not  bother  to  plot  ifj(u);  just  watch  the 
numerical  display  when  the  device  is  running.  You  will  soon  conclude  that 
there  are  no  values  of  e in  this  range  for  which  the  behavior  of  t jj(u),  as  u be- 
comes sufficiently  large,  is  analogous  to  the  behavior  displayed  in  Fig. 
31-17.  Specifically,  you  will  not  find  a value  of  e in  this  range  at  which  there 
is  the  striking  switch  in  the  sign  of  the  divergence  of  t Jj(u)  that  is  found  at 
e=l.  This  will  be  your  conclusion  whether  you  search  for  a t Jj(u)  which  is 
an  even  function  of  u,  as  in  Fig.  31-17,  or  an  odd  function  of  u.  To  search 
for  an  odd  t use  the  initial  conditions  t//  = 0,  dijj/du  = 1 at  u = 0. 

Next  continue  the  search  to  values  of  e greater  than  1 . You  can  speed  it 
up  by  using  a relatively  large  A u,  reducing  the  increment  for  accuracy 
when  you  get  into  a promising  range  of  e.  Soon  you  will  find  that  the  next 
allowed  value  occurs  at  e ~ 3 and  represents  a case  where  t Jj(u)  is  odd  in  u. 
This  is  illustrated  in  Example  31-5. 


EXAMPLE  31-5  ' » ' 

Run  the  harmonic  oscillator  Schrodinger  equation  program  with  each  of  the  follow- 
ing sets  of  parameters: 

t/»0  = 0;  (di/j/du)o  = 1.65;  t<0  = 0;  A u = 0.02;  e = 2.999;  1 = 4. 

ipo  — 0;  (dip/du) o = 1.65;  t<0  — 0;  A u = 0.02;  e = 3;  l = 4. 

■ The  initial  value  of  dijj(u)/du  was  adjusted  after  the  critical  values  of  e were 
found  by  watching  the  displayed  values  of  <|/(m),  and  before  plotting,  so  as  to  make 
the  peak  height  of  i//(w)  for  the  results  plotted  in  Fig.  31-18  be  the  same  as  in  Fig. 
31-17.  This  facilitates  comparison  without  affecting  the  shape  of  the  allowed  i/j(u), 
or  the  value  of  the  allowed  e,  since  the  Schrodinger  equation  is  linear  in  iJj(u). 

In  this  case,  ijj(u)  starts  with  a positive  slope  from  a zero  value  at  u = 0.  But 
since  its  curvature  is  much  larger  at  small  u than  for  the  i jj(u)  in  Fig.  31-17,  it  is  nev- 
ertheless able  to  bend  over  before  the  sign  of  e — ir  switches,  that  is,  before 
3 — u2  = 0 or  u = v/3  = 1.73.  For  larger  u it  behaves  much  as  before.  The  allowed 
value  of  e found  in  this  Example  is  e = 3,  to  within  one  digit  in  the  third  decimal 
place. 


Fig.  31-18  I he  second  allowed  solu- 
tion to  the  harmonic  oscillator  Schro- 
dinger equation. 


Energy  Quantization  in  Matter 


M«) 


Run  the  program  to  find  the  next-higher  allowed  value  of  e and  the 
corresponding  wave  function.  Since  the  first  one  is  even  and  the  second  is 
odd,  what  do  you  think  the  symmetry  of  the  next  will  be?  Use  the  appropri- 
ate initial  conditions.  Since  the  first  value  of  e is  1 and  the  second  is  3,  you 
may  also  be  able  to  make  a good  guess  about  the  next  value  of  e.  Since  the 
curvature  of  i p(u)  in  the  region  of  u where  e — u2  is  positive  will  be  even 
larger  for  a higher  value  of  e,  you  can  expect  that  the  ifj(u)  you  seek  will  os- 
cillate even  more  rapidly  in  that  region. 

Example  31-6  illustrates  the  ninth  allowed  solution  to  the  Schrodinger 
equation  for  the  harmonic  oscillator. 


EXAMPLE  31-6 

Run  the  harmonic  oscillator  Schrodinger  equation  program  with  each  of  the  follow- 
ing sets  of  parameters: 

t|/0  = E (dxjj/du)0  = 0;  u0  = 0;  A u = 0.01;  e = 16.999;  1 = 5 for  u < 5 and  l = 
10  for  u 3=  5. 

i//0  = 1;  (dijj/du)0  = 0;  u0  = 0;  Am  = 0.01;  e = 17;  / = 5 for  u < 5 and  l = 10 
for  u 3=  5. 

■ The  results,  plotted  in  Fig.  31-19,  make  it  evident  that  the  allowed  value  of  the 
energy  parameter  is  e = 17.  The  oscillations  in  i//(m)  build  up  until  they  reach  a final 
peak  just  before  e — u2  = 0.  Can  you  explain  why  by  means  of  a curvature  argu- 
ment? 


The  probability  density  i{j2(u)  for  the  ninth  wave  function  is  plotted  in 
Fig.  31-20,  along  with  a dashed  curve  representing  the  probability  density 
Pnew t(u)  predicted  by  newtonian  mechanics  for  the  same  particle  mass,  po- 


Fig.  31-19  The  ninth  allowed  solu- 
tion to  the  harmonic  oscillator  Schro- 
dinger  equation. 


31-4  The  Harmonic  Oscillator 


1517 


• •••••  Schrodinger  = \ p2(u)  | | 

Newton  = Fnewt(«)  I I 


e=17 


Fig.  31-20  The  probability  density  for  the  ninth  allowed  solution  of  the  har- 
monic oscillator  Schrodinger  equation  is  the  curve  formed  by  the  dots.  The 
probability  density  predicted  by  newtonian  mechanics  for  the  same  oscillator  with 
the  same  value  of  total  energy  is  shown  by  the  dashed  curve. 


u 


t 


tential  energy  function,  and  total  energy.  The  newtonian  probability  den- 
sity is  proportional  to  the  time  which  the  particle  in  the  oscillator  spends  in 
each  interval  of  the  u axis,  just  as  for  the  particle  in  the  box,  and  so  is  in- 
versely proportional  to  its  speed  v.  Using  the  energy-conservation  equation 
mv2/ 2 + kx2/2  = E to  find  v and  then  changing  to  the  dimensionless  coor- 
dinate u,  you  can  easily  show  that  Pnewt(u)  = A/Ve  — u2,  where  A is  a pro- 
portionality constant.  The  newtonian  probability  density  is  a minimum  at 
the  midpoint  of  the  oscillation,  u = 0,  where  the  particle  moves  most  rap- 
idly. It  goes  to  infinity  at  the  endpoint  of  the  oscillation,  u = Ve,  where  the 
particle  is  instantaneously  at  rest.  Beyond  that  point  it  abruptly  drops  to 
zero.  The  value  of  A has  been  set  so  that  the  integral  over  all  u is  the  same 
for  both  Pnewt(M)  and  i//2(w).  Neither  of  these  total  probability  integrals  has 
been  adjusted  to  have  the  value  1 required  by  Eq.  (31-4).  But  this  can  read- 
ily be  done. 

The  figure  makes  it  clear  that  the  probability  density  predicted  by 
quantum  mechanics  for  a harmonic  oscillator  with  the  ninth  allowed  en- 
ergy fluctuates  about  an  average  that  is  well  represented  by  the  probability 
density  predicted  by  newtonian  mechanics  for  the  same  oscillator  and  the 
same  energy.  The  higher  the  energy,  the  more  numerous  and  closely 
spaced  are  the  fluctuations  in  the  probability  density.  Ultimately,  the  fluc- 
tuations will  not  be  resolvable  experimentally,  and  only  an  average  over 
many  fluctuations  can  be  measured.  In  this  way  the  predictions  of  quantum 
mechanics  concerning  the  location  of  a particle  in  a harmonic  oscillator 
come  into  correspondence  with  those  of  newtonian  mechanics. 


There  are  two  principal  qualitative  conclusions  to  be  drawn  from  the 
numerical  solutions  that  have  been  obtained  to  the  harmonic  oscillator 
Schrodinger  equation,  d2i]j(u)/du2  — — (e  — u2)ifj(u).  One  is  that  its  solutions 
ijj(u)  are  oscillatory  functions  of  u in  the  region  where  u < Ve,  so  that 
e — u2  > 0.  fhe  reason  is  that  there  the  sign  of  d2ijj(u) / du2 , which  deter- 
mines the  curvature,  is  always  opposite  to  the  sign  of  ip(u)  itself.  So  <//(w)  is 
concave  toward  the  u axis  everywhere  in  this  region. 


Energy  Quantization  in  Matter 


The  second  qualitative  conclusion  is  that  there  is  a marked  tendency 
for  i l>(u)  to  diverge  to  infinity  in  the  region  where  u > \/e,  so  that  e — u2  < 
0.  There  the  sign  of  d2\\f(u) / du2  is  always  the  same  as  the  sign  of  iJj( u ),  so  the 
function  is  concave  away  from  the  axis.  As  a result,  if  its  magnitude  starts  to 
increase,  it  continues  to  increase  without  limit.  Such  divergent  behavior  of 
i Jj(u)  means  an  even  more  divergent  behavior  of  t fj2(u),  and  this  is  not  al- 
lowed by  Born’s  postulate.  However,  there  are  certain  special  values  of  e at 
which  the  divergence  of  i Jj(u)  is  suppressed  to  a degree  limited  only  by  the 
accuracy  of  the  numerical  calculations.  These  are  the  allowed  values  of  the 
energy  parameter  e,  and  the  »|/(w)  obtained  for  them  are  the  acceptable 
wave  functions. 


A quantitative  conclusion  obtained  from  the  numerical  work  is  that  the 
first,  second,  and  ninth  allowed  values  of  e are  1,  3,  and  17.  The  pattern  in 
these  results  is  matched  by  the  formula 

e = 2n  — 1 for  n = 1,  2,  3,  . . . (31-39) 


Its  validity  was  verified  by  the  calculations  for  the  quantum  numbers  n = 1, 
2,  and  9 to  better  than  three  decimal  places.  Similar  calculations  will  verify 
it  for  other  values  of  n and/or  for  more  decimal  places. 

Let  us  use  Eq.  (31-36), 

4-7T 

e = — 7==£ 
h V k / m 


to  write  Eq.  (31-39)  in  terms  of  the  actual  energy  E of  the  harmonic  oscil- 
lator. We  have 


E = (2 n - 


1) 


hVk/ 


m 


4rr 


The  result  is  customarily  expressed  not  directly  in  terms  of  the  constants  k 
and  m,  but  indirectly  through  the  frequency  related  to  these  constants  by 
Eq.  (6-28o), 


This  is  the  frequency  at  which  a particle  of  mass  m and  potential  energy 
U(x)  = Ax2/ 2 would  oscillate  if  its  behavior  were  governed  by  newtonian 
mechanics.  But  unless  the  quantum  number  n is  very  large,  newtonian  me- 
chanics does  not  govern  its  behavior,  and  v should  he  looked  on  as  nothing 
more  than  a convenient  symbol  defined  in  terms  of  k and  m by  Eq.  (6-28«). 
Using  it  and  labeling  the  allowed  energies  with  the  quantum  number  n,  we 
write  the  energy  quantization  formula  for  a harmonic  oscillator  as 

En  = (n  — i)hv  for  n = 1,  2,  3,  . . . (31-40) 


As  was  mentioned  in  Sec.  30-2,  Planck’s  theory  of  the  emission  of  thermal 
radiation  led  him  to  an  energy  quantization  formula  for  a harmonic  oscillator  long 
before  Eq.  (31-40)  was  obtained  by  solving  Schrodinger’s  equation  for  the  har- 
monic oscillator.  Expressed  in  the  form  of  Eq.  (31-40),  Planck’s  formula  was  E„  = 

(n  — l)hn,  for  n = 1,  2,  3 The  essential  difference  between  the  two  is  that 

according  to  Planck,  the  lowest  possible  value  of  the  total  energy  of  a harmonic  os- 
cillator is  E = 0,  whereas  according  to  Schrodinger,  the  lowest  possible  value  is 
E = hv/2.  Planck's  formula  is  wrong  because  it  does  not  contain  a zero-point  en- 
ergy and  is  therefore  inconsistent  with  the  uncertainty  principles.  But  this  error 


31-4  The  Harmonic  Oscillator  1519 


(c) 


Fig.  31-21  (a)  The  potential  energy 

curve  and  energy-level  diagram  for  a 
harmonic  oscillator,  (b)  The  potential 
energy  curve  and  energy-level  diagram 
for  a hydrogen  atom,  (c)  The  potential 
energy  curve  and  energy-level  diagram 
for  a particle  in  a one-dimensional  box. 


caused  no  difficulty  in  Planck’s  theory  because  it  involved  only  the  differences 
AE„  between  the  energies  of  adjacent  energy  levels.  Both  Planck’s  formula  and  the 
correct  Eq.  (31-40)  predict  the  same  value  for  these  differences,  namely,  AE„  = hv. 

Equation  (31-40)  shows  that  t he  energy  levels  of  a harmonic  oscillator 
are  discrete.  This  is  not  in  disagreement  with  the  observation  that  the  levels 
of  a macroscopic  harmonic  oscillator  appear  to  form  a continuum.  The 
fractional  separation  between  adjacent  levels,  A En/En  = hv/En  — 
hv/nhv  = 1/n,  is  not  large  enough  to  be  measurable  for  the  very  large  val- 
ues of  n characteristic  of  macroscopic  oscillators. 

Figure  3 1-2  la  is  an  energy-level  diagram  constructed  from  Eq.  (31-40) 
and  a superimposed  diagram  of  the  harmonic  oscillator  potential  energy 
function  that  leads  to  these  levels.  For  comparison,  similar  diagrams  are 
shown  in  Fig.  31-216  for  the  hydrogen  atom,  with  the  three-dimensional 
electric  potential  energy  in  the  system  represented  along  a diameter  of  the 
atom.  In  Fig.  31-21c  the  energy  levels  for  the  particle  in  a box  are  illus- 
trated, along  with  the  potential  energy  function  producing  these  levels 
when  used  in  the  Schrodinger  equation. 

The  particle-in-a-box  potential  energy  function  requires  some  explanation.  It 
becomes  infinitely  large  at  the  impenetrable  walls  of  the  box  for  reasons  that  can 
be  seen  by  looking  at  Fig.  31-3.  The  wave  functions  t//(x)  in  that  figure  were  ob- 
tained without  using  the  Schrodinger  equation,  but  they  are  the  same  as  those  ob- 
tained by  solving  the  equation  when  the  potential  energy  U (x)  has  the  form  shown 
in  Fig.  31-21c.  Consider  the  discontinuities  in  the  slopes  di//(x)/dx  of  the  wave 
functions  at  the  walls  of  the  box.  These  discontinuities  make  the  rate  of  change  of 
the  slope,  d2ip(x]/dx2,  infinite  at  the  walls.  According  to  Schrodinger’s  equation, 
d2b(x)/dx2  = - (8772m/h2)[E  — U(x)]t//(x)  everywhere.  So  the  infinite  value  of 
d2t|/(x)/dx2  at  each  wall  requires  that  U(x)  be  infinite  at  the  walls.  Between  the 
walls  U(x)  has  a constant  value,  which  is  defined  to  be  zero,  because  the  force  on 
the  particle,  F = — dl/(x)/dx,  is  zero  there.  You  should  be  able  to  argue  from 
inspection  of  the  Schrodinger  equation  that  the  particle-in-a-box  wave  functions 
are,  in  fact,  sinusoids  in  the  region  between  the  walls  since  U(x)  = 0 there. 


For  every  energy  level  E„  in  Fig.  31-21,  its  two  intersections  with  the 
potential  energy  curve  U determine  the  range  of  the  coordinate  axis  within 
which  a particle  of  that  energy  would  be  confined — in  other  words, 
bound — according  to  newtonian  mechanics.  Inside  this  region,  En  > U so 
the  particle’s  kinetic  energy  is  K — E — U > 0.  The  particle  would  never  be 
found  outside  where  En  < U since  there  K — E — U < 0.  This  is  impos- 
sible in  newtonian  mechanics  because  K = mv2/2,  and  K < 0 therefore 
means  that  the  speed  v is  an  imaginary  number.  But  this  is  not  a strict  limi- 
tation on  the  location  of  the  particle  in  quantum  mechanics.  For  instance, 
the  values  of  u at  which  E — U — 0 for  the  harmonic  oscillator  are  those 
for  which  e — u2  — 0,  or  u = \fe.  You  can  see  from  Fig.  31-20  that  the 
harmonic  oscillator  probability  density  extends  somewhat  into  the 
newtonian-forbidden  region. 

o 


It  is  easier  to  accept  this  if  you  realize  that  a position-momentum  uncertainty 
principle  calculation  shows  it  is  not  possible  to  detect  the  particle  in  circum- 
stances where  it  is  known  thatE  — U < 0.  Because  the  probability  density  is  ap- 
preciable only  for  a short  distance  into  the  forbidden  region,  detecting  the  particle 
there  would  constitute  an  accurate  position  measurement.  This  would  necessarily 
introduce  an  inaccuracy  in  its  momentum  after  the  measurement  and  a corre- 


1520 


Energy  Quantization  in  Matter 


sponding  inaccuracy  in  its  kinetic  energy.  The  resulting  inaccuracy  in  its  total  en- 
ergy is  seen  in  the  calculation  to  be  larger  than  the  amount  by  which  E is  smaller 
than  U before  the  measurement.  So  it  could  not  be  said  that  the  particle  was  de- 
tected with  E — U < 0. 


U(x) 


An  essential  factor  underlying  the  discrete  separation  of  the  allowed 
energies  of  a bound  particle  is  that  there  are  conditions  imposed  on  its 
wave  function  at  both  boundaries  of  the  system.  For  the  particle  in  a box  the 
two  boundary  conditions  are  that  the  wave  function  must  have  zero  value  at 
each  wall  of  the  box.  (Completely  analogous  are  the  conditions  that  there 
be  zei'o  transverse  displacement  of  a stretched  string  at  the  two  boundaries 
at  its  fixed  ends.  You  will  remember  that  these  boundary  conditions  lead  to 
discretely  separated,  possible  vibration  frequencies  for  the  string.  The  fre- 
quencies for  symmetrical  vibrations  of  a circular  drumhead  are  also  dis- 
cretely separated  because  there  are  conditions  on  the  transverse  displace- 
ment at  two  boundaries.  As  you  will  recall,  the  coordinate  used  in  treating 
the  drumhead  extends  along  a radius  from  the  center  to  the  rim.  The 
boundary  condition  at  the  center  is  that  ihe  first  derivative  of  the  displace- 
ment must  be  zero,  and  the  one  at  the  rim  is  that  the  displacement  itself 
must  be  zero.) 

For  the  harmonic  oscillator  there  are  also  two  boundary  conditions 
which  cause  its  energies  to  be  discretely  separated.  At  the  center  of  the  os- 
cillator there  is  the  condition  that  either  the  wave  function  must  be  zero  or 
its  first  derivative  must  be  zero.  The  other  “boundary”  is  in  the  region 
where  E — U < 0,  and  the  condition  is  that  the  wave  function  must  ap- 
proach zero. 

For  the  hydrogen  atom  there  are  two  boundary  conditions  and  dis- 
cretely separated  energy  levels,  as  long  as  the  total  energy  E is  less  than  the 
maximum  value  of  the  potential  energy  U.  But  if  E is  greater  than  the  max- 
imum U,  then  E — U > 0 everywhere.  The  electron  is  no  longer  bound, 
there  are  no  boundary  conditions  at  all,  and  the  energy  levels  form  the 
dense  continuum  indicated  in  Fig.  31-216. 

The  potential  energy  of  a diatomic  molecule  like  NaCl  is  shown  in  Fig. 
31-22  as  a function  of  the  separation  of  its  two  constituents,  along  with  the 
corresponding  energy-level  diagram.  When  E is  less  than  the  maximum 
value  assumed  by  U at  large  separations,  there  are  two  boundary  conditions 
and  discrete  energy  levels.  When  E is  greater  than  this  value,  there  is  one 
boundary  condition,  but  not  two.  A result  is  that  in  this  energy  range  the 
constituents  of  the  molecule  are  not  bound — although  there  is  a limit  to 
how  close  they  can  approach,  they  can  move  apart  without  limit.  Another 
result  is  that  the  energy  levels  of  the  system  form  a continuum. 

The  situation  can  be  described  concisely  as  follows:  When  the  relation 
between  the  potential  energy  U and  the  total  energy  E is  such  that  in  newtonian  me- 
chanics a particle  would  be  bound  to  a limited  region  of  space,  then  in  quantum  me- 
chanics E is  quantized.  Otherwise,  it  is  not  quantized. 


Fig.  31-22  The  potential  energy  curve  and  energy-level  diagram  for 
a diatomic  molecule.  The  hrst  few  energy  levels  show  a harmonic  oscil- 
latorlike structure  because  the  potential  energy  curve  in  this  energy 
region  is  approximately  parabolic.  The  higher  energy  levels  begin  to 
bunch  because  of  the  increase  in  the  “width”  of  the  binding  region 
with  increasing  energy.  The  energy-level  diagram  can  be  obtained 
both  by  solving  the  Schrodinger  equation  for  a given  potential  energy 
curve  and  by  analyzing  the  measured  spectrum  of  radiation  emitted  by 
a molecule.  For  a potential  energy  curve  having  the  proper  form, 
agreement  between  theory  and  experiment  is  excellent. 


31-4  The  Harmonic  Oscillator  1521 


EXERCISES 

Group  A 

31-1.  Recognizing  wave  functions.  When  the  wave 
function  for  a particle  in  a box  has  the  form  in  Fig. 
31E-la,  its  total  energy  is  6.0  eV. 

Fig.  31E-1 

(a) 


a.  What  is  its  total  energy  when  the  wave  function  has 
the  form  in  Fig.  3 IE -b? 

b.  What  is  its  lowest  possible  total  energy? 

31-2.  A reasonable  assumption. 

a.  Find  the  mathematical  expression  for  the  wave 
functions  t p„(x)  of  a particle  of  mass  m in  a box  extending 
from  x = 0 to  x = L,  assuming  that  they  are  sinusoids. 

b.  If  the  particle  is  observed  never  to  be  in  the  vicinity 
of  the  point  x = L/3,  what  are  the  smallest  three  energies 
it  could  have? 

31-3.  Balmer  series  limit.  What  is  the  shortest  wave- 
length, called  the  series  limit,  in  the  Balmer  series  of  hy- 
drogen?  Express  your  result  to  three  significant  figure 
accuracy. 

31-4.  Positronium. 

a.  The  Rydberg  constant  for  an  infinitely  massive  nu- 
cleus is  1.09737  x 107  m-1.  What  is  it  for  positronium. 
which  in  the  Bohr  model  consists  of  an  election  and  a po- 
sitron revolving  about  their  center  of  mass? 

b.  H ow  much  energy  is  released  when  a positron  and 
an  electron  come  together  to  form  positronium? 

31-5.  Ionized  helium,  I.  The  Bohr  formula  for  the 
spectrum  of  singly  ionized  helium  can  be  obtained  from 
that  for  hydrogen  by  replacing  e 4 with  (Ze2)2  in  Eq.  (31-14), 
using  Z = 2 for  helium,  if  the  slight  difference  between 
the  hydrogen  and  helium  reduced  mass  is  ignored.  Show 
that  in  this  approximation  the  lines  of  the  helium  series 
with  nf  = 2 and  even  values  of  nt  have  the  same  wave- 
lengths as  those  of  the  Lyman  series  for  hydrogen  with 
n,  = 1. 

31-6.  A relation  between  Schrbdinger  and  Bohr.  Solution 
of  the  Schrodinger  equation  for  the  hydrogen  atom  shows 
that  the  wave  function  i//(r)  for  the  ground  state  is  propor- 
tional to  e~rlr‘,  where  the  constant  iq  is  the  ground-state  or- 
bital radius  in  the  Bohr  model.  (See  Exercise  31-25.)  T he 
probability  P(r)  dr  of  finding  the  electron  with  radial  coor- 
dinate between  r and  r + dr  is  the  product  of  the  probabil- 


ity per  unit  volume  </r(r)  and  the  volume  of  the  shell- 
shaped region  extending  from  r to  r + dr.  Since  this  vol- 
ume is  proportional  to  r2,  it  follows  that  P(r)  * r2e~2rlr'. 
Show  that  P(r ) has  its  maximum  value  at  r = ry. 

31-7.  Microscopic  zero-point  energy.  The  center-to- 
center  separation  of  the  atoms  of  a certain  diatomic  mole- 
cule is  determined  by  the  influence  of  a Hooke’s  law  force 
with  a force  constant  having  the  value  k = 2.0  x 103  J/m2. 
Each  atom  has  mass  m = 14u.  Predict  the  zero-point  en- 
ergy of  the  vibration  in  their  center-to-center  separation. 

31-8.  Macroscopic  zero-point  energy?  Passengers  on 
large  jet  aircraft  occasionally  see  the  tips  of  the  wings  oscil- 
lating vertically,  with  amplitude  of  roughly  0.1  m and  fre- 
quency of  roughly  1 Hz.  Prove  that  this  phenomenon  is 
not  a zero-point  motion,  as  follows.  Make  an  order-of- 
magnitude  estimate  of  the  amount  of  mass  in  oscillation, 
and  use  it  to  estimate  the  total  energy  of  the  oscillation. 
Then  evaluate  the  zero-point  energy,  and  compare. 

31-9.  Why  Newton  did  not  detect  energy  quantization.  The 
bob  of  a pendulum  has  a mass  of  2.00  kg,  the  length  of  the 
string  supporting  the  bob  is  0.500  m,  and  the  amplitude  of 
the  oscillation  of  the  bob  is  0.0600  rad.  Calculate  the  fol- 
lowing: 

a.  The  oscillation  frequency  v. 

b.  The  total  energy  E„. 

c.  The  approximate  value  of  the  quantum  number  n. 

d.  The  separation  in  energy  £„+1  — E„  between  adja- 
cent energy  levels. 

e.  The  oscillation  amplitude  at  which  energy  quanti- 
zation could  conceivably  be  detected  experimentally, 
assuming  it  requires  {En+1  — En)/E„  be  at  least  1 part  in 
106.  Compare  this  amplitude  with  the  size  of  an  atom. 
Could  energy  quantization  of  the  pendulum  actually  be 
detected  experimentally? 

31-10.  Quantized  or  not  quantized.  Figure  3 1 E- 1 0 shows 
the  potential  energy  U(x)  for  a system.  Is  the  total  energy 
E quantized  (allowed  values  discretely  separated)  or  not 
quantized  (allowed  values  continuously  distributed)  for 
each  of  the  following  ranges? 


Energy  Fig.  3 IE- 10 


U(x) 

\ ^3 


a.  0<  E < Ur 

b.  Ur<E  < U2 

c.  U2<E  < U3 

d.  U3  < E 


1522  Energy  Quantization  in  Matter 


Group  B 

31-11.  Normalization.  Each  of  the  functions  i//„(x) 
plotted  in  Fig.  31-3  is  a sinusoid,  which  can  be  written 
i = A„  sin(n7rx/L).  Prove  that  when  the  normalization 
condition  of  Eq.  (31-4)  is  satisfied,  the  multiplicative  con- 
stant An , which  determines  the  maximum  value  of  these 
functions,  must  equal  V2 /L.  In  other  words,  prove  that 
the  maximum  value  of  t//2(x)  is  2/L  when  the  normalization 
condition  is  satisfied,  as  indicated  in  Fig.  31-5. 

31-12.  Correspondence  between  quantum  and  nonquantum 
physics,  I.  An  electron  in  a box  of  length  L makes  a transi- 
tion from  its  quantum  state  n + 1 to  its  quantum  state  n. 

a.  Find  the  frequency  of  the  photon  which  the  elec- 
tron emits  to  conserve  energy  while  making  this  transi- 
tion. 

b.  Find  the  frequency  of  vibration  (number  of  round 
trips  per  second)  of  the  electron  when  it  is  in  quantum 
state  n + 1. 

c.  According  to  Maxwell's  electromagnetic  theory,  a 
vibrating  charged  particle  emits  electromagnetic  waves 
whose  frequency  is  the  frequency  of  vibration  of  the  par- 
ticle. We  would  expect  this  prediction  to  correspond  to  the 
quantum  prediction  for  large  quantum  numbers  but  not 
for  small  ones.  Show  that  in  the  limit  of  large  n,  the  fre- 
quencies found  in  parts  a and  b agree.  Also,  find  the  frac- 
tional difference  between  them  when  n = 1. 

31-13.  Relativistic  particle  in  a box.  Equation  (31-2) 
for  the  energy  levels  of  a particle  in  a one-dimensional  box 
was  derived  assuming  that  the  particle  was  nonrelativistic. 

a.  Derive  an  equation  for  the  energy  levels  of  a par- 
ticle in  a box  using  relativistic  mechanics. 

b.  How  big  does  the  quantum  number  n have  to  be 
before  relativistic  effects  are  significant  (say  the  kinetic  en- 
ergy is  50  percent  of  the  rest  mass  energy).  Evaluate  this 
critical  quantum  number  for  the  following  three  cases: 

(i)  a proton  in  a box  of  length  1.0  x 10~12  nr. 

(ii)  an  electron  in  a box  of  length  1.0  x 10-10  m. 

(iii)  an  electron  in  a box  of  length  1.0  x 10-12  m. 

31-14.  Discovery  of  deuterium.. 

a.  Ruo,  the  Rydberg  constant  for  an  infinitely  massive 
nucleus,  is  equal  to  1.097  37  x 107  nr-1.  Use  this  value  and 
the  value  m/MH  = 1/1836,  where  m is  the  mass  of  an  elec- 
tron and  Mh  is  the  mass  of  a hydrogen  nucleus,  to  calcu- 
late Rh,  the  Rydberg  constant  for  hydrogen. 

b.  Deuterium  has  a nucleus  consisting  of  a proton 
and  a neutron.  This  nucleus  is  almost  exactly  twice  as  mas- 
sive as  a hydrogen  nucleus.  Calculate  the  Rydberg  con- 
stant for  deuterium,  RD. 

c.  Naturally  occurring  hydrogen  contains  about  1 
part  in  5000  of  deuterium.  The  latter  was  originally  dis- 
covered by  the  small  difference  in  the  wavelength  of  the 
lines  in  the  Balmer  series  of  hydrogen  and  deuterium. 
The  longest  wavelength  of  the  Balmer  series  of  hydrogen 
is  656.280  nrn.  By  how  much  would  the  corresponding 
line  of  deuterium  differ  from  this? 


31-15.  Ionized  helium,  II.  Lines  in  the  spectrum  of  sin- 
gly ionized  helium  satisfy  the  relation 


The  experimental  value  of  RHe  is  10,972,226  m h Lines  in 
the  spectrum  of  neutral  hydrogen  satisfy  the  relation 


where  the  experimentally  determined  value  of  Rh  is 
10,967,758  m-1.  The  ratio  of  the  mass  of  the  helium  nu- 
cleus to  the  mass  of  the  hydrogen  nucleus  is  MHe/MH  — 
3.9715.  Using  these  data,  show  that  the  ratio  of  the  mass  of 
an  electron  to  the  mass  of  the  hydrogen  nucleus  is 
m/MH  = 1/1836. 

31-16.  Correspondence  between  quantum  and  nonquantum 
physics,  II.  An  electron  in  a Bohr-model  hydrogen  atom 
makes  a transition  from  its  quantum  state  n + 1 to  its 
quantum  state  n. 

a.  Find  the  frequency  of  the  photon  which  the  elec- 
tron emits  to  conserve  energy. 

b.  Find  the  frequency  of  revolution  (number  of 
orbits  per  second)  of  the  electron  when  it  is  in  the 
quantum  state  n + 1. 

c.  According  to  Maxwell’s  electromagnetic  theory,  an 
orbiting  charged  particle  emits  electromagnetic  waves 
whose  frequency  is  the  same  as  the  orbital  frequency  ol 
the  particle.  We  would  expect  this  prediction  to  corre- 
spond to  the  quantum  prediction  for  large  quantum 
numbers,  but  not  for  small  ones.  Show  that  in  the  limit  of 
large  n,  the  frequencies  found  in  parts  a and  b agree.  Also 
find  the  fractional  difference  between  them  when  n = 1. 

31-17.  A relation  betxveen  Planck  and  Bohr.  A particle  of 
mass  m is  in  a circular  “Bohr-like”  orbit  of  radius  r about  a 
fixed  force  center.  The  force  exerted  on  it  is  attractive, 
and  its  magnitude  F is  proportional  to  r.  That  is,  F = kr, 
where  A is  a constant. 

a.  Find  the  potential  energy  U for  the  system,  if  U = 
0 for  r = 0. 

b.  Find  the  speed  v of  the  particle. 

c.  Apply  the  quantization  condition  of  Ecp  (31-5)  to 
find  the  radii  and  speeds  for  the  allowed  orbits. 

d.  Find  the  kinetic  energy  K for  the  particle  in  al- 
lowed orbits. 

e.  Find  the  energy  quantization  formula  for  this 
system,  expressing  the  total  energy  En  in  terms  of  the  par- 
ticle’s frequency  of  revolution  v (number  of  orbits  per  sec- 
ond). 

f.  Compare  the  results  of  part  e with  those  of  Planck, 
described  in  small  print  below  Eq.  (31-40).  Can  you  ex- 
plain why  they  are  the  same? 

31-18.  Separating  variables.  Substitute  the  function 
'P(x,  t)  = i Jj(x)e~2'iriEtlh  into  Eq.  (31-24),  the  time-dependent 


Exercises  1523 


Schrodinger  equation.  Then  show  that  it  is  a solution  to 
the  equation,  providing  t fj(x)  is  a solution  to  Eq.  (31-23), 
the  time-independent  Schrodinger  equation. 

31-19.  A time-dependent  probability  density.  By  following 
the  procedure  outlined  in  the  text,  verify  that  the  proba- 
bility density  for  an  electron  making  a transition  from 
state  2 to  state  1 can  be  written  in  the  form  of  Eq.  (31-28). 

31-20.  Ground  state  of  the  harmonic  oscillator,  1.  The 
function  ifi(x)  = Ae~0x'12,  where  A is  an  arbitrary  constant 
and  f3  = 2n\/mk/h  is  the  parameter  defined  in  Eq. 
(31-31),  is  the  analytical  form  of  an  acceptable  solution  to 
Schrodinger’ s equation  for  the  harmonic  oscillator,  pro- 
viding the  total  energy  E has  a suitable  value.  In  fact,  it  is 
the  solution  for  n = 1.  By  substitution  into  the  differential 
equation,  verify  that  this  is  true,  and  find  the  value  of  E. 
Compare  the  value  you  obtain  with  Eq.  (31-40). 

31-21.  Ground  state  of  the  harmonic  oscillator,  II.  The 
wave  function  for  the  ground  state  of  the  harmonic  oscil- 
lator is  ijj(x)  = Ae~ex2'2,  where  A is  a constant  and 
/3  = T\fmi/h.  (See  Exercise  31-20). 

a.  Draw  a graph  of  the  probability  density  ip2(x)  as  a 
function  of  x. 

b.  What  are  the  endpoints  of  the  region  allowed  by 
newtonian  mechanics  when  the  oscillator  has  its  ground 
state  energy  E = \hv  = \h  (l/2Tr)x/k/m?  Indicate  them  on 
your  graph. 

c.  Use  your  graph  to  estimate  the  fractional  probabil- 
ity of  finding  the  harmonic  oscillator  outside  the 
newtonian-allowed  region  when  it  is  in  its  ground  state. 
That  is,  if  a set  of  independent  measurements  are  made  to 
locate  the  oscillating  body,  in  what  fraction  of  these  will  it 
be  found  in  the  newtonian-forbidden  region? 

31-22.  First  excited  state  of  the  harmonic  oscillator.  The 
function  i fi(x)  = Axe~l3j'~12,  where  A is  an  arbitrary  constant 
and  f3  = 2 TT\/mk/h  is  the  parameter  defined  in  Eq. 
(31-31),  is  the  analytical  form  of  an  acceptable  solution  to 
Schrodinger’s  equation  for  the  harmonic  oscillator,  pro- 
viding the  total  energy  E has  a suitable  value.  In  fact,  it  is 
the  solution  for  n = 2.  By  substitution  into  the  differential 
equation,  verify  that  this  is  true,  and  find  the  value  of  E. 
Compare  the  value  you  obtain  with  Eq.  (31-40). 

31-23.  Determining  a microscopic  force  constant.  Experi- 
mentally determined  values  of  the  energy  levels  for  vibra- 
tional motion  of  a hydrogen  molecule  are  quoted  in  Ex- 
ample 18-6.  Use  them,  and  the  harmonic  oscillator  energy 
quantization  formula,  to  calculate  the  vibrational  fre- 
quency v of  the  molecule.  The  vibrational  frequency  is  re- 
lated to  the  force  constant  k specifying  the  “stiffness”  of 
the  “spring”  binding  the  two  atoms  together,  and  to  the 
reduced  mass  fx  of  either  of  these  atoms,  by  the  equation 
v = (1/27t)\A/m.  Explain  the  relation  of  this  equation  to 
Eq.  (6-28o),  and  explain  why  yu  = m/ 2 with  m being  the 
actual  mass  of  either  atom.  Calculate  the  value  of  k//x 
from  the  value  of  v.  Then  use  the  known  value  of  m to  cal- 


culate the  value  of  yu,.  Finally,  determine  the  value  of  k. 
Compare  it  with  the  value  of  k for  a typical  macroscopic 
spring. 


Group  C 

31-24.  Dimensional  analysis  and  the  Bohr  model.  The 
Bohr  model  of  the  hydrogen  atom  is  based  on  newtonian 
mechanics,  Coulomb’s  law  for  the  electric  force,  and  a 
quantization  principle  which  selects  the  allowed  orbits.  As 
a result,  the  dynamical  quantities  such  as  size,  speed,  and 
energy  describing  the  electron’s  motion  can  depend  only 
on  (1)  the  reduced  mass  fx  of  the  electron-proton  system, 

(2)  the  constant  c2/47t€0.  which  appears  in  Coulomb’s  law, 

(3)  Planck’s  constant  h,  and  (4)  dimensionless  numerical 
constants,  or  quantum  numbers. 

Any  model  of  the  hydrogen  atom  based  on  newtonian 
mechanics,  Coulomb’s  law,  and  a quantization  principle 
will  yield  orbital  radii  and  energy  levels  of  the  same  order 
of  magnitude  as  the  Bohr  model.  Show  this  by  using  di- 
mensional analysis  to  show  that  the  only  distance  which 
can  be  constructed  from  items  1.  2,  and  3 must  be  a nu- 
merical multiple  of  the  first  Bohr  orbit  radius  h2e0/TTfxe2 
and  that  the  only  energy  which  can  be  constructed  from 
the  same  items  must  be  a numerical  multiple  of  the  cor- 
responding total  energy  — fxe4/8elh2. 

31-25.  Ground  state  of  the  hydrogen  atom.  In  order  to 
generalize  the  Schrodinger  equation  to  three  dimensions 
and  two  particles,  the  ordinary  derivative  d2\\i/dxl  is  re- 
placed by  the  combination  of  partial  derivatives 

d2x\t  d\ J>  d2i jj 
dx2  dy2  dz2 


and  the  mass  m is  replaced  by  the  reduced  mass  /x.  The 
equation  then  becomes 


d2ip  d2ip  d2il) 

dx2  dy2  dz2 


8n2fx 


[E 


U(x,  y,  z)]i|/ 


For  the  hydrogen  atom,  U(x,y,  z)  = —e2/4Tre0r>  where  r = 
(x2  + y2  + z2)112. 

Show  by  substitution  into  the  differential  equation 
that  it  has  the  acceptable  solution  < jj  = Ae~rln,  where  A is 
an  arbitrary  constant  and  rt  = /re0/ n/xe2  is  the  ground- 
state  Bohr  orbit  radius.  In  the  process,  you  will  evaluate 
the  energy  E of  the  atom.  Compare  it  with  the  Bohr  value 
given  by  Ecj.  (31-10)  to  identify  this  i//  as  the  ground  state 
wave  function.  If  you  have  done  Exercise  31-6,  you  will 
already  have  made  a comparison  between  its  form  and  the 
ground-state  Bohr  orbit  radius.  Hint:  Evaluation  of  the 
three  partial  derivatives  requires  repeated  use  of  the 
“chain  rule. " 


31-26.  This  one  really  oscillates! 
a.  Show  that 

T(x,  t)  = A{e~i7T,t  + V/3  xe-^-fe-^212 
is  a solution  of  Schrodinger’s  time-dependent  equation 


1524  Energy  Quantization  in  Matter 


for  a harmonic  oscillator.  Here  A is  an  arbitrary  constant, 
v = (1/277 )y/k/m,  and  (3  = 2’n\fmk/h. 

b.  Find  the  probability  density  and  sketch 

graphs  of  it  for  times  t = 0,  t = \/\v  and  t = \/2v.  Com- 
ment on  your  graphs.  Note:  If  you  compare  this  T(x,  t) 
with  the  ip(x)  considered  in  Exercise  31-20  and  the  one 
considered  in  Exercise  31-22,  you  will  see  that  the  har- 
monic oscillator  is  partly  in  its  ground  state  and  partly  in 
its  first  excited  state.  Compare  also  Eq.  (31-28). 

31-27.  Newtonian  probability  density.  Use  the  proce- 
dures of  newtonian  mechanics  to  prove  that  for  a har- 
monic oscillator  the  probability  density  Pnewt(«)  has  the 
form  P neW (it)  = A/y/e  — u2  shown  in  Fig.  31-20. 

31-28.  Th  ree -dimensional  harmonic  oscillator.  Schrd- 
dinger's  equation  for  a three-dimensional  harmonic  oscil- 
lator of  potential  energy 

U(x,  y,  z)  = + y2  + z2) 


d2ip  d2ip  d2p 
dx2  dy2  dz2 


87 T2m 
h2 


[. E — ik(x?  + y2  4-  z2)] p 


Let  pnjx)  be  a solution  of  the  one-dimensional 
Sch rod i nger  equa t ion , 


d24>nx(x) 

dx2 


Stt  2m 

~ir{Enx 


ik^)pnr(x) 


where  E„x  = (nx  - k)hv,  and  let  p„Jy)  and  i|/„2(z)  satisfy 
similar  one-dimensional  Schrodinger  equations.  Show 
that 


l/j(x,  y,  Z)  = pnx(x)pjitJ(y)pnz{^) 

is  a solution  of  Schrodinger’ s equation  for  the  three- 
dimensional  oscillator  if  the  energy  E has  the  correct 
value,  and  find  that  value  of  E. 


Numerical 

31-29.  Harmonic  oscillator,  I.  Run  the  harmonic  oscil- 
lator Schrodinger  equation  program  with  ip0  = 1; 
(dp/du) 0 = 0;  u0  = 0;  A u = 0.01;  e = 1;  l = 10.  Then  re- 
peat, except  with  e = 0.9999.  Plot  your  results  for  each 
run.  compare  them  with  those  plotted  in  Fig.  31-17,  and 
then  comment  on  the  relation  between  the  two  sets  of  re- 
sults. 

31-30.  Harmonic  oscillator,  II.  Run  the  harmonic  oscil- 
lator Schrodinger  equation  program  to  find  the  third  al- 
lowed value  of  e.  Plot  the  results  obtained  in  one  run  to  il- 
lustrate the  form  of  p(u). 

31-31.  Harmonic  oscillator,  III.  Modify  the  harmonic 
oscillator  Schrodinger  equation  program  so  that  the  val- 
ues of  p2(u)  are  calculated  and  displayed.  Then  plot  this 
quantity  for  the  third  allowed  value  of  e obtained  in  Exer- 
cise 31-30.  Compare  with  the  plot  in  Fig.  31-20. 


31-32.  Anharmonic  oscillator.  Write  a program  to  solve 
the  Schrodinger  equation  for  an  anharmonic  oscillator, 
with  potential  energy 


U(x) 


First  reexpress  the  Schrodinger  equation  in  the  dimen- 
tionless form 


d2  p 

— = - (e  - m2  - Su 4) 
du 

and  find  the  relation  between  6 and  p.  Incorporate  in  your 
program  provision  for  displaying  p2.  Then  find  the  ninth 
allowed  energy,  and  compare  it  with  the  ninth  allowed  en- 
ergy found  for  the  harmonic  oscillator  in  Example  31-6. 
Plot  1 1>2,  compare  the  plot  with  Fig.  31-20,  and  explain  the 
difference.  There  is  no  analytical  solution  to  this 
Schrodinger  equation. 

31-33.  Finite  square  well,  I.  Quantum  mechanics 
makes  interesting  predictions  about  the  behavior  of  a par- 
ticle of  mass  m in  a system  with  potential  energy  U(x)  that 
is  capable  of  binding  the  particle  only  if  its  total  energy  E 
is  less  than  a certain  value.  The  simplest  example  is  found 
in  the  finite  square-well  potential: 

( U0  for  x < —a/2  or  x > +a/2 
U(x)  = | Uq/2  for  x = ±a/2 

lo  for  —a/2  < x < +a/2 

The  constants  U0  and  a are  the  “depth”  and  “width”  of  the 
“potential  well.” 

a.  Show  that  a dimensionless  form  of  the 
Schrodinger  equation  for  the  system  is 

d2\b(u)  f ~^<e  ~ for  u < - 1/2  or  u > +1/2 

■ , 2 = ~ l/2)t/t(w)  for  u = ±1/2 

[-(3ep(u)  for  —1/2  < u < +1/2 

where  u = x/a,  (3  = 8Tr2ma2U0/h2,  and  e = E/U0- 

b.  Write  a program  to  find  numerical  solutions  to  the 
differential  equation  in  part  a.  Hint:  For  accuracy,  it  is  im- 
portant that  one  of  the  values  of  W;  used  in  the  calculation 
be  precisely  0.5,  the  location  of  the  discontinuity  in  U(u). 
(Can  you  see  why?)  Hence  be  sure  to  use  only  values  of  Aw 
that  divide  evenly  into  0.5. 

c.  Let  (3  have  the  representative  value  (3  = 64.  Then 
run  the  program  to  find  the  lowest  energy  level  Ev  of  the 
system  in  terms  of  U0 , and  also  the  corresponding  wave 
function  i/zjfx)  in  terms  of  a.  Plot  the  wave  function.  Ex- 
plain why  the  divergence  of  p to  positive  or  negative  in- 
finity, found  when  E is  not  quite  right,  is  less  pronounced 
than  for  the  case  of  the  harmonic  oscillator  potential. 

31-34.  Finite  square  well,  II. 

a.  Use  the  finite  square  well  program  of  Exercise 
31-33  to  find  the  second  and  third  bound  energy  levels  E2 
and  £3  of  the  system,  with  f3  = 64,  as  well  as  the  corre- 
sponding wave  functions  i/>2M  and  p3(x).  Plot  the  wave 
functions. 


Exercises  1525 


b.  Use  the  program  to  show  that  for  /3  = 64  the  finite 
square  well  does  not  have  a fourth  bound  energy  level — 
that  is,  a fourth  discrete  level  at  some  energy  E < U0- 

c.  Make  a sketch,  like  those  in  Fig.  31-21,  showing  to 
scale  U(x)  and  its  three  bound  energy  levels  Eu  E2,  and  £3. 
Label  each  energy  level  as  to  whether  the  corresponding 
wave  function  is  even  or  odd. 

31-35.  Finite  square  well,  III. 

a.  Use  the  finite  square  well  program  of  Exercise 
31-33  to  show  that  there  is  a continuum  of  energy  levels 
starting  at  E = U0.  In  other  words,  show  that  there  is  an 
acceptable  solution  iJj(x)  to  the  Schrodinger  equation  for 
any  energy  E > U0.  Plot  one  of  these  wave  functions  for  a 
typical  energy  E > U0. 

b.  Work  done  in  part  a demonstrates  that  for  a typi- 
cal energy  E > U0  the  amplitude  of  the  oscillatory  wave 
function  t ji(x)  in  the  region  —a/2  < x < +a/2  is  appre- 
ciably less  than  the  amplitude  in  the  regions  x < —a/2  or 


x > +a/2.  This  means  that  the  particle  will  typically  have 
an  appreciably  smaller  probability  of  being  found  within 
the  potential  well  than  of  being  found  in  a region  of  equal 
extent  outside  the  potential  well.  But  there  are  certain  en- 
ergies E > U0  for  which  the  interior  and  exterior  ampli- 
tudes are  equal,  so  that  the  particle  has  a relatively  large 
probability  of  being  found  within  the  potential  well.  These 
are  sometimes  called  the  energies  of  the  virtual  levels  of 
the  potential  well.  Use  the  program  with  (3  = 64  to  find 
the  lowest  of  these  energies  for  which  the  wave  function  is 
even,  and  also  the  lowest  for  which  it  is  odd.  Plot  the  two 
wave  functions. 

c.  Find  the  simple  analytical  expression  for  the  condi- 
tion that  E be  the  energy  of  a virtual  level  of  a particular 
finite  square-well  potential. 

d.  If  you  have  done  Exercise  31-34,  then  add  the  two 
lowest  virtual  levels  (labeled  even  or  add)  to  the  sketch  in 
part  c showing  the  three  bound  levels.  Comment  on  any 
regularities  you  see. 


1526  Energy  Quantization  in  Matter 


Answers 


Answers 


1.  Most  answers  are  given  to  three  significant  figures.  The  departures  from  this  convention  occur  where 
appropriate,  based  either  on  the  problem  statement  or  on  the  numerical  details. 

2.  Unless  otherwise  indicated,  g has  been  assigned  the  value  9.80  m/s1 2. 

3.  For  problems  which  require  estimates,  the  numerical  answers  are  preceded  by  “Est.” 


CHAPTER  16 

1.  (a)  The  upper  part — at  and  near  the  upper  sur- 
face; the  lower  part — at  and  near  the  lower  sur- 
face. ( b ) At  the  surface  between  the  region 
under  tension  and  the  region  under  compres- 
sion. 

3.  3.17  mm 

5.  ( b ) 10.3  m 

7.  2.99  x 105  Pa  = 2.96  atm 

9.  (a)  0.87  mm/s;  ( b ) approx.  4 min;  (c)  approx. 
10  km 

11.  (a)  0.150  m/s;  ( b ) 4.71  x 10“5  m3/s  = 2.83 
liters /min 

13.  100  cm2  = 10“2  m2 

15.  (a)  F±  = F sin  0,  F,|  = F cos  0;  ( b ) cr(  = 


(F/A)  sin2  0,  crs  = ( F/A ) sin  0 cos  0 = ( F/2A ) 
sin(20);  (c)  tt/2  = 90°,  tt/4  = 45° 

21.  18. 1 cm,  87.7  cm 

23.  (a)  6.27  x 107  Pa.  ( b ) The  equilibrium  is  un- 
stable. (c)  no 

25.  (b)  vf  = 2rf(ps  - p)g/9N\ 

(c)  Ft  = 4 r$(ps  - p)pg/9Af2, 

so  R < Rc  = 10  for  0 < rs  < rmax  = 

45  T2  “| 1/3 

— — ; (d)  vf  = ( 1.5  x 107  m_1  s_1)rf , 

L2(ps  - p)g J 

R = (3.0  x 1013  m~3)r3,  rmax  - 6.9  x 10“5  m = 
0.069  mm 

31.  0.69  cm2 

33.  (c)dT  = 27rr3drGd/L ; (d)  T'  = 7r/?4G0/2L,  yes 


2A-1 


37.  (a)  p'A  = P'b  = 


PaVa  + PbV B yt  _ PaV A^V A + Vb) 
VA  + VB  ’ -4  £(F4  + pB^B 


v,=  PbVbWa  + 

A4K4  + pBVB 
V'a/Va  = 3/7 


(6)  Pa/Pa  = 7/3. 


41.  (i)  v(r)  = Ap(R2  — r2)/4r}L 
43.  (a)  F = pgha 

45.  (a)  x = 2\/y(h  — y) ; (ft)  h — y'  = y,  (c)  x is  maxi- 
mized for  y = ft/2 


CHAPTER  17 

1.  (a)  C = f(F  — 32),  where  C and  F are  equiva- 
lent readings  on  the  Celsius  and  Fahrenheit 
scales;  (ft)  - 40°F  = - 40°C 

3.  610  K 

5.  1.89  g 

7.  0.524  kg/m3 

9.  19 1°C 

11.  0.37  kg 

13.  4.8  x 103  J/kcal 

15.  (a)  5.09  cm;  (ft)  7.09  kg 


17.  5.00  m3 
19.  760  mm  Hg 
21.  0.14 
23.  0.49  m 
25.  946°C 
27.  35.3°C 

31.  (ft)  500  x 10-6/°C 
35.  10.2  cm 

37.  0.771  cal  = 7.71  x 10“4 *  kcal 


CHAPTER  18 


1.  (In  this  problem,  we  assume  that  T — 293  K.) 
(a)  6.07  x 10~21J  for  both;  (ft)  He:  urms  = 
1.35  x 103  m/s,  Ar:  urms  = 428  m/s;  (c)  2.43  x 
105  Pa 

3.  (a)  3.86  x 1025  molecules;  (ft)  4.83  x 1024/m3; 
(c)  2.40  x 10s  J;  (d)  EgJ{KE)B  = 100; 

(?)  6.21  x 10“21  J 

7.  ( a ) The  two  results  are  equally  probable, 
(ft)  ( 1/6)3  = 4.63  x 10“3;  (c)  (1/6)2  = 2.78  x 

10“2 

9.  (a)  0.561;  (ft)  0.749 
11.  1250  rotations/min 

13.  (a)  2.68  x 1025  m“3;  (ft)  3.34  x 10~9  m; 

(c)  3.72  x 10"6  m,  (d)  3.72  x 10“6  atm 

17.  (a)  V3 kT/M ; (ft)  5.0  x 10~3  m/s;  (c)  y = 3/2; 

(d)  3 kT 

19.  (a)  1;  (ft)  3;  (c)  6 

21.  (Here  p = 1 /kT)  (ft)  P(e  = -e1)  = eBei/{eBei  + 
e~Bei)  P(e  = +d)  = e~B€1/(eBei  + e~Bei),  (e)  = 

( — p~^\ 

~ei  + g-Ttl)  = tanh^ed,  E = N(e)  = 


-Ne,  tanh(/3e1);  (c)  c'  = 4ft(/3ei)2/(cfe  + e~Beif 

= ft(iSei)2  sech2(/3e!) 

23.  (a)  G(v)  = K2v,  n(v)  = K2v~(mv2,2kT)\  (ft)  VftT/m ; 

(c)  G(v)  = Kdv(D~v,  n(v)  = KDvD-1e-(mv2l2kT\ 

vmp  - \/(D  - 1 )kT/m 

29.  pf  = pi  only  if  ra  = rB\  otherwise,  pf  < pi 

31.  (a)  and  (ft)  <ef)  = |f,  Pi(e)de  = for  i = 

1,2;  (c)  m(e ) = NiPde)  = Tc/<c,>  for  i = 1,2; 

(d)  £ = £4  + £2,  <e)  = (£1  + E2)/(Nl  + N2Y, 

(e)  P(e)de  = e~e,{£' , n(e)  = ^ e~e/U) 

41.  The  bar  graph  will  be  randomly  distributed 
about  the  function  n(e)  = (80/6)  e~£l6. 

45.  0.01ft,  0.09ft,  0.29ft,  0.48ft,  0.71ft,  0.82ft,  0.91ft, 
0.95ft 

49.  The  average  distance  will  be  proportional  to  the 
square  root  of  the  number  of  steps. 


2 A- 2 Answers 


CHAPTER  19 


1.  (a)  The  work  done  on  the  water  is  — 1.69  x 
105  J:  the  water  does  positive  work  in  pushing 
back  the  surrounding  atmosphere,  (b)  20.9  x 
105  f;  (r)  potential  energy 

3.  (a)  irreversible,  A H = 0,  AW  = 0,  A E = 0; 
(, b ) If  the  independent  variables  are  taken  to  be 
T and  V,  then  E depends  only  upon  T,  and  not 
on  V. 

7.  98.2  K 

9.  Monatomic  (y  = 1.61) 

11.  (a)  4.00  x 1 03  J ; (b)  3.00  x 1 03  J 

13.  For  Tl0  = - 1 0°C ; £h*p  = 9.77;  for  Tlo  = 10°C, 
= 29.3. 

15.  (a)  4.92  m3;  (b)  1.22  x 105  Pa  = 1.20  atm; 

(c)  1.09  x 105J 


23.  (b)  AS  = R In  2 

27.  (a)  pAVA;  (b)  A Hm  = 8 pAVA,Jf  = 1/8; 

(c)  A SAB  = SB  - SA=i  nR  In  2,  A SBC  = I nR 
In  (3/2)  + nR  In  3,  ASca  = l nR  In  (1/3) 

31.  (a)  Pi  = nRTi/Vi,  pf  = nRT/Vf  ; (b)  p(V)  = 

nRTjV ; (c)  nRTi  ln(P,/V»;  (d)  - nRT, 

In  ( Vi/Vf);  ( e ) (i)  AW/E  = 0.462,  (ii)  AW/E  = 
0.732,  (lii)  AW/E  = 1.073 


33.  (a)  y = 


/i7i(72 

fl(j2 


1)  +/272(Tl  - 1) 

i)  +My  1 - 1) 


(b)  y - 1 + 2 UkKjk  ~ 1)] 


k= l 


35.  29.7  km 

37.  (d)  Isometric  doubling  of  the  absolute  tempera- 
ture 


17.  AW  = -nRT  In  (VB/VA)  > 0;  AH  = nRT 
In  (VB/VA)  < 0 

19.  (a)  AH  AW  A E AT  AS 
AB  + - 0 0 + 

BC  - 0 

CD  - + 00 

DA  + 0 + + + 

(b)  AH  > 0,  AW  < 0,  AT  = AT  = A5  = 0 
21.  AS  = nR  In  (VB/VA) 


39.  (b)  4.74  x 10^2 
41.  (a)  AT  = 


1 In  (Tp/Ti) 


{ I + [2T1/3(T1  - r0)]  In  {VJV,)} 

^2  „-e/*r 


43.  (a)  C(T) 


kT'1  ( 1 + e~€lkTy- 


; (b)  k In  2 (c)  S(T)  = 


ax') 


T 


dT  = 


6“ 


n-elkT' 


o k(T')3  (1  + e-€lkT')2 


dT' 


e 


-elkT 


kT3  ( 1 + e-dkT)2 


CHAPTER  20 

1.  0.010  N 

3.  (a)  2.31  x 10“24  N;  (b)  1.86  x lO”60  N; 

(c)  1.24  x 1036 

5.  4'he  bismuth  nucleus  must  have  one  more  unit 
of  positive  charge  than  the  lead  nucleus  has. 

7.  3.52  x 1015  m/s2  in  the  direction  opposite  to 
the  electric  held 

13.  (i b ) 5.29  x 10“9  C 

17.  (a)  Along  the  line  through  the  charges,  at  the 
point  which  is  1.0  m from  the  negative  charge, 
and  2.0  m from  the  positive  charge 


I9.  (a)  (V2  + i);  (b)  0;  (c)  q'  = (V2  + i) 

/o"  2 

27.  (a)  0;  (b)~. ^7,  directed  outward  along  the 

47 T€qCI 

perpendicular  bisector  of  the  line  connecting 
the  other  two  charges 

29.  Est:  6max  - 3x  10“4  radians  = 0.02° 


33.  (c ) 6.55  x 1015  Hz 
39.  7 x 10~15  m 


Answers  2A-3 


CHAPTER  21 


■MHHMHHI 


1.  Est:  8 x 10~19  J - 5 eV 

V Z7re0mr 
5.  3q2/4ne0a 

7.  (a)  3.3  x l(r3  C;  (b)  3.0  x 106  V 

9.  3.5  x 10“3  V 

11.  (a)  1.33  pT;  ( b ) 1.00  /x,F 

13.  (1)  Two  in  parallel,  in  series  with  a third; 

(2)  three  in  series,  in  parallel  with  another  three 
in  series 

15.  ( a ) 103  V;  ( b ) 1.77  x 10“2  /rF;  (c)  8.85  x 10“3  J 
17.  (a)  160  V;  ( b ) The  charge  is  unchanged,  (c)  2.5 


35.  (b)R 


3e2 


207re0mc 2 


= 1.69  x 10  15  m 


37.  (o)  0;  ( b ) perpendicular  to  the  plane; 

f,2 

( C ) 


167 T€0d 


<T  m ~W 

— 2;  (d)  a = 


27rr3 


39.  The  dipole  is  attracted  to  the  point  charge  if  the 
dipole  moment  p is  parallel  to  the  radius  vector 
r from  q to  the  dipole.  If  p and  r are  antiparallel, 
the  dipole  is  repelled. 

41.  (a)  p%z;  (b)  sin  0\ 

(d)  Hima2)  (~  j - p%  cos  6 = Hima2)  (~  j 
— p%  cos  d0 


21.  (a)  (i)  -1  x 10-8  C,  (ii)  3 x 10~8  C;  (b)  (i)  and 
(ii)  9000  V,  (iii)  13,500  V;  (c)  4500  V,  with  the 
sphere  at  the  higher  potential 

33.  4/3 


43.  (a)  10  V;  (b)  regular  tetrahedron  and  spherical 
shell 

45.  cr  = 2e0£T0cos  kx 


CHAPTER  22 


29.  The  proper  circuit  has  the  two  55-W  bulbs  in 
parallel,  and  has  that  combination  in  series  with 
the  1 10-W  bulb. 


1.  600  W 

3.  (a)  3.03  x 103  C;  (b)  3.40  g 
5.  4.91  m 

7.  2.67  x UP3  °C“1 
9.  5.52  x 10“4  m/s 

11.  (a)  240  0;  (b)  144  fi;  (c)  The  60-W  bulb  will  be 
brighter. 

13.  (a)  90.0  W;  (b)  22.5  W 
15.  0.10  a 
17.  RAg/R Cu  = 3.40 
19.  100  nr,  104 

23.  Relative  to  copper,  the  number  densities  are: 
0.17  for  iron,  1.7  x 10-10  for  silicon,  and  8.5  x 
10~20  for  glass.  Assuming  that  copper  has  one 
conduction  electron  per  molecule,  the  others 
have  0.17,  2.8  x 10_1°,  and  3.1  x 10-19  conduc- 
tion electrons  per  molecule,  respectively. 

25.  (a)  1.8  O;  ( b ) 180  W;  (c)  40  W;  (d)  880  W 

27.  h = 3.00  A,  i2  = 2.00  A;  V = 19  V 


31.  (a)  ~ iklx  + V*  = 0;  ( b ) -ikls  + V5  = 0; 

(c)  Vx/Vs  = lx/ Is 

33.  ( b ) 796  H 

35.  (a)  i/2vrL\  ( b ) i/27Tr<rL\  (d)  surface 

ZttctL 

charge  = e0t'/27rr1crL,  = e0i/cr,  Q2  = 
— e0i/(j,  S2  = — e0i/2TTr2crL 

37‘  ia)  + cr2(r|  - r?)]’ 

ill]  h {1  + (cr2/cr1)[(rl/rf)  - 1]}' 

• _ dq~2/o'i)[(rl/>1)  - 11 

2 {1  + (0-2/0-dOf/r?)  - 1]} 

39.  (b)  50  percent 

41.  (a)  (i)  119.2  A,  (ii)  19.2  A;  ( b ) 514  V; 

(c)  (i)  71.2  A,  (ii)  28.8  A;  (d)  493  V;  (e)  470  V 

43.  7r/5  = 1.40r 


2A-4  Answers 


CHAPTER  23 

3.  0.447 

5.  V3/2  = 0.866 
7.  Positive 
9.  4 x 10“3 * 5  T 


kinetic  energy 
kinetic  energy. 


equal  twice  the  deuteron 


21.  (a)  3.05  x 107  Hz;  (c)  (i)  1.92  MeV, 

(ii)  175  MeV  (using  relativistic  mechanics); 
(d)  2.57  X 107  Hz 


11.  0.796  A 
13.  Northward 

15.  (a)  The  fields  are  equal,  (b)  Solenoid  B con- 
sumes less  power. 


23.  (b)  2.34  x 10“3  radians  = 0.134° 

27.  (a)  The  smaller  coil;  (b)  2 

29.  (a)  (B  is  parallel  to  y for  x > 0;  (B  is  parallel  to 
— y for  x < 0 


17.  (a)  Negative;  (b)  6.64  x 10  6 m;  (c)  3.57  x 
10-10  s 

19.  (a)  Magnetic  field  must  be  halved;  ( b ) Fre- 
quency must  be  doubled;  (c)  If  the  change  is 
made  as  described  in  part  (a),  the  maximum 
proton  kinetic  energy  will  equal  half  the  deu- 
teron kinetic  energy.  If  the  change  is  made  as 
described  in  part  (b),  the  maximum  proton 

aHaaKBBHHHHMRIMIHnHHIIIMMa 


35.  ( b ) We  denote  the  solenoid’s  radius  by  k and  its 
length  by  L.  (i)  SftA  = 

5 7 2Vl  + (k/L)2 


(ii)  C = 


fJL0ni ...  <Mq_  _ 2V 1 + (k/L): 

Vl  + (2 k/Lf’  111  ®A  ~ Vl  + (2k /L) 


37.  (a)  b = a/2 

39.  (b)  2ft  — 0 for  r < ry,  2ft  = \x^/2'nr  for  r > r2. 


m wmmm 


CHAPTER  24 


wire  to  the  beam.  The  force  is  directed  toward 
the  beam  and  is  purely  magnetic. 


1.  (a)  2.5  x 10-2  N/m  toward  the  west;  (b)  0 

3.  1.96  x 10-3  4’  directed  out  of  the  page 

5.  1.33  x 10"5  N attracting  the  loop  to  the  wire 

7.  Since  the  relative  speed  is  much  less  than  c,  the 
surface  charge  density  is  1.00  x 10~6  Cm-2.  A 
current  is  observed  to  be  flowing  in  the  direc- 
tion opposite  to  the  jet’s  motion. 

9.  6 = N^ai/k 

11.  17.5  A/m  in  the  same  direction  as  (B0 

15.  27 sin  9 upward 

17.  (a)  Thin  wire:  5.15  x 10-4  m/s,  middle  wire: 
1.29  x 10-4  m/s,  thick  wire:  5.72  x 10-5  m/s; 
(b)  this  wire:  i-  = 1.18  A and  i+  = —0.18  A, 
middle  wire:  i-  = 1.73  A and  i+  = —0.73  A, 
thick  wire:  i-  = 2.64  A and  i+  — — 1.64  A 

19.  (a)  The  force  per  unit  length  has  magnitude 
/ji0iNev / 2ttt , where  r is  the  distance  from  the 


25.  1 .00  x 1 0~2  Am 

27.  63  K 

29.  (e)  19.5  A 

31.  (a)  Fe  = 5.76  x 10-19  N away  from  beam,  Fm  = 
1.60  x 10_2°  N toward  the  beam;  (b)  F’e  = 
5.68  x 10“19  N away  from  beam,  F’m  = 0; 
(c)  F'e  = 5.76  x 10-19  N away  from  beam,  F'm  = 
1.80  x 10_2°  N toward  the  beam 

35.  (a)  Re  = 900  H,  RF  = 9000  H,  RG  = 90,000  H; 
(b)  Rb  = if  H,  Rq  = t r H,  RD  = 9 V H 

37.  (a)  1.8  x 1011  C/kg,  which  is  twice  as  large  as 
e/2 me  \ (b)  5.2  x 1010  m/s,  which  is  greater  than 
the  speed  of  light. 

39.  (a)  Est:  8 x 1 0“23  J ; ( b ) kTc  = 1.4  x 10“2°  J 


CHAPTER  25 


3.  (a)  1.00  x 10  2 Wb  through  each  turn; 

(b)  1.00  V 

5.  Clockwise 


9.  2.83  x 103  V 
11.  9.33  x 10“2  V 


Answers  2A-5 


13.  5.00  x 10  3 Wb  through  each  turn 
15.  (a)  1.35  J;  (b)  1.59  x 103  J/m3 
17.  (a)  From  D to  C;  (c)  0.314  V 
19.  1.13  x 10“6J 
21.  6.28  C 

23.  (a)  -/tioiiy;  (b)  %x(z)  = (z  - D/2) 

for  0 ^ z ^ D\  %x(z)  = for  z > D; 

% (Z)  = for  < o 

xW  z (It 

25.-  (b)  50  percent;  (c)  by  making  Re  much  greater 
than  R 

27.  M=«l„(l+^ 


31.  (a)  i = — w3ftv/R ; (b)  v(t)  = v0e  tlT  and  x(t)  = 

x0  + v0T(  1 - e~tlT),  with  T = ^r|^;  (c)  v{t)  = 

(10  m/s)e~f/2-5s,  x{t)  = 3.0  m + 25  m(l  — e~rl2-5s), 
and  i(t)  = (—2.0  A)e~tl2'5Sm,  the  rod  slides  25 

7/i2  OJ&d 

meters,  (d)  ^ ^ — MJR  = 0.0408; 

(e)  %c/dft  = 6.7  x 10-4;  yes 

33.  (a)  For  a < r < b,  (B  = ~~~  dp,  where  dp  = z x r. 

Itty 

For  r > c,  (B  = 0. 

35.  (a)  213  V;  (b)  298  W;  (c)  202  V;  (rf)  3.6  A; 
(e)  727  W 

37.  fXoi2 / 1 677 


CHAPTER  26 

1.  (a)  V0/L\  ( b ) L/R\  (c)  It  is  identically  equal  to  tl. 
5.  159  Hz 
7.  239  H 


9.  (a)  (i)  377  fl,  (ii)  2.65  x IQ3  Q;  (b)  159  Hz 

11.  (a)  141  V;  (b)  The  current  leads  the  voltage  by 
45°. 

13.  1.60  A 

15.  (a)  (i)  377  fl,  (ii)  531  fl;  ( b ) 183  fl;  (r)  0.655  A; 
(d)  (i)  247  V,  (ii)  347  V.  (iii)  65.5  V;  (e)  56.9° 
(The  current  leads  the  voltage.);  (/)  0.546; 
(g)  42.9  W 


19.  L = 1.59  x 10“3  H.  C = 15.9  \x F 

yn  1 

21.  (a)  i(t ) = — cos  a >0t,  where  w0  = — j=\ 
P VLC 


(b)  q(t)  = sin  <oof,  (c)  E = LVI/2R2 
(o0K 

\VA 

31.  (a)  i — —pr1  (1  — e tlr<-),  where  rL  = L/R: 

K 

(b) i(t)  = ^(1  - e~TA)e-^A  -J^i 

[1  - e-{t~T)lT'-] 

(c)  t'  = T + tl  In  [1  + (|T^|/|Tb|)(1  - e-T^)] 

35.  (a)  (i)  0,  (ii)  and  (iii)  q0/RC\  ( b ) 0; 

1_  _ /J 1_  \1/2 

(d)  - RC • s - [LC  - 4R2Cz  ) ; 

(e)  A = q0.  B = -q0/2RC(op 

39.  (a)  (i)  Vg/^R2  + (oj L)2,  where  Vg  is  the  genera- 
tor voltage;  (ii)  VgojL/\/R/  + (ojL)2;  (iii) 

Va/\/Rl  + ( 1/wC)2;  (iv)  VgR2/\/m  + (1/wC)2; 

(. b ) (i)  RjtoL,  (ii)  1 /ojR2C 


CHAPTER  27 

1.  4.43  x 10"2  A 
7.  305  m 

11.  (b)  3.33  x 10“6  N 

13.  (a)  {m1/m2)2',  (b)  Pe/Pap  — 3.39  x 106 

15.  48  H - 0.657i?ideai 


17.  (a)  q0  = 


CVp 

V I + (c oRC )2’ 


8 - 


— tan  hw/^C); 


(b)  S5  = P-oeot  Voo,  sin(cat  + 8)  in  the  azi- 

2dV  1 + (toRC)2 

muthal  direction,  where  d is  the  distance 
between  the  plates  and  r is  the  axial  distance; 


2 A- 6 Answers 


21. 


(c)  2.78  x l(r9  F;  (d)  2.78  x 10~7 
cos(1207rf  — 1.05  x 10-4)  coulombs; 
(e)  -2.09  x 1011  (r/1.00  m) 
sin(1207rt  — 1.05  x 10-4)  tesla; 
if)  ® max/  ® earth  ~ 4 X 10 

d2%z  d2%  , a2a„  d2M 

-&F  = f‘°e°  IF and  ~£F  = '4°e“  H? 


u. 


25.  (B  = — z y;  cos3[(27t/X.)(x  + ct)] 


31.  (a)^^;  0)^;  (c)  p„ 


d 


Um  ~ lutr’  {d)  8^;  (g) 


(t/m) 


lxQdorql  f a rR2 
32t t ’ 8C2 


2 '2 
I 

8ttR 4’ 

= . ^ 
47reo/?2  ’ 


CHAPTER  28 


5.  (a)  2.24  x 108  m/s;  (b)  5.00  x 1014  Hz  in  both 
media;  ( c ) 4.47  x 10-7  m 

7.  40.4° 

9.  1.31 

11.  0.910  mm 
13.  16.8° 

15.  546.4  nm 

17.  Est:  width/spacing  =1/11 
19.  0.145 


23  . 2.29  x 108  m/s,  1.31 
25.  1.58 
27.  491 

35.  (a)  0°  as  d < 35.5°;  (b)  35.5°  6 < 56.2°; 

(c)  56.2°  s£  0 < 90° 

41.  568  nm,  green 

43.  (1  + cos  6)/(  1 — cos  9),  where  6 is  the  angle 
between  the  orientations  of  the  polarizers 


CHAPTER  29 

1.  30  cm 
5.  18.8  cm 

7.  (a)  0.25  m;  (b)  0.476  m 
11.  —2  diopters 
15.  2.62  cm 


17.  (a)  and  ( b ) 


0 

- 1 


(c)  Matrix  multiplication 


obeys  the  associative  law. 


21.  (a)  80  cm;  (b)  — 160  cm:  When  immersed  in 
carbon  disulfide,  the  lens  becomes  a diverging 
lens. 

23.  29.2  cm 

25.  30.0  cm,  70.0  cm 

33.  -33.3  cm 

39.  — 90  cm 

41.  18.7  cm  from  lens  4 


CHAPTER  30 

1.  564  nm 
3.  9.31  x 1014  Hz 

5.  (a)  0.030711  nm;  ( b ) 0.034141  nm 
7.  2.65  x 10“15  m 
9.  7.01  x 10~15  m 
11.  2.9  x 106  m/s 


13.  (a)  and  ( b ) 5.7  x 10  38  J = 3.5  x 10  19  eV; 
(c)  A m/m  = 3.4  x 10~28 

15.  (a)  6.67  x 10“34  J • s;  (b)  2.44  eV;  (c)  5.86  x 
1014  Hz,  512  nm;  (d)  22.5  V 

17.  (a)  3.1  x 1014  photons  per  second;  (b)  7.0  x 106 
photons  per  second;  (c)  310  photons  per  second 

19.  (b)  0.50  mec2  = 0.256  MeV 


Answers  2 A- 7 


23.  (a) 


6 h 


\ 2/9 


(. b ) 7.57  nm,  which  is  much 


77 -p  \/2g' 

less  than  the  wavelengths  of  visible  light;  one 
cannot  expect  to  see  the  droplet. 


27.  (a)  h/Airmc2',  ( b ) f?max  = c At  = h/Anmc', 

(c)  49.4  MeV,  which  is  of  the  right  order  of 
magnitude  for  the  pion  rest  mass 


29.  (r)  n(v)dv  = 


87 tv'1  1 

c3  ghvlkT  _ ] 


dv 


33.  (a)  Total  energy  1.44  MeV,  kinetic  energy 
0.93  MeV;  (b)  1.14  MeV 

35.  (a)  0 = 26.57°,  d = 1.79  x 10"10  m;  (b)  2.03  x 
106  m/s;  (c)  The  possible  values  of  6 are  given 
by  sin  6 = j( 7.57  x 10“3).  The  first  three  values 
are  0.434°,  0.868°,  and  1.301°. 


CHAPTER  31 


1.  (a)  13.5  eV;  (b)  1.5  eV 

3.  365  nm 

7.  2.19  x lO”2"  I = 0.137  eV 

9.  (a)  0.705  Hz;  (b)  1.76  x 10“2  J;  (c)  3.78  x 1031; 
{d)  4.67  x 1 0— 34  J ; (e)  Amplitude  — 5 x 10_5m, 
which  is  four  orders  of  magnitude  smaller  than 
the  size  of  an  atom.  Experimental  detection  of 
energy  quantization  is  not  possible  in  such  a 
system. 

13.  (a)  En  = + 6«o f2)2;  (b)  En  = 1.5  m0c2 

for  n = 2.24  (^y)  L:  (i)  1700,  (ii)  92,  (iii)  1 


17.  (a)  U = Mr2;  (b)  v = r\fkjm\ 
(c)  rn  = 


nh 


W)  K = £ 


2rr\/mk 
k 


nh\fk 


:ttm 


3/2 


= nh  1 


m 


21.  {b)  x = ± 


; (c)  16  percent 


23.  v = 1.32  x 1014  Hz,  k/fi  = 6.87  x 1029  s~2, 
/x  = 8.35  x 10“28  kg  k = 574  N/m 

25.  E = -ne4/8e20h2 


2 A- 8 Answers 


Index 


Aberration: 

chromatic,  1401 
monochromatic.  1401 
Absolute  temperature,  748,  786 
Absolute  zero,  748 
Ac  circuit,  1211 
parallel,  1245 
series,  1235 
Ac  source,  1235 
Achromatic  doublet,  1429 
Action,  principle  of  least,  1377 
Adiabatic,  860 
Adiabatic  process,  856 
Airspeed,  735 
Airspeed  indicator,  734 
Alkali  metal  atom.  1500 
Alpha  particle,  907 
Alternating  current  (se e Ac) 
Ammeter,  1053,  1 146,  1 148 
Ampere,  903,  1019 
Amperean  current,  induced,  1153 
Amperean  curve,  1 102 
Amperean  surface  current.  1 150 
Ampere's  conjecture,  1 150 
Ampere’s  law,  1101,  1106 

Maxwell's  generalization  of,  1269 
Analyzer,  1365 
Angle: 

Brewster’s,  1367 
critical,  1335 
of  incidence,  1318,  1329 
phase,  1232 
phase  difference,  1232 
of  reflection,  1318 
of  refraction,  1329 
scattering,  913 
Angular  frequency,  1223 
Angular  magnification: 

of  astronomical  telescope,  1399 
of  magnifying  glass,  1398 


Anisotropic,  698 
Anode,  1014 
Antenna: 
dipole,  1287 
half-wave  dipole,  1288 
transmitting,  1 305 
Antinodes,  line  of,  1342 
Antiparticle,  906 
Antiproton,  906 

Astronomical  refracting  telescope.  1398 
Atmosphere,  709 
Atomic  number,  916 
Atomic  weight,  799 
Avogadro's  number,  755 


Back  emf,  1 194 
Ballistic  galvanometer,  1035 
Balmer  formula,  1497 
Balmer  series,  1499 
Bandwidth,  1474 
Bar,  709 

Bar  magnet,  1061 
Barometer,  7 1 2 
Beam,  parallel,  1378 
Bernoulli’s  theorem,  730,  731 
Binding  energy,  1439,  1496 
Binoculars,  1400 
Biot-Savart  law,  1090 
Bohr  magneton,  1 158 
Bohr  model,  1492 
Boltzmann  factor,  815 
Boltzmann’s  constant,  753 
Bom's  postulate,  1456,  1490 
Boyle’s  law,  714 
Box  camera,  1391 
Bragg’s  law,  1444 
Branch  of  circuit,  1051 
Brewster’s  angle,  1367 
Brewster's  law,  1 367 


1-1 


Bridge,  Wheatstone,  1053 

Brownian  motion,  837 

Bulk  expansion  coefficient,  759 

Bulk  modulus,  715 

Bulk  modulus,  isothermal,  714 


Calorimeter,  767 
Calorimetry,  764 
Camera,  box,  1391 
Capacitance,  989 
Capacitive  circuit,  1242 
Capacitive  reactance,  1 242 
Capacitor,  989 

plane-parallel,  989 
Carnot  cycle,  871 
Carnot  efficiency,  873 
Carnot  engine,  871 
Carnot’s  theorem,  873 
Cathode,  1014 
Cathode-ray  tube.  1077 
Cavitation,  735 
Celsius  temperature,  746 
Change  of  phase,  765 
Charge: 

electric,  894 
electron,  905 
negative,  897 
point,  902 
positive,  897 
source,  917 
test,  917 

Charge  carrier,  1034 
for  metals,  1036 
Charge  density,  mobile,  1020 
Charged  negatively,  897 
Charged  positively,  897 
Charles'  law,  747,  749 
Chromatic  aberration,  1401 
Circuit,  1018 
ac,  1211 

capacitive,  1242 
electric,  1018 
inductive,  1243 
LC,  1221 
LRC,  1226 
parallel  ac,  1245 
primary,  1258 
RC,  1217 
resistive,  1242 
RL.  1212 
secondary,  1258 
series  ac,  1 235 
Circuitanalysis.de,  1048 
Circuit  diagram,  1049 
Circuit  element,  1045 
Circularly  polarized,  1312 
Circulation,  1 102 
electric,  1102,  1185 
magnetic,  1 1 02 
Clausius  equation  of  state,  791 

Clausius  statement  of  second  law  of  thermodynamics,  876 
Coefficient: 

of  bulk  expansion,  759 
of  drag,  723 
Hall,  1083 

of  linear  expansion,  757 
of  performance,  heat  pump,  878 
of  performance,  refrigerator,  878 
of  viscosity,  710 
of  volume  expansion,  759 
of  volume  expansion,  liquids,  759 
Coils: 

field,  1193 
Helmholtz,  1121 
Collimator,  1339 
Collision  cross  secUon,  836 
Color,  wavelength  associated  with,  1337 
Commutator,  1148 
Complex  conjugate,  1508 
Compressible,  defined,  710 


Compressibility,  715,  890 
Compression  ratio,  884 
Compressive  stress,  701 
Compton  equation,  1449 
Compton  scattering,  1451 
Compton  shift,  1449 
Compton  wavelength,  1449 
Concentration,  1082 
Conductance,  1024 
Conductivity,  electrical,  1026 
for  metals,  1029 
Conductor.  897 
ohmic,  1028 
Contact  potential,  1440 
Continental  drift.  1 165 
Continuity  equation,  727 
Continuous  medium,  697 
Converging  lens,  1386 
Coulomb,  903 
Coulomb’s  law,  902 
Couple.  1 168 
Critical  angle,  1335 
Critical  Reynold’s  number,  724,  725 
Cross  section,  1313 
collision.  836 
Crossed  field,  1079 
Curie  temperature,  1 164 

of  ferromagnetic  materials,  1 164 
Curie-Weiss  law,  1 164 
Curie’s  constant,  1 160 
Curie's  law,  1 161 
Current: 

alternating,  1235 
amperean  surface,  1 1 50 
electric,  903,  1019 
induced,  1 173 
induced  amperean,  1 153 
magnetizing,  1258 
primary,  1258 
pulsating  direct,  1192 
saturation,  1436 
Current  density,  1027 
Current  lines,  1020 
Cutoff  frequency,  1437 
Cycle,  Carnot,  871 
Cyclotron,  1077 
Cyclotron  frequency,  1072 
Cyclotron  period,  1072 
Cyclotron  resonance,  1073 
Cylindrical  lens,  1377 


Dc,  pulsating,  1 192 
Dc  circuit  analysis,  1048 
Dc  source,  1 047 
De  Broglie  relation,  1456 
De  Broglie  wavelength,  1458 
Decay  curve,  1217 
Degree.  745 

Demagnetizing  field,  1 153 
Density,  710 
current,  1027 

Density-of-states  factor,  820 
Depolarization  field,  1004 
Diamagnetism,  1152 
Dichroic,  1364 
Dichroism,  1363 
Dielectric,  1002 
Dielectric  constant.  1005 
Dielectric  strength,  1005 
Difference  equation.  985 
Diffraction,  1340,  1341 
Fraunhofer,  1344 
Fresnel,  1344 
muluslit,  1349 
single-slit,  1355 
two-slit,  1341 
Diffraction  grating,  1352 
Diffraction  maxima,  1342 
Diffraction  minima,  1342 
Diffraction  pattern,  1342 


1-2  Index 


Diopter.  1426 
Dipole: 

electric,  965 
electric,  induced,  1002 
magnetic,  1063,  1 140 
Dipole  antenna,  1287 
half  wave,  1288 
Dipole  moment: 
electric,  971 
induced,  1 153 
magnetic,  1095 
magnetic,  of  bar  magnet.  1141 
magnetic,  of  circular  loop,  1 1 40 
magnetic,  of  planar  loop,  1145 
Dipole  moment  magnitude,  electric,  967 
Dipole  transmitting  antenna,  1 305 
Direct  current  (see  Dc) 

Directrix,  1 320 
Dispersion,  1337,  1352 
Dispersion  curve,  1338 
Dispersive,  1337 
Dispersive  power,  1429 
Displacement  current,  1269 
Distribution  function,  800 
Diverging  lens,  1386 
Domain,  ferromagnetic,  1 162 
Domain  structure,  1 162 
Doublet,  achromatic,  1429 
Drag,  coefficient  of,  723 
Drag,  viscous,  7 1 8 
Drift  velocity,  1021 
Driving  frequency,  1235 
Drude-Lorentz  theory,  1043 
Dulong-Petit  law,  799 


Effective  mass,  1074 
Efficiency: 

Carnot,  873 
thermodynamic,  869 
Eigenfunction,  1570 
Einstein-de  Broglie  relations,  1456 
Electric  charge,  894 
Electric  circuit,  1018 
Electric  circulation,  1 102,  1 185 
Electric  current,  903,  1019 
Electric  dipole,  965 
induced,  1002 
Electric  dipole  moment,  97 1 
magnitude,  967 
Electric  field,  917 
induced,  1 185 

Eleectric  field  energy  density.  998 
Electric  field  lines,  924 
Electric  flux,  927 
Electric  flux  element,  927 
Electric  force,  895 
Electric  motor,  1 148 
Electric  potential,  948 
Electric  potential  difference,  988 
Electric  potential  energy.  946 
Electric  power  input,  1045 
Electric  resistance.  1025 
Electrical  conductivity,  1026 
for  metals,  1029 

Electrically  neutral,  defined,  895 
Electrode,  975 
Electromagnet,  1163 
Electromagnetic  energy  flux  vector,  1 290 
Electromagnetic  field,  1266 
Electromagnetic  force.  894 
Electromagnetic  induction,  1 173 
Electromagnetic  radiation,  1286 
Electromagnetic  spectrum,  1286,  1443 
Electromagnetic  wave  equations,  1280 
Electromotive  force  (see  Emf) 

Electron.  904 
Electron  charge,  905 
Electron  microscope,  1 464 
Electron-spin  magnetic  moment,  1 158 
Electron-volt,  950 


Element  of  a matrix,  1407 
Emf,  1014 
back,  I 194 
source  of.  1014 
Energy: 

binding,  1439,  1 496 
electric  potential,  946 
internal,  843 

orientational  potential,  971 
zero  point,  1488 
Energy  density,  784,  1294 
electric  field,  998 
magnetic  field,  1203 
total,  1289 
Energy  flux,  1290 

Energy  flux  vector,  electromagnetic,  1290 
Energy  level,  1488 
Energy  quantization,  1485 
Energy  quantization  formula: 
for  harmonic  oscillator,  1519 
for  hydrogen  atom,  1495 
for  particle  in  a box,  1488 
Engine: 

Carnot,  871 

external-combustion,  882 
heat,  848 

internal-combustion,  884 
Engine  cycle,  848 
Entropy,  830 

Equation  of  state,  756,  847 
ideal  gas,  754 
van  der  Waals,  891 
Equilibrium: 

hydrostatic,  706 
thermal,  789 

Equilibrium  macrostate,  829 

Equilibrium  thermodynamics,  842 

Equipartition  of  energy,  theorem  of,  795,  797 

Equipotential,  959 

Equipotential  surface,  959 

Equivalent  resistance,  1048 

Exchange  interaction,  1 162 

Excited  state,  1 488 

Expansion,  Joule-Kelvin,  880,  881 

Experimental  simulation,  802 

External-combustion  engine,  882 

External  parameter,  843 

Eye,  1395 

Eye  glasses,  1 394 

Eye  piece  lens,  1 399 


Fahrenheit  temperature,  745 
Farad, 989 

Faraday’s  constant,  1016 
Faraday’s  law,  1 1 77 
Fermat’s  principle,  1375 
Ferromagnetic  domain,  1162 
Ferromagnetism,  1 152,  1161 
Fiber  optic  bundle,  1337 
Field(s): 

crossed,  1079 
demagnetizing,  1 153 
depolarization,  1004 
electric,  917 
electromagnetic,  1266 
induced  electric.  1 185 
magnetic,  1063,  1066 
steady,  Maxwell’s  equations  for,  1 107 
Field  coils,  1 193 
Field  energy  density: 
electric,  998 
magnetic,  1203 
Field  lines: 
electric,  924 
magnetic,  1064 

Finite  square-well  potential.  1525 
F'irst  law  of  thermodynamics,  834,  844 
Fixed  point,  745 
Flow,  706 
laminar,  719 


Index 


1-3 


Flow  (Cont.): 

turbulent,  723 
Fluid,  706 
ideal,  725 
Flux,  726 
electric,  927 
energy,  1290 
magnetic,  1097 
mass,  726 
F-number,  1426 
Focal  length,  1384 
Focal  point,  1378 
Focus,  1377 

of  parabola,  1320 
Force: 

electric,  985 
electromagnetic,  984 
electromotive  (see  emf) 
Lorentz,  1077 
magnetic,  1066 
Four-stroke  Otto  cycle,  884 
Fourier  integral,  1483 
Fraunhofer  diffraction,  1344 
Free  space: 

permeability  of,  1091 
permittivity  of,  932 
Free-stream  speed,  722 
Free-stream  velocity,  721 
Frequency: 
angular,  1223 
cutoff,  1437 
cyclotron,  1072 
driving,  1235 
natural,  1234 
resonant,  1240,  1246 
Fresnel  diffraction.  1344 


Galilean  telescope.  1429 
Galvanometer,  1054,  1 146 
ballistic,  1035 
Gas.  710 

Gas  constant,  universal,  755 
Gauss’  law,  931 

for  magnetic  field,  1098 
Gaussian  surface,  932 
Gay-Lussac’s  law.  747 
Generator,  van  de  Graaff,  1013 
Geometric  optics,  1374 
Grating: 

diffraction,  1352 
master,  1352 
reflecUon,  1352 
replica,  1353 
transmission.  1352 
Grating  spacing,  1352 
Ground  state,  1488 
Guard  ring,  992 


Half-wave  dipole  antenna,  1288 
transmitting,  1305 

Half-width  at  half-maximum  intensity,  1351 

Hall  angle,  1118,  1119 

Hall  coefficient,  1083 

Hall  effect,  1081 

Hall  voltage,  1082 

Harmonic  oscillator,  energy  quantization  formula  for,  1519 

Heat,  763 

Heat  capacity,  761 

per  atom  for  solids,  798 
molar  at  constant  pressure,  85 1 
molar  at  constant  volume,  844,  850 
molecular,  792 
Heat  engine,  848 
Heat  engine  cycle,  848 
Heat  pump,  873 

coefficient  of  performance,  878 
Heat  sink,  867 
Heat  source,  867 

Helium  atom,  singly  ionized,  1500 


Helmholtz  coils,  1121 
Henry,  1 196 
Hole,  1075 

Huygens’ construction,  1315 
Hydraulic  press.  737 

Hydrogen  atom,  energy  quantization  formula  for,  1495 
Hydrostatic  equilibrium,  706 
Hydrostatic  pressure,  709 


Ideal  gas,  748 

equation  of  state  for,  754 
Ideal-gas  law,  754,  755 
Ideal-gas  model,  777 
Ideal  fluid,  725 
Ideal  wake,  724 
Image,  1388 
real,  1393 
virtual,  1393 
Image  distance,  1388 
Image  plane,  1388 
Images,  method  of,  1009 
Imaginary  number,  1507 
Impact  parameter,  909 
Impedance,  125 1 
Incidence,  angle  of,  1318,  1329 
Incompressible,  defined,  710 
Independence  requirement.  809 
Index  of  refraction,  1327 
for  various  substances,  1328 
Induced  amperean  current,  1 1 53 
Induced  current,  1 173 
Induced  dipole  moment,  1 153 
Induced  electric  dipole,  1002 
Induced  electric  field,  1 1 85 
Induced  magnetism,  1062,  1 152 
Inductance,  1 199 
mutual,  1 196 
Induction: 

electromagnetic,  1 173 
magnetic,  1066 
Inductive  circuit,  1243 
Inductive  reactance,  1243 
Inductor,  1 199 
Infrared,  1314 
Insulator,  897 
Intensity,  1 350 
Interaction: 

exchange,  1 162 
thermal,  843 
thermodynamic,  842 
Interference,  1341 
order  of,  1344 
Interferometer,  stellar,  1372 
Internal-combustion  engine,  884 
Internal  energy,  843 
Internal  reflection,  total,  1 334 
Internal  resistance,  1048 
Ionize,  1496 
IR  drop,  1052 

Iron-core  electromagnet,  1163 
Irreversible  process,  844 
Isentropic,  857 
Isobaric,  defined,  846 
Isometric,  defined,  844 
Isotherm,  855 
Isothermal,  defined,  855 
Isothermal  bulk  modulus,  714 
Isotope,  1080 
Isotropic,  699 
Isotropic  scattering.  1038 


Joule-Kelvin  expansion,  880,  881 
Joule’s  law,  1045 


Kelvin,  748 

Kelvin-Planck  statement  of  second  law  of  thermodynamics,  877 
Kelvin  temperature,  748 
Kilocalorie,  762 


1-4  Index 


Kilomole,  754 
Kinetic  theory,  778 
Kirchoff  s rules,  1051 


Lamina,  718 
Laminar  flow,  7 1 9 
Laplace’s  equation,  978 
iterative  solution  of,  986 
Laser,  1310 
Latent  heat,  766 
Lateral  magnification,  1390 
Law  of  reflection,  1318 
Law  of  refraction,  1331 
LC  circuit,  1221 
Least  action,  principle  of,  1377 
Lens: 

converging,  1386 
cylindrical,  1377 
diverging,  1386 
spherical,  1378 
telephoto,  1430 
thin,  1381 
zoom,  1424 

Lens  maker's  formula,  1384 
Lentz’  law,  1181 
Lifetime,  1500 
Light,  1314 

Light-gathering  ability,  1400 

Light  pipe,  1336 

Light  ray  matrix,  1410 

Lightning,  983 

Lightning  rod,  983 

Like  poles,  1062 

Line(s): 

of  antinodes,  1342 
of  grating,  1352 
of  nodes,  1342 
of  spectrum,  1497 
Line  spectrum,  1497 
Linear  expansion,  coefficient  of,  757 
Linearly  polarized,  1312 
Liquid,  710 
Loop  of  circuit,  1051 
Loop  rule,  1051 
Lorentz  force,  1077 
LRC  circuit,  1336 
Lyman  series,  1499 


Macrostate,  808 
equilibrium,  829 
Magnet: 
bar,  1061 
permanent,  1061 
Magnetic  circulation,  1102 
Magnetic  dipole,  1063,  1140 
Magnetic  dipole  moment,  1095 
of  bar  magnet,  1141 
of  circular  loop,  1 1 40 
of  planar  loop,  1 145 
Magnetic  field,  1063,  1066 
Gauss’  law  for,  1098 
for  various  circumstances,  1067 
Magnetic  field  energy  density,  1203 
Magnetic  field  lines,  1064 
right-hand  rule  for,  1086 
Magnetic  flux,  1097 
Magnetic  flux  element,  1097 
Magnetic  force,  1066 
Magnetic  induction,  1066 
Magnetic  moment,  electron-spin,  1 158 
Magnetic  monopole,  1064 
Magnetic  pole  strength,  1 140 
Magnetic  remanence,  1 163 
Magnetic  susceptibility,  1155 
Magnetism,  induced,  1062,  1 152 
Magnetization,  1154 
Magnetizing  current,  1258 
Magneto,  1191 

Magnetometer,  search  coil,  1 179 


Magnification: 

angular,  of  astronomical  telescope,  1 399 
angular,  of  magnifying  glass,  1398 
lateral,  1390 

overall,  of  microscope,  1404 
Magnifying  glass,  1396 
Mass,  effective,  1074 
Mass  flux,  726 
Mass  spectroscopy,  1080 
Master  grating,  1 352 
Matrix: 

light  ray,  1410 
refraction,  1410 
system,  1412,  1413 
thin-lens,  1424 
transmission,  1410 
two-by-one,  1407 
two-by-two,  1407 
Matrix  equation,  1407 
Matrix  multiplication,  1407 
Matrix  notation,  1407 
Matter  wave,  1456 
Matthieson's  rule,  1044 
Maxwell-Boltzmann  speed  distribution,  820 
Maxwell’s  equations,  1273 
for  steady  fields,  1 107 
in  vacuum,  1274 

Maxwell’s  generalization  of  Ampere’s  law,  1269 
Mean  free  path,  836,  1041 
Mean  scattering  time,  1039 
Mechanics: 

quantum,  1 505 
statistical,  800 
Meson  theor\  1 482 
Method  of  images.  1009 
Microscope,  1402 
electron,  1464 
oil  immersion,  1427 
overall  magnification  of,  1404 
Microstate,  807,  808 
Mirror: 

parabolic,  1320 
plane,  1318 

Mobile  charge  density,  1020 
Model,  776 

Modulus  of  rigidity,  705 
Molar  heat  capacity: 

at  constant  pressure,  85 1 
at  constant  volume,  844,  850 
Molecular  heat  capacity,  792 
at  constant  volume,  793 
for  monotomic  gases,  793 
for  polyatomic  gases.  794 
Molecular  weight.  822,  1017 
Moment: 

electric  dipole,  97 1 
induced  dipole,  1 153 
magnetic,  of  electron-spin,  1158 
magnetic  dipole,  1095 
magnetic  dipole,  of  bar  magnet,  1411 
magnetic  dipole,  of  cirular  loop,  1 140 
magnetic  dipole,  of  planar  loop,  1 145 
Momentum  density,  1294 
Monochromatic  abberation,  1401 
Monopole,  magnetic,  1064 
Most  probable  speed,  822 
Motor,  electric,  1 148 
Multislit  diffraction,  1349 
Muonic  atom,  1500 
Mutual  inductance,  1 196 


Nanometer,  1328 

Natural  frequency,  1234 

Negative  charge,  897 

Net  reactive  voltage  phasor,  1 244 

Network,  1048 

Node,  of  circuit,  1051 

Node  rule,  1051 

Nodes,  line  of,  1 342 

Nonquasistatic,  845 


Index  1-5 


Normalization  condition,  806,  1490 
Normalize,  806 

Normalized  a priori  probability,  807 
North  pole  of  magnet,  1062 
Nucleus,  907 
Null  instrument,  1054 


Object,  virtual,  1394 
Object  distance,  1388 
Object  plane,  1388 
Objective  lens,  1 398 
Ohm,  1025 

Ohmic  conductor,  1028 

Ohmic  system,  1025 

Ohm’s  law,  1025 

Oil  immersion  microscope,  1427 

Optics: 

geometric,  1374 
ray,  1374 

ray,  basic  law  of,  1 374 
wave,  1315 

Order  of  interference,  1344 
Orientational  potential  energy,  971 

Oscillator,  harmonic,  energy  quantization  formula  for,  1519 
Otto  cycle,  four-stroke,  884 


Parabolic  mirror,  1320 
Parallel  ac  circuit,  1245 
Parallel  beam,  1378 
Parallel  connection,  994.  1048 
Paramagnetism,  1 152,  1 157 
Parameter: 
external,  843 
impact,  909 
Paraxial  ray,  1 380 

Particle-in-a-box  energy  quantization  formula,  1488 
Particle-wave  duality,  1454 
Partition  function,  837 
Pascal,  701 
Pascal-second,  719 
Pascal’s  law,  709 
Pascal’s  principle,  709 
Paschen  series,  1499 
Peak-to-peak  voltage,  1256 
Peak  voltage.  1256 
Permanent  magnet,  1061 
Permanent  magnetism,  1152 
Permeability,  1153 
of  free  space,  1091 
Permittivity  of  free  space,  932 
Perpetual  motion  machine: 
of  first  kind,  870 
of  second  kind,  877 
Phase,  of  matter,  746 
Phase  angle,  1232 
Phase  change,  763 
Phase  difference,  1359 
Phase  difference  angle,  1232 
Phasor,  1241 

net  reactive  voltage,  1244 
Phasor  diagram,  1242 
Phenomenological,  743 
Photocell.  1436 
Photoelectric  effect,  1436 
Photoelectron,  1436 
Photon,  1439 
Photon  correlation,  1452 
Planck's  constant,  1439,  1444 
Plane  mirror,  1318 
Plane-parallel  capacitor,  989 
Plane  wave,  1275 
Point  charge,  902 
Poiseuille’s  law,  741 
Poisson's  equation,  1010 
Poisson’s  ratio,  704 
Polarity,  1048 
Polarized: 

circularly,  1312 
linearly,  1312 

1-6  Index 


Polarized  wave,  1277 

Polarizer,  1365 

Pole,  1061 

Pole  face,  1 066 

Pole  strength,  magnetic,  1141 

Poles: 

like,  1062 
unlike,  1062 

Position-momentum  uncertainty  principle,  1465 
Positive  charge,  897 
Positron,  906 
Positronium,  1500 
Potential,  1012 
contact,  1440 
electric,  948 
finite  square-well.  1525 
stopping,  1436 
Potential  difference,  1012 
electric,  988 
Potential  energy: 
electric,  946 
orientational,  971 
Potentiometer,  1058 
Power: 

average,  1256 
radiated,  1303 
Power  factor,  1 257 
Power  input,  1045 
Poynting  vector,  1290 
Pressure,  708 
hydrostatic,  709 
radiation,  1294 
stagnation,  734 
Pressure  head,  7 1 I 
Primary  circuit,  1258 
Primary  current,  1258 
Primary  voltage,  1258 
Primary  winding,  1258 
Principle  of  least  action,  1377 
Prism  spectrometer,  1339 
Probability,  807 

normalized  a priori,  807 
postulate  of  equal  a priori,  801 
Probability  density,  1490 
Proton,  905 
Pulsating  dc,  1192 
p-V  diagram,  847 


Q factor,  1252 
Quantity  of  heat,  763 
Quantization,  energy,  1485 
Quantization  formula: 

harmonic  oscillator,  1419 
hydrogen  atom,  1495 
particle-in-a-box,  1488 
Quantized,  defined,  1436,  1485,  1487 
Quantum  mechanics,  1505 
Quantum  number,  1487 
Quantum  state,  1488 
Quasistatic,  845 

Radiated  power.  1302 
Radiation: 

electromagnetic,  1286 
synchrotron,  1308 
thermal,  1435 
Radiation  pressure,  1294 
Radiation  resistance,  1306 
Ray,  1 373 

paraxial,  1380 
Ray  optics,  1374 
basic  law  of,  1374 
Rayleigh  scattering,  1451 
Rayleigh’s  criterion  for  resolution,  1400 
RC  circuit,  1217 
Reactance,  1249 
capacitive,  1242 
inductive,  1243 

Reactive  voltage  phasor,  net,  1244 


Real  image,  1393 
Reflecting  telescope,  1 402 
Reflection: 

angle  of,  1318 
law  of,  1318 
total  internal,  1334 
Reflection  grating,  1352 
Refracting  telescope,  astronomical,  1398 
Refraction,  1328 
angle  of,  1329 
index  of,  1327 

index  of,  for  various  substances,  1 328 
law  of,  1 33 1 

Refraction  matrix,  1410 
Refrigerator,  878 

Ref  rigerator  coefficient  of  performance,  878 
Rejected  heat,  867 
Relative  permeability,  1 153 
Remanence,  magnetic,  1 163 
Replica  grating,  1353 
Resistance: 
electric,  1025 
equivalent,  1048 
internal,  1048 
radiation,  1036 
Resistive  circuit,  1242 
Resistivity,  1027 
for  metals,  1029 
Resistor,  1046 

Resolution,  Rayleigh’s  criterion  for,  1400 
Resolving  power,  1352 
Resonance,  1240,  1251 
cyclotron,  1073 
Resonance  condition,  1246 
Resonance  curve,  1252 
Resonant  frequency,  1240,  1246 
Reversibility,  1322 
Reversible  process.  845 
Reynolds  number,  724 
critical,  724,  725 
Right-hand  rule: 

for  magnetic  field  line,  1086 
for  surface  element  vector,  1 106 
Rigidity,  modulus  of,  705 
RL  circuit,  1212 

Root-mean-square,  defined,  1255 
Root-mean  square  speed,  822 
Root-mean-square  voltage,  1255 
Rydberg  constant  for  hydrogen,  1497 

Saturation  of  magnetization,  1 158,  1 159 
Saturation  current,  1436 
Scattered  particle,  907 
Scattering: 

Compton,  1451 
isotropic,  1038 
Rayleigh,  1451 
Thomson,  1313 
Scattering  angle,  9 1 3 
Scattering  time,  mean,  1039 
Schrbdinger  equation,  1507 
time-dependent,  1507 
time-independent,  1507 
Sea-floor  spreading,  1 165 
Search-coil  magnetometer,  1179 
Second  law  of  thermodynamics,  830,  831 
Clausius  statement  of,  876 
Kelvin- Planck  statement  of,  877 
Secondary  circuit,  1258 
Secondary  winding,  1258 
Self-inductance,  1 199 
Series  ac  circuit,  1235 
Series  connection,  994,  1048 
Series  limit,  1522 
Shear,  705 
Shear  modulus,  705 
Shear  strain,  705 
Shear  stress,  705 
Siemens,  1024 

Simulation,  experimental,  802 


Single-object  state,  81  1 
Single-slit  diffraction,  1355 
Singly  ionized  helium  atom,  1500 
Sink  of  fluid,  727 
Sink  temperature,  868 
Skin  effect,  1263 
Slip  ring,  1 192 
Snell's  law,  1331 
Solenoid,  1 107 
Source: 
ac,  1235 
dc,  1047 
of  emf,  1014 
of  fluid,  727 
steady,  1047 
steady  of  dc,  1047 
Source  charge,  917 
Source  temperature,  868 
Sourcelet,  1315 
South  pole  of  magnet,  1062 
Specific  heat  capacity,  763 
Specific  heat  ratio,  763,  764,  853 
for  gases,  854 
Spectrometer,  pnsm,  1339 
Spectroscopy,  1339 
mass.  1080 
Spectrum: 

electromagnetic,  1286.  1443 
line,  1497 

Spherical  lens,  1 378 
Square-well  potential,  finite,  1525 
Stagnation  point,  734 
Stagnation  pressure,  734 
State: 

excited,  1488 
ground,  1488 
quantum,  1488 
of  system,  756,  842 
State  variable,  848 
Static,  1 185 

Statistical  mechanics,  800 
Steady  fields.  Maxwell’s  equations  for.  1 107 
Steady  source,  1047 
of  dc,  1047 
Steady  state,  726 
Steady-state  equation,  730 
Stellar  interferometer,  1372 
Stokes'  law,  722 
Stop  in  optical  system,  1 383 
Stopping  potential,  1436 
Strain,  699 
shear,  705 
Streamlines,  722 
Stress,  701 

compressive,  701 
shear,  705 
tensile,  701 
uniaxial,  701 
Strong  focusing,  1430 
Superconductor,  1029 
Surface: 

equipotential,  959 
gaussian,  932 

Surface  current,  amperean,  1 1 50 
Surface  element  vector,  927 
right-hand  rule  for,  1 106 
Susceptibility,  magnetic,  1 155 
Synchrotron  radiation,  1308 
System  matrix,  1412,  1413 


Telephotolens,  1430 
Telescope: 

Galilean,  1429 
reflecting,  1402 
refracting,  1398 
Temperature: 
absolute,  786 
Celsius,  746 
Curie,  1 164 
Fahrenheit,  745 


Temperature  ( Coni .): 

Kelvin,  748 
thermodynamic,  892 

Temperature  coefficient  of  resistivity,  1029 
for  metals.  1029 
Tensile  stress,  701 
Terminal  of  emf  source,  1014 
Tesla,  1067 
Test  charge,  917 
Thermal  equilibrium,  789 
Thermal  interaction,  843 
Thermal  radiation,  1435 
Thermocouple,  1025 
Thermodynamic  efficiency,  869 
Thermodynamic  interaction,  842 
Thermodynamic  temperature,  892 
Thermodynamics: 
equilibrium,  842 
first  law  of,  834,  844 
second  law  of,  830,  83 1 , 876,  877 
third  law  of,  885 
zeroth  law  of,  790 
Thin  lens,  1 381 
Thin-lens  matrix,  1424 
Third  law  of  thermodynamics,  885 
Thomson  scattering,  1313 
Time  constant,  1214,  1218 
Time-dependent  wave  function,  1507 
Time-dependent  Schrodinger  equation,  1507 
Time-energy  uncertainty  principle,  1466 
Time-independent  Schrodinger  equation,  1507 
Tolman-Stewart  experiment,  1034,  1036 
Toroid,  1111 
Torricelli's  theorem,  732 
Torsion  balance,  899 
Total  energy  density,  1 289 
Total  internal  reflection,  1334 
Transformer,  1257 
Transmission  grating,  1352 
Transmission  matrix,  1410 
Transverse  wave,  1275,  1276 
Triple  point.  746 
Tube  of  flow,  726 
Turbulent  flow,  723 
Two-by-one  matrix,  1407 
Two-by-two  matrix,  1407 
Two-slit  diffraction,  1341 


Ultraviolet,  1314 
Uncertainty,  1465 
Uncertainty  principle: 

position-momentum.  1465 
time-energy,  1466 
Uniaxial  stress,  701 
Universal  gass  constant,  755 
Unlike  poles,  1062 
Unmagnetized,  1062 


Valence,  1015 

Vande  Graaff  generator,  1013 
Van  der  Waals  equation  of  state,  891 
Van  Leeuwen's  theorem,  1 1 57 
Vector: 

electromagnetic  energy  flux,  1290 
Poynting,  1290 
surface  element,  927 


Velocity,  drift,  1021 
Venturi  effect,  732 
Venturi  meter,  733 
Vertex  of  optical  system,  1409 
Virtual  plane,  1409 
Virtual  event,  1482 
Virtual  image,  1393 
Virtual  level,  1526 
Virtual  object,  1394 
Viscometer,  720 
Viscosity,  719 

coefficient  of,  7 1 9 
Viscous  drag,  718 
Volt,  949 
Voltage,  1025 
Hall.  1082 
peak,  1256 
peak-to-peak,  1256 
primary,  1258 
rms,  1255 
Voltage  drop,  1052 
Voltage  phasor.  net  reactive,  1244 
Voltaic  cell.  1015 
Voltmeter,  1 146,  1 148 
Volume  expansion,  coefficient  of,  759 
for  liquids,  759 
Vortex.  723 


Wake,  723 
ideal,  724 
Wave: 

matter,  1456 
plane,  1275 
polarized,  1277 
transverse,  1276 

Wave  equations,  electromagnetic,  1280 
Wave  front,  1275 
Wave  function,  1476 
time-dependent,  1507 
Wave  group.  1476 
Wave  optics,  1315 

Wave-particle  duality  (see  Particle-wave  duality) 
Wavelength: 

association  with  color,  1337 
Compton,  1449 
de  Broglie,  1458 
Wavelet,  1315 
Weber,  1097 
Wheatstone  bridge,  1053 
Width: 

half,  at  half-maximum  intensity,  1351 
of  diffraction  maximum,  1351 
of  wave  group,  1471 
Work  function,  1439 
Working  fluid,  848 


Young's  modulus,  701,  702 
Yukawa’s  meson  theory,  1482 


Zero-point  energy,  1488 
Zero-point  motion,  1488 
Zeroth  law  of  thermodynamics,  790 
Zoom  lens,  1424 


1-8 


Index 


Important  Physical  Constants* 


Quantity 

Symbol 

Value 

Universal  gravitational  constant 

G 

6.67  x ltr11  N • m2/kg: 

Speed  of  light 

c 

3.00  x 10s  m/s 

Permeability  of  free  space 

Mo 

477  x 10-7  T • m/A 
(by  definition) 

Mo/ 477 

1 x 10“7  T • m/A 

Permittivity  of  free  space 

e0  (=  1/moC2) 

8.85  X 10“12  C2/N  ■ m2 

1/47760 

8.99  X 109  N • m2/C2 

Elementary  charge  (magnitude 
of  electron  charge) 

e 

1.60  X 10“19  C 

Boltzmann's  constant 

k 

1.38  x 10~23  J/K 

Avogadro’s  number 

A 

6.02  x 1026  kmol-1 

Universal  gas  constant 

R (=  Ak ) 

8.31  x 103  J/kmol  • K 

Faraday’s  constant 

^ (=  Ae) 

9.65  x 107  C/kmol 

Electron  rest  mass 

me 

9.11  x 10~31  kg 

Electron  rest  mass  energy 

m,.c2 

5.11  x io5  eV 

Electron  charge/mass  ratio 

e/me 

1.76  x 1011  C/kg 

Proton  rest  mass 

mp 

1.67  x 10~27  kg 

Proton  rest  mass  energy 

rripC2 

9.38  x 108  eV 

Planck's  constant 

h 

6.63  x 10"34  J • s 

Bohr  magneton 

mB 

9.27  X 10“24  A • m2 

* More  precise  values  are  cited  in  text;  see 

index. 

Mathematical  Symbols 

is  proportional  to 
= is  equal  to 

is  not  equal  to 
is  approximately  equal  to 
= is  identical  to 

> is  greater  than 

< is  less  than 

is  greater  than  or  equal  to 
is  less  than  or  equal  to 
|z|  the  absolute  value  of  z 

(z)  the  average  value  of  z 


Powers-of-Ten  Notation 


Power  of  ten 

Equivalent  value 

Prefix 

Symbol 

1(T12 

0.000  000  000  001 

pico 

P 

1(T9 

0.000  000  001 

nano 

n 

ur6 

0.000  001 

micro 

M 

io~3 

0.001 

milli 

m 

O 

1 

to 

0.01 

centi 

c 

ict1 

0.1 

deci 

d* 

10° 

1 

— 

— 

101 

10 

deka 

da* 

102 

100 

hecto 

h* 

103 

1000 

kilo 

k 

10B 

1 000  000 

mega 

M 

109 

1 000  000  000 

giga 

G 

1012 

1 000  000  000  000 

tera 

T 

* Prefix  and  symbol  not  currently  in  wide  use  in  the  physical  sciences. 


The  Greek 

Alphabet 

Alpha 

A 

a 

Iota 

Beta 

B 

p 

Kappa 

Gamma 

r 

y 

Lambda 

Delta 

A 

s 

Mu 

Epsilon 

E 

e 

Nu 

Zeta 

Z 

l 

Xi 

Eta 

H 

V 

Omicron 

Theta 

0 

e 

Pi 

I 

i 

Rho 

P 

P 

K 

K 

Sigma 

2 

<J 

A 

A. 

Tau 

T 

T 

M 

p 

Upsilon 

Y 

V 

N 

V 

Phi 

<t> 

77 

f 

Chi 

X 

X 

0 

0 

Psi 

T 

* 

n 

7 T 

Omega 

a 

CO 

///*■■ 


